Tocopherol cyclase
The present invention is directed to the full-length amino acid and nucleic acid sequences of proteins having tocopherol cyclase activities from different organisms and their use for different purposes, including but not limited to the biotechnological production of vitamin E.
Vitamin E biosynthesis involves a series of enzyme-catalyzed reactions, starting with p-hydroxyphenylpyruvate and geranylgeranyl pyrophosphate. These enzymes include p-hydroxyphenyl-pyruvate dioxygenase, phytyl transferase, tocopherol cyclase, and ^-tocopherol methyltransferase, leading to β. R-βf-tocopherol. Tocopherols are synthesized in photosynthetic organisms, i. e. , in higher plants, algae and cyanobacteria.
The identification and cloning of the enzymes involved in vitamin E biosynthesis is highly desirable. Biosynthetic enzymes offer a number of advantages over chemical production:
1 ) Since enzymes are or can be engineered to be highly specific and stereoselective, it is possible to produce a given positional and stereoisomer of interest. Therefore, in the case of vitamin E, through the combination of selected wild-type or engineered enzymes, it will be possible to synthesize, by biological or biotechnological means, either RRR- α-tocopherol, RRR- ^-tocopherol, or any other vitamin E compound.
2) Since vitamin E compounds are rather hydrophobic, organic solvents have to be used in their chemical synthesis. Biotechnological production may help to reduce the requirement for organic solvents.
3) Specific accumulation of increased amounts of vitamin E in transgenic plants through overexpression of selected enzymes of the vitamin E pathway is an attractive way to produce more healthy plant- derived feed and food for both animals and humans.
The only enzyme in the biosynthetic pathway to RRR- or-tocopherol for which no full-length DNA or amino acid sequence is yet available is tocopherol cyclase (2,3-dimethylphytylquinol cyclase). EP 531 639 discloses the purification and partial amino acid sequence determination of tocopherol cyclase from the green alga Chlorella protothecoides. However, despite extensive efforts, knowledge of the five partial amino acid sequences did not allow cloning of the tocopherol cyclase gene.
In the course of the present invention the purification and maintenance of proteins having tocopherol cyclase activity could be improved to get better results in a tocopherol cyclase activity assay.
Activity of enzymes having tocopherol cyclase activity was improved by increasing the potassium buffer concentration up to 100 mM.
Protein extracts were obtained from Anabaena variabilis, but are also obtainable from any photosynthetic organism, such as higher plants, algae and cyanobacteria. After a first purification by anion exchange chromatography, nine active fractions were analyzed on SDS- polyacrylamide gel electrophoresis (SDS-PAGE) (Example 2c). Tryptic in- gel digestion and subsequent fragmentation by tandem mass spectrometry revealed 24 peptide sequences (SEQ ID NOs: 1 -24) . The corresponding full-length protein (SEQ ID NOs: 25-31 ) and DNA sequences (SEQ ID NOs: 32-38) were identified by a search in the Kasuka DNA Research Institute's database. The protein extracts obtained as described above were further purified by different techniques, including but not limited to ammonium sulfate precipitation, hydrophobic interaction chromatography, cation exchange chromatography, hydroxyapatide or HiTrap affinity columns
(Example 3). After each purification step, the eluted fractions were tested for tocopherol cyclase activity and analyzed on SDS-polyacrylamide gels. The active fractions were then applied to a further purification step.
Thus, the present invention is directed to amino acid sequences which are identical or at least 60% homologous to SEQ ID NOs: 25, 26, 27, 28, 29, 30, or 31. More particularly, the degree of homology is preferably at least 70%, more preferably at least 80%, and most preferred at least 90%.
Moreover, the present invention comprises isolated nucleic acid sequences coding for such proteins, said nucleic acid sequences being
identical or homologous to SEQ ID NOs: 32, 33, 34, 35, 36, 37, or 38. The degree of homology is at least 60%, preferably at least 70%, more preferably at least 80%, and most preferred at least 90%.
"Homology" as defined in the present invention means that when the amino/nucleic acid sequences of two proteins/nucleic acid sequences are aligned at least the given percentage is identical. The remainder of the amino/nucleic acids may be different. The homology search can be performed using the BLAST program (Altschul et al., Nucl. Acids Res. 25, 3389-3402, 1997) .
Investigation of the substrate specificity of tocopherol cyclase has shown that it not only catalyzes cyclization of 2,3 -dimethyl-5-phytyl- l ,4- benzoquinol to /-tocopherol, but that it also accepts other substrates/precursors to yield, e.g. , α-tocopherol, /2-tocopherol, δ- tocopherol, tocol, and/or tocotrienols.
As used herein, "tocopherol cyclase" or "protein having tocopherol cyclase activity" means proteins exhibiting the above mentioned substrate specificity and cleavage properties, i. e. resulting in the above mentioned products. The proteins of the present invention may be also part of a multienzyme complex with one or more components exhibiting the function of a tocopherol cyclase.
Thus, it is a further aspect of the present invention to provide combinations of the above mentioned proteins as of SEQ ID NOs: 25-31 and nucleotide sequences encoding such proteins, i. e. SEQ ID NOs: 32-38, with any other cellular components that are needed or are beneficial for producing or stimulating the tocopherol cyclase activity.
With those sequences exhibiting the highest degree of homology, a multiple alignment is performed using, e.g. , the ClustalW program (Thompson et al., Nucl. Acids Res. 22, 4673-4680, 1994) or the GAP program (Genetics Computer Group, University of Wisconsin) , in order to define the degree of homology between the aligned sequences or to define conserved regions within the aligned sequences. These conserved regions can be used to construct oligonucleotides, e.g. primers or probes. Such an alignment is also used to define the homology of proteins with tocopherol
cyclase activity originating from different organisms, which is also covered by the scope of the present invention.
Thus, one aspect of the present invention is the provision of primers for the specific amplification of genes or parts thereof coding for proteins having tocopherol cyclase activity. The length of the oligonucleotide is at least 20 bases.
The nucleic acid sequences of the present invention include both desoxyribonucleic acid sequences as well as antisense ribonucleic acid sequences.
The full-length as well as fragments of the claimed nucleic acid sequences can be used as probes for the detection of coding and non- coding regions of a gene coding for a protein having tocopherol cyclase activity. The generation of antisense ribonucleic acids derived from the claimed sequences which can be used, e.g., as probes for in situ hybridization, is another aspect of the present invention.
The invention embraces probes for the detection of proteins having tocopherol cyclase activity as defined above, said probes having a length of at least 1 1 nucleotides, preferably at least 14 nucleotides, and most preferably at least 17 nucleotides.
Since the protein has been expressed and an improved method for purification of the protein is described in the examples, the person skilled in the art can use the protein or peptides in order to generate antibodies which do specifically react with a protein having tocopherol cyclase activity and which is identical or homologous to the sequences as claimed in the present invention. It is either possible to produce polyclonal antibodies by immunizing laboratory animals like rabbits, sheeps or goats, preferably with an adjuvant, or to produce monoclonal antibodies by well-known techniques in the art. The antibodies should specifically react with a protein as claimed in the present invention in order to avoid an unspecific crossreaction. This means that the antibodies of the present invention should preferably react with an epitope which is present only on a protein of the present invention.
A further embodiment of the present invention is an antibody which specifically reacts with the proteins having tocopherol cyclase activities, especially with those proteins of the amino acid sequences ID NOs: 25-31 or proteins encoded by SEQ ID NOs: 32-38. A preferred use of such antibodies is the detection and/or quantification of tocopherol cyclase in different assay systems, including but not limited to Western blots, immunoprecipitation, and immunohistochemistry.
It is an integral aspect of the present invention that the tocopherol cyclases disclosed are used for an enzymatic cyclization step in a biotechnological and/or biotransformation process for the production of tocopherol, tocotrienols, or tocols.
More particularly, the present invention is directed to a process for the production of tocols, tocopherols or tocotrienols involving an enzymatic cyclization step catalyzed by a protein according to the present invention. Preferably, such process comprises the catalyzation of the cyclization of 2,3-dimethyl-5-phytyl- l ,4-benzoquinol to ^-tocopherol.
Another preferred embodiment of the present invention concerns the introduction of a gene coding for a protein having tocopherol cyclase activity into a suitable host cell. The gene can be selected, for example, from algae such as Anabaena variabilis or Synechocystis. The coding regions of such genes are PCR-amplified, with specific primers, from photosynthetic organisms and inserted into a suitable vector. The vector must fit with the host cell into which the gene is introduced. There are specific vectors available for bacteria, yeasts, plant cells, insect cells or mammalian cells, including but not limited to vectors pDS-His, pDS ( EP 821 063 ), pFPMT 121 (Mayer et al., Biotechnol. Bioeng., Vol.63, No. 3, 1999), pYES2 (Invitrogen, San Diego, CA, USA) . Preferably, the DNA coding for a protein having tocopherol cyclase activity is combined with genetic structures which provide the required genetic regulation like promoters, enhancers, ribosomal binding sites, etc.
It is in accordance with the present invention to provide a method for the introduction of a cDNA, coding for a protein with tocopherol cyclase activity, into an expression vector suitable for the host cell and said vector is introduced into said host cell. The recombinant vector having the gene coding for a protein having tocopherol cyclase activity and the other
required genetic structures is introduced into suitable host cells by methods well-known to the person skilled in the art, e.g. , transformation, transfection, electroporation or microprojectile bombardment. The host cell as used within this method includes prokaryotic cells, such as cells of E. coli, as well as eukaryotic cells, such as cells of plants, mammals, yeasts or other fungi, and algae. Preferred mammalian cells as used in this method are human cells.
Furthermore, a recombinant vector suitable for the expression in a host cell is provided, said vector comprising a DNA coding for the proteins of the present invention, i. e. such proteins having the amino acid sequence ID NOs: 25-31. In a preferred embodiment, the cDNA encoding a protein with tocopherol cyclase activity is integrated into the host cell's chromosomal DNA. The cells obtained by such methods can then be further propagated.
In one embodiment of the present invention the host cell is a prokaryotic cell, preferably E. coli. In a further embodiment, the host cell is of eukaryotic origin, including but not limited to cells of yeasts or other fungi, plants, and algae. Examples for such yeasts are Hansenula polymorpha and Saccharomyces cerevisiae.
In another aspect of the present invention the host cell is a mammalian cell, into which a gene coding for a protein of the present invention is introduced. A preferred mammalian cell is a human cell.
It is a part of this invention to provide host cells which comprise a tocopherol cyclase cDNA obtained by another species.
Besides expression of tocopherol cyclase in bacteria, yeasts, fungi, plants, mammals, etc., tocopherol cyclase can also be produced in a cell- free translation, as described or referenced in Jermutus et al. (Curr. Opin. Biotechnol. 9, 534- 548, 1998) and Ryabova et al. (Meth. Mol. Biol. 77, 179- 193, 1998) . Thus, the present invention also covers a method for the cell- free translation of proteins having tocopherol cyclase activity as well as proteins obtainable by such cell-free translation, said proteins being used for the production of tocols, tocopherols or tocotrienols. It also covers a testkit for the generation of a protein as claimed in this invention by cell- free translation.
Furthermore it is an aspect of the present invention to provide a process for the production of vitamin E characterized therein that a recombinant host cell as described above is grown under suitable growth conditions and the vitamin E secreted into the medium is isolated by methods known in the art. It is also an object of the present invention to provide a process for the preparation or production of a food or feed composition characterized in that such a process has been effected and the vitamin E obtained thereby is converted into a food or feed composition by methods known in the art. Such process comprises the following steps: (a) introducing of a nucleic acid sequence comprising an isolated nucleic acid sequence coding for a protein according to the present invention in a recombinant host cell or plant,
(b) biosynthetic production of vitamin E by overexpression of a protein encoded by the nucleic acid sequence of ( a), (c) isolating the biosynthetically produced vitamin E, and
(d) conversion of said vitamin E into a food or feed composition according to standard methods.
Thus, the scope of the present invention also encompasses a food or feed composition comprising vitamin E obtainable by a process involving an enzymatic cyclization step catalyzed by proteins as claimed in the present invention.
A further aspect of the present invention is related to a transgenic plant for the production of vitamin E, wherein the plant contains isolated nucleic acid sequences coding for proteins according to the present invention, i. e. proteins having tocopherol cyclase activity, and wherein said proteins are overexpressed to obtain an increased level of such protein.
The present invention is further illustrated by the following examples:
Example 1 : Tocopherol cyclase activity assay
At the start of the present project, we observed that tocopherol cyclase activity is highly dependent on the composition and strength of the incubation buffer. Tocopherol cyclase activity cannot be detected in Tris buffer alone, but can be restored completely after buffer exchange against
phosphate buffer or simply by addition of concentrated phosphate buffer to reach a concentration in phosphate of approximately 100 mM. Similarly, enzyme fractions displaying reduced tocopherol cyclase activity in 15 mM phosphate buffer can be activated by increasing the phosphate concentration to 100 mM. These findings were included in the design of an improved tocopherol cyclase activity assay.
a) Preparation of the "substrate inclusion complex"
2,3-Dimethyl-5-phytylcyclohexa-2,5-diene- l ,4-dione ( 10 mg, 24 μmol) was added to 2,6-di-O-methyl- β-cyclodextrin (600 mg, 0.45 mmol) dissolved in 10 ml of a 100 mM potassium phosphate buffer, pH 7.0. The mixture was stirred for 5 min, and ascorbic acid (440 mg, 2.5 mmol) was added. The yellow mixture was kept at 4°C.
b) Enzyme reaction
0.5 ml of an enzyme sample in 100 mM potassium phosphate buffer, pH 7.0, 2 mM dodecyl maltoside (DDM), 2 mM 1 ,4-dithio-DL-threitol ( DTT) was mixed with 50 μl of the "substrate inclusion complex" and sealed under nitrogen. After incubation for 15 hours at 35°C, the reaction was stopped by addition of 1.5 ml methanol. After centrifugation for 5 min at 10,000 rpm in an Eppendorf centrifuge, the supernatant was submitted to HPLC analysis (see below).
c) HPLC analysis
HPLC analysis was performed on a Merck-Hitachi system consisting of a gradient pump L-7100, an autosampler L-7200, a column oven L-7300 set at 40°C, a UN detector L-7400 set at 290 nm, and a fluorescence spectrometer (model Merck Hitachi F- 1050) set at 290 nm excitation wavelength and 330 nm emission wavelength. Chromatographic analysis was performed on a Lichrosphere- 100, RP 18 ( 5 μm) column (Merck, Darmstadt, Germany) with a flow rate of 1.5 ml/min. The column was equilibrated with a 10:90 mixture of water:methanol. Upon loading of the sample (typically 20 μl), methanol concentration in the elution buffer was linearly increased over 3 min, and was then maintained at 100% for another 9 min. Under these conditions, the retention times of 2,3-dimethyl-5-
phytylcyclohexa-2,5-diene- l ,4-diol and of ^-tocopherol were 5.6 min and 7.6- min, respectively.
Example 2: Enrichment of tocopherol cyclase from Anabaena variabilis
It is known from previous work (A. Stocker, PhD thesis, University of
Zurich, Switzerland, 1992) that tocopherol cyclase is readily degraded and/or inactivated during the purification process. For example, Stocker reported that no tocopherol cyclase activity is observed in Tris buffer, and that chromatography on various hydrophobic interaction columns also destroys tocopherol cyclase activity.
a) Culturing of Anabaena variabilis
The culture conditions and the preparation of the acetone powder have already been described (Stocker et al., Helv. Chim. Acta 76, 1729- 1738, 1993; Stocker et al., Helv. Chim. Acta 77, 1721 - 1737, 1994) , but for completeness shall be listed here again. The strain used in the present experiments, Anabaena variabilis (Kϋtzing) 1403-4b, was obtained from the collection of the University of Gottingen, Germany ("Sammlung von Algenkulturen") . The growth medium contained (per liter) : 0.15 g MgSO4-7H2O, 0.6 g K2HPO4, 10 mg Ca(NO3)2-4H2O, 0.5 g KNO3, 0.165 g Na3C6H5O7-2H2O (sodium citrate) , 4 mg Fe2(SO4)3-5H2O, 2.86 mg HBO3 ) 1.81 mg MnCl2-4H2O, 0.222 mg ZnSO4-7H2O, 0.015 mg MoO3, 0.079 mg CuSO -5H2O. The growth medium was autoclaved for 30 min at 120°C.
Anabaena variabilis from either fresh liquid cultures or from agar slants were used to inoculate the sterile culture media. Cells were grown for one week at approximately 35°C under an air/CO2 atmosphere (with CO2 concentration not exceeding 1 %) and under continuous illumination with neon light (Phillips TLD 15 W/33) . The cultures were checked daily for proper growth and gas flow through the media. At the end of the incubation period, the algae were allowed to settle. A large portion of the culture medium was decanted off. The algae pad was centrifuged (Kontron H-401 , 15 min, 10,000 rpm, 4°C) and the supernatant discarded. The average yield of wet cells was approximately 2 g/1.
b) Preparation of the acetone powder
The cell pellet was resuspended in 30 mM potassium phosphate buffer, pH 7.0, 5 mM NaCl, 500 mM sucrose (ratio of cells to buffer 1 : 10 w/v), and centrifuged (Kontron, Centricon H-401 , 15 min, 10,000 rpm, 4°C) . This procedure was repeated twice. The final pellet was resuspended ( at a ratio of 100 ml buffer per 10 g wet algae) in the same buffer containing 1 mg/ml lysozyme and EDTA (2 mM) . The mixture was rotated in a water bath for 2 hours at 35°C. The formation of spheroplasts (isolated, oval cells) from the native slender threads of Anabaena was followed by phase contrast microscopy. The spheroplasts were collected by centrifugation ( 10,000 rpm, 15 min, 4°C) after addition of 2 mM MgSO4-7H2O, washed twice with 30 mM potassium phosphate buffer, pH 7.0, 5 mM NaCl, 500 mM sucrose in order to remove the lysozyme. Subsequently, 10 g of the spheroplasts were suspended twice in approximately 200 ml of a 10 mM potassium phosphate buffer, pH 7.0, and centrifuged ( 10,000 rpm, 15 min, 4°C) . The resulting pellet was suspended in approximately 10 ml of the same buffer, and 200 ml acetone (cooled to -20°C) was added under vigorous stirring. After 3 min of agitation, the precipitated proteins were collected by centrifugation (Kontron H-401 , 10,000 rpm, 2 min, 4°C) , and the resulting pellet resuspended in 100 ml of diethyl ether cooled at -20°C. After 10 s of vigorous stirring, the mixture was filtered through folded paper filters (Schleicher & Schϋll) and the residue dried under vacuum for 3 hours, yielding 1 g of acetone powder which could be kept for at least 1 -2 months at -80°C without loss of activity.
c) Initial purification with anion exchange chromatography
1 g of the acetone powder ( Example 2b) was suspended by means of a glass-teflon homogenizer in 40 ml of a potassium phosphate buffer, pH 7.0, containing 15 mM dodecyl maltoside and 2 mM 1 ,4-dithio-DL-threitol, followed by stirring of the suspension for 2 hours at 4°C and centrifugation (Kontron H-401 , 10,000 rpm, 15 min, 4°C) . The strength of the phosphate buffer was 100 mM in case the samples were used for activity measurements ( Example 1 ), and 15 mM in case the samples were used for further enrichment of the protein (Examples 2c and 3) . 12 ml of the supernatant were loaded on a HiLoad 16/ 10 Q-Sepharose column (Amersham Pharmacia Biotech, Dϋbendorf, Switzerland) equilibrated with 15 mM potassium
phosphate buffer, pH 7.0, 2 mM dodecyl maltoside, 2 mM 1 ,4-dithio-DL- threitol (buffer A) . After washing of the column at a rate of 0.5 ml/min for 45 min with buffer A, tocopherol cyclase was eluted with a linear gradient over 90 min from buffer A to buffer B ( = buffer A + 1 M KC1) followed by a further 45 min at 100% buffer B. The individual fractions were analyzed by SDS-PAGE.
Example 3: Improved purification procedure for tocopherol cyclase
All the buffers used contained a mixture of protease inhibitors consisting of benzamidine (Sigma, 100 mg), aminocaproic acid ( Fluka, 100 mg) and trypsin inhibitor from soybean (Fluka, 40 mg, 1 1551 U/mg) dissolved in 10 ml of bidistilled water. This solution was added at a concentration of 0.1 % v/v to all the buffers used in the different purification steps.
After denaturation of the protein solutions (20 μl) for 30 min at room temperature, SDS-PAGE analyses were performed in 40 μl Lammli buffer without β-mercaptoethanol on precast gels ( 10% Ready gels, Bio-Rad, Tris- HC1) . The gels were stained with colloidal blue ( Invitrogen) or silver stain plus ( Bio-Rad) .
a) Ammonium sulfate precipitation
To 65 ml of the supernatant obtained as described in Example 2c above, 35 ml of a saturated ammonium sulfate solution, pH 7.0 (adjusted with diluted NH OH) was added at 4°C. The mixture was gently stirred for 30 min at 4°C and centrifuged for 20 min at 14,000 rpm (Kontron H-401 , 4°C) . The blue pellet was discarded. To 30 ml of the new supernatant, 35 ml saturated ammonium sulfate solution described above was added, and the mixture treated as described before. After centrifugation, the supernatant was discarded. The tocopherol cyclase activity, located in the blue pellet, could be kept without loss of activity for several weeks at 4°C.
The activity of the ammonium sulfate pellet was tested as described in
Example 1 after solubilization in 100 mM potassium phosphate, pH 7.0, 2 mM DDM, 2 mM DTT and removal of the ammonium sulfate by buffer
exchange on a Sephadex G-25 column (Amersham Pharmacia Biotech) previously equilibrated with the phosphate-DDM-DTT buffer.
b) Hydrophobic interaction chromatography (HIC)
A HiLoad 26/ 10 Phenyl Sepharose HIC column (Amersham Pharmacia Biotech) was equilibrated with 100 mM potassium phosphate, pH 7.0, 2 mM DDM, 2 mM DTT, 1.5 mM ammonium sulfate (buffer Al ) . Most of the final ammonium sulfate pellet (obtained in Example 3a, see above) could be dissolved in 25 ml of buffer Al . After centrifugation (Kontron H-401 , 10,000 rpm, 10 min, 4°C) the supernatant (solution 1 ) was put on the side, and the pellet, insoluble in buffer Al , was dissolved in 5 ml of buffer B l ( B l = buffer Al without ammonium sulfate) . This solubilized pellet (solution 2) and the supernatant, solution 1 obtained above, were pooled and loaded onto the HIC column. The column was washed for 50 min with buffer A l at a flow rate of 1 ml/min. Elution of bound proteins was performed with a linear gradient from 0% to 100% buffer B l over 100 min, followed by another 50 min at 100% buffer B l . Tocopherol cyclase activity was eluted with 2 ml of mixed micelles and further elution with buffer B l . Mixed micelles were obtained as follows: 1.16 g of glycocholic acid (Sigma) were dissolved in 8 ml water and 470 μl 5 M NaOH under magnetic stirring for 30 min. The pH was set to 7.0-7.5 by addition of 5 μl cone, acetic acid; to this solution 80 mg of soybean azolectin (Sigma) were added under magnetic stirring and the total volume was adjusted to 10 ml. Tocopherol cyclase activity of the different fractions was analyzed as described in Example 1. The active fractions were pooled, and to 30 ml of the pooled fractions, 70 ml of saturated ammonium sulfate, pH 7.0, was added at 4°C. The ammonium sulfate pellet was kept at 4°C and analyzed by SDS-PAGE.
c) Anion exchange chromatography (Q-Sepharose/Tris buffer)
A HiLoad 16/ 10 Q-Sepharose column (strong anion exchanger; Amersham Pharmacia Biotech) was equilibrated with 20 mM Tris-HCl, pH 7.5, 2 mM DDM, 2 mM DTT (buffer A2) . The ammonium sulfate pellet obtained after HIC ( Example 3b) was dissolved in 8 ml of buffer A2 and buffer-exchanged on a Sephadex G-25 column against the same buffer A2. The eluate was loaded onto the Q-Sepharose column and the column washed for 60 min with buffer A2. Elution of bound proteins was attained with the following gradient: linear for 60 min from 0% to 30% B2 (B2 =
buffer A2 + 1 M KC1), isocratic (30% B2) for 80 min, linear from 30% to 100% B2 for 30 min and finally isocratic ( 100% B2) for 70 min. The flow rate was 0.5 ml/min.
Tocopherol cyclase activity of the individual fractions was measured after addition of 1 M phosphate buffer pH 7.0 to yield a final phosphate concentration of 100 mM. Tocopherol cyclase activity eluted at the end of the 30% B2 isocratic elution. The three active fractions were analyzed by SDS-PAGE and pooled.
d) Cation exchange chromatography (SP-Sepharose)
A HiTrap SP HP 5 ml column (strong cation exchanger; Amersham
Pharmacia Biotech) was equilibrated with 25 mM potassium phosphate buffer, pH 7.2, 2 mM DDM, 2 mM DTT (buffer A3 ) . The pooled active fractions obtained after Q-Sepharose ( Example 3c) were buffer exchanged on Sephadex G-25 against buffer A3 and loaded onto the HiTrap SP column. Under these conditions tocopherol cyclase was not bound and passed through the column. Several contaminants were bound and could be eluted, after washing with buffer A3 for 10 min, by using a linear gradient from 0% to 100% buffer B3 ( B3 = A3 + 1 M KC1) for 5 min and further washing with buffer B3 for 10 min. The active fractions were analyzed by SDS-PAGE and pooled.
e) Hydroxyapatite (HAP) column
A HAP column ( Econo-Pac CHT-II, bed volume 5 ml, Bio-Rad) was equilibrated with buffer A3. The pooled active fractions obtained after SP- Sepharose ( Example 3d) were loaded onto the HAP column and the column was washed with buffer A3 for 5 min. Elution of bound proteins was attained with a linear gradient from 0% to 100% buffer B4 (500 mM potassium phosphate buffer pH 7.2, 2 mM DDM, 2 mM DTT) , followed by washing with buffer B4 for another 20 min. The active fractions (23-26) were analyzed by SDS-PAGE.
f) HiTrap affinity column
A 1 ml HiTrap Blue column (prepared by covalent attachment of a dye, Cibacron blue F3G-A to agarose gel, Amersham Pharmacia Biotech) was equilibrated with buffer A3. An aliquot of the pooled active fractions
obtained after SP-Sepharose (Example 3d) was loaded onto the Cibacron column and elution was performed as described in Example 3e. Tocopherol cyclase activity was recovered, immediately after the loading phase. The active fractions were pooled, evaporated in a SpeedVac Concentrator and dissolved in 100 ml water.
Example 4: Identification of nine bands from an SDS-polyacrylamide gel by microproteinchemical methods
All nine bands labeled on the SDS-polyacrylamide gel (Example 2c) were in-gel digested with trypsin as described by Fountoulakis & Langen (Anal. Biochem. 250, 153- 156, 1997) . After digestion, the peptides were extracted and analyzed by nanoelectrospray tandem mass spectrometry (Wilm & Mann, Anal. Chem. 68, 1 -8, 1996) . In short, about 2 μl of the unseparated peptide mixture was desalted and further concentrated on a pulled capillary containing approximately 100 nl POROS R2 reverse phase material (Applied Biosystems, Framingham, MA) . The peptides were eluted in one step with 1 μl of 60% methanol/5% formic acid/water directly into the nanoelectrospray needle. Electrospray mass spectra were acquired on an API 365 triple quadrupole mass spectrometer (Sciex, Toronto, Canada) equipped with a nanoelectrospray ion source developed by Wilm and Mann. Q l scans were performed with a 0.2 Da mass step. For operation in the MS/MS mode Q l was set to transmit a mass window of 2 Da, and spectra were accumulated with 0.2 Da mass steps. Resolution was set so that the fragment masses could be assigned to better than 1 Da.
Fragmentation of a peptide by tandem mass spectrometry yields a stretch of sequence together with its location in the peptide (peptide sequence tag) . With these sequence tags appropriate sequence databases were searched. With this approach the following bands could be identified as (in the following sequences, Leu = He [isobars] , since they can not be discriminated by the technology applied; the respective amino acid as found in the respective database was selected) :
Band 1 : glyceraldehyde 3-phosphate dehydrogenase (SW:G3P2_ANANA)
Peptides sequenced: NLDLAELNAEK SEQ ID NO: 1
TITEEVNQALK SEQ ID NO: 2
VMAWYDNEWGYSQR SEQ ID NO: 3
AVALVIPELK SEQ ID NO: 4
Bold letters represent amino acids as derived from mass spectrometry which are identical to the respective amino acid in the Kazusa database (http://www.kazusa.or.jp/cyano/anabaena) , but differ from the amino acid in the respective peptide sequence as deposited in other publicly available databases. Italic letters (see below) represent amino acids as derived from mass spectrometry that differ from the respective amino acid in the Kazusa database.
Band 7: ATP synthase beta chain (SW:ATPB_ANASP)
Peptides sequenced: LPQIYNALTIK SEQ ID NO: 5 VVDLLTPYR SEQ ID NO: 6
NGLSGLTMAEYFR SEQ ID NO: 7
Band 8: ATP synthase alpha chain (SW:ATPA_ANASP)
Peptides sequenced: ATLVIYDDLSK SEQ ID NO: 8 TAIAIDTIINQK SEQ ID NO : 9
VANVGTNLQNGDGIAR SEQ ID NO : 10
For the following bands, good sequence tags were obtained, but the tags were not sufficient to assign a protein homologue from the database. After gathering more sequence information by MS/MS experiments, a sequence search was performed. With this approach, the following bands could be identified as homologues of:
Band 2: soluble dehydrogenase (SW:DHSS_SYNP 1 ) not found in Anabaena!
Peptide sequenced: ALIVTHSETSTGVINDLEAINR SEQ ID NO: 1 1
Band 4: NADH-plastoquinone oxidoreductase (SW:NUCC_SYNY3)
Peptides sequenced: EEAINWGLSGPMLR SEQ ID NO: 12
VGGVAADLPYGWVDK SEQ ID NO: 13 LVTNNPIFR SEQ ID NO: 14
SN7MYVPYVSR SEQ ID NO: 15
Band 9: 60 kDa chaperonin 1 (SW:CH61_SYNY3 )
Peptides sequenced: FGAPQIVNDGVTIAK SEQ ID NO: 16 IALVQDLNPVLEQVAR SEQ ID NO: 17
The remaining bands 3, 5 and 6 could only be identified by performing a homology search via internet in the Kasuza DNA Research Institute's database (http://www.kazusa.or.jp/cyano/anabaena) . No close homologues were found, however, in the EMBL/GenBank/DDBJ databases.
Band 3 : Closest homology ( 55% identity) to phosphate binding periplasmic protein (PBP) (NRDBP:TR_Q55199)
Peptides sequenced: AVΕAAIEYALTEGQK SEQ ID NO: 18 QPITVVYR SEQ ID NO : 19
NTYTDILLGK SEQ ID NO: 20
Band 5 and band 6: no homologue identified
Peptides sequenced: LQEEFSAELATLR SEQ ID NO: 21 VNELIATATADLVTKQ£LATLQR SEQ ID NO : 22
EGNTLGLIFGQSPK SEQ ID NO : 23
LSL£SSFTGTDJ SEQ ID NO : 24
The full-length protein and DNA sequences corresponding to the above peptide sequences were subsequently identified in the Kasuza DNA Research Institute's database (http://www.kazusa.or.jp/cyano/anabaena) :
Protein sequences:
Band 1 SEQ ID NO: 25
Band 2 no hit found
Band 3 SEQ ID NO: 26
Band 4 SEQ ID NO: 27
Band 5 = Band 6: SEQ ID NO: 28
Band 7 SEQ ID NO: 29
Band 8 SEQ ID NO: 30
Band 9 SEQ ID NO: 31
DNA sequences:
Band 1 SEQ ID NO : 32 Band 2 no hit found
Band 3 SEQ ID NO: 33
Band 4 SEQ ID NO: 34
Band 5 = Band 6: SEQ ID NO: 35
Band 7 SEQ ID NO: 36 Band 8 SEQ ID NO: 37
Band 9 SEQ ID NO: 38
Example 5: Identification of homologues from other organisms through a BLAST search
A non-redundant database for proteins was searched for homologues with a BLAST search (Altschul et al., Nucl. Acids Res. 25 , 3389-3402, 1997) . While such a search for homologues can be performed with every single amino acid sequence ( SEQ ID NOs: 25-3 1 ) or every single DNA sequence (SEQ ID NOs: 32-38) for bands 1 -9 as input, the result is described - by way of example - for band 4 only. The output of the BLAST search
(BLASTP 2.0.12) revealed 11 sequences showing a higher homology to the reference sequence having tocopherol cyclase activity (in our example the amino acid sequence 27) in comparison to the other sequences shown up
with the BLAST search. These sequences are likely to belong to the same enzyme class and to catalyze the same reaction.
Example 6: Multiple sequence alignment of known tocopherol cyclases
A multiple sequence alignment using the program ClustalW, Version
1.7 (Thompson et al., Nucl. Acids Res. 22, 4673-4680, 1994) was made with the sequences showing the highest homology among each others as revealed in Example 5. For the purpose of simplification, only the result for band 4 ( NdhH proteins, see Example 5) is described here. This sequence alignment revealed that proteins having tocopherol cyclase activity, such as the NdhH proteins, are rather conserved between plants, algae and cyanobacteria. The sequence alignment also allowed to identify amino acid stretches that are most conserved among the different homologues. For the NdhH proteins, 5 highly conserved regions could be detected.
Using the GAP program with the endweight parameter (Genetic
Computers Group, Wisconsin) the homologies between the tocopherol cyclase proteins and their respective genes were defined. As exemplified for the 1 1 homologous sequences as of Example 5 ( for simplification purpose only, see above), the minimal degree of homology was around 60% with regards to both the amino acid sequences as well as the DNA sequences. The highest homology between two sequences was nearly 100% for both the DNA and the amino acid levels.
The conserved amino acid stretches may be used to design oligonucleotides which either alone or in combination may be used to clone further ndhH homologues and proteins having tocopherol cyclase activity, respectively, from any living organism in which such a homologue exists (see Example 8) . Also parts of the conserved sequences or less conserved sequences from the alignment (see above) may be used for this purpose, although in the latter case with a chance of success that decreases with a decrease in sequence identity and/or with an increase in the required degeneracy of the derived oligonucleotides.
Example 7: Overexpression of genes encoding for proteins having tocopherol cyclase activity in E. coli and yeast
As already mentioned in Examples 5 and 6, overexpression is only described - by the way of example - for the ndhH genes ( i. e. sequences derived from band 4 as of Example 4) . Nevertheless, this method is applicable to other genes encoding for proteins having tocopherol cyclase activity.
a) Expression of ndhH in E. coli
The coding regions of the ndhH genes from Anabaena variabilis and Synechocystis PCC6803 were amplified from the respective genomic DNAs by PCR using the primers Av-Tocy-5 and Av-Tocy-3 for the A. variabilis gene and Sy-Tocy-5 and Sy-Tocy-3 for the Synechocystis gene. The primers were designed such that the ATG start codons constitute the second half of a Ndel site (cleavage recognition site CATATG), and that Bamϊil sites (GGATCC) are introduced immediately after the stop codons. Both PCR products were cloned into the pCR®2.1 -TOPO vector (Invitrogen) , resulting in plasmids TOPO-Av-Tocy and TOPO-Sy-Tocy.
The expression vectors pDS-His and pDS were derived from pDSNdeHis, which is described in Example 2 of European Patent Application EP 821 063. The plasmid pDS-His was constructed from pDSNdeHis by deleting a 857 bp Nhel and Xbal fragment carrying a silent chloramphenicol acetyl transferase gene from E. coli. The plasmid pDS was constructed from pDS-His by replacing a small EcoRl-Bam l fragment with the annealed primers S/D- l and S/D-2.
The coding regions of the ndhH genes from Anabaena variabilis and
Synechocystis PCC6803 were excised from TOPO-Av-Tocy and TOPO-Sy- Tocy with Bam l and Ndel and ligated into the BamHl-Ndel cleaved vectors pDS-His and pDS, resulting in plasmids pDS-His-Av-Tocy, pDS- Av-Tocy, pDS-His-Sy-Tocy and pDS-Sy-Tocy. E. coli strain M 15 (Villarejo, M.R. and Zabin, I. , J. Bacteriol. 1 10, 171 - 178, 1974) carrying the lad (lac repressor) containing plasmid pREP4 (EMBL/GenBank accession # A25856) was transformed with the ligations, and recombinant colonies were screened for the presence of the plasmid with the ndhH gene insert. All four E. coli strains, M l 5/pREP4/pDS-His-Av-Tocy, M 15/pREP4/pDS-Av-
Tocy, M 15/pREP4/pDS-His-Sy-Tocy and M 15/pREP4/pDS-Sy-Tocy were obtained. For the expression of the ndhH gene, the strains were grown overnight at 37°C in Luria Broth (GibcoBRL, Life Technologies) with 25 mg/1 kanamycin and 100 mg/1 ampicillin. The next day, fresh medium was inoculated with 2% (volume) of the overnight cultures and the new cultures were grown at 37°C. After 3 hours, expression of the cloned genes was induced by addition of IPTG to a final concentration of 2 mM and the growth of the cultures was continued. Samples were taken at various time points and analyzed by SDS-PAGE for appearance of NdhH. Expression of the ndhH gene and genes encoding proteins having tocopherol cyclase activity, respectively, was tested by using the standard assay as described in Example 1.
b) Expression of ndhH in Hansenula polymorpha
For expression of the ndhH gene of A. variabilis or a homologous gene encoding for a protein having tocopherol cyclase activity from another organism in H. polymorpha, an EcoRI site was added to each end of the gene by PCR using two primers which cover the 5'- and the 3'-end of the gene, with both primers having an extended 5'-end containing an EcoRI restriction site. The PCR product was purified using the QIAquick PCR Purification Kit (Qiagen Inc., Valencia, CA, USA) , digested with EcoRI, and purified by agarose gel electrophoresis.
If the gene of interest already contained one or more EcoRI restriction sites, these sites were eliminated by site directed mutagenesis using the "Quick exchange site-directed mutagenesis kit" from Stratagene (La Jolla, CA, USA) following the manufacturer's protocol.
The ndhH expression vector, used to transform H. polymorpha RB 1 1 (Gellissen et al., in: Smith, A., ed. , Gene Expression in Recombinant Microorganisms. Dekker, New York, pp. 395-439, 1994) , was constructed by inserting the EcoRI-digested and purified PCR product into the multiple cloning site of the H. polymorpha expression vector pFPMT 121 , which is based on an ura3 selection marker from S. cerevisiae, a formate dehydrogenase (FMD) promoter element and a methanol oxidase (MOX) terminator element from H. polymorpha. The 5'-end of the ndhH gene is fused to the FMD promoter, the 3'-end to the MOX terminator ( Gellissen et al., Appl. Microbiol. Biotechnol. 46, 46-54, 1996; European patent EP
299 108). The constructed plasmids were propagated in E. coli. Plasmid DNA was purified using standard state-of-the-art procedures and for control, the construct was sequenced by standard methods. The expression plasmids were transformed into the H. polymorpha strain RB 1 1 deficient in orotidine-5'-phosphate decarboxylase ( ura3 ) using the procedure for preparation of competent cells and for transformation of yeast as described in Gellissen et al. ( 1996) . Each transformation mixture was plated on YNB (0.14% w/v Difco YNB and 0.5% ammonium sulfate) containing 2% glucose and 1.8% agar and incubated at 37°C. After 4 to 5 days, individual transformant colonies were picked and grown in the liquid medium described above for 2 days at 37°C. Subsequently, an aliquot of this culture was used to inoculate fresh vials with YNB-medium containing 2% glucose. After seven further passages in selective medium, the expression vector was integrated into the yeast genome in multimeric form. Subsequently, mitotically stable transformants were obtained by two additional cultivation steps in 3 ml non-selective liquid medium (YPD, 2% glucose, 10 g/1 yeast extract, and 20 g/1 peptone) . In order to obtain genetically homogeneous recombinant strains, an aliquot from the last stabilization culture was plated on a selective plate. Single colonies were isolated for analysis of ndhH expression in YNB containing 2% glycerol instead of glucose to derepress the fmd promoter. Expression of the ndhH gene and genes encoding proteins having tocopherol cyclase activity, respectively, was tested by using the standard assay as described in Example 1.
c) Expression of ndhH in Saccharomyces cerevisiae
Two EcoRI sites were added to the ndhH gene as described for
Hansenula polymorpha above. The so-prepared gene was ligated into the EcoRI site of the expression cassette of the Saccharomyces cerevisiae expression vector pYES2 (Invitrogen, San Diego, CA, USA) or subcloned between the shortened GAPFL (glyceraldehyde-3-phosphate dehydrogenase) promoter and the pho5 terminator as described by Janes et al. (Curr.
Genet. 18, 97- 103, 1990) . The correct orientation of the gene was tested by PCR. Transformation of S. cerevisiae strains, e.g. INVSc l (Invitrogen, San Diego, CA, USA), was done according to Hinnen et al. (Proc. Natl. Acad. Sci. USA 75, 1929- 1933, 1978). Single colonies harboring the ndhH gene under the control of the GAPFL promoter were picked and cultivated in 5 ml selection medium (SD-uracil, Sherman et al., Laboratory Course Manual
for Methods in Yeast Genetics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, 1986) at 30°C under vigorous shaking (250 rpm) for one day. The preculture was then added to 500 ml YPD medium (Sherman et al., 1986) and grown under the same conditions. Induction of the gal l promoter was done according to the manufacturer's instructions. Expression of the ndhH gene and genes encoding proteins having tocopherol cyclase activity, respectively, was tested by using the standard assay as described in Example 1.
Example 8: Homology cloning of ndhH genes from other organisms, e.g., algae and plants
As described in the multiple alignment of Example 6, there is high overall homology among the proteins with putative tocopherol cyclase activity, and there are also several very highly conserved homology regions. Such conserved regions are used to design degenerated oligonucleotides for the cloning of unknown genes encoding proteins with tocopherol cyclase activity from a variety of organisms. Especially useful are conserved regions containing amino acids that are encoded by only one codon, i. e. methionine and tryptophan. On the other hand, regions containing arginines, leucines or serines are to be avoided, because these amino acids are encoded by six codons and would dramatically increase the degeneracy of the oligonucleotides. To reduce the degeneracy, a set of oligonucleotides can be designed from the same peptide, in which one (or more) ambiguous nucleotide(s) is (are) eliminated. For instance, two oligonucleotides could be made from a peptide, one with a C and the other with a T in the penultimate position, thus reducing the degeneracy to 32- fold for each oligonucleotide. This strategy can also be used to create more defined 3'-ends whereby mispriming can be effectively reduced. Some conserved stretches might contain a non-conserved amino acid, e.g. where the third position is either isoleucine or valine. Some organisms use preferred codons for certain amino acids. If the preferred codon usage for an organism is known, the degeneracy of the oligonucleotide can be reduced. Pairs of degenerated oligonucleotides are then used to amplify the coding region between them. The oligonucleotide closer to the beginning of the gene must be complementary to the coding strand and the
oligonucleotide closer to the end of the gene must be complementary to the non-coding strand. Nested PCR can be performed to confirm a PCR fragment or to obtain a specific fragment out of a mixture of primary products. Genomic DNA can be used as template when using an organism containing no introns or only few or short introns. When large introns are expected, cDNA or RNA ( RT-PCR) should be used as template. Alternatively, genomic DNA can be used for small fragments, which are expected to lie on the same exon.
Example 9: Production of antibodies against proteins having tocopherol cyclase activity
Polyclonal or monoclonal antibodies against either the complete proteins having tocopherol cyclase activity, e.g. against the NdhH protein, or against fragments thereof (= peptide antibodies) are produced according to state-of-the-art techniques (e.g. , Morgan, Monoclonal antibody production, Mod. Methods Pharmacol. 2, 1984; Schook, Monoclonal antibody production techniques and applications, Immunology Series Vol. 33, Dekker, New York, 1987; Ritter 8c Ladyman HM, eds., Monoclonal antibodies: Production, engineering and clinical application, Postgraduate medical science series, Cambridge University Press, Cambridge 1995; Huet, in: Celis, ed., Cell biology: A laboratory handbook, Vol. 2, Academic Press, San Diego, 1998, pp. 381 -391 ; Howard & Bethell, eds. , Methods in Antibody Production and Characterization, CRC Press, 2001 ) . Potential uses of antibodies are described, e.g. , in Caponi & Migliorini, Antibody Usage in the Lab, Springer-Verlag Berlin/Heidelberg, Germany, 1999.