[go: up one dir, main page]

AU2007202378A1 - Coryneracterium glutamicum genes encoding metabolic pathway proteins - Google Patents

Coryneracterium glutamicum genes encoding metabolic pathway proteins Download PDF

Info

Publication number
AU2007202378A1
AU2007202378A1 AU2007202378A AU2007202378A AU2007202378A1 AU 2007202378 A1 AU2007202378 A1 AU 2007202378A1 AU 2007202378 A AU2007202378 A AU 2007202378A AU 2007202378 A AU2007202378 A AU 2007202378A AU 2007202378 A1 AU2007202378 A1 AU 2007202378A1
Authority
AU
Australia
Prior art keywords
nucleic acid
protein
seq
sequence
amino acid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
AU2007202378A
Inventor
Gregor Haberhauer
Burkhard Kroger
Markus Pompejus
Hartwig Schroder
Oskar Zelder
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BASF SE
Original Assignee
BASF SE
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2006200807A external-priority patent/AU2006200807A1/en
Application filed by BASF SE filed Critical BASF SE
Publication of AU2007202378A1 publication Critical patent/AU2007202378A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P20/00Technologies relating to chemical industry
    • Y02P20/50Improvements relating to the production of bulk chemicals
    • Y02P20/52Improvements relating to the production of bulk chemicals using catalysts, e.g. selective catalysts

Landscapes

  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)

Description

P001 Section 29 Regulation 3.2(2)
AUSTRALIA
Patents Act 1990 COMPLETE SPECIFICATION STANDARD PATENT Application Number: Lodged: Invention Title: Coryneracterium glutamicum genes encoding metabolic pathway proteins The following statement is a full description of this invention, including the best method of performing it known to us: 1 CORYNEBACTERUM GLUTAMICUM GENES ENCODING
METABOLIC
SPATHWAY
PROTEINS
Related Applications The present application claims priority to prior filed U.S. Provisional Patent I 5 Application Serial No. 60/141031, filed June 25 1999, U.S. Provisional Patent Application Serial No. 60/142101, filed July 2, 1999, U.S. Provisional Patent 00 Application Serial No. 60/148613, filed August 12, 1999, and also to U.S. Provisional Patent Application Serial No. 60/187970, filed March 9, 2000. The present application also claims priority to prior filed German Patent Application No. 19930476.9, filed July 1, 1999, German Patent Application No. 19931415.2, filed July 8, 1999, German Patent Application No. 19931418.7, filed July 8, 1999, German Patent Application No.
19931419.5, filed July 8, 1999, German Patent Application No. 19931420.9, filed July 8, 1999, German Patent Application No. 19931424.1, filed July 8, 1999, German Patent Application No. 19931428.4, filed July 8, 1999, German Patent Application No.
19931434.9, filed July 8, 1999, German Patent Application No. 19931435.7, filed July 8, 1999, German Patent Application No. 19931443.8, filed July 8, 1999, German Patent Application No. 19931453.5, filed July 8, 1999, German Patent Application No.
19931457.8, filed July 8, 1999, German Patent Application No. 19931465.9, filed July 8, 1999, German Patent Application No. 19931478.0, filed July 8, 1999, German Patent Application No. 19931510.8, filed July 8, 1999, German Patent Application No.
19931541.8, filed July 8, 1999, German Patent Application No. 19931573.6, filed July 8, 1999, German Patent Application No. 19931592.2, filed July 8, 1999, German Patent Application No. 19931632.5, filed July 8, 1999, German Patent Application No.
19931634.1, filed July 8, 1999, German Patent Application No. 19931636.8, filed July 8, 1999, German Patent Application No. 19932125.6, filed July 9, 1999, German Patent Application No. 19932126.4, filed July 9, 1999, German Patent Application No.
19932130.2, filed July 9, 1999, German Patent Application No. 19932186.8, filed July 9, 1999, German Patent Application No. 19932206.6, filed July 9, 1999, German Patent Application No. 1993222.7.9, filed July 9, 1999, German Patent Application No.
19932228.7, filed July 9, 1999, German Patent Application No. 19932229.5, filed July 9, 1999, German Patent Application No. 19932230.9, filed July 9, 1999, German Patent Application No. 19932922.2, filed July 14, 1999, German Patent Application No.
Q) 19932926.5, filed July 14, 1999, German Patent Application No. 19932928.1, filed July 14, 1999, German Patent Application No. 19933004.2, filed July 14, 1999, German Patent Application No. 19933005.0, filed July 14, 1999, German Patent Application No.
tI 19933006.9, filed July 14, 1999, German Patent Application No. 19940764.9, filed August 27, 1999, German Patent Application No. 19940765.7, filed August 27, 1999, German Patent Application No. 19940766.5, filed August 27, 1999, German Patent Application No. 19940832.7, filed August 27, 1999, German Patent Application No.
19941378.9, filed August 31, 1999, German Patent Application No. 19941379.7, filed August 31, 1999, German Patent Application No. 19941380.0, filed August 31, 1999, German Patent Application No. 19941394.0, filed August 31, 1999, German Patent (N Application No. 19941396.7, filed August 31, 1999, German Patent Application No.
19942076.9, filed September 3, 1999, German Patent Application No. 19942077.7, filed September 3, 1999, German Patent Application No. 19942079.3, filed September 3, 1999, German Patent Application No. 19942086.6, filed September 3, 1999, German Patent Application No. 19942087.4, filed September 3, 1999, German Patent Application No. 19942088.2, filed September 3, 1999, German Patent Application No.
19942095.5, filed September 3, 1999, German Patent Application No. 19942124.2, filed September 3, 1999, and German Patent Application No. 19942129.3, filed September 3, 1999. The entire contents of all of the aforementioned applications are hereby expressly incorporated herein by this reference.
Background of the Invention Certain products and by-products of naturally-occurring metabolic processes in cells have utility in a wide array of industries, including the food, feed, cosmetics, and pharmaceutical industries. These molecules, collectively termed 'fine chemicals', include organic acids, both proteinogenic and non-proteinogenic amino acids, nucleotides and nucleosides, lipids and fatty acids, diols, carbohydrates, aromatic compounds, vitamins and cofactors, and enzymes. Their production is most conveniently performed through large-scale culture of bacteria developed to produce and secrete large quantities of a particular desired molecule. One particularly useful organism for this purpose is Corynebacterium glutamicum, a gram positive, nonpathogenic bacterium. Through strain selection, a number of mutant strains have -3been developed which produce an array of desirable compounds. However, selection of strains improved for the production of a particular molecule is a time-consuming and C difficult process.
Summary of the Invention The invention provides novel bacterial nucleic acid molecules which have a Svariety of uses. These uses include the identification of microorganisms which can be used to produce fine chemicals, the modulation of fine chemical production in C.
Sglutamicum or related bacteria, the typing or identification of C. glutamicum or related bacteria, as reference points for mapping the C. glutamicum genome, and as markers for Stransformation. These novel nucleic acid molecules encode proteins, referred to herein as metabolic pathway (MP) proteins.
C. glutamicum is a gram positive, aerobic bacterium which is commonly used in industry for the large-scale production of a variety of fine chemicals, and also for the degradation of hydrocarbons (such as in.petroleum spills) and for the oxidation of terpenoids. The MP nucleic acid molecules of the invention, therefore, can be used to identify microorganisms which can be used to produce fine chemicals, by fermentation processes. Modulation of the expression of the MP nucleic acids of the invention, or modification of the sequence of the MP nucleic acid molecules of the invention, can be used to modulate the production of one or more fine chemicals from a microorganism to improve the yield or production of one or more fine chemicals from a Corynebacterium or Brevibacterium species).
The MP nucleic acids of the invention may also be used to identify an organism as being Corynebacterium glutamicum or a close relative thereof, or to identify the presence of C. glutamicum or a relative thereof in a mixed population of microorganisms. The invention provides the nucleic acid sequences of a number of C.
glutamicum genes; by probing the extracted genomic DNA of a culture of a unique or mixed population of microorganisms under stringent conditions with a probe spanning a region of a C. glutamicum gene which is unique to this organism, one can ascertain whether this organism is present. Although Corynebacterium glutamicum itself is nonpathogenic, it is related to species pathogenic in humans, such as Corynebacterium -4- Shdiphtheriae (the causative agent of diphtheria); the detection of such organisms is of significant clinical relevance.
The MP nucleic acid molecules of the invention may also serve as reference points for mapping of the C. glutamicum genome, or of genomes of related organisms.
Similarly, these molecules, or variants or portions thereof, may serve as markers for genetically engineered Corynebacterium or Brevibacterium species.
The MP proteins encoded by the novel nucleic acid molecules of the invention are Scapable of, for example, performing an enzymatic step involved in the metabolism of C certain fine chemicals, including amino acids, vitamins, cofactors, nutraceuticals, nucleotides, nucleosides, and trehalose. Given the availability of cloning vectors for use c in Corynebacterium glutamicum, such as those disclosed in Sinskey et al., U.S. Patent No. 4,649,119, and techniques for genetic manipulation of C. glutamicum and the related Brevibacterium species lactofermentum) (Yoshihama et al, J. Bacteriol.
162: 591-597 (1985); Katsumata etal., J. Bacteriol. 159: 306-311 (1984); and Santamaria et al., J Gen. Microbiol. 130: 2237-2246 (1984)), the nucleic acid molecules of the invention may be utilized in the genetic engineering of this organism to make it a better or more efficient producer of one or more fine chemicals.
This improved production or efficiency of production of a fine chemical may be due to a direct effect of manipulation of a gene of the invention, or it may be due to an indirect effect of such manipulation. Specifically, alterations in C. glutamicum metabolic pathways for amino acids, vitamins, cofactors, nucleotides, and trehalose may have a direct impact on the overall production of one or more of these desired compounds from this organism. For example, optimizing the activity of a lysine biosynthetic pathway protein or decreasing the activity of a lysine degradative pathway protein may result in an increase in the yield or efficiency of production of lysine from such an engineered organism. Alterations in the proteins involved in these metabolic pathways may also have an indirect impact on the production or efficiency of production of a desired fine chemical. For example, a reaction which is in competition for an intermediate necessary for the production of a desired molecule may be eliminated, or a pathway necessary for the production of a particular intermediate for a desired compound may be optimized. Further, modulations in the biosynthesis or degradation of, for example, an amino acid, a vitamin, or a nucleotide may increase the overall ability of the microorganism to rapidly grow and divide, thus increasing the number (1 and/or production capacities of the microorganism in culture and thereby increasing the possible yield of the desired fine chemical.
The nucleic acid and protein molecules of the invention may be utilized to directly improve the production or efficiency of production of one or more desired fine chemicals from Corynebacterium glutamicum. Using recombinant genetic techniques 00 well known in the art, one or more of the biosynthetic or degradative enzymes of the eC invention for amino acids, vitamins, cofactors, nutraceuticals, nucleotides, nucleosides, Sor trehalose may be manipulated such that its function is modulated. For example, a biosynthetic enzyme may be improved in efficiency, or its allosteric control region destroyed such that feedback inhibition of production of the compound is prevented.
Similarly, a degradative enzyme may be deleted or modified by substitution, deletion, or addition such that its degradative activity is lessened for the desired compound without impairing the viability of the cell. In each case, the overall yield or rate of production of the desired fine chemical may be increased.
It is also possible that such alterations in the protein and nucleotide molecules of the invention may improve the production of other fine chemicals besides the amino acids, vitamins, cofactors, nutraceuticals, nucleotides, nucleosides, and trehalose through indirect mechanisms. Metabolism of any one compound is necessarily intertwined with other biosynthetic and degradative pathways within the cell, and necessary cofactors, intermediates, or substrates in one pathway are likely supplied or limited by another such pathway. Therefore, by modulating the activity of one or more of the proteins of the invention, the production or efficiency of activity of another fine chemical biosynthetic or degradative pathway may be impacted. For example, amino acids serve as the structural units of all proteins, yet may be present intracellularly in levels which are limiting for protein synthesis; therefore, by increasing the efficiency of production or the yields of one or more amino acids within the cell, proteins, such as biosynthetic or degradative proteins, may be more readily synthesized. Likewise, an alteration in a metabolic pathway enzyme such that a particular side reaction becomes more or less favored may result in the over- or under-production of one or more compounds which are utilized as intermediates or substrates for the production of a desired fine chemical.
6 This invention provides novel nucleic acid molecules which encode proteins, referred to herein as metabolic pathway proteins which are capable of, for example, performing an enzymatic step involved in the metabolism of molecules important for the normal functioning of cells, such as amino acids, V) 5 vitamins, cofactors, nucleotides and nucleosides, or trehalose. Nucleic acid molecules encoding an MP Protein are referred to herein as MP nucleic acid 00molecules. In a preferred embodiment, the MP protein performs an enzymatic t' step related to the metabolism of one of more of the following: amino acids, vitamins, cofactors, nutraceuticals, nucleotides, nucleosides, and trehalose.
10 Examples of such proteins include those encoded by the genes set forth in Table 1.
The following embodiments of the invention, the subject of this application, are specifically disclosed herein: a An isolated nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:449, or a complement thereof.
An isolated nucleic acid molecule which encodes a polypeptide comprising the amino acid sequence of SEQ ID NO:450, or a complement thereof.
An isolated nucleic acid molecule which encodes a naturally occurring allelic variant of a polypeptide comprising the amino acid sequence of SEQ ID NO:450, or a complement thereof.
0 An isolated nucleic acid molecule comprising a nucleotide sequence which is at least 50% identical to the entire nucleotide sequence of SEQ ID NO:449, or a complement thereof.
0 An isolated nucleic acid molecule comprising a fragment of at least contiguous nucleotides of the nucleotide sequence of SEQ ID NO:449, or a complement thereof.
6a An isolated nucleic acid molecule which encodes a polypeptide comprising an amino acid sequence which is at least 50% identical to the entire amino acid sequence of SEQ ID NO:450, or a complement thereof.
An isolated nucleic acid molecule comprising the nucleic acid molecule of o0 5 any one of claims 1-6 and a nucleotide sequence encoding a heterologous polypeptide.
l An isolated polypeptide comprising the amino acid sequence of SEQ ID SNO:450.
An isolated polypeptide comprising a naturally occurring allelic variant of a polypeptide comprising the amino acid sequence of SEQ ID NO:450.
An isolated polypeptide which is encoded by a nucleic acid molecule comprising a nucleotide sequence which is at least 50% identical to the entire nucleotide sequence of SEQ ID NO:449.
An isolated polypeptide comprising an amino acid sequence which is at least 50% identical to the entire amino acid sequence of SEQ ID NO:450.
An isolated polypeptide comprising a fragment of a polypeptide comprising the amino acid sequence of SEQ ID NO:450, wherein said polypeptide fragment maintains a biological activity of the polypeptide comprising the amino sequence of SEQ ID NO:450.
An isolated polypeptide comprising an amino acid sequence which is encoded by a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:449.
A host cell comprising the nucleic acid molecule of SEQ ID NO:449, wherein the nucleic acid molecule is disrupted.
A host cell comprising the nucleic acid molecule of SEQ ID NO:449, Swherein the nucleic acid molecule comprises one or more nucleic acid modifications as compared to the sequence of SEQ ID NO:449.
S* A host cell comprising the nucleic acid molecule of SEQ ID NO:449, wherein the regulatory region of the nucleic acid molecule is modified oO relative to the wild-type regulatory region of the molecule.
Accordingly, one aspect of the invention pertains to isolated nucleic acid molecules cDNAs, DNAs, or RNAs) comprising a nucleotide sequence 0 10 encoding an MP protein of biologically active portions thereof, as well as nucleic acid fragments suitable as primers or hybirdization probes for the detection or amplification of MP encoding nucleic acid DNA or mNRA). In particular preferred embodiments, the isolated nucleic acid molecule comprises one of the nucleotide sequences set forth as the odd-numbered SEQ ID NOs in the Sequence Listing SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID or the coding region or a complement thereof of one of these nucleotide sequences. In the other particularly preferred embodiments, the isolated nucleic acidmolecule of the invention comprises a nucleotide sequence which hybirdizes to or at least about 50%, preferably at least about 60%, more preferably at least about 70%, 80% or 90%, and even more preferably at least about 95%, 96%, 97%, 98%, 99% or more homologous to a nucleotide sequence set forth as an odd-numbered SEQ ID NO in the Sequence Listing SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID or a portion thereof. In other preferred embodiments, the isolated nucleic acid molecule encodes one of the amino acid sequences set forth as an even-numbered SEQ IN NO in the Sequence Listing SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID The preferred MP proteins of the present invention also preferably possess at least one of the MP activities described herein.
In another embodiment, the isolated nucleic acid molecule encodes a protein or portion thereof wherein the protein or portion thereof includes an amino acid sequence which is sufficiently homologous to an amino sequence of the invention (e.g.,a sequence having an even-numbered SEQ ID NO: in the Sequence Listing), e.g., eC sufficiently homologous to an amino acid sequence of the invention such that the protein or portion thereof maintains an MP activity. Preferably, the protein or portion thereof encoded by the nucleic acid molecule maintains the ability to perform an enzymatic reaction in a amino acid, vitamin, cofactor, nutraceutical, nucleotide, nucleoside, or 0 trehalose metabolic pathway. In one embodiment, the protein encoded by the nucleic M acid molecule is at least about 50%, preferably at least about 60%, and more preferably O at least about 70%, 80%, or 90% and most preferably at least about 95%, 96%, 97%, 98%, or 99% or more homologous to an amino acid sequence of the invention an entire amino acid sequence selected from those having an even-numbered SEQ ID NO in the Sequence Listing). In another preferred embodiment, the protein is a full length C. glutamicum protein which is substantially homologous to an entire amino acid sequence of the invention (encoded by an open reading frame shown in the corresponding odd-numbered SEQ ID NOs in the Sequence Listing SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID In another preferred embodiment, the isolated nucleic acid molecule is derived from C. glutamicum and encodes a protein an MP fusion protein) which includes a biologically active domain which is at least about 50% or more homologous to one of the amino acid sequences of the invention a sequence of one of the even-numbered SEQ ID NOs in the Sequence Listing) and is able to catalyze a reaction in a metabolic pathway for an amino acid, vitamin, cofactor, nutraceutical, nucleotide, nucleoside, or trehalose, or one or more of the activities set forth in Table I, and which also includes heterologous nucleic acid sequences encoding a heterologous polypeptide or regulatory regions.
In another embodiment, the isolated nucleic acid molecule is at least nucleotides in length and hybridizes under stringent conditions to a nucleic acid molecule comprising a nucleotide sequence of the invention a sequence of an oddnumbered SEQ ID NO in the Sequence Listing). Preferably, the isolated nucleic acid molecule corresponds to a naturally-occurring nucleic acid molecule. More preferably, the isolated nucleic acid encodes a naturally-occurring C. glutamicum MP protein, or a biologically active portion thereof.
Another aspect of the invention pertains to vectors, recombinant expression vectors, containing the nucleic acid molecules of the invention, and host cells into which such vectors have been introduced. In one embodiment, such a host cell is used to ,1 produce an MP protein by culturing the host cell in a suitable medium. The MP protein can be then isolated from the medium or the host cell.
00 00 Yet another aspect of the invention pertains to a genetically altered microorganism in which an MP gene has been introduced or altered. In one I embodiment, the genome of the microorganism has been altered by introduction of a nucleic acid molecule of the invention encoding wild-type or mutated MP sequence as a transgene. In another embodiment, an endogenous MP gene within the genome of the microorganism has been altered, functionally disrupted, by homologous recombination with an altered MP gene. In another embodiment, an endogenous or introduced MP gene in a microorganism has been altered by one or more point mutations, deletions, or inversions, but still encodes a functional MP protein. In still another embodiment, one or more of the regulatory regions a promoter, repressor, or inducer) of an MP gene in a microorganism has been altered by deletion, truncation, inversion, or point mutation) such that the expression of the MP gene is modulated. In a preferred embodiment, the microorganism belongs to the genus Corynebacterium or Brevibacterium, with Corynebacterium glutamicum being particularly preferred. In a preferred embodiment, the microorganism is also utilized for the production of a desired compound, such as an amino acid, with lysine being particularly preferred.
In another aspect, the invention provides a method of identifying the presence or activity of Cornyebacterium diphtheriae in a subject. This method includes detection of one or more of the nucleic acid or amino acid sequences of the invention the sequences set forth in the Sequence Listing as SEQ ID NOs 1 through 1156) in a subject, thereby detecting the presence or activity of Corynebacterium diphtheriae in the subject.
Still another aspect of the invention pertains to an isolated MP protein or a portion, a biologically active portion, thereof. In a preferred embodiment, the isolated MP protein or portion thereof can catalyze an enzymatic reaction involved in one or more pathways for the metabolism of an amino acid, a vitamin, a cofactor, a -9nutraceutical, a nucleotide, a nucleoside, or trehalose. In another preferred embodiment, the isolated MP protein or portion thereof is sufficiently homologous to an amino acid ttm sequence of the invention a sequence of an even-numbered SEQ ID NO: in the Sequence Listing) such that the protein or portion thereof maintains the ability to 00 5 catalyze an enzymatic reaction involved in one or more pathways for the metabolism of an amino acid, a vitamin, a cofactor, a nutraceutical, a nucleotide, a nucleoside, or Strehalose.
The invention also provides an isolated preparation of an MP protein. In preferred embodiments, the MP protein comprises an amino acid sequence of the ri 10 invention a sequence of an even-numbered SEQ ID NO: of the Sequence Listing).
In another preferred embodiment, the invention pertains to an isolated full length protein which is substantially homologous to an entire amino acid sequence of the invention a sequence of an even-numbered SEQ ID NO: of the Sequence Listing) (encoded by an open reading frame set forth in a corresponding odd-numbered SEQ ID NO: of the Sequence Listing). In yet another embodiment, the protein is at least about preferably at least about 60%, and more preferably at least about 70%, 80%, or and most preferably at least about 95%, 96%, 97%, 98%, or 99% or more homologous to an entire amino acid sequence of the invention a sequence of an even-numbered SEQ ID NO: of the Sequence Listing). In other embodiments, the isolated MP protein comprises an amino acid sequence which is at least about 50% or more homologous to one of the amino acid sequences of the invention a sequence of an even-numbered SEQ ID NO: of the Sequence Listing) and is able to catalyze an enzymatic reaction in an amino acid, vitamin, cofactor, nutraceutical, nucleotide, nucleoside, or trehalose metabolic pathway, or has one or more of the activities set forth in Table 1.
Alternatively, the isolated MP protein can comprise an amino acid sequence which is encoded by a nucleotide sequence which hybridizes, hybridizes under stringent conditions, or is at least about 50%, preferably at least about 60%, more preferably at least about 70%, 80%, or 90%, and even more preferably at least about 96%, 97%, or 99% or more homologous to a nucleotide sequence of one of the even-numbered SEQ ID NOs set forth in the Sequence Listing. It is also preferred that the preferred forms of MP proteins also have one or more of the MP bioactivities described herein.
The MP polypeptide, or a biologically active portion thereof, can be operatively linked to a non-MP polypeptide to form a fusion protein. In preferred embodiments, this fusion protein has an activity which differs from that of the MP protein alone. In other NC preferred embodiments, this fusion protein, when introduced into a C glutamicum pathway for the metabolism of an amino acid, vitamin, cofactor, nutraceutical, results in 00 0 increased yields and/or efficiency of production of a desired fine chemical from C.
glutamicum. In particularly preferred embodiments, integration of this fusion protein Sinto an amino acid, vitamin, cofactor, nutraceutical, nucleotide, nucleoside, or trehalose metabolic pathway of a host cell modulates production of a desired compound from the cell.
In another aspect, the invention provides methods for screening molecules which modulate the activity of an MP protein, either by interacting with the protein itself or a substrate or binding partner of the MP protein, or by modulating the transcription or translation of an MP nucleic acid molecule of the invention.
Another aspect of the invention pertains to a method for producing a fine chemical. This method involves the culturing of a cell containing a vector directing the expression of an MP nucleic acid molecule of the invention, such that a fine chemical is produced. In a preferred embodiment, this method further includes the step of obtaining a cell containing such a vector, in which a cell is transfected with a vector directing the expression of an MP nucleic acid. In another preferred embodiment, this method further includes the step of recovering the fine chemical from the culture. In a particularly preferred embodiment, the cell is from the genus Corynebacterium or Brevibacterium, or is selected from those strains set forth in Table 3.
Another aspect of the invention pertains to methods for modulating production of a molecule from a microorganism. Such methods include contacting the cell with an agent which modulates MP protein activity or MP nucleic acid expression such that a cell associated activity is altered relative to this same activity in the absence of the agent. In a preferred embodiment, the cell is modulated for one or more C. glutamicum amino acid, vitamin, cofactor, nutraceutical, nucleotide, nucleoside, or trehalose metabolic pathways, such that the yields or rate of production of a desired fine chemical by this microorganism is improved. The agent which modulates MP protein activity can be an agent which stimulates MP protein activity or MP nucleic acid expression.
)-11- ~-ll- Examples of agents which stimulate MP protein activity or MP nucleic acid expression include small molecules, active MP proteins, and nucleic acids encoding MP proteins Sthat have been introduced into the cell. Examples of agents which inhibit MP activity or expression include small molecules, and antisense MP nucleic acid molecules.
00 5 Another aspect of the invention pertains to methods for modulating yields of a desired compound from a cell, involving the introduction of a wild-type or mutant MP gene into a cell, either maintained on a separate plasmid or integrated into the genome of the host cell. If integrated into the genome, such integration can be random, or it can take place by homologous recombination such that the native gene is replaced by the 10 introduced copy, causing the production of the desired compound from the cell to be modulated. In a preferred embodiment, said yields are increased. In another preferred embodiment, said chemical is a fine chemical. In a particularly preferred embodiment, said fine chemical is an amino acid. In especially preferred embodiments, said amino acid is L-lysine.
Detailed Description of the Invention The present invention provides MP nucleic acid and protein molecules which are involved in the metabolism of certain fine chemicals in Corynebacterium glutamicum, including amino acids, vitamins, cofactors, nutraceuticals, nucleotides, nucleosides, and trehalose. The molecules of the invention may be utilized in the modulation of production of fine chemicals from microorganisms, such as C. glutamicum, either directly where modulation of the activity of a lysine biosynthesis protein has a direct impact on the production or efficiency of production of lysine from that organism), or may have an indirect impact which nonetheless results in an increase of yield or efficiency of production of the desired compound where modulation of the activity of a nucleotide biosynthesis protein has an impact on the production of an organic acid or a fatty acid from the bacterium, perhaps due to improved growth or an increased supply of necessary co-factors, energy compounds, or precursor molecules).
Aspects of the invention are further explicated below.
1 O -12- I. Fine Chemicals The term 'fine chemical' is art-recognized and includes molecules produced by an organism which have applications in various industries, such as, but not limited to, N the pharmaceutical, agriculture, and cosmetics industries. Such compounds include organic acids, such as tartaric acid, itaconic acid, and diaminopimelic acid, both 00 Sproteinogenic and non-proteinogenic amino acids, purine and pyrimidine bases, nucleosides, and nucleotides (as described e.g. in Kuninaka, A. (1996) Nucleotides and related compounds, p. 561-612, in Biotechnology vol. 6, Rehm et al., eds. VCH: Weinheim, and references contained therein), lipids, both saturated and unsaturated fatty acids arachidonic acid), diols propane diol, and butane diol), carbohydrates hyaluronic acid and trehalose), aromatic compounds aromatic amines, vanillin, and indigo), vitamins and cofactors (as described in Ullmann's Encyclopedia of Industrial Chemistry, vol. A27, "Vitamins", p. 443-613 (1996) VCH: Weinheim and references therein; and Ong, Niki, E. Packer, L. (1995) "Nutrition, Lipids, Health, and Disease" Proceedings of the UNESCO/Confederation of Scientific and Technological Associations in Malaysia, and the Society for Free Radical Research Asia, held Sept. 1-3, 1994 at Penang, Malaysia, AOCS Press, (1995)), enzymes, polyketides (Cane et al. (1998) Science 282: 63-68), and all other chemicals described in Gutcho (1983) Chemicals by Fermentation, Noyes Data Corporation, ISBN: 0818805086 and references therein. The metabolism and uses of certain of these fine chemicals are further explicated below.
A. Amino Acid Metabolism and Uses Amino acids comprise the basic structural units of all proteins, and as such are essential for normal cellular functioning in all organisms. The term "amino acid" is artrecognized. The proteinogenic amino acids, of which there are 20 species, serve as structural units for proteins, in which they are linked by peptide bonds, while the nonproteinogenic amino acids (hundreds of which are known) are not normally found in proteins (see Ulmann's Encyclopedia of Industrial Chemistry, vol. A2, p. 57-97 VCH: Weinheim (1985)). Amino acids may be in the D- or L- optical configuration, though Lamino acids are generally the only type found in naturally-occurring proteins.
Biosynthetic and degradative pathways of each of the 20 proteinogenic amino acids 1 -13have been well characterized in both prokaryotic and eukaryotic cells (see, for example, Stryer, L. Biochemistry, 3P edition, pages 578-590 (1988)). The 'essential' amino acids t (histidine, isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan, and valine), so named because they are generally a nutritional requirement due to the 0 5 complexity of their biosyntheses, are readily converted by simple biosynthetic pathways to the remaining 11 'nonessential' amino acids (alanine, arginine, asparagine, aspartate, cysteine, glutamate, glutamine, glycine, proline, serine, and tyrosine). Higher animals c do retain the ability to synthesize some of these amino acids, but the essential amino Sacids must be supplied from the diet in order for normal protein synthesis to occur.
Aside from their function in protein biosynthesis, these amino acids are interesting chemicals in their own right, and many have been found to have various applications in the food, feed, chemical, cosmetics, agriculture, and pharmaceutical industries. Lysine is an important amino acid in the nutrition not only of humans, but also of monogastric animals such as poultry and swine. Glutamate is most commonly used as a flavor additive (mono-sodium glutamate, MSG) and is widely used throughout the food industry, as are aspartate, phenylalanine, glycine, and cysteine. Glycine, Lmethionine and tryptophan are all utilized in the pharmaceutical industry. Glutamine, valine, leucine, isoleucine, histidine, arginine, proline, serine and alanine are of use in both the pharmaceutical and cosmetics industries. Threonine, tryptophan, and D/ Lmethionine are common feed additives. (Leuchtenberger, W. (1996) Amino aids technical production and use, p. 466-502 in Rehm et al. (eds.) Biotechnology vol. 6, chapter 14a, VCH: Weinheim). Additionally, these amino acids have been found to be useful as precursors for the synthesis of synthetic amino acids and proteins, such as Nacetylcysteine, S-carboxymethyl-L-cysteine, (S)-5-hydroxytryptophan, and others described in Ulmann's Encyclopedia of Industrial Chemistry, vol. A2, p. 57-97, VCH: Weinheim, 1985.
The biosynthesis of these natural amino acids in organisms capable of producing them, such as bacteria, has been well characterized (for review of bacterial amino acid biosynthesis and regulation thereof, see Umbarger, H.E.(1978) Ann. Rev.
Biochem. 47: 533-606). Glutamate is synthesized by the reductive amination of aketoglutarate, an intermediate in the citric acid cycle. Glutamine, proline, and arginine are each subsequently produced from glutamate. The biosynthesis of serine is a three- 1 O -14-
O
step process beginning with 3-phosphoglycerate (an intermediate in glycolysis), and resulting in this amino acid after oxidation, transamination, and hydrolysis steps. Both cysteine and glycine are produced from serine; the former by the condensation of C" homocysteine with serine, and the latter by the transferal of the side-chain P-carbon atom to tetrahydrofolate, in a reaction catalyzed by serine transhydroxymethylase.
0 0 Phenylalanine, and tyrosine are synthesized from the glycolytic and pentose phosphate M pathway precursors erythrose 4-phosphate and phosphoenolpyruvate in a 9-step Sbiosynthetic pathway that differ only at the final two steps after synthesis ofprephenate.
Tryptophan is also produced from these two initial molecules, but its synthesis is an 11step pathway. Tyrosine may also be synthesized from phenylalanine, in a reaction catalyzed by phenylalanine hydroxylase. Alanine, valine, and leucine are all biosynthetic products ofpyruvate, the final product of glycolysis. Aspartate is formed from oxaloacetate, an intermediate of the citric acid cycle. Asparagine, methionine, threonine, and lysine are each produced by the conversion ofaspartate. Isoleucine is formed from threonine. A complex 9-step pathway results in the production of histidine from 5-phosphoribosyl-l-pyrophosphate, an activated sugar.
Amino acids in excess of the protein synthesis needs of the cell cannot be stored, and are instead degraded to provide intermediates for the major metabolic pathways of the cell (for review see Stryer, L. Biochemistry 3 rd ed. Ch. 21 "Amino Acid Degradation and the Urea Cycle" p. 495-516 (1988)). Although the cell is able to convert unwanted amino acids into useful metabolic intermediates, amino acid production is costly in terms of energy, precursor molecules, and the enzymes necessary to synthesize them.
Thus it is not surprising that amino acid biosynthesis is regulated by feedback inhibition, in which the presence of a particular amino acid serves to slow or entirely stop its own production (for overview of feedback mechanisms in amino acid biosynthetic pathways, see Stryer, L. Biochemistry, 3 rd ed. Ch. 24: "Biosynthesis of Amino Acids and Heme" p.
575-600 (1988)). Thus, the output of any particular amino acid is limited by the amount of that amino acid present in the cell.
B. Vitamin, Cofactor, and Nutraceutical Metabolism and Uses Vitamins, cofactors, and nutraceuticals comprise another group of molecules which the higher animals have lost the ability to synthesize and so must ingest, although they are readily synthesized by other organisms, such as bacteria. These molecules are either bioactive substances themselves, or are precursors of biologically active substances which may serve as electron carriers or intermediates in a variety of metabolic pathways. Aside from their nutritive value, these compounds also have significant industrial value as coloring agents, antioxidants, and catalysts or other 00 processing aids. (For an overview of the structure, activity, and industrial applications N of these compounds, see, for example, Ullman's Encyclopedia of Industrial Chemistry, S"Vitamins" vol. A27, p. 443-613, VCH: Weinheim, 1996.) The term "vitamin" is art- O recognized, and includes nutrients which are required by an organism for normal functioning, but which that organism cannot synthesize by itself. The group of vitamins may encompass cofactors and nutraceutical compounds. The language "cofactor" includes nonproteinaceous compounds required for a normal enzymatic activity to occur. Such compounds may be organic or inorganic; the cofactor molecules of the invention are preferably organic. The term "nutraceutical" includes dietary supplements having health benefits in plants and animals, particularly humans. Examples of such molecules are vitamins, antioxidants, and also certain lipids polyunsaturated fatty acids).
The biosynthesis of these molecules in organisms capable of producing them, such as bacteria, has been largely characterized (Ullman's Encyclopedia of Industrial Chemistry, "Vitamins" vol. A27, p. 443-613, VCH: Weinheim, 1996; Michal, G. (1999) Biochemical Pathways: An Atlas of Biochemistry and Molecular Biology, John Wiley Sons; Ong, Niki, E. Packer, L. (1995) "Nutrition, Lipids, Health, and Disease" Proceedings of the UNESCO/Confederation of Scientific and Technological Associations in Malaysia, and the Society for Free Radical Research Asia, held Sept.
1-3, 1994 at Penang, Malaysia, AOCS Press: Champaign, IL X, 374 S).
Thiamin (vitamin Bi) is produced by the chemical coupling ofpyrimidine and thiazole moieties. Riboflavin (vitamin B 2 is synthesized from (GTP) and ribose-5'-phosphate. Riboflavin, in turn, is utilized for the synthesis of flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD). The family of compounds collectively termed 'vitamin B 6 pyridoxine, pyridoxamine, pyridoxaand the commercially used pyridoxin hydrochloride) are all derivatives of the common structural unit, 5-hydroxy-6-methylpyridine. Pantothenate (pantothenic -16acid, (R)-(+)-N-(2,4-dihydroxy-3,3-dimethyl-1-oxobutyl)-p-alanine) can be produced either by chemical synthesis or by fermentation. The final steps in pantothenate biosynthesis consist of the ATP-driven condensation of p-alanine and pantoic acid. The enzymes responsible for the biosynthesis steps for the conversion to pantoic acid, to Palanine and for the condensation to panthotenic acid are known. The metabolically 00 active form of pantothenate is Coenzyme A, for which the biosynthesis proceeds in enzymatic steps. Pantothenate, pyridoxal-5'-phosphate, cysteine and ATP are the precursors of Coenzyme A. These enzymes not only catalyze the formation of panthothante, but also the production of(R)-pantoic acid, (R)-pantolacton, panthenol (provitamin Bs), pantetheine (and its derivatives) and coenzyme A.
Biotin biosynthesis from the precursor molecule pimeloyl-CoA in microorganisms has been studied in detail and several of the genes involved have been identified. Many of the corresponding proteins have been found to also be involved in Fe-cluster synthesis and are members of the nifS class of proteins. Lipoic acid is derived from octanoic acid, and serves as a coenzyme in energy metabolism, where it becomes part of the pyruvate dehydrogenase complex and the a-ketoglutarate dehydrogenase complex. The folates are a group of substances which are all derivatives of folic acid, which is turn is derived from L-glutamic acid, p-amino-benzoic acid and 6methylpterin. The biosynthesis of folic acid and its derivatives, starting from the metabolism intermediates guanosine-5'-triphosphate (GTP), L-glutamic acid and pamino-benzoic acid has been studied in detail in certain microorganisms.
Corrinoids (such as the cobalamines and particularly vitamin B 12 and porphyrines belong to a group of chemicals characterized by a tetrapyrole ring system.
The biosynthesis of vitamin B 1 2 is sufficiently complex that it has not yet been completely characterized, but many of the enzymes and substrates involved are now known. Nicotinic acid (nicotinate), and nicotinamide are pyridine derivatives which are also termed 'niacin'. Niacin is the precursor of the important coenzymes NAD (nicotinamide adenine dinucleotide) and NADP (nicotinamide adenine dinucleotide phosphate) and their reduced forms.
The large-scale production of these compounds has largely relied on cell-free chemical syntheses, though some of these chemicals have also been produced by largescale culture of microorganisms, such as riboflavin, Vitamin B 6 pantothenate, and -17biotin. Only Vitamin B 1 2 is produced solely by fermentation, due to the complexity of its synthesis. In vitro methodologies require significant inputs of materials and time, Soften at great cost.
C. Purine, Pyrimidine, Nucleoside and Nucleotide Metabolism and Uses Purine and pyrimidine metabolism genes and their corresponding proteins are important targets for the therapy of tumor diseases and viral infections. The language 1N "purine" or "pyrimidine" includes the nitrogenous bases which are constituents of O nucleic acids, co-enzymes, and nucleotides. The term "nucleotide" includes the basic C 10 structural units of nucleic acid molecules, which are comprised of a nitrogenous base, a pentose sugar (in the case of RNA, the sugar is ribose; in the case of DNA, the sugar is D-deoxyribose), and phosphoric acid. The language "nucleoside" includes molecules which serve as precursors to nucleotides, but which are lacking the phosphoric acid moiety that nucleotides possess. By inhibiting the biosynthesis of these molecules, or their mobilization to form nucleic acid molecules, it is possible to inhibit RNA and DNA synthesis; by inhibiting this activity in a fashion targeted to cancerous cells, the ability of tumor cells to divide and replicate may be inhibited. Additionally, there are nucleotides which do not form nucleic acid molecules, but rather serve as energy stores AMP) or as coenzymes FAD and NAD).
Several publications have described the use of these chemicals for these medical indications, by influencing purine and/or pyrimidine metabolism Christopherson, R.I. and Lyons, S.D. (1990) "Potent inhibitors of de novo pyrimidine and purine biosynthesis as chemotherapeutic agents." Med. Res. Reviews 10: 505-548). Studies of enzymes involved in purine and pyrimidine metabolism have been focused on the development of new drugs which can be used, for example, as immunosuppressants or anti-proliferants (Smith, (1995) "Enzymes in nucleotide synthesis." Curr. Opin.
Struct. Biol. 5: 752-757; (1995) Biochem Soc. Transact. 23: 877-902). However, purine and pyrimidine bases, nucleosides and nucleotides have other utilities: as intermediates in the biosynthesis of several fine chemicals thiamine, S-adenosyl-methionine, folates, or riboflavin), as energy carriers for the cell ATP or GTP), and for chemicals themselves, commonly used as flavor enhancers IMP or GMP) or for several medicinal applications (see, for example, Kuninaka, A. (1996) Nucleotides and S-18-
O
C1 Related Compounds in Biotechnology vol. 6, Rehm et al., eds. VCH: Weinheim, p. 561c 612). Also, enzymes involved in purine, pyrimidine, nucleoside, or nucleotide metabolism are increasingly serving as targets against which chemicals for crop protection, including fungicides, herbicides and insecticides, are developed.
The metabolism of these compounds in bacteria has been characterized (for 00 reviews see, for example, Zalkin, H. and Dixon, J.E. (1992) "de novo purine nucleotide ¢C biosynthesis", in: Progress in Nucleic Acid Research and Molecular Biology, vol. 42, O Academic Press:, p. 259-287; and Michal, G. (1999) "Nucleotides and Nucleosides", SChapter 8 in: Biochemical Pathways: An Atlas of Biochemistry and Molecular Biology, Wiley: New York). Purine metabolism has been the subject of intensive research, and is essential to the normal functioning of the cell. Impaired purine metabolism in higher animals can cause severe disease, such as gout. Purine nucleotides are synthesized from in a series of steps through the intermediate compound phosphate (IMP), resulting in the production of guanosine-5'-monophosphate (GMP) or adenosine-5'-monophosphate (AMP), from which the triphosphate forms utilized as nucleotides are readily formed. These compounds are also utilized as energy stores, so their degradation provides energy for many different biochemical processes in the cell.
Pyrimidine biosynthesis proceeds by the formation of uridine-5'-monophosphate (UMP) from ribose-5-phosphate. UMP, in turn, is converted to cytidine-5'-triphosphate (CTP).
The deoxy- forms of all of these nucleotides are produced in a one step reduction reaction from the diphosphate ribose form of the nucleotide to the diphosphate deoxyribose form of the nucleotide. Upon phosphorylation, these molecules are able to participate in DNA synthesis.
D. Trehalose Metabolism and Uses Trehalose consists of two glucose molecules, bound in a, a-1,1 linkage. It is commonly used in the food industry as a sweetener, an additive for dried or frozen foods, and in beverages. However, it also has applications in the pharmaceutical, cosmetics and biotechnology industries (see, for example, Nishimoto et al., (1998) U.S.
Patent No. 5,759,610; Singer, M.A. and Lindquist, S. (1998) Trends Biotech. 16: 460- 467; Paiva, C.L.A. and Panek, A.D. (1996) Biotech. Ann. Rev. 2: 293-314; and Shiosaka, M. (1997) J. Japan 172: 97-102). Trehalose is produced by enzymes from -19many microorganisms and is naturally released into the surrounding medium, from which it can be collected using methods known in the art.
II. Elements and Methods of the Invention The present invention is based, at least in part, on the discovery of novel 00 molecules, referred to herein as MP nucleic acid and protein molecules, which play a 1 role in or function in one or more cellular metabolic pathways. In one embodiment, the C, MP molecules catalyze an enzymatic reaction involving one or more amino acid, 0vitamin, cofactor, nutraceutical, nucleotide, nucleoside, or trehalose metabolic N( 10 pathways. In a preferred embodiment, the activity of the MP molecules of the present invention in one or more C. glutamicum metabolic pathways for amino acids, vitamins, cofactors, nutraceuticals, nucleotides, nucleosides or trehalose has an impact on the production of a desired fine chemical by this organism. In a particularly preferred embodiment, the MP molecules of the invention are modulated in activity, such that the C glutamicum metabolic pathways in which the MP proteins of the invention are involved are modulated in efficiency or output, which either directly or indirectly modulates the production or efficiency of production of a desired fine chemical by C.
glutamicum.
The language, "MP protein" or "MP polypeptide" includes proteins which play a role in, catalyze an enzymatic reaction, in one or more amino acid, vitamin, cofactor, nutraceutical, nucleotide, nucleoside or trehalose metabolic pathways.
Examples of MP proteins include those encoded by the MP genes set forth in Table 1 and by the odd-numbered SEQ ID NOs. The terms "MP gene" or "MP nucleic acid sequence" include nucleic acid sequences encoding an MP protein, which consist of a coding region and also corresponding untranslated 5' and 3' sequence regions.
Examples of MP genes include those set forth in Table 1. The terms "production" or "productivity" are art-recognized and include the concentration of the fermentation product (for example, the desired fine chemical) formed within a given time and a given fermentation volume kg product per hour per liter). The term "efficiency of production" includes the time required for a particular level of production to be achieved (for example, how long it takes for the cell to attain a particular rate of output of a fine chemical). The term "yield" or "product/carbon yield" is art-recognized and includes the efficiency of the conversion of the carbon source into the product fine chemical). This is generally written as, for example, kg product per kg carbon source.
By increasing the yield or production of the compound, the quantity of recovered molecules, or of useful recovered molecules of that compound in a given amount of culture over a given amount of time is increased. The terms "biosynthesis" or a 00 "biosynthetic pathway" are art-recognized and include the synthesis of a compound, preferably an organic compound, by a cell from intermediate compounds in what may be a multistep and highly regulated process. The terms "degradation" or a "degradation pathway" are art-recognized and include the breakdown of a compound, preferably an organic compound, by a cell to degradation products (generally speaking, smaller or less complex molecules) in what may be a multistep and highly regulated process. The language "metabolism" is art-recognized and includes the totality of the biochemical reactions that take place in an organism. The metabolism of a particular compound, then, the metabolism of an amino acid such as glycine) comprises the overall biosynthetic, modification, and degradation pathways in the cell related to this compound.
In another embodiment, the MP molecules of the invention are capable of modulating the production of a desired molecule, such as a fine chemical, in a microorganism such as C. glutamicum. Using recombinant genetic techniques, one or more of the biosynthetic or degradative enzymes of the invention for amino acids, vitamins, cofactors, nutraceuticals, nucleotides, nucleosides, or trehalose may be manipulated such that its function is modulated. For example, a biosynthetic enzyme may be improved in efficiency, or its allosteric control region destroyed such that feedback inhibition of production of the compound is prevented. Similarly, a degradative enzyme may be deleted or modified by substitution, deletion, or addition such that its degradative activity is lessened for the desired compound without impairing the viability of the cell. In each case, the overall yield or rate of production of one of these desired fine chemicals may be increased.
It is also possible that such alterations in the protein and nucleotide molecules of the invention may improve the production of other fine chemicals besides the amino acids, vitamins, cofactors, nutraceuticals, nucleotides, nucleosides, and trehalose.
Metabolism of any one compound is necessarily intertwined with other biosynthetic and -21 degradative pathways within the cell, and necessary cofactors, intermediates, or substrates in one pathway are likely supplied or limited by another such pathway.
Vt Therefore, by modulating the activity of one or more of the proteins of the invention, the production or efficiency of activity of another fine chemical biosynthetic or degradative 00 5 pathway may be impacted. For example, amino acids serve as the structural units of all r proteins, yet may be present intracellularly in levels which are limiting for protein synthesis; therefore, by increasing the efficiency of production or the yields of one or l more amino acids within the cell, proteins, such as biosynthetic or degradative proteins, Smay be more readily synthesized. Likewise, an alteration in a metabolic pathway "1 10 enzyme such that a particular side reaction becomes more or less favored may result in the over- or under-production of one or more compounds which are utilized as intermediates or substrates for the production of a desired fine chemical.
The isolated nucleic acid sequences of the invention are contained within the genome of a Corynebacterium glutamicum strain available through the American Type Culture Collection, given designation ATCC 13032. The nucleotide sequence of the isolated C. glutamicum MP DNAs and the predicted amino acid sequences of the C.
glutamicum MP proteins are shown in the Sequence Listing as odd-numbered SEQ ID NOs and even-numbered SEQ ID NOs, respectively. Computational analyses were performed which classified and/or identified these nucleotide sequences as sequences which encode metabolic pathway proteins.
The present invention also pertains to proteins which have an amino acid sequence which is substantially homologous to an amino acid sequence of the invention the sequence of an even-numbered SEQ ID NO of the Sequence Listing). As used herein, a protein which has an amino acid sequence which is substantially homologous to a selected amino acid sequence is least about 50% homologous to the selected amino acid sequence, the entire selected amino acid sequence. A protein which has an amino acid sequence which is substantially homologous to a selected amino acid sequence can also be least about 50-60%, preferably at least about 60-70%, and more preferably at least about 70-80%, 80-90%, or 90-95%, and most preferably at least about 96%, 97%, 98%, 99% or more homologous to the selected amino acid sequence.
The MP protein or a biologically active portion or fragment thereof of the invention can catalyze an enzymatic reaction in one or more amino acid, vitamin, -22cofactor, nutraceutical, nucleotide, nucleoside, or trehalose metabolic pathways, or have one or more of the activities set forth in Table 1.
Various aspects of the invention are described in further detail in the following subsections: 00 A. Isolated Nucleic Acid Molecules r n One aspect of the invention pertains to isolated nucleic acid molecules that O encode MP polypeptides or biologically active portions thereof, as well as nucleic acid fragments sufficient for use as hybridization probes or primers for the identification or amplification of MP-encoding nucleic acid MP DNA). As used herein, the term "nucleic acid molecule" is intended to include DNA molecules cDNA or genomic DNA) and RNA molecules mRNA) and analogs of the DNA or RNA generated using nucleotide analogs. This term also encompasses untranslated sequence located at both the 3' and 5' ends of the coding region of the gene: at least about 100 nucleotides of sequence upstream from the 5' end of the coding region and at least about nucleotides of sequence downstream from the 3'end of the coding region of the gene.
The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA. An "isolated" nucleic acid molecule is one which is separated from other nucleic acid molecules which are present in the natural source of the nucleic acid. Preferably, an "isolated" nucleic acid is free of sequences which naturally flank the nucleic acid sequences located at the 5' and 3' ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated MP nucleic acid molecule can contain less than about kb, 4kb, 3kb, 2kb, 1 kb, 0.5 kb or 0.1 kb of nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived a C. glutamicum cell). Moreover, an "isolated" nucleic acid molecule, such as a DNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or chemical precursors or other chemicals when chemically synthesized.
A nucleic acid molecule of the present invention, a nucleic acid molecule having a nucleotide sequence of an odd-numbered SEQ ID NO of the Sequence Listing, or a portion thereof, can be isolated using standard molecular biology techniques and the -23sequence information provided herein. For example, a C. glutamicum MP DNA can be isolated from a C. glutamicum library using all or portion of one of the odd-numbered t SEQ ID NO sequences of the Sequence Listing as a hybridization probe and standard hybridization techniques as described in Sambrook, Fritsh, E. and Maniatis, T. Molecular Cloning: A Laboratory Manual. 2nd, ed, Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989).
Moreover, a nucleic acid molecule encompassing all or a portion of one of the nucleic 1 acid sequences of the invention an odd-numbered SEQ ID NO:) can be isolated by Sthe polymerase chain reaction using oligonucleotide primers designed based upon this 10 sequence a nucleic acid molecule encompassing all or a portion of one of the nucleic acid sequences of the invention an odd-numbered SEQ ID NO of the Sequence Listing) can be isolated by the polymerase chain reaction using oligonucleotide primers designed based upon this same sequence). For example, mRNA can be isolated from normal endothelial cells by the guanidinium-thiocyanate extraction procedure of Chirgwin et al. (1979) Biochemistry 18: 5294-5299) and DNA can be prepared using reverse transcriptase Moloney MLV reverse transcriptase, available from Gibco/BRL, Bethesda, MD; or AMV reverse transcriptase, available from Seikagaku America, Inc., St. Petersburg, FL). Synthetic oligonucleotide primers for polymerase chain reaction amplification can be designed based upon one of the nucleotide sequences shown in the Sequence Listing. A nucleic acid of the invention can be amplified using cDNA or, alternatively, genomic DNA, as a template and appropriate oligonucleotide primers according to standard PCR amplification techniques. The nucleic acid so amplified can be cloned into an appropriate vector and characterized by DNA sequence analysis. Furthermore, oligonucleotides corresponding to an MP nucleotide sequence can be prepared by standard synthetic techniques, e.g., using an automated DNA synthesizer.
In a preferred embodiment, an isolated nucleic acid molecule of the invention comprises one of the nucleotide sequences shown in the Sequence Listing. The nucleic acid sequences of the invention, as set forth in the Sequence Listing, correspond to the Corynebacterium glutamicum MP DNAs of the invention. This DNA comprises sequences encoding MP proteins the "coding region", indicated in each oddnumbered SEQ ID NO: sequence in the Sequence Listing), as well as 5' untranslated -24sequences and 3' untranslated sequences, also indicated in each odd-numbered SEQ ID NO: in the Sequence Listing. Alternatively, the nucleic acid molecule can comprise only the coding region of any of the nucleic acid sequences of the Sequence Listing.
For the purposes of this application, it will be understood that each of the nucleic acid and amino acid sequences set forth in the Sequence Listing has an identifying RXA, 00 RXN, RXS, or RXC number having the designation "RXA", "RXN", "RXS", or "RXC" followed by 5 digits RXA00007, RXN00023, RXS00116, or RXC00128). Each of 0 the nucleic acid sequences comprises up to three parts: a 5' upstream region, a coding region, and a downstream region. Each of these three regions is identified by the same 0 10 RXA, RXN, RXS, or RXC designation to eliminate confusion. The recitation "one of the odd-numbered sequences of the Sequence Listing", then, refers to any of the nucleic acid sequences in the Sequence Listing, which may also be distinguished by their differing RXA, RXN, RXS, or RXC designations. The coding region of each of these sequences is translated into a corresponding amino acid sequence, which is also set forth in the Sequence Listing, as an even-numbered SEQ ID NO: immediately following the corresponding nucleic acid sequence. For example, the coding region for RXA02229 is set forth in SEQ ID NO:1, while the amino acid sequence which it encodes is set forth as SEQ ID NO:2. The sequences of the nucleic acid molecules of the invention are identified by the same RXA, RXN, RXS, or RXC designations as the amino acid molecules which they encode, such that they can be readily correlated. For example, the amino acid sequences designated RXA02229, RX00351, RXS02970, and RXC02390 are translations of the coding regions of the nucleotide sequences of nucleic acid molecules RXA02229, RX00351, RXS02970, and RXC02390, respectively. The correspondence between the RXA, RXN, RXS, and RXC nucleotide and amino acid sequences of the invention and their assigned SEQ ID NOs is set forth in Table 1.
Several of the genes of the invention are "F-designated genes". An F-designated gene includes those genes set forth in Table 1 which have an in front of the RXA, RXN, RXS, or RXC designation. For example, SEQ ID NO:5, designated, as indicated on Table 1, as "F RXA01009", is an F-designated gene, as are SEQ ID NOs: 73, 75, and 77 (designated on Table 1 as "F RXA00007", "F RXA00364", and "F RXA00367", respectively).
In one embodiment, the nucleic acid molecules of the present invention are not intended to include C. glutamicum those compiled in Table 2. In the case of the dapD ltt' gene, a sequence for this gene was published in Wehrmann, et al. (1998) J.
Bacteriol. 180(12): 3159-3165. However, the sequence obtained by the inventors of the present application is significantly longer than the published version. It is believed that the published version relied on an incorrect start codon, and thus represents only a fragment of the actual coding region.
CIn another preferred embodiment, an isolated nucleic acid molecule of the invention comprises a nucleic acid molecule which is a complement of one of the nucleotide sequences of the invention a sequence of an odd-numbered SEQ ID NO: of the Sequence Listing), or a portion thereof. A nucleic acid molecule which is complementary to one of the nucleotide sequences of the invention is one which is sufficiently complementary to one of the nucleotide sequences shown in the Sequence Listing the sequence of an odd-numbered SEQ ID NO:) such that it can hybridize to one of the nucleotide sequences of the invention, thereby forming a stable duplex.
In still another preferred embodiment, an isolated nucleic acid molecule of the invention comprises a nucleotide sequence which is at least about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, or 60%, preferably at least about 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, or more preferably at least about 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, or 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, or 90%, or 91%, 92%, 93%, 94%, and even more preferably at least about 95%, 96%, 97%, 98%, 99% or more homologous to a nucleotide sequence of the invention a sequence of an odd-numbered SEQ ID NO: of the Sequence Listing), or a portion thereof. Ranges and identity values intermediate to the above-recited ranges, 70-90% identical or 80-95% identical) are also intended to be encompassed by the present invention. For example, ranges of identity values using a combination of any of the above values recited as upper and/or lower limits are intended to be included. In an additional preferred embodiment, an isolated nucleic acid molecule of the invention comprises a nucleotide sequence which hybridizes, hybridizes under stringent conditions, to one of the nucleotide sequences of the invention, or a portion thereof.
Moreover, the nucleic acid molecule of the invention can comprise only a portion of the coding region of the sequence of one of the odd-numbered SEQ ID NOs -26of the Sequence Listing, for example a fragment which can be used as a probe or primer or a fragment encoding a biologically active portion of an MP protein. The nucleotide sequences determined from the cloning of the MP genes from C glutamicum allows for cq the generation of probes and primers designed for use in identifying and/or cloning MP homologues in other cell types and organisms, as well as MP homologues from other 0 0 Corynebacteria or related species. The probe/primer typically comprises substantially Spurified oligonucleotide. The oligonucleotide typically comprises a region of nucleotide sequence that hybridizes under stringent conditions to at least about 12, preferably about more preferably about 40, 50 or 75 consecutive nucleotides of a sense strand of one of the nucleotide sequences of the invention a sequence of one of the oddnumbered SEQ ID NOs of the Sequence Listing), an anti-sense sequence of one of these sequences, or naturally occurring mutants thereof. Primers based on a nucleotide sequence of the invention can be used in PCR reactions to clone MP homologues.
Probes based on the MP nucleotide sequences can be used to detect transcripts or genomic sequences encoding the same or homologous proteins. In preferred embodiments, the probe further comprises a label group attached thereto, e.g..the label group can be a radioisotope, a fluorescent compound, an enzyme, or an enzyme cofactor. Such probes can be used as a part of a diagnostic test kit for identifying cells which misexpress an MP protein, such as by measuring a level of an MP-encoding nucleic acid in a sample of cells from a subject detecting MP mRNA levels or determining whether a genomic MP gene has been mutated or deleted.
In one embodiment, the nucleic acid molecule of the invention encodes a protein or portion thereof which includes an amino acid sequence which is sufficiently homologous to an amino acid sequence of the invention a sequence of an evennumbered SEQ ID NO of the Sequence Listing) such that the protein or portion thereof maintains the ability to catalyze an enzymatic reaction in an amino acid, vitamin, cofactor, nutraceutical, nucleotide, nucleoside, or trehalose metabolic pathway. As used herein, the language "sufficiently homologous" refers to proteins or portions thereof which have amino acid sequences which include a minimum number of identical or equivalent an amino acid residue which has a similar side chain as an amino acid residue in a sequence of one of the even-numbered SEQ ID NOs of the Sequence Listing) amino acid residues to an amino acid sequence of the invention such that the 0 (N -27protein or portion thereof is able to catalyze an enzymatic reaction in a C. glutamicum amino acid, vitamin, cofactor, nutraceutical, nucleotide, nucleoside or trehalose t metabolic pathway. Protein members of such metabolic pathways, as described herein, function to catalyze the biosynthesis or degradation of one or more of: amino acids, OO 5 vitamins, cofactors, nutraceuticals, nucleotides, nucleosides, or trehalose. Examples of C such activities are also described herein. Thus, "the function of an MP protein" contributes to the overall functioning of one or more such metabolic pathway and contributes, either directly or indirectly, to the yield, production, and/or efficiency of production of one or more fine chemicals. Examples of MP protein activities are set forth in Table 1.
In another embodiment, the protein is at least about 50-60%, preferably at least about 60-70%, and more preferably at least about 70-80%, 80-90%, 90-95%, and most preferably at least about 96%, 97%, 98%, 99% or more homologous to an entire amino acid sequence of the invention a sequence of an even-numbered SEQ ID NO: of the Sequence Listing).
Portions of proteins encoded by the MP nucleic acid molecules of the invention are preferably biologically active portions of one of the MP proteins. As used herein, the term "biologically active portion of an MP protein" is intended to include a portion, a domain/motif, of an MP protein that catalyzes an enzymatic reaction in one or more C. glutamicum amino acid, vitamin, cofactor, nutraceutical, nucleotide, nucleoside, or trehalose metabolic pathways, or has an activity as set forth in Table 1. To determine whether an MP protein or a biologically active portion thereof can catalyze an enzymatic reaction in an amino acid, vitamin, cofactor, nutraceutical, nucleotide, nucleoside, or trehalose metabolic pathway, an assay of enzymatic activity may be performed. Such assay methods are well known to those of ordinary skill in the art, as detailed in Example 8 of the Exemplification.
Additional nucleic acid fragments encoding biologically active portions of an MP protein can be prepared by isolating a portion of one of the amino acid sequences of the invention a sequence of an even-numbered SEQ ID NO: of the Sequence Listing), expressing the encoded portion of the MP protein or peptide by recombinant expression in vitro) and assessing the activity of the encoded portion of the MP protein or peptide.
-28- CN The invention further encompasses nucleic acid molecules that differ from one of the nucleotide sequences of the invention a sequence of an odd-numbered SEQ ID NO: of the Sequence Listing) (and portions thereof) due to degeneracy of the genetic Scode and thus encode the same MP protein as that encoded by the nucleotide sequences of the invention. In another embodiment, an isolated nucleic acid molecule of the 00 invention has a nucleotide sequence encoding a protein having an amino acid sequence cshown in the Sequence Listing an even-numbered SEQ ID In a still further embodiment, the nucleic acid molecule of the invention encodes a full length C.
glutamicum protein which is substantially homologous to an amino acid sequence of the invention (encoded by an open reading frame shown in an odd-numbered SEQ ID NO: of the Sequence Listing).
It will be understood by one of ordinary skill in the art that in one embodiment the sequences of the invention are not meant to include the sequences of the prior art, such as those Genbank sequences set forth in Tables 2 or 4 which were available prior to the present invention. In one embodiment, the invention includes nucleotide and amino acid sequences having a percent identity to a nucleotide or amino acid sequence of the invention which is greater than that of a sequence of the prior art a Genbank sequence (or the protein encoded by such a sequence) set forth in Tables 2 or For example, the invention includes a nucleotide sequence which is greater than and/or at least 40% identical to the nucleotide sequence designated RXA00115 (SEQ ID NO: 185), a nucleotide sequence which is greater than and/or at least identical to the nucleotide sequence designated RXA00131 (SEQ ID NO:991), and a nucleotide sequence which is greater than and/or at least 39% identical to the nucleotide sequence designated RXA00219 (SEQ ID NO:345). One of ordinary skill in the art would be able to calculate the lower threshold of percent identity for any given sequence of the invention by examining the GAP-calculated percent identity scores set forth in Table 4 for each of the three top hits for the given sequence, and by subtracting the highest GAP-calculated percent identity from 100 percent. One of ordinary skill in the art will also appreciate that nucleic acid and amino acid sequences having percent identities greater than the lower threshold so calculated at least 50%, 51%, 52%, 53%, 54%, 56%, 57%, 58%, 59%, or 60%, preferably at least about 61%, 62%, 63%, 64%, 66%, 67%, 68%, 69%, or 70%, more preferably at least about 71%, 72%, 73%, -29- 74%, 75%, 76%, 77%, 78%, 79%, or 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, or 90%, or 91%, 92%, 93%, 94%, and even more preferably at least about tIt) 95%, 96%, 97%, 98%, 99% or more identical) are also encompassed by the invention.
In addition to the C. glutamicum MP nucleotide sequences set forth in the Sequence Listing as odd-numbered SEQ ID NOs, it will be appreciated by one of ordinary skill in the art that DNA sequence polymorphisms that lead to changes in the Cr amino acid sequences of MP proteins may exist within a population the C.
Cglutamicum population). Such genetic polymorphism in the MP gene may exist among individuals within a population due to natural variation. As used herein, the terms Cr 10 "gene" and "recombinant gene" refer to nucleic acid molecules comprising an open reading frame encoding an MP protein, preferably a C. glutamicum MP protein. Such natural variations can typically result in 1-5% variance in the nucleotide sequence of the MP gene. Any and all such nucleotide variations and resulting amino acid polymorphisms in MP that are the result of natural variation and that do not alter the functional activity of MP proteins are intended to be within the scope of the invention.
Nucleic acid molecules corresponding to natural variants and non-C. glutamicum homologues of the C. glutamicum MP DNA of the invention can be isolated based on their homology to the C. glutamicum MP nucleic acid disclosed herein using the C.
glutamicum DNA, or a portion thereof, as a hybridization probe according to standard hybridization techniques under stringent hybridization conditions. Accordingly, in another embodiment, an isolated nucleic acid molecule of the invention is at least nucleotides in length and hybridizes under stringent conditions to the nucleic acid molecule comprising a nucleotide sequence of an odd-numbered SEQ ID NO: of the Sequence Listing. In other embodiments, the nucleic acid is at least 30, 50, 100, 250 or more nucleotides in length. As used herein, the term "hybridizes under stringent conditions" is intended to describe conditions for hybridization and washing under which nucleotide sequences at least 60% homologous to each other typically remain hybridized to each other. Preferably, the conditions are such that sequences at least about 65%, more preferably at least about 70%, and even more preferably at least about 75% or more homologous to each other typically remain hybridized to each other. Such stringent conditions are known to one of ordinary skill in the art and can be found in Current Protocols in Molecular Biology, John Wiley Sons, N.Y. (1989), 6.3.1-6.3.6.
A preferred, non-limiting example of stringent hybridization conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 0.2 X SSC, 0.1 I% SDS at 50-65°C. Preferably, an isolated C<1 nucleic acid molecule of the invention that hybridizes under stringent conditions to a nucleotide sequence of the invention corresponds to a naturally-occurring nucleic acid 00 molecule. As used herein, a "naturally-occurring" nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature encodes a natural protein). In one embodiment, the nucleic acid encodes a natural C.
glulamicum MP protein.
In addition to naturally-occurring variants of the MP sequence that may exist in the population, one of ordinary skill in the art will further appreciate that changes can be introduced by mutation into a nucleotide sequence of the invention, thereby leading to changes in the amino acid sequence of the encoded MP protein, without altering the functional ability of the MP protein. For example, nucleotide substitutions leading to amino acid substitutions at "non-essential" amino acid residues can be made in a nucleotide sequence of the invention. A "non-essential" amino acid residue is a residue that can be altered from the wild-type sequence of one of the MP proteins an evennumbered SEQ ID NO: of the Sequence Listing) without altering the activity of said MP protein, whereas an "essential" amino acid residue is required for MP protein activity.
Other amino acid residues, however, those that are not conserved or only semiconserved in the domain having MP activity) may not be essential for activity and thus are likely to be amenable to alteration without altering MP activity.
Accordingly, another aspect of the invention pertains to nucleic acid molecules encoding MP proteins that contain changes in amino acid residues that are not essential for MP activity. Such MP proteins differ in amino acid sequence from a sequence of an even-numbered SEQ ID NO: of the Sequence Listing yet retain at least one of the MP activities described herein. In one embodiment, the isolated nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the protein comprises an amino acid sequence at least about 50% homologous to an amino acid sequence of the invention and is capable of catalyzing an enzymatic reaction in an amino acid, vitamin, cofactor, nutraceutical, nucleotide, nucleoside, or trehalose metabolic pathway, or has one or more activities set forth in Table 1. Preferably, the protein encoded by the nucleic -31acid molecule is at least about 50-60% homologous to the amino acid sequence of one of the odd-numbered SEQ ID NOs of the Sequence Listing, more preferably at least about 60-70% homologous to one of these sequences, even more preferably at least about 80-90%, 90-95% homologous to one of these sequences, and most preferably at 0 5 least about 96%, 97%, 98%, or 99% homologous to one of the amino acid sequences of 00 the invention.
To determine the percent homology of two amino acid sequences one of the amino acid sequences of the invention and a mutant form thereof) or of two nucleic acids, the sequences are aligned for optimal comparison purposes gaps can be 10 introduced in the sequence of one protein or nucleic acid for optimal alignment with the other protein or nucleic acid). The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in one sequence one of the amino acid sequences of the invention) is occupied by the same amino acid residue or nucleotide as the corresponding position in the other sequence a mutant form of the amino acid sequence), then the molecules are homologous at that position as used herein amino acid or nucleic acid "homology" is equivalent to amino acid or nucleic acid "identity"). The percent homology between the two sequences is a function of the number of identical positions shared by the sequences homology of identical positions/total of positions x 100).
An isolated nucleic acid molecule encoding an MP protein homologous to a protein sequence of the invention a sequence of an even-numbered SEQ ID NO: of the Sequence Listing) can be created by introducing one or more nucleotide substitutions, additions or deletions into a nucleotide sequence of the invention such that one or more amino acid substitutions, additions or deletions are introduced into the encoded protein. Mutations can be introduced into one of the nucleotide sequences of the invention by standard techniques, such as site-directed mutagenesis and PCRmediated mutagenesis. Preferably, conservative amino acid substitutions are made at one or more predicted non-essential amino acid residues. A "conservative amino acid substitution" is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains lysine, arginine, histidine), acidic side chains aspartic acid, glutamic S-32acid), uncharged polar side chains glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains cl1 threonine, valine, isoleucine) and aromatic side chains tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue 00 0 in an MP protein is preferably replaced with another amino acid residue from the same Sside chain family. Alternatively, in another embodiment, mutations can be introduced O randomly along all or part of an MP coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for an MP activity described herein to identify mutants that retain MP activity. Following mutagenesis of the nucleotide sequence of one of the odd-numbered SEQ IDNOs of the Sequence Listing, the encoded protein can be expressed recombinantly and the activity of the protein can be determined using, for example, assays described herein (see Example 8 of the Exemplification).
In addition to the nucleic acid molecules encoding MP proteins described above, another aspect of the invention pertains to isolated nucleic acid molecules which are antisense thereto. An "antisense'" nucleic acid comprises a nucleotide sequence which is complementary to a "sense" nucleic acid encoding a protein, complementary to the coding strand of a double-stranded DNA molecule or complementary to an mRNA sequence. Accordingly, an antisense nucleic acid can hydrogen bond to a sense nucleic acid. The antisense nucleic acid can be complementary to an entire MP coding strand, or to only a portion thereof. In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" of the coding strand of a nucleotide sequence encoding an MP protein. The term "coding region" refers to the region of the nucleotide sequence comprising codons which are translated into amino acid residues the entire coding region of SEQ ID NO. 1 (RXA02229) comprises nucleotides 1 to 825). In another embodiment, the antisense nucleic acid molecule is antisense to a "noncoding region" of the coding strand of a nucleotide sequence encoding MP. The term "noncoding region" refers to 5' and 3' sequences which flank the coding region that are not translated into amino acids also referred to as 5' and 3' untranslated regions).
Given the coding strand sequences encoding MP disclosed herein the sequences set forth as odd-numbered SEQ ID NOs in the Sequence Listing), antisense -33nucleic acids of the invention can be designed according to the rules of Watson and Crick base pairing. The antisense nucleic acid molecule can be complementary to the entire coding region of MP mRNA, but more preferably is an oligonucleotide which is antisense to only a portion of the coding or noncoding region of MP mRNA. For 00 5 example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of MP mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, phosphorothioate derivatives and acridine substituted nucleotides can be used. Examples of modified nucleotides which can be used to generate the antisense nucleic acid include fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl-2thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-Dgalactosylqueosine, inosine, N6-isopentenyladenine, 1 -methylguanine, I -methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, methylcytosine, N6-adenine, 7-methylguanine, 5-methylaminomethyluracil, methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine, methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, acid wybutoxosine, pseudouracil, queosine, 2-thiocytosine, methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5- oxyacetic acid methylester, uracil-5-oxyacetic acid 5-methyl-2-thiouracil, 3-(3-amino-3-N-2carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the antisense nucleic acid can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).
-34- (N The antisense nucleic acid molecules of the invention are typically administered to a cell or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding an MCT protein to thereby inhibit expression of the Sprotein, by inhibiting transcription and/or translation. The hybridization can be by conventional nucleotide complementarity to form a stable duplex, or, for example, in the 00 case of an antisense nucleic acid molecule which binds to DNA duplexes, through Cc specific interactions in the major groove of the double helix. The antisense molecule can O be modified such that it specifically binds to a receptor or an antigen expressed on a selected cell surface, by linking the antisense nucleic acid molecule to a peptide or an antibody which binds to a cell surface receptor or antigen. The antisense nucleic acid molecule can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong prokaryotic, viral, or eukaryotic promoter are preferred.
In yet another embodiment, the antisense nucleic acid molecule of the invention is an ctanomeric nucleic acid molecule. An ac-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual p-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids.
Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2'-omethylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).
In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. Ribozymes are catalytic RNA molecules with ribonuclease activity which are capable of cleaving a single-stranded nucleic acid, such as an mRNA, to which they have a complementary region. Thus, ribozymes hammerhead ribozymes (described in Haselhoff and Gerlach (1988) Nature 334:585-591)) can be used to catalytically cleave MP mRNA transcripts to thereby inhibit translation ofMP mRNA.
A ribozyme having specificity for an MP-encoding nucleic acid can be designed based upon the nucleotide sequence of an MP DNA disclosed herein SEQ ID NO: 1 (RXA02229). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in an MP-encoding mRNA. See, Cech et al.
U.S. Patent No. 4,987,071 and Cech et al. U.S. Patent No. 5,116,742. Alternatively, MP mRNA can be used to select a catalytic RNA having a specific ribonuclease activity t) from a pool of RNA molecules. See, Bartel, D. and Szostak, J.W. (1993) Science 261:1411-1418.
Alternatively, MP gene expression can be inhibited by targeting nucleotide r- sequences complementary to the regulatory region of an MP nucleotide sequence an MP promoter and/or enhancers) to form triple helical structures that prevent Stranscription of an MP gene in target cells. See generally, Helene, C. (1991) Anticancer 0 Drug Des. 6(6):569-84; Helene, C. et al. (1992) Ann. N. Y Acad. Sci. 660:27-36; and Maher, L.J. (1992) Bioassays 14(12):807-15.
B. Recombinant Expression Vectors and Host Cells Another aspect of the invention pertains to vectors, preferably expression vectors, containing a nucleic acid encoding an MP protein (or a portion thereof). As used herein, the term "vector" refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. One type of vector is a "plasmid", which refers to a circular double stranded DNA loop into which additional DNA segments can be ligated. Another type of vector is a viral vector, wherein additional DNA segments can be ligated into the viral genome. Certain vectors are capable of autonomous replication in a host cell into which they are introduced bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors non-episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as "expression vectors". In general, expression vectors of utility in recombinant DNA techniques are often in the form ofplasmids. In the present specification, "plasmid" and "vector" can be used interchangeably as the plasmid is the most commonly used form of vector.
However, the invention is intended to include such other forms of expression vectors, such as viral vectors replication defective retroviruses, adenoviruses and adenoassociated viruses), which serve equivalent functions.
0 -36- 0 The recombinant expression vectors of the invention comprise a nucleic acid of the invention in a form suitable for expression of the nucleic acid in a host cell, which means that the recombinant expression vectors include one or more regulatory Ssequences, selected on the basis of the host cells to be used for expression, which is operatively linked to the nucleic acid sequence to be expressed. Within a recombinant expression vector, "operably linked" is intended to mean that the nucleotide sequence of Sinterest is linked to the regulatory sequence(s) in a manner which allows for expression of the nucleotide sequence in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell). The term "regulatory S 10 sequence" is intended to include promoters, repressor binding sites, activator binding sites, enhancers and other expression control elements terminators, polyadenylation signals, or other elements of mRNA secondary structure). Such regulatory sequences are described, for example, in Goeddel; Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, CA (1990).
Regulatory sequences include those which direct constitutive expression of a nucleotide sequence in many types of host cell and those which direct expression of the nucleotide sequence only in certain host cells. Preferred regulatory sequences are, for example, promoters such as cos-, tac-, trp-, tet-, trp-tet-, Ipp-, lac-, Ipp-lac-, lacl q T7-, T5-, T3-, gal-, trc-, ara-, SP6-, amy, SPO2, X-PR- or X PL, which are used preferably in bacteria.
Additional regulatory sequences are, for example, promoters from yeasts and fungi, such as ADC 1, MFa, AC, P-60, CYC1, GAPDH, TEF, rp2 8 ADH, promoters from plants such as CaMV/35S, SSU, OCS, lib4, usp, STLS1, B33, nos or ubiquitin- or phaseolinpromoters. It is also possible to use artificial promoters. It will be appreciated by one of ordinary skill in the art that the design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, etc. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or peptides, including fusion proteins or peptides, encoded by nucleic acids as described herein MP proteins, mutant forms of MP proteins, fusion proteins, etc.).
The recombinant expression vectors of the invention can be designed for expression of MP proteins in prokaryotic or eukaryotic cells. For example, MP genes can be expressed in bacterial cells such as C. glutamicum, insect cells (using baculovirus -37expression vectors), yeast and other fungal cells (see Romanos, M.A. et al. (1992) "Foreign gene expression in yeast: a review", Yeast 8: 423-488; van den Hondel, C.A.M.J.J. et al. (1991) "Heterologous gene expression in filamentous fungi" in: More Gene Manipulations in Fungi, J.W. Bennet L.L. Lasure, eds., p. 396-428: Academic 5 Press: San Diego; and van den Hondel, C.A.M.J.J. Punt, P.J. (1991) "Gene transfer 00 r systems and vector development for filamentous fungi, in: Applied Molecular Genetics of Fungi, Peberdy, J.F. et al., eds., p. 1-28, Cambridge University Press: Cambridge), C1 algae and multicellular plant cells (see Schmidt, R. and Willmitzer, L. (1988) High Sefficiency Agrobacterium tumefaciens -mediated transformation of Arabidopsis c 10 thaliana leaf and cotyledon explants" Plant Cell Rep.: 583-586), or mammalian cells.
Suitable host cells are discussed further in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, CA (1990). Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.
Expression of proteins in prokaryotes is most often carried out with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein but also to the Cterminus or fused within suitable regions in the proteins. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, in fusion expression vectors, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase.
Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D.B. and Johnson, K.S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, MA) and pRIT5 (Pharmacia, Piscataway, NJ) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein. In one embodiment, the coding sequence of the MP protein is cloned into a pGEX expression vector to create a vector encoding a fusion protein comprising, from -38the N-terminus to the C-terminus, GST-thrombin cleavage site-X protein. The fusion protein can be purified by affinity chromatography using glutathione-agarose resin.
Recombinant MP protein unfused to GST can be recovered by cleavage of the fusion C1 protein with thrombin.
Examples of suitable inducible non-fusion E. coli expression vectors include o 0 pTrc (Amann et al., (1988) Gene 69:301-315) pLG338, pACYC 184, pBR322, pUC 18, r pUC19, pKC30, pRep4, pHS1, pHS2, pPLc236, pMBL24, pLG200, pUR290, pIN- O II1 13-B 1, Xgtl 1, pBdC1, and pET 1 d (Studier et al., Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, California (1990) 60-89; and Pouwels et al., eds. (1985) Cloning Vectors. Elsevier: New York IBSN 0 444 904018).
Target gene expression from the pTrc vector relies on host RNA polymerase transcription from a hybrid trp-lac fusion promoter. Target gene expression from the pET 1 Id vector relies on transcription from a T7 gnl0-lac fusion promoter mediated by a coexpressed viral RNA polymerase (T7 gnl). This viral polymerase is supplied by host strains BL21(DE3) or HMS174(DE3) from a resident prophage harboring a T7 gnl gene under the transcriptional control of the lacUV 5 promoter. For transformation of other varieties of bacteria, appropriate vectors may be selected. For example, the plasmids pIJl01, pIJ364, pIJ702 and pIJ3 6 1 are known to be useful in transforming Streptomyces, while plasmids pUBl 10, pC 194, or pBD214 are suited for transformation of Bacillus species. Several plasmids of use in the transfer of genetic information into Corynebacterium include pHM1519, pBLI, pSA77, or pAJ667 (Pouwels et al., eds.
(1985) Cloning Vectors. Elsevier: New York IBSN 0 444 904018).
One strategy to maximize recombinant protein expression is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, California (1990) 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in the bacterium chosen for expression, such as C. glutamicum (Wada et al. (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.
-39- In another embodiment, the MP protein expression vector is a yeast expression vector. Examples of vectors for expression in yeast S. cerevisiae include pYepSecl lt (Baldari, et al., (1987) Embo J. 6:229-234),, 2 pAG-1, Yep6, Yepl3, pEMBLYe23, pMFa (Kurjan and Herskowitz, (1982) Cell 30:933-943), pJRY88 (Schultz et al., (1987) Gene 54:113-123), and pYES2 (Invitrogen Corporation, San Diego, CA). Vectors and 00 Smethods for the construction of vectors appropriate for use in other fungi, such as the CN filamentous fungi, include those detailed in: van den Hondel, C.A.M.J.J. Punt, P.J.
O
C (1991) "Gene transfer systems and vector development for filamentous fungi, in: SApplied Molecular Genetics of Fungi, J.F. Peberdy, et al., eds., p. 1-28, Cambridge C 10 University Press: Cambridge, and Pouwels et al., eds. (1985) Cloning Vectors. Elsevier: New York (IBSN 0 444 904018).
Alternatively, the MP proteins of the invention can be expressed in insect cells using baculovirus expression vectors. Baculovirus vectors available for expression of proteins in cultured insect cells Sf9 cells) include the pAc series (Smith et al.
(1983) Mol. Cell Biol. 3:2156-2165) and the pVL series (Lucklow and Summers (1989) Virology 170:31-39).
In another embodiment, the MP proteins of the invention may be expressed in unicellular plant cells (such as algae) or in plant cells from higher plants the spermatophytes, such as crop plants). Examples of plant expression vectors include those detailed in: Becker, Kemper, Schell, J. and Masterson, R. (1992) "New plant binary vectors with selectable markers located proximal to the left border", Plant Mol. Biol. 20: 1195-1197; and Bevan, M.W. (1984) "Binary Agrobacterium vectors for plant transformation", Nucl. Acid. Res. 12: 8711-8721, and include pLGV23, pGHlac+, pBIN19, pAK2004, and pDH51 (Pouwels et al., eds. (1985) Cloning Vectors. Elsevier: New York IBSN 0 444 904018).
In yet another embodiment, a nucleic acid of the invention is expressed in mammalian cells using a mammalian expression vector. Examples of mammalian expression vectors include pCDM8 (Seed, B. (1987) Nature 329:840) and pMT2PC (Kaufman et al. (1987) EMBO J. 6:187-195). When used in mammalian cells, the expression vector's control functions are often provided by viral regulatory elements.
For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40. For other suitable expression systems for both prokaryotic and eukaryotic cells see chapters 16 and 17 of Sambrook, Fritsh, E. F., t and Maniatis, T. Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989.
In another embodiment, the recombinant mammalian expression vector is 0 0 capable of directing expression of the nucleic acid preferentially in a particular cell type n tissue-specific regulatory elements are used to express the nucleic acid). Tissue- 0 specific regulatory elements are known in the art. Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al.
0 10 (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and.
Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters the neurofilament promoter; Byre and Ruddle (1989) PNAS 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters milk whey promoter; U.S. Patent No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the a-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).
The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. That is, the DNA molecule is operatively linked to a regulatory sequence in a manner which allows for expression (by transcription of the DNA molecule) of an RNA molecule which is antisense to MP mRNA. Regulatory sequences operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the continuous expression of the antisense RNA molecule in a variety of cell types, for instance viral promoters and/or enhancers, or regulatory sequences can be chosen which direct constitutive, tissue specific or cell type specific expression of antisense RNA.
The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus in which antisense nucleic acids are produced under the control of a high efficiency regulatory region, the activity of which can be determined by the cell -41 type into which the vector is introduced. For a discussion of the regulation of gene expression using antisense genes see Weintraub, H. et al., Antisense RNA as a Smolecular tool for genetic analysis, Reviews Trends in Genetics, Vol. 1(1) 1986.
Another aspect of the invention pertains to host cells into which a recombinant expression vector of the invention has been introduced. The terms "host cell" and "recombinant host cell" are used interchangeably herein. It is understood that such terms refer not only to the particular subject cell but to the progeny or potential progeny Ci of such a cell. Because certain modifications may occur in succeeding generations due Sto either mutation or environmental influences, such progeny may not, in fact, be N 10 identical to the parent cell, but are still included within the scope of the term as used herein.
A host cell can be any prokaryotic or eukaryotic cell. For example, an MP protein can be expressed in bacterial cells such as C. glutamicum, insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells). Other suitable host cells are known to those of ordinary skill in the art. Microorganisms related to Corynebacterium glutamicum which may be conveniently used as host cells for the nucleic acid and protein molecules of the invention are set forth in Table 3.
Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional transformation or transfection techniques. As used herein, the terms "transformation" and "transfection", "conjugation" and "transduction" are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid linear DNA or RNA a linearized vector or a gene construct alone without a vector) or nucleic acid in the form of a vector a plasmid, phage, phasmid, phagemid, transposon or other DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, natural competence, chemical-mediated transfer, or electroporation. Suitable methods for transforming or transfecting host cells can be found in Sambrook, et al. (Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989), and other laboratory manuals.
For stable transfection of mammalian cells, it is known that, depending upon the expression vector and transfection technique used, only a small fraction of cells may integrate the foreign DNA into their genome. In order to identify and select these 0 -42integrants, a gene that encodes a selectable marker resistance to antibiotics) is generally introduced into the host cells along with the gene of interest. Preferred selectable markers include those which confer resistance to drugs, such as G418, CN hygromycin and methotrexate. Nucleic acid encoding a selectable marker can be introduced into a host cell on the same vector as that encoding an MP protein or can be 00 introduced on a separate vector. Cells stably transfected with the introduced nucleic acid can be identified by drug selection cells that have incorporated the selectable marker gene will survive, while the other cells die).
S To create a homologous recombinant microorganism, a vector is prepared which contains at least a portion of an MP gene into which a deletion, addition or substitution has been introduced to thereby alter, functionally disrupt, the MP gene. Preferably, this MP gene is a Corynebacterium glutamicum MP gene, but it can be a homologue from a related bacterium or even from a mammalian, yeast, or insect source. In a preferred embodiment, the vector is designed such that, upon homologous recombination, the endogenous MP gene is functionally disrupted no longer encodes a functional protein; also referred to as a "knock out" vector). Alternatively, the vector can be designed such that, upon homologous recombination, the endogenous MP gene is mutated or otherwise altered but still encodes functional protein the upstream regulatory region can be altered to thereby alter the expression of the endogenous MP protein). In the homologous recombination vector, the altered portion of the MP gene is flanked at its 5' and 3' ends by additional nucleic acid of the MP gene to allow for homologous recombination to occur between the exogenous MP gene carried by the vector and an endogenous MP gene in a microorganism. The additional flanking MP nucleic acid is of sufficient length for successful homologous recombination with the endogenous gene. Typically, several kilobases of flanking DNA (both at the 5' and 3' ends) are included in the vector (see Thomas, and Capecchi, M.R. (1987) Cell 51: 503 for a description of homologous recombination vectors). The vector is introduced into a microorganism by electroporation) and cells in which the introduced MP gene has homologously recombined with the endogenous MP gene are selected, using art-known techniques.
In another embodiment, recombinant microorganisms can be produced which contain selected systems which allow for regulated expression of the introduced gene.
-43- ~For example, inclusion of an MP gene on a vector placing it under control of the lac operon permits expression of the MP gene only in the presence of IPTG. Such n regulatory systems are well known in the art.
In another embodiment, an endogenous MP gene in a host cell is disrupted 00 5 by homologous recombination or other genetic means known in the art) such that expression of its protein product does not occur. In another embodiment, an endogenous or introduced MP gene in a host cell has been altered by one or more point mutations, deletions, or inversions, but still encodes a functional MP protein. In still another Sembodiment, one or more of the regulatory regions a promoter, repressor, or 10 inducer) of an MP gene in a microorganism has been altered by deletion, truncation, inversion, or point mutation) such that the expression of the MP gene is modulated. One of ordinary skill in the art will appreciate that host cells containing more than one of the described MP gene and protein modifications may be readily produced using the methods of the invention, and are meant to be included in the present invention.
A host cell of the invention, such as a prokaryotic or eukaryotic host cell in culture, can be used to produce express) an MP protein. Accordingly, the invention further provides methods for producing MP proteins using the host cells of the invention. In one embodiment, the method comprises culturing the host cell of invention (into which a recombinant expression vector encoding an MP protein has been introduced, or into which genome has been introduced a gene encoding a wild-type or altered MP protein) in a suitable medium until MP protein is produced. In another embodiment, the method further comprises isolating MP proteins from the medium or the host cell.
C. Isolated MP Proteins Another aspect of the invention pertains to isolated MP proteins, and biologically active portions thereof. An "isolated" or "purified" protein or biologically active portion thereof is substantially free of cellular material when produced by recombinant DNA techniques, or chemical precursors or other chemicals when chemically synthesized.
The language "substantially free of cellular material" includes preparations of MP protein in which the protein is separated from cellular components of the cells in which -44it is naturally or recombinantly produced. In one embodiment, the language "substantially free of cellular material" includes preparations of MP protein having less than about 30% (by dry weight) of non-MP protein (also referred to herein as a C<1 "contaminating protein"), more preferably less than about 20% of non-MP protein, still more preferably less than about 10% of non-MP protein, and most preferably less than 00 Sabout 5% non-MP protein. When the MP protein or biologically active portion thereof C€ is recombinantly produced, it is also preferably substantially free of culture medium, i.e., 0 culture medium represents less than about 20%, more preferably less than about F; and most preferably less than about 5% of the volume of the protein preparation. The language "substantially free of chemical precursors or other chemicals" includes preparations of MP protein in which the protein is separated from chemical precursors or other chemicals which are involved in the synthesis of the protein. In one embodiment, the language "substantially free of chemical precursors or other chemicals" includes preparations of MP protein having less than about 30% (by dry weight) of chemical precursors or non-MP chemicals, more preferably less than about 20% chemical precursors or non-MP chemicals, still more preferably less than about 10% chemical precursors or non-MP chemicals, and most preferably less than about 5% chemical precursors or non-MP chemicals. In preferred embodiments, isolated proteins or biologically active portions thereof lack contaminating proteins from the same organism from which the MP protein is derived. Typically, such proteins are produced by recombinant expression of, for example, a C. glutamicum MP protein in a microorganism such as C. glutamicum.
An isolated MP protein or a portion thereof of the invention can catalyze an enzymatic reaction in an amino acid, vitamin, cofactor, nutraceutical, nucleotide, nucleoside, or trehalose metabolic pathway, or has one or more of the activities set forth in Table I. In preferred embodiments, the protein or portion thereof comprises an amino acid sequence which is sufficiently homologous to an amino acid sequence of the invention a sequence of an even-numbered SEQ ID NO: of the Sequence Listing) such that the protein or portion thereof maintains the ability to catalyze an enzymatic reaction in an amino acid, vitamin, cofactor, nutraceutical, nucleotide, nucleoside, or trehalose metabolic pathway. The portion of the protein is preferably a biologically active portion as described herein. In another preferred embodiment, an MP protein of the invention has an amino acid sequence set forth as an even-numbered SEQ ID NO: of the Sequence Listing. In yet another preferred embodiment, the MP protein has an amino acid sequence which is encoded by a nucleotide sequence which hybridizes, e.g., hybridizes under stringent conditions, to a ntucleotide sequence of the invention a 00 5 sequence of an odd-numbered SEQ ID NO: of the Sequence Listing). In still another rpreferred embodiment, the MP protein has an amino acid sequence which is encoded by a nucleotide sequence that is at least about 50%, 51%, 52%, 53%, 54%, 55%, 56%, S57%, 58%, 59%, or 60%, preferably at least about 61%, 62%, 63%, 64%, 65%, 66%, S67%, 68%, 69%, or 70%, more preferably at least about 71%, 72%, 73%, 74%, 110 76%, 77%, 78%, 79%, or 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, or or 91%, 92%, 93%, 94%, and even more preferably at least about 95%, 96%, 97%, 98%, 99% or more homologous to one of the nucleic acid sequences of the invention, or a portion thereof. Ranges and identity values intermediate to the above-recited values, 70-90% identical or 80-95% identical) are also intended to be encompassed by the present invention. For example, ranges of identity values using a combination of any of the above values recited as upper and/or lower limits are intended to be included. The preferred MP proteins of the present invention also preferably possess at least one of the MP activities described herein. For example, a preferred MP protein of the present invention includes an amino acid sequence encoded by a nucleotide sequence which hybridizes, hybridizes under stringent conditions, to a nucleotide sequence of the invention, and which can catalyze an enzymatic reaction in an amino acid, vitamin, cofactor, nutraceutical, nucleotide, nucleoside, or trehalose metabolic pathway, or which has one or more of the activities set forth in Table 1.
In other embodiments, the MP protein is substantially homologous to an amino acid sequence of the invention a sequence of an even-numbered SEQ ID NO: of the Sequence Listing) and retains the functional activity of the protein of one of the amino acid sequences of the invention yet differs in amino acid sequence due to natural variation or mutagenesis, as described in detail in subsection I above. Accordingly, in another embodiment, the MP protein is a protein which comprises an amino acid sequence which is at least about 50%, 5 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, or 60%, preferably at least about 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, or 70%, more preferably at least about 71%, 72%, 73%, 74%, 75%, 76%, 77%, -46- 78%, 79%, or 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, or 90%, or 91%, 92%, 93%, 94%, and even more preferably at least about 95%, 96%, 97%, 98%, 99% or more homologous to an entire amino acid sequence of the invention and which CK1 has at least one of the MP activities described herein. Ranges and identity values intermediate to the above-recited values, 70-90% identical or 80-95% identical) 00 are also intended to be encompassed by the present invention. For example, ranges of Ff identity values using a combination of any of the above values recited as upper and/or O lower limits are intended to be included. In another embodiment, the invention pertains to a full length C. glutamicum protein which is substantially homologous to an entire amino acid sequence of the invention.
Biologically active portions of an MP protein include peptides comprising amino acid sequences derived from the amino acid sequence of an MP protein, an amino acid sequence of an even-numbered SEQ ID NO: of the Sequence Listing or the amino acid sequence of a protein homologous to an MP protein, which include fewer amino acids than a full length MP protein or the full length protein which is homologous to an MP protein, and exhibit at least one activity of an MP protein.. Typically, biologically active portions (peptides, peptides which are, for example, 5, 10, 15, 20, 30, 35, 36, 37, 38, 39, 40, 50, 100 or more amino acids in length) comprise a domain or motif with at least one activity of an MP protein. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the activities described herein. Preferably, the biologically active portions of an MP protein include one or more selected domains/motifs or portions thereof having biological activity.
MP proteins are preferably produced by recombinant DNA techniques. For example, a nucleic acid molecule encoding the protein is cloned into an expression vector (as described above), the expression vector is introduced into a host cell (as described above) and the MP protein is expressed in the host cell. The MP protein can then be isolated from the cells by an appropriate purification scheme using standard protein purification techniques. Alternative to recombinant expression, an MP protein, polypeptide, or peptide can be synthesized chemically using standard peptide synthesis techniques. Moreover, native MP protein can be isolated from cells endothelial -47cells), for example using an anti-MP antibody, which can be produced by standard techniques utilizing an MP protein or fragment thereof of this invention.
SThe invention also provides MP chimeric or fusion proteins. As used herein, an MP "chimeric protein" or "fusion protein" comprises an MP polypeptide operatively 0 5 linked to a non-MP polypeptide. An "MP polypeptide" refers to a polypeptide having an t_ amino acid sequence corresponding to MP, whereas a "non-MP polypeptide" refers to a C polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the MP protein, a protein which is different from the SMP protein and which is derived from the same or a different organism. Within the C 10 fusion protein, the term "operatively linked" is intended to indicate that the MP polypeptide and the non-MP polypeptide are fused in-fiame to each other. The non-MP polypeptide can be fused to the N-terminus or C-terminus of the MP polypeptide. For example, in one embodiment the fusion protein is a GST-MP fusion protein in which the MP sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant MP proteins. In another embodiment, the fusion protein is an MP protein containing a heterologous signal sequence at its Nterminus. In certain host cells mammalian host cells), expression and/or secretion of an MP protein can be increased through use of a heterologous signal sequence.
Preferably, an MP chimeric or fusion protein of the invention is produced by standard recombinant DNA techniques. For example, DNA fragments coding for the different polypeptide sequences are ligated together in-frame in accordance with conventional techniques, for example by employing blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can be synthesized by conventional techniques including automated DNA synthesizers.
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers which give rise to complementary overhangs between two consecutive gene fragments which can subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for example, Current Protocols in Molecular Biology, eds. Ausubel et al. John Wiley Sons: 1992). Moreover, many expression vectors are commercially available that already encode a fusion moiety a GST polypeptide). An MP- -48encoding nucleic acid can be cloned into such an expression vector such that the fusion Smoiety is linked in-frame to the MP protein.
Homologues of the MP protein can be generated by mutagenesis, discrete Spoint mutation or truncation of the MP protein. As used herein, the term "homologue" refers to a variant form of the MP protein which acts as an agonist or antagonist of the 00 activity of the MP protein. An agonist of the MP protein can retain substantially the same, or a subset, of the biological activities of the MP protein. An antagonist of the l MP protein can inhibit one or more of the activities of the naturally occurring form of Ur- the MP protein, by, for example, competitively binding to a downstream or upstream member of the MP cascade which includes the MP protein. Thus, the C. glutamicum MP protein and homologues thereof of the present invention may modulate the activity of one or more metabolic pathways in which MP proteins play a role in this microorganism.
In an alternative embodiment, homologues of the MP protein can be identified by screening combinatorial libraries of mutants, truncation mutants, of the MP protein for MP protein agonist or antagonist activity. In one embodiment, a variegated library of MP variants is generated by combinatorial mutagenesis at the nucleic acid level and is encoded by a variegated gene library. A variegated library of MP variants can be produced by, for example, enzymatically ligating a mixture of synthetic oligonucleotides into gene sequences such that a degenerate set of potential MP sequences is expressible as individual polypeptides, or alternatively, as a set of larger fusion proteins for phage display) containing the set of MP sequences therein.
There are a variety of methods which can be used to produce libraries of potential MP homologues from a degenerate oligonucleotide sequence. Chemical synthesis of a degenerate gene sequence can be performed in an automatic DNA synthesizer, and the synthetic gene then ligated into an appropriate expression vector. Use of a degenerate set of genes allows for the provision, in one mixture, of all of the sequences encoding the desired set of potential MP sequences. Methods for synthesizing degenerate oligonucleotides are known in the art (see, Narang, S.A. (1983) Tetrahedron 39:3; Itakura et al. (1984) Annu. Rev. Biochem. 53:323; Itakura et al. (1984) Science 198:1056; Ike et al. (1983) Nucleic Acid Res. 11:477.
-49- In addition, libraries of fragments of the MP protein coding can be used to generate a variegated population of MP fragments for screening and subsequent t selection ofhomologues of an MP protein. In one embodiment, a library of coding sequence fragments can be generated by treating a double stranded PCR fragment of an MP coding sequence with a nuclease under conditions wherein nicking occurs only r- about once per molecule, denaturing the double stranded DNA, renaturing the DNA to C1 form double stranded DNA which can include sense/antisense pairs from different C1 nicked products, removing single stranded portions from reformed duplexes by Streatment with S1 nuclease, and ligating the resulting fragment library into an expression C 10 vector. By this method, an expression library can be derived which encodes N-terminal, C-terminal and internal fragments of various sizes of the. MP protein.
Several techniques are known in the art for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property. Such techniques are adaptable for rapid screening of the gene libraries generated by the combinatorial mutagenesis of MP homologues. The most widely used techniques, which are amenable to high through-put analysis, for screening large gene libraries typically include cloning the gene library into replicable expression vectors, transforming appropriate cells with the resulting library of vectors, and expressing the combinatorial genes under conditions in which detection of a desired activity facilitates isolation of the vector encoding the gene whose product was detected. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify MP homologues (Arkin and Yourvan (1992) PNAS 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6(3):327-331).
In another embodiment, cell based assays can be exploited to analyze a variegated MP library, using methods well known in the art.
D. Uses and Methods of the Invention The nucleic acid molecules, proteins, protein homologues, fusion proteins, primers, vectors, and host cells described herein can be used in one or more of the following methods: identification of C. glutamicum and related organisms; mapping of genomes of organisms related to C. glutamicum; identification and localization of C.
glutamicum sequences of interest; evolutionary studies; determination of MP protein regions required for function; modulation of an MP protein activity; modulation of the t activity of an MP pathway; and modulation of cellular production of a desired compound, such as a fine chemical.
The MP nucleic acid molecules of the invention have a variety of uses. First, 00 they may be used to identify an organism as being Corynebacterium glutamicum or a C close relative thereof. Also, they may be used to identify the presence of C. glutamicum 0 C or a relative thereof in a mixed population of microorganisms. The invention provides Sthe nucleic acid sequences of a number of C. glulamicum genes; by probing the C1 10 extracted genomic DNA of a culture of a unique or mixed population of microorganisms under stringent conditions with a probe spanning a region of a C. glutamicum gene which is unique to this organism, one can ascertain whether this organism is present.
Although Corynebacterium glutamicum itself is not pathogenic to humans, it is related to species which are human pathogens, such as Corynebacterium diphtheriae.
Corynebacterium diphtheriae is the causative agent of diphtheria, a rapidly developing, acute, febrile infection which involves both local and systemic pathology. In this disease, a local lesion develops in the upper respiratory tract and involves necrotic injury to epithelial cells; the bacilli secrete toxin which is disseminated through this lesion to distal susceptible tissues of the body. Degenerative changes brought about by the inhibition of protein synthesis in these tissues, which include heart, muscle, peripheral nerves, adrenals, kidneys, liver and spleen, result in the systemic pathology of the disease. Diphtheria continues to have high incidence in many parts of the world, including Africa, Asia, Eastern Europe and the independent states of the former Soviet Union. An ongoing epidemic of diphtheria in the latter two regions has resulted in at least 5,000 deaths since 1990.
In one embodiment, the invention provides a method of identifying the presence or activity of Cornyebacterium diphtheriae in a subject. This method includes detection of one or more of the nucleic acid or amino acid sequences of the invention the sequences set forth as odd-numbered or even-numbered SEQ ID NOs, respectively, in the Sequence Listing) in a subject, thereby detecting the presence or activity of Corynebacterium diphtheriae in the subject. C. glutamicum and C. diphtheriae are related bacteria, and many of the nucleic acid and protein molecules in C. glutamicum -51are homologous to C. diphtheriae nucleic acid and protein molecules, and can therefore be used to detect C. diphtheriae in a subject.
The nucleic acid and protein molecules of the invention may also serve as markers for specific regions of the genome. This has utility not only in the mapping of the genome, but also for functional studies of C. glutamicum proteins. For example, to 00 identify the region of the genome to which a particular C. glutamicum DNA-binding protein binds, the C. glutamicum genome could be digested, and the fragments incubated with the DNA-binding protein. Those which bind the protein may be additionally probed with the nucleic acid molecules of the invention, preferably with readily detectable labels; binding of such a nucleic acid molecule to the genome fragment enables the localization of the fragment to the genome map ofC. glitamicum, and, when performed multiple times with different enzymes, facilitates a rapid determination of the nucleic acid sequence to which the protein binds. Further, the nucleic acid molecules of the invention may be sufficiently homologous to the sequences of related species such that these nucleic acid molecules may serve as markers for the construction of a genomic map in related bacteria, such as Brevibacterium lactofermentum.
The MP nucleic acid molecules of the invention are also useful for evolutionary and protein structural studies. The metabolic processes in which the molecules of the invention participate are utilized by a wide variety ofprokaryotic and eukaryotic cells; by comparing the sequences of the nucleic acid molecules of the present invention to those encoding similar enzymes from other organisms, the evolutionary relatedness of the organisms can be assessed. Similarly, such a comparison permits an assessment of which regions of the sequence are conserved and which are not, which may aid in determining those regions of the protein which are essential for the functioning of the enzyme. This type of determination is of value for protein engineering studies and may give an indication of what the protein can tolerate in terms ofmutagenesis without losing function.
Manipulation of the MP nucleic acid molecules of the invention may result in the production of MP proteins having functional differences from the wild-type
MP
proteins. These proteins may be improved in efficiency or activity, may be present in greater numbers in the cell than is usual, or may be decreased in efficiency or activity.
-52- CN The invention also provides methods for screening molecules which modulate the activity of an MP protein, either by interacting with the protein itself or a Substrate or binding partner of the MP protein, or by modulating the transcription or translation of an MP nucleic acid molecule of the invention. In such methods, a microorganism expressing one or more MP proteins of the invention is contacted with one or more test 00 compounds, and the effect of each test compound on the activity or level of expression f¢3 of the MP protein is assessed.
When the desired fine chemical to be isolated from large-scale fermentative culture of C. glutamicum is an amino acid, a vitamin, a cofactor, a nutraceutical, a nucleotide, a nucleoside, or trehalose, modulation of the activity or efficiency of activity of one or more of the proteins of the invention by recombinant genetic mechanisms may directly impact the production of one of these fine chemicals. For example, in the case of an enzyme in a biosynthetic pathway for a desired amino acid, improvement in efficiency or activity of the enzyme (including the presence of multiple copies of the gene) should lead to an increased production or efficiency of production of that desired amino acid. In the case of an enzyme in a biosynthetic pathway for an amino acid whose synthesis is in competition with the synthesis of a desired amino acid, any decrease in the efficiency or activity of this enzyme (including deletion of the gene) should result in an increase in production or efficiency of production of the desired amino acid, due to decreased competition for intermediate compounds and/or energy. In the case of an enzyme in a degradation pathway for a desired amino acid, any decrease in efficiency or activity of the enzyme should result in a greater yield or efficiency of production of the desired product due to a decrease in its degradation. Lastly, mutagenesis of an enzyme involved in the biosynthesis of a desired amino acid' such that this enzyme is no longer is capable of feedback inhibition should result in increased yields or efficiency of production of the desired amino acid. The same should apply to the biosynthetic and degradative enzymes of the invention involved in the metabolism of vitamins, cofactors, nutraceuticals, nucleotides, nucleosides and trehalose.
Similarly, when the desired fine chemical is not one of the aforementioned compounds, the modulation of activity of one of the proteins of the invention may still impact the yield and/or efficiency of production of the compound from large-scale culture of C. glutamicum. The metabolic pathways of any organism are closely -53 interconnected; the intermediate used by one pathway is often supplied by a different pathway. Enzyme expression and function may be regulated based on the cellular levels Sof a compound from a different metabolic process, and the cellular levels of molecules necessary for basic growth, such as amino acids and nucleotides, may critically affect 5 the viability of the microorganism in large-scale culture. Thus, modulation of an amino 00 Cacid biosynthesis enzyme, for example, such that it is no longer responsive to feedback Sinhibition or such that it is improved in efficiency or turnover may result in increased (7 cellular levels of one or more amino acids. In turn, this increased pool of amino acids Sprovides not only an increased supply of molecules necessary for protein synthesis, but also of molecules which are utilized as intermediates and precursors in a number of other biosynthetic pathways. Ifa particular amino acid had been limiting in the cell, its increased production might increase the ability of the cell to perform numerous other metabolic reactions, as well as enabling the cell to more efficiently produce proteins of all kinds, possibly increasing the overall growth rate or survival ability of the cell in large scale culture. Increased viability improves the number of cells capable of producing the desired fine chemical in fermentative culture, thereby increasing the yield of this compound. Similar processes are possible by the modulation of activity of a degradative enzyme of the invention such that the enzyme no longer catalyzes, or catalyzes less efficiently, the degradation of a cellular compound which is important for the biosynthesis of a desired compound, or which will enable the cell to grow and reproduce more efficiently in large-scale culture. It should be emphasized that optimizing the degradative activity or decreasing the biosynthetic activity of certain molecules of the invention may also have a beneficial effect on the production of certain fine chemicals from C. glutamicum. For example, by decreasing the efficiency of activity of a biosynthetic enzyme in a pathway which competes with the biosynthetic pathway of a desired compound for one or more intermediates, more of those intermediates should be available for conversion to the desired product. A similar situation may call for the improvement of degradative ability or efficiency of one or more proteins of the invention.
This aforementioned list of mutagenesis strategies for MP proteins to result in increased yields of a desired compound is not meant to be limiting; variations on these mutagenesis strategies will be readily apparent to one of ordinary skill in the art. By -54these mechanisms, the nucleic acid and protein molecules of the invention may be utilized to generate C. glutamicum or related strains of bacteria expressing mutated MP nucleic acid and protein molecules such that the yield, production, and/or efficiency of C production of a desired compound is improved. This desired compound may be any natural product of C. glutamicum, which includes the final products of biosynthesis 00 00 pathways and intermediates of naturally-occurring metabolic pathways, as well as r n molecules which do not naturally occur in the metabolism of C glutamicum, but which are produced by a C. glutamicum strain of the invention.
This invention is further illustrated by the following examples which should not be construed as limiting. The contents of all references, patent applications, patents, published patent applications, Tables, and the sequence listing cited throughout this application are hereby incorporated by reference.
2007202378 25 May 2007 TABLE 1: Included Genes Lysine biosynthesis Nucleic Acid Amino Acid SEQ ID NO SEQ -IDNO 1 2 3 4 6 7 8 9 10 11 12 13 14 16 Trehalose Nucleic Acid Amino Acid SEQ ID NO EQ -ID NO 17 18 19 20 21 22 23 24 Identification Code RXA02229 RXS02970 F RXA01009 RXCGO2390 RXCO1 796 RXCO1 207 RXCO0657 RXCO0552 Identific-ation Code RXN00351 F RXA00351 RXA00873 R.XA00891 GRO0653 GR00287 W0135 GR00066 GR0024 1 GR00243 NT Stat 2793 4714 NT Stat 37078 1486 3 1005 NT Stop 3617 5943 NT Stop 38532 2931 758 4 Function DIAMINOPIMELATE EPIMERASE (EC 5.1.1.7) ACETYLORNITHINE AMINOTRANSFERASE (EC 2.6. 1.11) ACETYLORNITHINE AMINOTRANSFERASE (EC 2.6.1.11) MEMBRANE SPANNING PROTEIN INVOLVED IN LYSINE METABOLISM MEMBRANE ASSOCIATED PROTEIN INVOLVED IN LYSINE METABOLISM CYTOSOLIC PROTEIN INVOLVED IN METABOLISM OF LYSINE AND
THREONINE
TRANSCRIPTIONAL REGULATOR INVOLVED IN LYSINE METABOLISM CYTOSOLIC PROTEIN INVOLVED IN LYSINE METABOLISM Function ALPHA,ALPHA.TREHALOSE.PHOSPHATE SYNTHASE (UDP-FORMING) 56 KD SUBUNIT (EC 2.4.1.15) ALPHA.ALPHA.TREHALOSE-PHOSPHATE SYNTHASE (UDP-FORMING) 56 KD SUBUNIT (EC 2.4.1.15) trehalose synthase (EC trehalose synthase (EC Lysine biosynthesis Nucleic Acid SEQ ID NO0 27 29 31 .33 37 39 41 43 Amino Acid S EQ I D NO 26 28 30.
32 34 36 38 40 42 44 Identification Code RXA00534 RXA00533 RXA02843 RXA02022 RXA0044 RXA00863 RXA00864 RX.A02 843 RXN00355 F RXA00352 GR00137 GR00137 GR00842 GR00613 GROO007 GR00236 GR00236 GR00842 W0135 GROO068 NT Sta 4758 3469 543 2063 3458 896 1694 543 31980 861 NT Stop Function 3496 ASPARTOKINASE ALPHA AND B3ETA SUBUNITS (EC 2.7.2.4) 2438 ASPARTATE-SEMIALDEHYDE DEHYDROGENASE (EC 1.2.1.11) 4 2, 3 4 STETRAHYDROPYRIDINE.2CARBOXYLATE
N-SUCCINYLTRANSFERASE
(EC 2.3.1.117) 3169 SUCCINYL.DIAMINOPIMELATE DESUCCINYLASE (EC 3.5.1.18) 4393 DIN YDRODIPICOLINATE SYNTHASE (EC 4.2.1.52) 1639 DIHYDRODIPICOLINATE REDUCTASE (EC 1.3.1.26) 2443 probable 2,3-dihydrodipicolinate N-C6-tyase (cyclizing) (EC Corynebacterium glutamicum 4 2, 3 4 .STETRAHYDROPYRIDINE.2.CARBOXYLATE
N-SUCCINYLTRANSFERASE
(EC 2.3.1.117) 30961 MESO.DIAMINOPIMELATE 0-DEHYDROGENASE 4 MESO.DIAMINOPIMELATE 0-DEHYDROGENASE (EC 1.4.1.16) 2007202378 25 May 2007 Table 1 (continued) Nucleic Acid Amino Acid Identification Code Contig. NT Star NT Stop Function SEQ ID NO SEQ ID NO 46 RXA00972 CR00274 3 1379 DIAMINOPIMELATE DECARBOXYLASE (EC 4.1.1.20) 47 48 RXA02653 GR00752 5237 7234 DIAMINOPIMELATE DECARBOXYLASE (EC 4.1.1.20) 49 50 RXA0 1393 GR00408 4249 3380 LYSINE EXPORT REGULATOR PROTEIN 51 52 RXA00241 GR00036 5443 6945 1-LYSINE TRANSPORT PROTEIN 53 54 RXA01394 GR00408 4320 5018 LYSINE EXPORTER PROTEIN 56 RXA00865 CR00236 2647 3549 OIH-YDRODIPICOLINATE SYNTHASE (EC 4.2.1.52) 57 58 RXS02021 2 3 4 .5-TETRAHYDROPYRIDINE-2CARBOXYLATE N-SUCCINYLTRANSF
ERASE
(EC 2.3.1.117) 59 60 RXS02157 ACETYLORNITHINE AMINOTRANSFERASE (EC 2.6.1. 11) 61 62 RXC00733 ABC TRANSPORTER ATP-BINDING PROTEIN INVOLVED IN LYSINE
METABOLISM
63 64 RXCO0861 PROTEIN INVOLVED IN LYSINE METABOLISM 66 RXC00866 ZN-DEPENDENT HYDROLASE INVOLVED IN LYSINE METABOLISM 67 68 RXC02095 ABC TRANSPORTER ATP-BINDING PROTEIN INVOLVED IN LYSINE
METABOLISM
69 70 RXC03185 PROTEIN INVOLVED IN LYStNE METABOLISM Glutamate and glutamine metabolism Nucleic Acid Amino Acid Identification Code Contig. NT Start NT Sop Function SEQ ID NO SEQDN 71 72 RXN00367 W0196 9744 14273 GLUTAMATE SYNTHASE [NADH] PRECURSOR (EC 1.4.1.14) 73 74 F RXA00007 CR00001 7107 8912 CLUTAMATE SYNTHASE (NADPH) LARGE CHAIN PRECURSOR (EC 1.4.1.13) C 76 F RXA00364 CR00074 1296 4 CLUTAMATE SYNTH-ASE (NADPH) LARGE CHAIN PRECURSOR (EC 1.4.1.13) 77 78 F RXA00367 CR00075 1806 964 GLUTAMATE SYNTH-ASE (NADPH) LARGE CHAIN PRECURSOR (EC 1.4.1.13) 79 80 RXN00076 W0154 2752 4122 GLUTAMATE SYNTHASE (NADPH) SMALL CHAIN (EC 1.4.1.13) 81 82 F RXA00075 CR00012 2757 3419 GLUTAMATE SYNTHASE (NADPH) SMALL CHAIN (EC 1.4.1.13) 83 84 RXN00198 W0181 7916 7368 GLUTAMATE SYNTHASE (NADPH) SMALL CHAIN (EC 1.4.1.13) 86 F RXA00198 GR00031 2 283 GLUTAMATE SYNTHASE (NADPH) SMALL CHAIN (EC 1.4A1.13) 87 88 RXN00365 W0196 14607 15233 GLUTAMATE SYNTHASE [NADPHJ SMALL CHAIN (EC 1.4.1.13) 59 90 F RXA00365 CR00075 630 4 GLUTAMATE SYNTHASE (NADPH) SMALL CHAIN (EC 1.4.1.13) 91 92 RXA00356 CR00075 961 605 GLUTAMATE SYNTHASE (NADPH) SMALL CHAIN (EC 1.4.1.13) 93 94 RXA02072 CR00628 1259 2599 NADP-SPECIFIC GLUTAMATE DEHYOROGENASE (EC 1.4.1.4) 96 RXA00323 CR00057 3855 5192 GLUTAMINE SYNTHETASE (EC 6.3.1.2) 97 98 RXA00335 CR00057 19180 17750 GLUTAMINE'SYNTHETASE (EC 6.3.1.2) 99 100 RXA00324 CR00057 5262 8396 GLUTAMATE-AMMONIA.LIGASE ADENYLYLTRANSFERASE (EC 2.7.7.42) 101 102 RXN03176 W0332 2 862 GLUTAMINASE (EC 3.5.1.2) 103 104 F RXA02879 CR10017 2 862 GLUTAMINASE (EC 3.5.1.2) 105 106 RXA00278 CR00043 2612 1581 GLUTAMINE-BINDING PROTEIN PRECURSOR 1107 108 RXA00727 CR00193 614 1525 CLUTAMINE-BINDING PERIPLASMIC PROTEIN PRECURSOR 2007202378 25 May 2007 Table 1 (continued) Alanine and Aspartate and Asparagine metabolism Nucleic Acid Amino Acid Identification Code Conhig NT Star NT Sop Function S EQID No 109 11 0 RXA02 139 GR00639 6739 4901 ASPARACINE SYNTHETASE (GLUTAMINE-HYDROLYZING) (EC 6.3.5.4) ill 112 RXN001 16 WOI100 26974 25814 ASPARTATE AMVINOTRANSFERASE (EC 2.6.1.1) 113 114 F RXA001 16 GR00018 510 4 ASPARTATE AMINOTRANSFEPASE
(EC
2 6.1.) 1 15 116 RXN00618 WV0135 10288 9182 ASPARTATE AMINOTRANSFERASE (EC 2.6.1.1) 117 118 F RXA00618 GR00163 213 746 ASPARTATE AMINOTRANSFERASE (EC 2.6.1.1) 119 120 F RXA00627 GR00164 85.4 1138 ASPARTATE AMINOTRANSFERASE (EC 2.6.1 .1) 121 122 RXA02550 CR00729 1585 *275 ASPARTATE AMINOTRANSFERASE (EC 2.6.1.1) 123 124 RXA02193 CR00645 1942 365 ASPARTATE AMMONIA-LYASE (EC 4.3. 1. 1) 125 126 RXA02432 GR00708 2669 1695 L-ASPARAGINASE (EC 3.5.1.1) 1 27 128 RXN03003 VV0138 680 6 ASPARTATE AMINTRANSFEPASE
(EC
2 6 1 1 129 130 RXN00508 WV0086 4701 5783 ALANINE RACEMASE (EC 5.1.1.1) 131 132 RXN00636 WV0135 20972 19944 ALANINE RACEMVASE, BIOSYNTHETIC (EC 5.1.1.1) beta-Alanine metabolism Nucleic Acid Amino Acid Identification Code Contig NT Stat NT Stop Function SEQ ID NO S EQ -ID No 133 13 4 RXA02536 GR00726 8581 7826 BETA-UREIOOPROPIONASE (EC 3.5.1.6) 135 136 RXS00870 METHYLMALONATE-SEMIALDEHYDE DEHYOROGENASE (ACYLATINC) (EC 1.2.1.27) 137 138 RXS02299 ASPARTATE 1 -DECARBOXYLASE PRECURSOR (EC 4. 1. 1.11) Glycine and serine metabolism Nucleic Acid Amino Acid Identification Code Contig NT Star NT Stop Function SEQ ID NO SEQ IDNO 139 140 RXA01 561 GR00435 1113 2042 L-SERINE DEl-fYDRATASE (EC 4.2.1.1.3) 141 142 RXA01850 CR00525 481 1827 L-SERINE DEHYDRATASE (EC 4.2.1.13). 143 144 RXA00680 CR00156 7343 6042 SERINE HYDROXYMETHYLTRANSFERASE (EC 2.1.2.1) 145 146 RXA01821 CR00515 10253 9876 SARCOSINE OXIDASE (EC 1.5.3.1) 147 148 RXN02263 WV0202 11783 12160 SARCOSINE OXIDASE (EC 1.5.3.1) 149 150 F RXA02263 CR00654 33454 33813 SARCOSINE OXIDASE (EC 1.5.3.1) 151 152 RXA02176 CR00641 11454 12581 PHOSPHOSERINE AMINOjTRANSFERASE (EC 2.6.1.52) 153 1.54 RXN02758 CR00766 5082 4648 PHOSPHOSERINE PHOSPHATASE (EC 3.1.3.3) 155 156 F RXA02479 CR00717 393 4 PHOSPHOSERINE PHOSPHATASE (EC 3.1.3.3) 157 158 IF RXA02758 CR00766 5082 4648 PHOSPHOSERINE PHOSPHATASE (EC 3.1.3.3) 159 160 F RXA02759 CR00766 5330 5220 PHOSPHOSERINE PHOSPHATASE (EC 3.1.3.3) 161 162 RXA02501 CR00720 15041 13977 PI-OSPHOSERINE PHOSPHATASE (EC 3.1.3.3) 163 164 RXN03105 WV0074 15857 15423 SARCOSINE OXIDASE (EC 1.5.3.1) 165 166 RXS01I130 0-3-PHOSPHOGLYCERATE DEHYOROGENASE (EC 1.1.1.95) 167 168 RXS03112 D-3-PHOSPHOCLYCERATE DEHYOROCENASE (EC 1.1.1.95) 2007202378 25 May 2007 Table I (continued) Threonine metabolism Nucleic Acid SEQ ID NO 169 171 173 175 177 179 181 183 Amino Acid SEQ ID NO 170 172 174 176 178 180 182 184 Identification Code RXN00969 F RXA00974 RXAOOS7o RXA00330 RXN00403 F RXA00403 RXCO 1207 RXCO01 52 Contig.
WV0149 GR00274 GR00273 GR00057 WV0086 GROO088 NT Start 12053 2623 161 12968 70041 723 NT Stop 13387 3015 1087 14410 68911 1832 Function HOMOSERINE DEHYDROGENASE (EC 1.1.1.3) HOMOSERINE DEHYDROGENASE (EC 1.1.1.3) HOMOSERINE KINASE (EC 2.7.1.39) THREONINE SYNTHASE (EC 4.2.99.2) HOMOSERINE O-ACETYLTRANSFERASE HOMOSERINE O-ACETYLTRANSFERASE (EC 2.3.1.11) CYTOSOLIC PROTEIN INVOLVED IN METABOLISM OF LYSINE AND
THREONINE
MEMBRANE ASSOCIATED PROTEIN INVOLVED IN THREONINE METABOLISM Metabolism of methionine and S-adenosyl methionine Nucleic Acid SEQ ID NO 185 187 189 191 193 195 197 199 201 203 205 207 209 211 213 215 217 219 221 Amin o Acid SEQ ID NO 188 190 192 194 196 198 200 202 204 206 208 210 212 214 216 218 220 222 Identification Code RXAOO 115 RXN00403 F RXA00403 RXS031 58 F RXA00254 RXA02 532 RXS03 159 F RXA02768 RXA00216 RXN00402 F RXA00402 RXA00405 RXA021 97 RXN02 198 F RXA02198 RXN03074 F RXA02906 RXN00132 F RXA001 32 GRODO017 W0086 GR00088 GROO038 GR00726 GR00770 GR00032 W0086 GRO0088 GR00089 GR00645 W0302 GR00646 W0042 GR10044 W01 24 GROOD20 NT Start 5359 70041 723 2404 3085 1919 16286 70787 3289 4552 9228 2483 2238 1142 3612 7728 NT Stop Function 4313 HOMOSERINE O-ACETYLTRANSFERASE (EC 2.3.1.31) 68911 HOMOSERINE O-ACETYLTRANSFERASE 1832 HOMOSERINE O-ACETYLTRANSFERASE (EC 2.3. 1.11) CYSTATHIONINE GAMMA-SYNT-ASE (EC 4.2.99.9) 1811 CYSTATHIONINE GAMMA-SYNTHASE (EC 4.2.99.9) 2039 CYSTATHIONINE GAMMA-SYNTHASE (EC 4.2.99.9) CYSTATHIONINE GAMMA-SYNT,{ASE (EC 4.2.99.9) 2521 CYSTATI-IONINE GAMMA-SYNTHASE (EC 4.2.99.9) 15297 5-methyltetrahydrofolate-homocysteine methyltransferase (methionine synthetase) 70188 O-ACETYLHOMOSERINE SULFHYDRYLASE (EC 4.2.991)/OAEYSRN SULFHYDRYLASE (EC 4.2.99.8) '91) 0-CEYLERN 576 0-ACETYIHOMOSERINE SULFHYDRkLASE (EC 4.2.99.10)1 /0-ACETYLSERINE SULFH-YDRYLASE (EC 4.2.99.8) 3801 0-ACETYIHOMOSERINE SULFHYDRY *LASE (EC 4 2 .99.10)I/OACETYLSERINE SULFHYDRYLASE (EC 4.2.99.8) 4025 5-METHYLTETRAHYDROFOLATE..HOMOCYSTEINE
METH-YLTRANSFERASE
(EC 2.1.1.13) 11726 5-METHYLTETRAHYDROFOLATE..HOMOCYSTEINE
METHYLTRANSFERASE
(EC 2. 1. 1.13) 6 5-METHYLTETRAHYDROFOLATE..HOMOCYSTEINE
METHYLTRANSFERASE
(EC 2.1.1.13) 1741 S-ADENOSYLMETHIONINE:2.DEMETHYLMENAQUINONE METHYLTRANSFERASE (EC 645 S.ADENOSYLMETH.IONINE:2-DEMETHYLMENAQUINONE METHYLTRANSFERASE (EC 2.1-.
5045 ADENOSYLHOMOCYSTEINASE (EC 3.3. 1. 1) 7624 ADENOSYLHOMOCYSTEINASE (EC 3.3.1.1) 2007202378 25 May 2007 Nucleic Acid SEQ ID NO 223 225 227 229 231 233 235 237 239 Amino Acid SEC ID NO 224 226 228 230 232 234 236 238 240 Identification Code F RXA01371 RXN02085 F RXA02085 F RXA02086 RXN02648 F RXAO2648 F RXA02658 RXC02238 RXCO01 28 Conlig.
GR00398 GR00629 GROO629 GR00751 GR00752 Table I continued) NT Strt NTStop Function 2339 3634 ADENOSYLHOMOCYSTEINASE (EC 3.3. 1. 1) 5-METHYLTETRAHYDROPTEROYLTRIGLUTAMATE-HOMOCYSENE METH-YLTRANSFEP.ASE (EC 2.1.1.14) 3496 5295 5-METHYLTETRAHYDROPTEROYLTRIGLUTAMATEHOMOCYSEN METHYLTRANSFEASE (EC 2.1.1.14) 5252 5731 5-METHYLTETRAHYDROPTEROYLTRIGLUTAMATE-HOMOCYSENE METHYLTRANSFERASE (EC 2.1.1.14) 5-METHYLTETRAHYDROPTEROYLTRIGLUTAMATEHOMOCYSEN METHYLTRANSFERASE (EC 2.1.1.14) 5254 4730 5-METIIYLTETRAHYDROPTEROYLTRIGLUTAATE-HOMOCYSTEN METHYLTRANSFERASE (EC 2.1.1.14) 14764 15447 5-METHYLTETRAHYDROPTEROYLTRIGLUTAMATE-HOMOCYSEN METHYLTRANSFERASE (EC 2.1. 1.14) PROTEIN INVOLVED IN METABOLISM OF S-ADENOSYLMETHIONINE,
PURINES
AND PANTOTHENATE EXPORTED PROTEIN INVOLVED IN METABOLISM OF PYRIDIMES
AND
ADENOSYLHOMOCYSTEINE
S-adenosyl methionine (SAM) Biosynthesis Nucleic Acid Amino Acid S7EQ -ID NO S;EQ ID NO 24 1 2 42- Identification Code Contig.
RXA02240 GR00654 NT Star 7160 NT Stop 8380 Function S-ADENOSYLMETHIONINE SYNTHETASE (EC 2.5.1.6) Cysteine metabolism Nucleic Acid SEQ -IDNO 243 245 247 249 251 253 255 Amino Acid S9EQ I D NO 244 246 248 250 252 254 256 Idenification Code RXA00780 RXA00779 RXN00402 F RXA0O.402 RXS0O405 RXCO01 64 RXCOI 191 Contig GR00206 GR00206 VV0086 GROO088 NT Star 1689 550 70787 NT Stop 2234 1482 70188 576 Function SERINE ACETYLTRANSFERASE (EC 2.3.1.30) CYSTEINE SYNTHASE (EC 4.2.99.8)
EYSRN
O.ACIETYLHOMOSERINE SULFHYDRYLASE (EC 4.2.99.10)1 /0-ACELSRN SULFH-YORYLASE (EC 4.2.99.8) O-ACETYLHOMOSERINE SULFH-YDRYLASE (EC 4.2.99. 10) 1 0-ACETYLS
ERINE
SULFHYDRYLASE (EC 4.2.99.8) O-ACETYLHOMOSERINE SULFHYDRYLASE (EC 4.2.99.10) 1 0-ACETYLSERINE SULFHYDRYLASE (EC 4.2.99.8) ABC TRANSPORTER ATP.BINDING PROTEIN INVOLVED IN CYSTEINE
METABOLISM
ABC TRANSPORTER ATP-BINDING PROTEIN INVOLVED IN CYSTEINE
METABOLISM
2007202378 25 May 2007 Valine, leucine and isoleucine Nucleic Acid Amino Acid Identification Code Contig.
SEQ ID NO SEQ ID NO 257 258 RXA02646 GR0075 259 260 RXA00766 GR0O20 261 262 RXN01690 W0246 263 264 F RXA01690 GR0047 265 266 RXN01026 W0143 267 268 F RXAO 1026 GR0029 269 270 RXN01 127 W0157 271 272 F RXA01 132 GR0031 273 274 RXN00536 W0219 275 276 F RXA00536 GROO13 277 278 RXN02965 W0143 279 280 R.XN01929 WV0127 Table 1 (continued) NT Start NT Stop Function 1i 14 14 7 3856 5091 1 296 1248 9171 4491 1349 6128 6128 7711 47590 2766 15584 1075 2588 4249 196 196 7513 1602 3472 1651 7498 7360 7121 48402 1960 14643 1530 THREONINE DEHYDRATASE BIOSYVNTHETIC (C 4.2.1.16) BRANCHED-CHAIN AMINO ACID AMINOTRANSFERASE (EC 2.6.1.42) BRANCHED-CHAIN AMINO ACID AMINOTRANSFERASE. (EC 2.6.1.42) BRANCHED-CHAIN AMINO ACID AMINOTRANSFERASE, (EC 2.6. 1.42) 3-ISOPROPYLMALATE DIEHYDRATASE LARGE SUBUNIT (C 4.2.1.33) 3-ISOPROPYLMALATE DEHYDRATASE LARGE SUBUNIT (EC 4.2.1.33) 3-ISOPROPYLMAL-ATE DEHYDROGENASE (EC 1.1.1.85) 3-ISOPROPYLMALATE DIEHYDROGENASE (EC 1. 1.1.85) 2-ISOPROPYLMALATE SYNTHASE (EC 4.1.3.12) 2-ISOPROPYLMAI.ATE SYNTHASE (CO 4.1.3.1) 3-ISOPROPYLMALATE DEHYDRATASE SMALL SUBUNIT (EC, 4.2.1.33) 3-METHYL-2-OXOBUTANOATE HYDROXYMETHYLTRANSFERASE (EC 2.1.2.11) DECARBOXYLASE (EC 4.1.1.44) 3-METHYL-2-OXOBUTANOATE HYDROXYMETHYLTRANSFEPASE (C 2.1.2.11) 4"-MYCAROSYL ISOVALIERYL-COA TRANSFERASE (EC, KETOL-ACID REDUCTOISOMERASE (C 1.1.1.86) KETOL-ACID REDUCTOISOMERASE (EC 1.1.1.86) F RXA01929 RXN0 1420 RXSOI 145 F RXAOI1145 GR00555 W0O122 GR0032 1 Arginine and praline metabolism Enzymes of proline biosynthesis: Nucleic Acid SEQ ID NO 259 291 293 295 297 299 301 303 305 Amino Acid SEQ I0 NO 290 292 294 296 298 300 302 304 306 Identification Code Contig NT Starl NT Stop Function RXA02 375 RXN02382 F RXA02 378 F RXA02382 RXA02499 RX502 157 RX S02 262 RXS02970 F RXA01009 GR00689 1449 223 GLUTAMATE 5-KINASE (C 2.7.2.11 W0213 5162 3867 GAMMA-GLUTAMYL PHOSPHATE REOUCTASE (GPR) (C 1.2.1.41) GR00690 624 16 GAMMA-GLUTAMYL PHOSPHATE REDUCTASE (GPR) (C 1.2.1.41) GR00691 2493 1894 GAMMA-GLUTAMYL PHOSPHATE REOUCTASE (GPR) (C 1.2.1.4 1) GR00720 11883 12692 PYRROLINE-5-CARBOXYLATE REDUCTASE (C 1.5.1.2) ACETYLORNITHINE AMINOTRANSFERASE (EC 2.6.1.11) ORNITHINE CYCLODEAMINASE (C 4.3.1.12) ACETYLORNITHINE AMINOTRANSFERASE (EC 2.6.1.11) GR00287 4714 5943 ACETYLORNITHINE AMINOTRANSFERASE (EC 2.6. 1.11) 2007202378 25 May 2007 Table I (continued) Enzymes of proline degradation: Nucleic Acid SEQ ID NO 307 Amino Acid S-EQ I1D NO 308 IdentIficatIon Code Contig NT Start NT Stop Function RXN00023 309 310 F RXA00023 311 312 F RXA02264 W0127 68158 64703 PROLINE DEHYDROGENASE (EC 1.5.99.8) /DELTA-i1- CARBOXYLATE DEHYDROGENASE (EC 1.5.1.12) GR00003 2 454 PROLINE DEHYDROGENASE (EC 1.5.99.8) /DELTA.1 CARBOXYLATE DEHYDROGENASE (EC 1.5.1.12) GR00660 3028 5 PROLINE DEHYDROGENASE (EC 1.5.99.8) DELTA-i.- CARBOXYLATE DEHYDROGENASE (EC 1.5.1.12) PROTEIN INVOLVED IN PROLINE METABOLISM 313 314 RXC02498 Synthesis of 3-Hydoxy-proline: Nucleic Acid Amino Acid Identification Code Contig SEQ ID NO TEQ ID NO 315 316 RXA01491 GRO04: NT Start NT Stop Function 23 5337 4687 DNA FOR L-PROLINE 3-HYDROXYLASE. COMPLETE CDS Enzymes of ornithine, arginine and spermidine metabolism: Nucleic Acid SEQ ID NO 3W7 Amino Acid SEQO I D NO 318 Identification Code RXA021 55 RXA02 156 RXN02153 F RXA02153 RXA02 154 RXA021 57 RXS02970 F RXAOI1009 RXA02 158 RXA02 160 RXN02 162 F RXA02161 F RXA02162 RXA02 262 RXA00219 RXA0 1508 RXA0 1757 RXA02 159 RXN02 154 RXS00 147 RXS00905 RXS00906 Contl NT Start NT Stop Function GRO0640 1913 3076 GLUTAMATE N-ACETYLTRANSFE RASE (EC 2.3.1.35) /AMINO.ACID ACETYLTRANSFERASE (EC 2.3.1.1) GR00640 3125 4075 ACETYLGLUTAMATE KINASE (EC 2.7.2.'8) W0122 14106 13327 N-ACETYL-GAMMA.GLUTAMYL.PHOSPHATE REDUCTASE (EC 1.2.1.38) GR00640 757 1536 N.ACETYLGLUTAMATE-5-SEMIALDEHYDE
DEHYDROGENASE
GR00640 1536 1826 N-ACETYLGLUTAMATE-5SEMIALDEHiYDE
DEHYDROGENASE
GR00640 4079 5251 ACETYLORNITHINE AMINOTRANSFERASE (EC 2.6.1.11) ACETYLORNITHINE AMINOTRANSFERIASE (EC 2.6.1.11) GR00287 4714 5943 ACETYLORNITHINE AMINOTRANSFE RASE (EC 2.6.1.11) GR00640 5268 6224 ORNITHINE CARBAMOYLTRANSFERASE (EC 2.1.3.3) GR00640 6914 8116 ARGININOSUCCINATE SYNTHASE (EC6.3.4.5) \iV0122 6683 5253 ARGININOSUCCINATE LYASE (EC 4.3.2.11) GR00640 8180 8962 ARGININOSUCCINATE LYASE (EC 4.3.2. 1) GR00640 8949 9611 ARGININOSUCCINATE LYASE (EC 4.3.2. 1) GR00654 32291 33,436 ORNITHINE CYCLODEAMINASE (EC 4.3.1.12) GR00032 19289 20230 SPERMIDINE SYNTHASE (EC 2.5.1.16) GR00424 12652 14190 SPERMIDINE SYNTHASE (EC 2.5.1.16) GR00498 2942 2142 PUTRESCINE OXIDASE (EC 1.4.3.10) GR00640 6231 6743 ARGININE HYDROXIMATE RESISTANCE PROTEIN WVOt 22 13327 13037 N.ACETYL.GAMMA-GLUTAMYL.PHOSPHATE REDUCTASE (EC 1.2.1.38) CARBAMOYL-PHOSPHATE SYNfHASE SMALL CHAIN (EC 6.3.5.5) N-ACYL-L-AMINO ACID AMIDOHYOROLASE (EC 3.5.1.14) N-ACYL-L-AMINO ACID AMIDOHYDROLASE (EC 3.5.1.14) 2007202378 25 May 2007 Table I (continued) Nucleic Acid Amino Acid Identification Code Contig. NT Start NT Stop Function SEQ ID NO SEQ ID NO 361 362 RXS00907 N-ACYL-L-AMINO ACID AMIDOHYDROLASE (EC 3.5.1.14) 363 364 RXS02001 N-ACYL-L-AMINO ACID AMIDOHYDROLASE (EC 3.5.1.14) 365 366 RXS02101 N-ACYL-L-AMINO ACID AMIDOHYDROLASE (EC 3.5.1.14) 367 368 RXS02234 CARBAMOYL-PHOSPHATE SYNTHASE LARGE CHAIN (EC 6.3.5.5) 369 370 F RXA02234 CR00654 1 3198 CARBAMOYL-PHOSPHATE SYNTHASE LARGE CHAIN (EC 6.3.5.5) 371 372 RXS02565 N-ACYL-L-AMINO ACID AMIOOHYDROLASE (EC 3.5.1.14) 373 374 RXS02937 N-ACYL-L-AMINO ACID AMIDOHYDROLASE (EC 3.5.1.14) Histidine, metabolism Nucleic Acid Amino Acid Identification Code Contig NT Start NT Stop Function SEQ ID NO SEQ ID NO 375 376 RXA02 194 GR00645 2897 .2055 ATP PHOSPHORIBOSYLTRANSFERASE (EQ 2.4.2.17) 377 378 RXA02195 CR00645 3186 2917 PHOSPHORIBOSYL-ATP PYROPHOSPHOHYDROLASE (EQ 3.6.1.31) 379 380 RXA01097 GR00306 4726 4373 PHOSPHORIBOSYL-AMP QYCLOHYDROLASE (EQ 3.5.4.19) 381 382 R.XAO 1100 GR00306 7072 6335 PHOSPHORIBOSYLFORMIMINO-5-AMINOIMIDAZOLE
QARBOXAMIDE
RIBOTIDE ISOMERASE (EQ 5.3.1.16) 383 384 RXA01101 GR00305 7726 7094 AMIDOTRANSFERASE HISH (EQ 385 386 RXN01657 WO01O 39950 39351 AMIDOTRANSFERASE HISH (EQ 387 388 F RXA01657 GR00460 2444 2944 AMIDOTRAN.SFERASE HISH (EQ 389 390 RXA01098 CR00306 5499 4726 HISF PROTEIN 391 392 RXNOI1O4 W0059 7037 6432 IMIDAZOLEGLYQEROL-PHOSPHATE DEHYDRATAS E (EQ 4.2. 1. 19) C 393 394 F RXAO11O4 CR00306 10927 10322 IMIDAZOLECLYQEROL-PHOSPHATE DEHYDRATASE (EQ 4.2.1.19)1 HISTIDINOL-PHOSPHATASE (EQ 3.1.3.15) 395 396 RXN00446 W0112 24181 23318 HISTI0INOL-PHOSPHATE AMINOTRANSFERASE (EQ 2.6. 1.9) 397 398 F RXA00446 CR00108 4 525 HISTIDINOL.PHOSPHATE AMINOTRANSFERASE (EQ 2.6. 1.9) 399 400 R.XAOI 105 GR00306 12044 10947 HISTIDINOL.PHOSPHATE AMINOTRANSFERASE (EQ 2.6.1.9) 401 402 RXA01I106 CR00306 13378 12053 HISTIDINOL DEHYDROCENASE (EQ.11.23) 403 404 RXCO0930 PROTEIN INVOLVEDf IN HISTIDINE METABOLISM 405 406 RXCO1096 PROTEIN INV;OLVED IN HISTIDINE METABOLISM 407 408 RXC01656 PROTEIN INVOLVED IN HISTIDINE METABOLISM 409 4 10 RXCOI 158 MEMBRANE SPANNING PROTEIN INVOLVED IN HISTIDINE METABOLISM Metabolism of aromatic amino acids Nucleic Acid Amino Acid Identification Code Qontig NT Start NT Stop Function SEC ID NO SEQ ID NO 411 412 RXA02458 CR00712 3056 4345 3-PHOSPHOSHIKIMATE 1 -QARBOXYVINYLTRANSFERASE (EQ 2.5. 1. 19) 413 414 RXA02790 CR00777 5806 6948 4-AMIN0-4-DEOXYCHORISMATE LYASE (EQ 415 416 RXN00954 W0247 3197 2577 ANTH-RANILATE PHOSPHORIBOSYLTRANSFERASE (EQ 2.4.2.18) 417 418 F RXA00954 CR00263 3 590 ANTHRANILATE PHOSPHORIBOSYLTRANSFERASE (EQ 2.4.2.18) 419 420 RXN00957 W0208 1211 2764 ANTHRANILATE SYNTHASE COMPONENT I (EQ 4.1.3.27) 421 422 F RXA00957 CR00264 3 1130 ANTHRANILATE SYNTHASE COMPONENT I (EQ 4.1.3.27) 2007202378 25 May 2007 Nucleic Acid Amino Acid identification Code Contig. NT Start NT Stop Function SEQ ID NO SEQ IDNO 423 424 RXA02687 GR00754 11306 12250 CHORISMATE MUTASE (EC 5.4.99.5) 1 PREPHENATE DEHYORATASE (EC 4.2.1.51) 425 426 RXN01698 WV0134 11507 12736 CHORISMATE SYNTHASE (EC4.6.1.4) 427 428 F RXA01698 GR00477 2 991 CHORISMATE SYNTHASE (EC 4.6.1.4) 429 430 RXA01 095 GROO305 3603 2821 INOOLE-3-GLYCEROL PHOSPHATE SYNTHASE (EC 4.1.1.48) 431 432 RXA00955 GR00263 586 2007 INOOLE-3-GLYCEROL PHOSPHATE SYNTHASE (EC 4.1.1.48) 1 RI8OSYL)ANTHRANILATE ISOMERASE (EC 5.3.1.24) 433 434 RXA02814 GR00795 598 128 ISOCHORISMATE MUTASE 435 436 RXA00229 GR00033 1715 936 SHIKIMATE 5-DEHYDROGENASE (EC 1.1.1.25) 437 438 RXA02093 GR00629 12444 13247 SHIKIMATE 5-DEHYDROGENASE (EC 1.1.1.25) 439 440 RXA02791 GR00777 6968 7795 SHIKIMATE 5-DEHYDROGENASE (EC 1.1.1.25) 441 442 RXA01 699 GR00477 984 1553 SH-IKIMATE KINASE (EC 2.7.1.71) 443 444 RXA00952 GR00262 97 936 TRYPTOP-AN.SYNTHASE ALPHA CHAIN (EC 4.2.1.20) 445 446 RXN00956 WV0247 1140 4 TRYPTOPHANS8YNTHASE BETA CHAIN (EC 4.2.1.20) 447 448 F RXA00956 GR00263 2027 3157 TRYPTOPHAN SYNTHASE BETA CHAIN (EC 4.2.1.20) 449 450 RXA00064 GROO010 2499 3776 TYROSINE AMINOTRANSFERASE (EC 2.6.1.5) 451 452 RXN00448 WV0112 33959 32940 PREPHENATE DEHYOROGENASE (EC 1.3.1.12) 453 454 F RXA00448 GROO109 3 668 PREPHENATE DEHYDROGENASE (EC 1.3.1.12) 455 456 F RXA00452 GRO01 10 854 1099 PREPHENATE DEHYDROGENASE (EC. 1.3.1.12) 457 458 RXA00584 GR00156 11384 10260 PHOSPHO-2.DEHYDRO.3.flEOXYHEPTONATE ALDOLASE (EC 4.1.2.15) 459 460 RXA00579 GR00156 5946 4087 PARA-AMINOBENZOATE SYNTHASE COMPONENT I (EC 461 462 RYA00958 GR00264 1130 1753 PARA-AMINOBENZOATE SYNTHASE GLUTAMINE AMIOOTRANSFERASE COMPONENT 11 (EC /ANTHRANILATE SYNTHASE COMPONENT 11 (EC 4.1.3.27) C\ 463 464 RXN03007 WV0208 3410 3778 ANTHRANILATE SYNTHASE COMPONENT 11 (EC 4.1.3.27)w 465 466 RXN02918 W00O86 25447 25887 TRYPTOPHAN SYNTHASE BETA CHAIN (EC 4.2.1.20) 467 468 RXN01 116 WV0182 7497 6886 3-OXOADIPATE COA-TRANSFERASE SUBUNIT B (EC 2.8.3.6) 469 470 RXNOI 115 WV0182 10347 11099 3-OXOADIPATE ENOL-LACTONE HYDROLASE (EC 3.1.124) /4.
CARBOXYMUCONOLACTONE
471 472 RXSOOI116 ASPARTATE AMINOTRANSFEASE EC 1) 473 474 F RXA00OI16 CR00018 510 4 ASPARTATE AMINOTRANSFERASE (EC 2.6.1.1) 475 476 RXS00391 O-SUCCINYLBENZOIC ACID--COA LIGASE (EC 6.2.1.26) 477 478 RXS00393 1 ,4-DIHYOROXY-2-NAPHTHOATE OCTAPRENYLTRANSFEP.RASE (EC 479 480 F RXA00393 GR00086 4030 4911 1,4-DIHYDROXY-2-NAPHTHOATE OC-TAPRENYLTRANSFE RASE (EC 481 482 RXS00446 HISTIDINOL-PHOSPHATE AMINOTRANSFERASE (EC 2.6.1.9) 483 484 F RXA00446 GROO108 4 525 HISTIDINOLPHOSPHATE AMINOTRANSFEASE
(EC
2 6 .1.9) 485 486 RXS00618 ASPARTATE AMINOTRANSFERASE (EC 2.6.1.1) 487 468 F RXA0061I8 GR00163 213 746 ASPARTATE AMINOTRANSFERASE (EC 2.6.1.1) 489 490 F RXA00627 GR00164 854 1138 ASPARTATE AMINOTRANSFERASE (EC2.6.1.1) 491 492 RXS0I1 5 HISTIOINOL-PHOSPHATE AMINOTPANSFERASE (EC 2.6.1.9) 493 494 RXS02315 2 -SUCCINYL-6-HYDROXY-2,4-CYCLOHEXADIENE. 1 -CAREBOX YLATE SYNTHASE I 2-OXOGLUTARATE DECARBOXYLASE (EC 4.1.1.71) 495 496 RXS02550 ASPARTATE AMINOTRANSFERASE (EC 2:6.1. 1) 497 498 RXS02319 NAPHTHOATE SYNTHASE (EC 4.1.3.36) 499 500 RXS02908 O-SUCCINYLBENZOIC ACID-COA LIGASE (EC 6.2.1.26) 501 502 RXS03003 ASPARTATE AMINOTRANSFERASE (EC 2.6.1. 1) 503 504 RXS03026 3-OEHYDROQUINATE DEHYDRATASE (EC 4.2. 1. 2007202378 25 May 2007 Nucleic Acid SEQ ID NO 505 507 509 511 513 Amino Acid SEQ ID NO 506 508 510 512 514 Identification Code Cnig RXS03074 RXCO 1434 RXC02080 RXC02789 RXC02295 Table I (continued) NT Start NT Stop Function S-ADENOSYLMETHIONINE:2.DEMETHYLMENAOUINONE METHYLTRANSFERASE (EC MEMBRANE SPANNING PROTEIN INVOLVED IN METABOLISM OF AROMATIC AMINO ACIDS AND RIBOFLAVIN MEMBRANE SPANNING PROTEIN INVOLVED IN METABOLISM OF AROMATIC AMINO ACIDS CYTOSOLIC PROTEIN INVOLVED IN METABOLISM OF AROMATIC AMINO
ACIDS
MEMBRANE SPANNING PROTEIN INVOLVED IN METABOLISM OF AROMATIC AMINO ACIDS Aminobutyrate metabolism Nucleic Acid Amnino Acid SECQIDNO SEQ ID NO SiS 516 517 518 519 520 Identification Code Cog.
RXN03063 VV0035 RXN02970 WV0021 F RXA01009 GR00287 NT Start 666 4714 4714 NT Stop 1697 6081 5943 Function 4-aminobutyrate aminotransferase (EC 2.6.1.19) ACETYLORNITH-INE AMINOTRANSFERASE (EC 2.6.1.11) ACETYLORNITHINE AMINOTRANSFERASE (EC 2.6. 1.11) Vitamins, vitamin-like substances (cofactors), nutraceuticals Thiamine metabolism Nucleic Acid Amino Acid Identification Code Cont2g NT Start NT Stop Function SEQ ID NO SEQ 1D NO 521 522 RXAO 1551 GR00431 2945 4819 THIAMIN BIOSYNTHESIS PROTEIN 1 523 524 RXA01019 GR00291 6 995 THIAMIN*MONOPHOSPHATE KINAS 525 526 RXA01352 GR00393 609 4 THIAMIN-PHOSPHATE OYROPHOSP 527 528 RXA01381 GR00403 3206 2286 THIF PROTEIN 529 530 RXA0 1360 GR00394 162 4 THIG PROTEIN 531 532 RXAO 1361 GR00394 983 378 THIG PROTEIN 533 534 RXA01208 GR00348 229 1032 HYDROXYETHYLTHIAZOLE KINASE 535 536 RXA00838 GR00227 1532 633 APBA PROTEIN 537 538 RXA02400 GR00699 1 988 2557 THIAMIN BIOSYNTHESIS PROTEIN)X 539 540 RXN01209 VV0270 1019 2446 PHOSPHOMETHYLPYRIMIDINE KINi 541 542 F RXA01 209 GR00348 1019 2446 PHOSPHOMETHYLPYRIMIDINE KINi 543 544 RXN01413 W0050 27306 27905 PHOSPHOMETHYLPYRIMIDINE KIN) 545 546 RXN01617 VV0050 22187 22858 PHOSPHOMETHYLPYRIMIDINE KIfi 547 548 F RXA01617 GR00451 2 616 PHOSPHOMETHYLPYRIMIDINE KINi 549 550 RXS01807 PYRIDOXINE KINASE (EC 2.7.1.35) 551 552 RXCO1021 CYTOSOI it KINJSE IK~x/nI lIEI- "I~
HJC
E (EC 2.7.4.16) HORYLASE (EC 2.5.1.3) (EC 2.7.1.50) kSE (EC 2.7.4.7) kSE (EC 2.7.4.7) ~SE (EC 2.7.4.7) ~SE (EC 2.7.4.7) ~SE (EC 2.7.4.7) IETABOLISM OF SUGARS AND THIAMIN
I
2007202378 25 May 2007 Table I (continued) Riboflavin metabolism Nucleic Acid SEQ ID NO 553 555 557 559 561 563 565 567 569 571 573 575 577 579 581 583 585 587 589 591 593 Amino Acid SEQ ID NO 554- 556 558 560 562 564 566 568 570 572 574 576 578 580 582 584 586 588 590 592 594 Identification Code RXN02246 F RXA02246 RXA02247 RXN02248 F RXA02248 RXN02249 F RXA02249 RXA02250 RXA0 1489 RXA021135 RXA0 1489 RXNO017 12 F RXAO 1712 RXN02384 RXN0 1560 RXN00667 RXCO1711 RXC02380 F RXA02380 RXC02921 RXCO 1434 Contig NT Star NT Sop Function W01I30 4388 5371 diaminohydroxyphosphoribosylaminopyrimidine deaminase (EC 3.5.4.26) 6 -(5-phosphoribosylamino)uracl reductase (EC 1. 1. 1.1 93) GR00654 14299 15282 RiBG PROTEIN riboflavin-specific deaminase GR00654 15286 15918 RIBOFLAVIN SYNTHASE ALPHA CHAIN (EC 2.5.1.9) WV0130 6021 7286 GTP CYCLOHYDROLASE 11 (EC 3.5.4.25) 3 4 -DIHYOROXY-2-BUTANONE 4- PHOSPHATE SYNTHASE GR00654 15932 17197 RIBA PROTEIN.- GTP cyclohydrolase 11 [EC:3.5.4.25) W0130 7301 7777 6 7 -DIMETHYL8RIBITYLLUMAZINE SYNTHASE (EC 2.5.1.9) GR00654 17212 17688 RIBH PROTEIN 6 ,-dimethy-.rbityIlumazine synthase (dmrl synthase, lumazine synthase, riboflavin synthase beta chain) IEC:2.5.1.9) GR00654 17778 18356 RIBX PROTEIN GR00423 3410 2388 RIBOFLAVIN KiNASE (EC 2.7.1.26) FMN AOE NYLYLTRANSF ERASE (EC 2.7.7.2) GR00639 2809 1736 NICOTINATE.NUCLEOTIDE*.DIMETHYLBENZIMIDAZOLE PHOSPHORIBOSYLTRANSFEFRASE (EC 2.4.2.21) GR00423 3410 2388 RIBOFLAVIN KiNASE (EC 2.7.1.26) FMN ADENYLYLTRANSFERASE
(EC
2.7.7.2) W0191 8993 8298 RIBOFLAVIN-SPECIFIC DEAMINASE (EC GR00484 2652 2152 RIBOFLAVIN-SPECIFIC DEAMINASE (EC W0213 1386 679 ALPHA.RIBAZOLE.5PHOSPHATE PHOSPHATASE (EC W0319 767 438 RIBOFLAVIN-SPECIFIC DEAMINASE (EC W0109 1363 350 DRAP DEAMINASE MEMBRANE SPANNING PROTEIN INVOLVED IN RIBOFLAVIN METABOLISM PROTEIN INVOLVED IN RIBOFLAVIN METABOLISM GR00691 709 56 Predicted nucleotidyitransterases CYTOSOLIC PROTEIN INVOLVED IN METABOLISM OF RIBOFLAVIN AND
LIPIDS
MEMBRANE SPANNING PROTEIN 19t0OLVED IN METABOLISM OF AROMATIC AMINO ACIDS AND RIBOFLAVIN I.
Vitamin 136 metabolism Nucleic Acid Amino Acid SEQ 0 NO SEQ ID NO 595 596 Identification Code Cont RXA0 1807 GR00509 NT Start 7868 NT Stop 7077 Function PYRIDOXINE KINASE (EC 2.7.1.35), pyridoxal/pyridox ine/pyridoxa mine kinase 2007202378 25 May 2007 Table I (continued) Nicotinate (nicotinic acid), nicotinamide, NAD and NADP Nucleic Acid SEQ ID NO 597 599 601 603 Amino Acid Identification Code SEQ ID NO 598 RXN02754 600 F RXA02405 602 F RXA02754 604 RXA02112 Contig NT Start NT Stop Function WV00814 22564 23901 NICOTINATE PHOSPHORIBOSYLTRANSFERASE (EC 2.4.2.111) GROO701 774 4 NICOTINATE PI-OSPHORIBOSYLTRANSFERASE (C 2.4.2.11) GR00766 3 488 NICOTINATE PHOSPHORIBOSYLTRANSFERASE (EC 2.4.2.11) GR00632 5600 6436 NICOTINATE-NUCLEOTIOE PYROPH-OSPHORYLASE (CARBOXYLATING) (EC 2.4.2.19) GR00632 4310 5593 QUINOLINATE SYNTHETASE A 605 606 RXA021 11 NAD Biosynthesis Nucleic Acid SEQ ID NO 607 609 Amino Acid SEQ ID NO 608 610 Identification Code Contig. NT Start NT Stop Function RXA01073 R.XN02754 GROO300 1274 2104 NH(3)-DEPENOENT NAD(+) SYNTHETASE (EC 6.3.5.1) WV0084 22564 23901 NICOTINATE PHOSPH-ORIBOSYLTRANSFERASE (EC 2.4.2.11) Pantothenate and Coenzyme A (CoA) biosynthesis Nucleic Acid SEQ ID NO 611 613 615 Amino Acid SEQ ID NO 612 614 616 Identification Code Contig NT Start NT Stop Function RXA02299 RXA01928 RXN0 1929 F RXA01929 RXA01521 RXS01 145 F RXA01 145 RXA02239 RXA0058 1 R.XS00838 RXC02238 GR00662 10452 10859 ASPARTATE 1-DECARBOXYLASE PRECURSOR (EC 4.1.1.11) GR00555 1957 1121 PANTOATE-BETA-ALANINE LIGASE (EC 6.3.2.1) WV0127 47590 48402 3-METHYL-2-OXOBUTANOATE HYDROXYMETHYLTRANSFERASE (EC 2.1.2.11) DECARBOXYLASE (EC 4.1.1.44):, GR00555 2766 1960 3-METHYL-2-OXOBUTANOATE HYDROXYMETHYLTRANSFERASE (EC 2.1.2.11) GR00424 25167 25964 PANTOATE--BETA-ALANINE LIGASE (EC 6.3.2.1) KETOL-ACID REDUCTOISOMERASE (EC 1.1.1.86) GR00321 '1075 1530 KETOL-ACID REDUCTOISOMERASE (EC 1.1.1.86) GR00654 5784 7049 DNAIPANTOTHENATE METABOLISM FLAVOPROTEIN GR00156 7572 8540 PANTOTHENATE KINASE (EC 2.7.1.33) 2-DEHYDROPANTOATE 2-REOUCTASE (EC 1.1.1.169) PROTEIN INVOLVED IN METABOLISM OF S-ADENOSYLMETHIONINE, PURINES AND PANTOTHENATE Blotin metabolism Nucleic Acid SEQ ID NO 633 Amino Acid SEQ ID NO 634 Identification Code Contig NT Start NT Stop Function RXN03058 RX008 W0028 8272 8754 BIOTIN SYNTHESIS PROTEIN BlOC 2007202378 25 May 2007 Nucleic Acid SEQ ID NO 635 637 639 Amino Acid SEQ ID NO 636 638 640 Identification Code F RXA02903 RXA001 66 RXA00633 RXA00632 RXA00295 RXA00223 RXN00262 F RXA00262 RXN00435 F RXA00435 F RXA02801 RXA02516 RXA02517 GR 10040 GR00025 GR00166 GR00 166 GR00047 GR00032 W01 23 GR00040 VVOI 12 GROD100 GR00782 GR00723 GR00723 Table 1 (continued) NT Start NT Stop Function 11532 3650 3556 2281 3407 23967 16681 79 10037 3563 438 1724 2989 12014 4309 2288 1610 4408 22879 15608 897 11209 2949 4 2986 3435 BIOTIN SYNTHESIS PROTEIN BlOC BIOTIN SYNTHESIS PROTEIN BlOC ADENOSYLMETHIONINE.8.AMINO.7.OXONONANOATE
AMINOTRANSFERASE
(EC 2.6.1.62) DETHIOBIOTIN SYNTHETASE (EC 6.3.3.3) BIOTIN SYNTHASE (EC 2.8.1.6) NIFS PROTEIN NIFS PROTEIN NIFS PROTEIN NIFS PROTEIN NIFS PROTEIN NIFS PROTEIN NIFS PROTEIN NIFU PROTEIN Lipoic Acid Nucleic Acid SEQ ID NO 661 663 665 667 Amino Acid SEQ ID NO 662 664 666 668 Identification Code RX.AO 1747 RXA01 746 RXA02 106 RX SO 1183 Contig.
GR00495 GR00495 GR00632 NT Start NT Stop Function 669 670 671 672 RXS01 260 RX50 1261 3549 LIPOIC ACID SYNTHETASE 2366 LIPOATE-PROTEIN LICASE 8 (EC 1527 LIPOATE-PROTEIN LIGASE A (EC 0IHYDROLIPOAMIDE SUCCINYLTRANSFERASE COMPONENT (E2) OF 2.
OXOGLUTARATE DEHYDROGENASE COMPLEX (EC 2.3.1.61) LIPOAMIDE DEHYDROCENASE COMPONENT (E3) OF BRANCHED-CHAIN ALPHA-KETO ACID DEHYDROGENASE COMPLEX (EC 1.8.1.4) LIPOAMIDE DEHYDROCENASE COMPONENT (E3) OF BRANCHED-CHAIN ALPHA-KETO ACID DEHYDROGENASE COMPLEX (EC 1.8.1.4) Folate biosynthesis Nucleic Acid SEQ ID NO 673 675 677 679 681 683 685 Amino Acid SEQ ID NO 674 676 678 680 682 684 686 Identification Code RXA0271 7 RXN02027 F RXA02027 RXAOO 106 RXN01 321 F RXA0 1321 RXA00461 RXA0 1514 RXAO 1516 Conli.
GR00758 W0296 GR00616 GR00014 WV0082 GR00384 GR001 16 GR00424 GR00424 NT Start NT Stop Function 18281 17400 503 1003 500 6 17469 17924 8868 9788 23 559 428 1279 20922 21509 22360 22749 5.1 0-METHYLENETETRAHYDROFOLATE REDUCTASE (EC 1.7.99.5) 5.FORMYLTETRAHYDROFOLATE CYCLO-LIGASE (EC 6.3.3.2) 5-FORMYLTETRAHYOROFOLATE CYCLO-LIGASE (EC 6.3.3.2) DIHYDROFOLATE REDUCTASE (EC 1.5.1.3) FORMYLTETRAHYDROFOLATE DEFORMYLASE (EC 3.5. 1.10) FMORMYLTETRAHYDROFOLATEDEFORMYLASE (EC 3.5.1.10) METHYENETTRAHDROFLATEDEHYROGENASE (EC 1.5.1.5)1 METHENYLTETRAHYDROFDLATE C YCLOHYDROLASE (EC 3.5.4.9) CTP CYCLOHYDROLASE I (EC 3.5.4,16) DIHYDRONEOPTERIN ALDOLASE (EC 4.1.2.25) 2007202378 25 May 2007 Nucleic Acid SEQ ID NO 691 693 695 697 699 701 703 Amino Acid SEQ ID NO 692 694 696 698 700 702 704 Identification Code RXA01515 RXA02024 RXAO 1 06 RXA00989 RXAO 1517 RXA00579 RXA00958 RXA02790 RXA001 06 RXN021 98 F RXA02198 RXN02085 F RXA02085 F RXA02086 RXN02648 F RXA02648 F RXA02658 RXS021 97 RXCO0988 RXCOI518 RXCO1 942 Contig GR00424 GR00513 GROO014 GR00280 GROO424 GROO 156 GR00264 GR00777 GROO014 W0302 GR00646 W0126 GRO0629 GRO0629 GR00751 GR00752 Table i (continued) NT Start NT Stop Function 2 1513 4026 17469 2903 22752 5946 1130 5806 17469 9228 2483 8483 3496 5252 5254 14764 22364 4784 17924 1371 23228 4087 1753 6948 17924 11726 6 10717 5295 5731 4730 15447 DIHYDROPTEROATE SYNTHASE (EC 2.5.1.15) DIHYDROPTEROATE SYNTHASE (EC 2.5.1.15) DIHYOROFOLATE REDUCTASE (EC 1.5.1.3) FOLYLPOLYGLUTAMATE SYNTHASE (EC 6.3.2.17) 2 -AMINO.4.HYDROXY-6-HYDROX YMETHYLDI HYDROPTE RI DINE PYROPHOSPHOKINASE (EC 2.7.6.3) PARA-AMINOBENZOATE SYNTHASE COMPONENT I (EC PARA-AMINOBENZOATE SYNTHASE GLUTAMINE AMIDOTRANSFERASE COMPONENT 11 (EC /ANTHRANILATE SYNTHASE COMPONENT 11 (EC 4.1.3.27) 4-AMINO-4DEOXYCHORISMATE LYASE (EC DIHYDROFOLATE REDUCTASE (EC 1.5.1.3) 5-METHYLTETRAHYDROFOLATE..HOMOCYSTEINE
METHYLTRANSFERASE
(EC 2. 1. 1.13) 5-METHYLTETRAHYDROFOLATE..HOMOCYSTEINE
METHYLTRANSFERASE
(EC 2.1.1.13)
METHYLTRANSFERASE
METHYLTRANSFERASE (EC 2.1A1.14) 5-METHYLTETRAHYDROPTEROYLTRIGLUTAMATE-HOMOCYSTEINE METHYLTRANSFERASE (EC 2.1.1.14) 5
-METHYLTETRAHYDROPTEROYLTRIGLUTAMATE-HOMOCYSTEINE
METHYLTRANSFERASE (EC 2.1.1.14) 5-METHYLTETRAHYROPTEROYLTRIGLTAATE-HOMOCYSTEINE METHYLTRANSFERASE (EC 2.1.1.14) 5-METHYLTETRAHYDROPTEROYLTRIGLUTAATE-HOMOCYSTEINE METHYLTRANSFERASE (EC 2.1.1.14) 5.METHYLTETRAHYDROFOLATE..HOMOCYSTEINE
METHYLTRANSFERASE
(EC 2.1.1.13) PROTEIN INVOLVED IN FOLATE MgT 6OLISM MEMBRANE SPANNING PROTEIN INVOLVED IN FOLATE METABOLISM ATP-BINDING PROTEIN INVOLVED IN FOLATE METABOLISM Molybdopterin Metabolism Nucleic Acid SEQ ID NO 733 735 737 739 741 743 745 747 Amino Acid SEQ ID NO 734 736 738 740 742 744 746 748 Identification Code RXN02802 F RXA02802 F RXA00438 RXN00437 F RXA00437 RXN00439 F RXA00439 F RXAO0442 Conig NT Start NT Stop W0112 17369 16299 GR00783 7 474 GROO103 362 796 W0112 17824 17369 GROO103 3 362 W0112 18742 18275 GROO104 2 196 GROO105 830 1087 Function MOLYBDOPTERIN BIOSYNTHESIS MOEB PROTEIN MOLYBDOPTERIN BIOSYNTHESIS MOEB PROTEIN MOLYBDOPTERIN BIOSYNTHESIS MOEB PROTEIN MOLYBDOPTERIN (MPT) CONVERTING FACTOR, SUBUNIT 2 MOLYBDOPTERIN (MPT) CONVERTING FACTOR, SUBUNIT 2 MOLYBDOPTERIN CO-FACTOR SYNTHESIS PROTEIN MOLYBDOPTERIN CO-FACTOR SYNTHESIS PROTEIN MOLYBDOPTERIN CO-FACTOR SYNTHESIS PROTEIN 2007202378 25 May 2007 Nucleic Acid SEQ ID NO 749 751 753 755 Amino Acid S EQ D NO 750 752 754 756 Identification Code RXA00440 RXN0044 1 F RXA00441 RX N02085 Cog.
Table 1 (continued) NT Start NT Stop Function 757 758 F RXA02085 759 760 F RXA02086 RXN02648 F RXA02648 763 764 765 766 F R.XA02658 RXA01516 RXA0 1515 RXA02024 RXAO 1719 RXA01 720 RXS 032 23 F RXAO 1970 RXA02629 RXA0231 8 RYAO 1517 RXN01 304 RX S025 56 RX502560 GROOU104 196 654 MOLYBDENUM COFACTOR BIOSYNTHESIS PROTEIN CB W01 12 19942 18779 MOLYBOOPTERIN CO-FACTOR SYNTHESIS PROTEIN GROO105 2 793 MOLYBDOPTERIN CO-FACTOR SYNTHESIS PROTEIN 5-METHYLTETRAHY0ROPTEROYLTRIGLUTAMTE.HOMOCYSTEINE METHYLTPRANSFERASE (EC 2.1.1.14) GR00629 3496 5295 5.METHYLTETRAHYROPTEROYLTRIGLUTAATE..HOMOCYSTEINE METHYLTRANSFERASE (EC 2.1.1.14) GR00629 5252 5731 5-METHYLTETRAHYDROPTEROYLTRIGLUTAMATE.-HOMOCYSTEINE METHYLTRANSFERASE (EC 2.1.1.14) 5-METHYLTETRAHYDROPTEROYLTRIGLUTAMATE.HOMOCYSTEINE METHYLTRANSFERASE (EC 2.1.1.14) GR00751 5254 4730 5-METHYLTETRAHYROPTEROYLTRIGLUTAATE.-HOMOCYSTEINE METHYLTRANSFERASE (EC 2.1.1.14) GR00752 14764 15447 S.ETHYLTETRAHYDROPTEROYLTRIGLUTAATE-HOMOCYSTEINE METHYLTRANSFERASE (EC 2.1.1.14) GR00424 22360 22749 DIHYORONEOPTERIN ALOOLASE (EC 4.1.2.25) GR00424 21513 22364 DIHYOROPTEROATE SYNTHASE (EC 2.5.1.15) GR00613 4026 4784 DIHYDROPTEROATE SYNTHASE (EC 2.5.1.15) GR00488 1264 704 MOLYBDOPTERIN-GUANINE DINUCLEOTIDE BIOSYNTHESIS PROTEIN A GR00488 2476 1268 MOLYBDOPTERIN BIOSYNTHESIS MOEA PROTEIN MOLYBDOPTERIN BIOSYNTHESIS MOEA PROTEIN GR00568 2 1207 MOLYBDOPTERIN BIOSYNTHESIS MOEA PROTEIN GR00748 1274 690 MOLYBDOPTERIN BIOSYNTHESIS CNX1 PROTEIN GR00665 9684 9962 (090909) plerin-4a-carbinolamine dehydratase [Synechocystls sp.) GR00424 22752 23228 2
.AMINO.
4 -HYDROXY-6HYDROXYMETHYLDIHYDROPTE RI DINE PYROPHOSPHOKINASE (EC 2.7.6.3) W01148 4449 4934 MOLYBDOPTERIN BIOSYNTHESIS MOG PROTEIN FLAVOHEMOPROTEIN DIHYDROPTERIDINE REDUCTASE (EC 1.6.99.7) OXYGEN-IN SENSITIVE NAO(P)H NITROREDUCTASE (EC 1..)I DtH-YDROPTERI DINE REOUCTASE (ECl..6.99.7) Vitamin 13 12 porphyrins and heme metabolism Nucleic Acid SEQ ID NO 793 795 797 799 801 803 805 807 809 811 Amino Acid S;EQ I D NO 794 796 798 800 802 804 806 808 810 812 Identification Code Contlg. NT Start NT Stop FunctIon RXA00382 RXA00 156 RXA00624 RXA00306 RXA00884 RXN02503 F RXA02503 RXA00 377 RXN02504 F RXA02504 GRO0082 GRO0023 GRO0163 GROO51 GR00242 WOG07 GR00720 GROO081 W0007 GR00720 2752 10509 7910 2206 10137 22456 16906 1427 22805 17379 1451 9400 8596 1274 11276 22854 17340 306 23362 17816 GLUTAMATE-1-SEMIALDEHYDE 2.1 -AMINOMUTASE (EC 5.4.3.8) FERROCHELATASE (EC 4.99.1.1) FERROCHELATASE (EC 4.99. 1. 1) HEMK PROTEIN OXYGEN-INDEPENDENT COPROPORPHYRINOGEN III OXIDASE (EC PORPHOBILINOGEN DEAMINASE (EC 4.3.1.8) PORPHOBILINOGEN DEAMINASE (EC, 4.3.1.8) UROPORPHYRINOGEN DECARBOXYLASE (EC 4.1.1.37) PORPHOBILINOGEN DEAMINASE (EC 4.3.1.8) PORPHOB1LINOGEN DEAMINASE (EC 4.3.1.8) 2007202378 25 May 2007 Nucleic Acid SEQ ID NO 813 815 817 8 19 Amino Acid SEQ I0 NO 814 816 818 820 Identification Code RXN01 162 F RXAO11162 RXA0 1692 RXN00371 821 822 823 824 F RXAOO37I1 F RXA00374 RXN00383 F RXA00376 F RXA00383 RXA01253 RX.A02134 RXA02 135 RXA02 136 RXN031 14 RXN0 1810 RXS03205 F RXA003O6 RXCO 1715 C22!±l W0088 GR00330 GR00474 W0226 GR00078 GROO079 W0223 GRO0081 GRO0082 GR00365 GR00639 GR00639 GR00639 WV0088 WV0082 Table I (continued) NT Start NT Slop Function 1849 524 PRECORRIN-15Y METHYLASE (EC 1248 4 PRECORRIN-6Y METHYLASE (EC 1498 749 UROPORPHYRIN-111 C-METHYLTRANSFERASE (EC 2.1.1.107) 4180 5973 UROPORPH-YRIN-111 0-METHYLTRANSFERASE (EC 2.1.1.107)1 UROPORPHYRINOGEN-111 SYNTHASE (EC 4.2.1.75) 929 6 UROPORPHYRIN-111 C-METHYLTRANSFERASE (EC 2.1.1.107) UROPORPHYRINOGEN-111 SYNTH-ASE (EC 4.2.1.75) 1102 371 UROPORPHYRIN-111 C-METHYLTRANSFERASE (EC 2.1.1.107)1/ UROPORPHYRINOGEN-111 SYNTHASE (EC 4.2.1.75) 4206 2863 PROTOPORPHYRINOGEN OXIDASE (EC 1.3.3.4) 287 6 PROTOPORPHYRINOGEN OXIDASE (EC 1.3.3.4) 3876 2863 PROTOPORPHYRINOGEN OXIDASE (EC 1.3.3.4) 2536 1787 COBYRIC ACID SYNTHASE 1721 801 COBALAMIN (5'-PHOSPHATE) SYNTHASE 2809 1736 NICOTINATE-NUCLEOTIDE..DIMETHYLBENZIMIDAZOLE PHOSPHORIBOSYLTRANSFERASE (EC 2.4.2.2 1) 3362 2841 COBINAMIDE KINASE COBINAMIDE PHOSPHATE GUANYLYLTRANSFERASE 1 552 COBG PROTEIN (EC 1739 663 HEMIN-BINDING PERIPLASMIC. PROTEIN HMUT PRECURSOR HEMK PROTEIN HEMK PROTEIN CYTOSOLIC PROTEIN INVOLVED IN PORPHYRIN METABOLISM Vitamin C precursors Nucleic Acid SEQ ID NO 849 Amino Acid SEQ ID NO 850 Identification Code RXN00420 F RXA00420 F RXA00426 RXN 00708 F R.XA00708 RXA02373 RXS00389 RXS0O41 9 RXCO04 16 RXC02206 Contig.
VVOI 112 GR00096 GR00097 wo0005 GROO185 GR00688 NT Start NT Stop Fun clion 2511 2 1737 4678 2030 1540 L-GULONOLACTONE OXIDASE (EC 1.1.3.8) L-GULONOLACTONE OXIDASE (EC 4.1.3.8) L-GULONOLACTONE OXIDASE (EC 1138 2,5-DIKETO-D-GLUCONIC ACID REDUCTASE (EC 1. 2.5-DIKETO-D-GLUCONIC ACID REDUCTASE (EC 111- 2,5-DIKETO.D-GLUCONIC ACID REDUCTASE (EC 1. 1.1l.-) oxoglutarate semlidehyde dehydrogenaie (EC 1.2.1l..) ACETOACETYL-COA REDUCTASE (EC 1.1. 1.36) MEMBRANE SPANNING PROTEININVOLVED IN METABOLISM OF VITAMIN C PRECURSORS OXIDOREDUCTASE INVOLVED IN METABOLISM OF VITAMIN C PRECURSORS Vitamin K2 Nucleic Acid SEQ ID NO 869 Amino Acid SEQ ID NO 8-0 Identification Code Contig RXS 03074 NT Start NT Stop Functlon S-ADENOSYLMETHIONINE:2DEMETHYLMENAQUINONE METHYLTRANSFERASE (EC 2.1.- 2007202378 25 May 2007 Nucleic Acid SEQ ID NO 871 873 Amino Acid Identification Code Cot.
S§EQ I1D NO 872 F RXA02906 GR10044 Table I (continued) NT Start NT Stop Function 1142 874 RXA02315 RXA0231 9 RXS00393 F RXAD0393 RXA00391 RXS02908 GR00665 8011 GR00665 9977 GRO0086 4030 GR00086 2031 645 S-ADENOSYLIVETHION INE:2-OEM ETHYLIVENAQUI NONE METHYLTRANSFERASE (EC 6383 2 -SUCCINYL-6.HYDROXY-.4-CYCLOHEXADIENE-1
-CARBOXYLATE
SYNTHASE /2-OXOGLUTARATE DECARBOXYLASE (EC 4.1.1.71) 10933 NAPHTHOATE SYNTHASE (EC 4.1.3.36) I ,4-DIHYDROXY-2-NAPHTHOATE OCTAPRENYLTRANSFEMASE (EC 4911 1.4-DIHYDROXY.2-NAPHTHOATE OCTAPRENYLTRANSFERASE (EC 2750 O-SUCCINYLBENZOIC ACID-COA LIGASE (EC 6.2.1.26) O-SIJCCINYLBENZOIC ACID-COA LIGASE (EC 6.2.1.26) Ubiquinone biosynthesis Nucleic Acid SEQ ID NO 885 887 889 891 Amino Acid SEQ I1D NO 886 888 690 892 Identification Code Cont NT Start NT Stop Function.
RXA00997 RXA02189 RXA0231 1 RXN0291 2 RXS00998 GR00283 GR00642 GROO665 W01 35 2389 986 3073 13299 1608 249 2384 12547 3-DEMETHYLUBIQUINONE-9 3-METH-YLTRANSFERASE (EC 2.1.1.64) 3-DEMETHYLUBIQUINONE-9 3-METHYLTRANSFERASE (EC 2.1.1.64) 3-DEMETHYLUBIQUINONE-9 3-METHYLTRANSFE RASE (EC 2.1.1.64) U131QUI NON E/MIENAQUINONE BIOSYNTHESIS METHLYTRANSFERASE
UBIE
(EC 2. 1. COMA OPERON PROTEIN 2 893 894 Purines and Pyrimidines and other Nucleotides Regulation of purine and pyrimidine, biosynthesis pathways Purine metabolism Purine Biosynthesis Nucleic Acid SEQ 0 NO 895 897 899 901 903 905 907 Amino Acid S EQ 1D NO 896 898 900 902 904 906 908 Identification Code Contig.
NT Start NT Stop Function RXA01 215 RXN00558 F RXA00558 RXN00626 F RXA00629 F RXA00626 RXA02623 GR00352 1187 213 RIBOSE-PHOSPHATE PYROPHOSPHOKINASE. PRPP synthetase (EC 2.7.6. 1) WO0103 8235 9581 AMIDOPHOSPHORIBOSYLTR.ANSFERASE (EC 2.4.2.14) GROO1 48 61 501 AMIDOPHOSPHORIBOSYLTRANSFEPASE (EC 2.4.2.14) WV0135 11624 10362 PHOSPHORIBOSYLAMINE.-GLYCINE LIGASE (EC 6.3.4.13) GR00165 1450 1713 PHOSPHORIBOSYLAMINE-GLYCINE LIGASE (EC 6.3.4.13) GR00164 1 780 PHOSPHORIBOSYLAMINE.-GLYCINE LIGASE, GARS (EC 6.3.4.13) GR00746 4875 4285 PHOSPHORIBOSYLAMINE-GLYCINE LIGASE (EC 6.3.4.13)1/ PHOSPHORI8OSYLFORMYLGLYCINA1IDINE CYCLO.LIGASE (EC 6.3.3.1) 1 PHOSPHORIBOSYLGLYCINAMIDE FORMYLTRANSFE RASE (EC 2.1.2.2) GR00418 10277 9054 PHOSPHORIBOSYLGLYCINAMIDE FORMYLTRANSFERASE 2 (EC 909 910 RXA01442 2007202378 25 May 2007 Table 1 (continued) NT Start NT Stop Function Nucleic Acid SEQ ID NO 911 913 915 917 919 921 923 925 927 929 931 933 935 937 939 941 943 Amino Acd SEQ I0 NO 914 916 918 920 922 924 926 928 930 932 934 936 938 940 942 944 Identifir-ation Code RXN00537 F RXA02805 F RXA00537 F RXAOO56I RXA00541 RXA00620 RXN00770 F RXA00557 F RX~A00770 RXN02345 F RXA02345 RXN02350 F RXA02346 F RXA02350 RXA0 1087 RXA0061 9 RXA02622 Contig.
GR00786 GROC 138 GROO150 GR00139 GROO 163 W0103 GROO 147 GR00204 WV0078 GR00676 W00O78 GR00677 GR00678 GR00304 GROO 163 GR00746 3351 54 23 2 2269 3049 9614 15 7809 4788 1534 8369 127 11120 498 793 4274 5636 638 697 280 2937 3939 10783 818 7495 5984 725 8863 5 911 1373 2220 2715 PHOSPHORIBOSYLFORMYLGLYCINAMIDINE SYNTHASE (EC 6.3.5.3) PHOSPHORIBOSYLFORMYLGLYCINAMIDINE SYNTHASE (EC 6.3.5.3) PHOSPHORIBOSYLFORMYLGLYCINAMIOINE SYNTHASE (EC 6.3.5.3) PHOSPHORIBOSYLFORMYLGLYCINAMIDINE SYNTR-ASE (EC 6.3.5.3) PHOSPHORIBOSYLFORMYLGLYCINAMIDINE SYNTI-ASE (EC 6.3.5.3) PHOSPHORIBOSYLAMIOOIMIDAZOLE-SUCCINOCARBOXAMIDE SYNTHASE (EC 6.3.2,6) PHOSPH-ORIBOSYLFORMYLGLYCINAMIDINE CYCLO-LIGASE (EC 6.3.3.1) PHOSPHORIBOSYLFORMYLGLYCINAMIDINE CYCLO-LIGASE (EC 6.3.3.1) PHOSPHORIBOSYLFORMYLGLYCINAMIDINE CYCLO-LIGASE (EC 6.3.3.1) PHOSPHORIBOSYLAMINOIMIDAzOLE CARBOXYLASE ATPASE SUBUNIT (EC 4.1.1.21) PHOSPHORIBOSYLAMINOIMIDAZOLE CARBOXYLASE ATPASE SUBUNIT (EC 4.1.1.21) PHOSPHORIBOSYLAMINOIMIDAZOLE CARBOXYLASE CATALYTIC SUBUNIT (EC 4.11A.21) PHOSPHORIBOSYLAMINOIMIDAZOLE CARBOXYLASE CATALYTIC SUBUNIT (EC 4.1.1.21) PHOSPHORIBOSYLAMINOIMIDAZOLE CARBOXYLASE CATALYTIC SUBUNIT (EC 4.1.1.2 1) PHOSPHORIBOSYLAMINOIMIDAZOLE CARBOXYLASE (EC 4.1.1.21) ADENYLOSUCCINATE LYASE (EC 4.3.2.2) PHOSPHORIBOSYLAMINOIMIDAZOLECARBOXAMIDE FORMYLTRANSFERASE (EC 2.11.2.3) IMP CYCLOHYDROLASE (EC 3.5.4.10) GMVP, GDP, AMP and ADP synthesis, from inosine-5'-monophosphate (IMP) Nucleic Acid Amino Acid Identification Code Contig NT Start NT Stop Function SEQ I0 NO SEQ ID NO 945 946 RXN00488 W0086 19066 20583 INOSINE-5'-MONOPHOSPHATE DEHN01ROGENASE (EC 1.1.1.205) 947 948 F RXA00492 GR00122 1171 1644 INOSINE-5-MONOPHOSPHATE OEHYDROGENASE (EC 1.1.1.205) 949 950 F RXA00488 GR00121 1 534 INOSINE-5'-MONOPHOSPHATEDCEHYDROGENASE (EC 1.1.1.205) 951 952 RXA02469 GR00715 1927 497 INOSINE-5'-MONOPHOSPHATE DEHYQROGENASE (EC 1.1.1.205) 953 954 RXN00487 W0086 23734 25302 GMP SYNTHASE (GLUTAMINE-HYDROLYZING] (EC 6.3.5.2) 955 956 F RXA00487 GROO120 712 2097 GMP SYNTHASE (EC 6.3.4. 1) 957 958 RXA02237 GR00654 4577 5146 GUANYLATE KINASE (EC 2.7.4.8) 959 960 RXA01446 GROO418 17765 16476 ADENYLOSUCCINATE SYNTHETASE (EC 6.3.4.4) 961 962 RXA00619 GR00163 793 2220 ADENYLOSUCCINATE LYASE (EC 4.3.2.2) 963 964 RXA00688 GR00179 10443 10985 ADENYLATE KINASE (EC 2.7.4.3) 965 966 RXA00266 GROO040 3769 3362 NUCLEOSIDE DIPHOSPHATE KINASE (EC 2.7.4.6) 2007202378 25 May 2007 Table I (continued) GMPIAMP degrading activities Nucleic Acid Amino Acid Identification Code Contig NT Start NT Stop Function SEQ IDNO SEQ IDNO 967 968 RXA00489 GR00121 654 1775 GMP REDUCTASE (CC 1.6.6.8) 969 970 RXN02281 W0152 1893 3323 AMP NUCLEOSIDASE (EC 3.2.2.4) 971 972 F RXA02281 CR00659 1101 34 AMP NUCLEOSIDASE (EC 3.2.2.4) Pyrimidine metabolism Pyrimidine biosynthesis de novo: Nucleic Acid Amino Acid Identification Code Contig NT Start NT Stop Function SEQ ID NO SEQID0NO 973 974 RXA00147 GR00022 9722 10900 CARBAMOYL-PHOSPHATE SYNTI-ASE SMALL CHAIN (EC 6.3.5.5) 975 976 RXA00145 CR00022 7258 8193 ASPARTATE CARBAMOYLTRANSFERASE CATALYTIC CHAIN (EC 2.1.3.2) 977 978 RXA00146 CR00022 8249 9589 DIHYDROOROTASE (EC 3.5.2.3) 979 980 RXA02208 CR00647 2 1003 DIHYDROOROTATE DEHYDROGENASE (EC 1.3.3.1) 981 982 RXA0 1660 CR00462 591 1142 OROTATE PHOSPHORIBOSYLTRANSFERASE (EC 2.4.2. 983 984 RXA02235 CR00654 3207 4040 OROTIDINE S-PHOSPHATE DECARBOXYLASE (EC 4.1.1.23) 985 986 RXN01892 W0150 3020 3748 URIDYLATE KINASE (EC 987 988 IF R.XAO1 892 CR00542 47 775 URIDYLATE KINASE (EC 989 990 RXA00105 GR00014 16672 17346 THYMIDYLATE SYNTHASE (EC 2.1.1.45) 991 992 RXA00131 CR00020 7621 7013 THYMIDYLATE KINASE (EC 2.7.4.9) 993 .994 RXA00266 CR00040 3769 3362 NUCLEOSIDE DIPHOSPHATE KINASE (EC 2.7.4.6) 995 996 RXA0071 8 CR00188 4576 5283 CYTIDYLATE KINASE(EC 2.7.4.14).
997 998 RXA01599 CR00447 8780 10441 CTP SYNTHASE (EC 6.3.4.2) 999 1000 RXN02234 WV0134 24708 28046 CARBAMOYL-PHOSPHATE SYNTHASE LARGE CHAIN (EC 6.3.5.5) 1001 1002 F RXA02234 CR00654 1 3198 CARBAMOYL-PHOSPHATE SYNTHA:SV IIARGE CHAIN (EC 6.3.5.5) 1003 1004 RXNO0450 W0O1 12 34491 34814 CYTOSINE DEAMINASE (EC 3.5.4.1) 1005 1006 F RXA00450 GR00110 322 5 CYTOSINE DEAMINASE (EC 3.5.4.1).
1007 1008 RXN02272 W0020 15566 16810 CYTOSINEDEAMINASE(EC354.1).
1009 1010 F RXA02272 CR00655 6691 7935 CREATININE DEAMINASE (EC 3.5.4.21) 1011 1012 RXN03004 WV0237 1862 2341 DEOXYCYTIDINE TRIPHOSPHATE DEAMINASE (EC 3.5.4. 13) 1013 1014 RXN03137 W0129 9680 9579 THYMIDYLATE SYNTHASE (EC 2.1.1.45) 1015 1016 R.XN03171 W0328 568 1080 URACIL PHOSPHORIBOSYLTRANSFERASE (EC 2.4.2.9) 1017 1018 F RXA02857 CR10003 570 1082 URACIL PHOSPHORIBOSYLTRANSFERASE (EC 2.4.2.9) 2007202378 25 May 2007 Table I (continued) Purine and pyrimidine base, nucleoside and nucleotide salvage, interconversion, reduction and degradation: Purines: Nucleic Acid Amino Acid Identification Code Contig NT Start NT Stop Function SEQ ID NO SEC 1 019 1020 RXA02771 GR00772 1329 1883 ADENINE PHOSPHORIBOSYLTRANSFERASE (EC 2.4.2.7) 1021 1022 RXA01512 GROO424 17633 18232 HYPOXANTHINE-GUANINE PHOS PHORIBOSYLTRANSFE SE (EC 2.4.2.8) 1023 1024 RXA02031 GR00618 3820 3347 XANTHINE-GUANINE PHOSPHORIBOSYLTANSFEASE (EC 2.4.2.22) 1025 1026 RXA00981 GR00276 3388 4017 GTP PYROPHOSPHOKINASE (EC 2.7.6.5) 1027 1028 RXN02772 W0O171 2045 1011 GUANOSINE-3',5.BIS(DIPHOSPHATE) 3'-PYROPHOSPHOHYDROLASE (EC 3.1.7.2) 1029 1030 F RXA02772 GR00772 1962 2741 GUANOSINE-3'.5'-BIS(DIPHOSPHATE) 3-PYROPHOSPHOHYDROLASE (EC 3.1.7.2) 1031 1032 F RXA02773 GR00772 2741 2902 GUANOSINE-3,5'-BIS(oIPHOSPHATE) 3-PYROPHOSPHOHYDROLASE (EC 3.1.7.2) 1033 1034 RXA01835 GROO517 3147 3677 GUANOSINE-3,5'-BIS(DIPHOSPHATE) 3X-PYROPHOSPHOHYDROLASE (EC 3.1;7.2) 1035 1036 RXA01483 GR00422 19511 18240 OEOXYGUANOSINETRIPHOSPHATE TRIPHOSPHOHYDROLJASE (EC 3.1.5.1) 1037 1038 RXN01027 WV0143 5761 6768 DIADENOSINE 5',5-'-P1,P4-TETRAPHOSPHATE HYDROLASE (EC 3.6.1.17) 1039 1040 F RXAO 1024 GR00293 661 5 DIADENOSINE P4-TETRAPHOS PHATE HYDROLASE C 3.6.1.17) 1041 1042 F RXAOIO27 GR00294 2580 2347 DIADENOSINE5 5-P1,P4.TETRAPHOSP HATE HYDROLASE (C 3.6.1.17) 1043 1044 RXA01528 GR00425 5653 5126 DIADENOSINE 5',5--P1,P4-TETRAPHOSPHATE HYDROLASE (EC 3.6.1.17) 1045 1046 RXAOOO72 GR00012 446 6 PHOSPHOADENOSINE PHOSPHOSULFATE REDUCTASE (EC 1.8.99.4) 1047 1048 RXA01878 GR00537 1239 2117 DIMETHYLADENOSINE TRANSFERASE (C 2. 1. 1049 1050 RXN02281 WV0152 1893 3323 AMP NUCLEOSIDASE (EC 3.2.2.4) 1051 1052 F RXA02281 GR00659 1101 34 AMP NUCLEOSIDASE (EC 3.2.2.4) 1053 1054 RXN01240 WO0090 30442 29420 GTP PYROPHOSPHOKINASE (EC 2.7.6.5) 1055 1056 RXN02008 WV0171 1138 5 GUANOSINE-3*.5-BIS(DIPHOSPHATE) 3X-PYROPHOSPHOHYDROLASE (EC 3.1.7.2) Pyrimdine and purine metabolism: Nucleic Acid Amino Acid Identification Code Contig, NT Start NT Stop Function SEQ ID NO 1057 1058 RXN01940 WV0120 10268 9333 INOSINE-URIDINE PREFERRING NUCLEOSIDE HYDROLASE (C 3.2.2.1) 1059 1060 F RXA01940 GR00557 3 581 INOSINE-URIDINE PREFERRING NUCLEOSIDE HYDROLASE (EC 3.2.2. 1) 1061 1062 RXA02559 GR00731 5418 6320 INOSINE-IJRIDINE PREFERRING NUCLEOSIDE HYDROLASE (C 3.2.2.1) 1063 1064 RXA02497 GR00720 10059 10985 EXOPOLYPHOSPHAT 'ASE (EC 3.6. 1.11) 1065 1066 RXN01079 WV0084 38084 35982 RIBONUCLEOSIOE.DIPHOSPHATE REDUCTASE ALPHA CHAIN (C 1. 17.4. 1) 1067 1068 F RXA01O079 GR0O301 693 4 RIBONUCLEOSIDE-DIPHOSPHATE REDUCTASE ALPHA CHAIN (C 1. 17.4. 1) 1069 1070 F RXA01 084 GR00302 3.402 2062 RIBONUCLEOSIDE-DIPHOSPHATE REDUCTASE ALPHA CHAIN (EC 1.17.4. 1) 1071 1072 RXN01920 WV0084 32843 31842 RIBONUCLEOSIDE-OIPHOSPHATE REDUCTASE 2 BETA CHAIN (EC 1. 17.4. 1) 1073 1074 F RXA01920 GROO550 1321 908 RIBONUCLEOTIDE REDUCTASE SUBUNIT R2F 1075 1076 RXA01080 GROO301 1240 797 NRDI PROTEIN 1077 1078 RXA00867 GR00237 1 627 POLYRIBONUCLEOTIDE NUCLEOTIDYLTRANSFERASE (C 2.7.7.8) 1079 1080 RXAG1416 GR00413 2 631 POLYRIBONUCLEOTIDE NUCLEOTIDYLTRANSFERASE (EC 2.7.7.8) 1081 1082 RXA01 486 GR00423 660 4 POLYRIBONUCLEOTIOE NUCLEOTIDYLTRANSFERASE (C 2.7.7.8) 2007202378 25 May 2007 Table 1 (continued) Nucleic Acid Amino Acid Identification Code Contig N T Start NT Stop Function SEQ IDNO 1083 1084 RXA01678 GR00467 7162 7689 2,.3'-CYCLIC-NUCLEOTIDE X.PHOSPHODIESTERASE (EC 3.1.4.16) 1085 1086 RXA01679 GR00467 7729 8964 2'.3-CYCLIC-NUCLEOTIDE 2'-PHOSPHODIESTERASE (EC 3.1.4.16) 1087 1088 RXN01488 W0139 39842 40789 INOSINE-URIDINE PREFERRING NUCLEOSIDE HYOROLASE (EC 3.2.2.1) 1089 1090 RXCO0540 CYTOSOLIC PROTEIN INVOLVED IN PURINE METABOLISM 1091 1092 RXCO0560 PROTEIN INVOLVED IN PURINE METABOLISM 1093 1094 RXCO1088 CYTOSOLIC PROTEIN INVOLVED IN PURINE METABOLISM 1095 1096 RXC02624 MEMBRANE SPANNING PROTEIN INVOLVED IN PURINE METABOLISM 1097 1098 RXC02665 PROTEIN INVOLVED IN PURINE METABOLISM 1099 1100 RXC02770 LIPOPROTEIN INVOLVED IN PURINE METABOLISM 1101 1102 RXC02238 PROTEIN INVOLVED IN METABOLISM OF S-ADENOSYLMETHIONINE,
PURINES
AND PANTOTHENATE 1103 1104 RXC01946 ABC TRANSPORTER ATP-BINDING PROTEIN INVOLVED IN PURINE
METABOLISM
Pyrimd ines: Nucleic Acid Amino Acid Identification Code Contig NT Start NT Sop Function SEQ 10 NO SEQ ID NO 1105 1106 RXN03 171 WV0328 568 1080 URACIL PHOSPHORIBOSYLTRANSFERASE (EC 2.4.2.9) 1107 1108 F RXA02857 GR10003 570 1082 URACIL PHOSPHORIBOSYLTRANSFERASE (EC2.4.2.9) 1109 1110 RXN00450 WVOl 12 34491 34814 CYTOSINE DEAMINASE (EC 3.5.4.1) 1111 1112 F RXA00450 GROW 10 322 5 CYTOSINE DEAMINASE (EC 3.54.1) 1113 1114 RXA00465 GROW117 337 828 CYTOSINE DEAMINASE (EC354.1) 1115 1116 RXA0071 7 GROO1 88 3617 4576 RIBOSOMAL LARGE SUBUNIT PSEUDOURIDINE SYNTH-ASE B (EC 4.2.1.70) 1117 1118 RXA01894 GR00542 1622 2476 PHOSPIATIDATE CYTIDYLYLTRANSFERASE (EC 2.7.7.41) 1119 1120 RXA02536 GR00726 8581 7826 BETA-UREIDOPROPIONASE (EC 3.5.1.6) 1121 1122 RXN01209 WV0270 1019 2446 PHOSPHOMETH-YLPYRIMIDINE KINASE(EC27 4 7 1123 1124 F RXA01209 GR00348 1019 2446 PHOSPHOMETHYLPYRIMIDINE KINASE (EC27.
4 7 1125 1126 RXN01 617 WV0050 22187 22858 PHOSPHOMETHYLPYRIMIDINE KINASEJEC 2.7.4.7) 1127 1128 F RXA01617 GR00451 2 616 PHOSPHOMETHYLPYRIMIDINE KINASE(EC274.7) 1129 1130 RXCO1600 CYTOSOLIC PROTEIN INVOLVED IN PYRIMIDINE METABOLISM 1131 1132 RXC01622 CYTOSOLIC PROTEIN INVOLVED IN PYRIMIDINE METABOLISM 1133 1134 RXCO0128 EXPORTED PROTEIN INVOLVED IN MkTABOLISM OF PYRIDIMES AND
ADENOSYLHOMOCYSTEINE
1135 1136 RXCO1709 CYTOSOLIC PROTEIN INVOLVED IN PYRIMIDINE METABOLISM 1137 1138 RXC02207 EXPORTED PROTEIN INVOLVED IN PYRIMIDINE METAbOLISM 2007202378 25 May 2007 Table 1 (continued) Sugars Trehalose Nucleic Acid SEQ ID NO 1139 1141 1143 1145 1147 1149 1151 1153 1 155 Amino Acid SEQ ID NO 1140 1142 1144 1146 1148 1150 1152 1154 1156 Identification Code qonl NT Start NT Stop Function RXA00347 RXN0 1239 F RXA01 239 RXA02645 RXN02355 RXN02909 RXS00349 RXS031 83 RXCO0874 GROO065 WV0090 GR00358 GR00751 WO0051 VV0 135 246 32921 5147 714 735 38532 1013 30489 ~7579 2543 4 39017 TREHALOSE-PHOSPHATASE (EC 3.1.3.12) maltooligosyltrehalose synthase malItooligosyltreha lose synthase miaItooligosyltreha lose trehalohydrolase TREHALOSEIMALTOSE BINDING PROTEIN Hypothetical Trehalose-Binding Protein Hypothetical Treha lose Transport Protein TREHALOSEIMALTOSE BINDING PROTEIN TRANSMEBRANE PROTEIN INVOLVED IN TREHALOSE METABOLISM 2007202378 25 May 2007 TABLE 2 Excluded Genes GenBankM Gene Name Gene Function Reference Accession No.
A09073 ppg Phosphoenol pyruvate carboxylase Bachmann, B. et al. "DNA fragment coding for phosphoenolpyruvat corboxylase, recombinant DNA carrying said fragment, strains carrying the recombinant D~NA and method for producing L-aminino acids using said strains," Patent: EP 0358940-A 3 03/21/90 A45579, Threonine dlehydratase Moeckel, B. et al. "Production of L-isoleucine by mecans of -recombinant A4558 1, micro-organisms with deregulated threonine dehydratase," Patent: WO A45583, 9519442-A 5 07/20/95 A45585 A45587 AB0O3 132 murC; ftsQ; ftsZ Kobayashi, M. et al. "Cloning, sequencing, and characterization of the ftsZ gene from coryneform bacteria," Biocihem. Biophys. Res. Commun., (1997) AB015023 murC; ftsQ Wachi, M. et al. "A murC gene from Coryneform bacteria," App!. Microbiol.
51(2):223-228 (1999) AB018530 dtsR Kimura, E. et al. "Molecular cloning of a novel gene, dtsR, which rescues the detergent sensitivity of a mutant derived from Brevibacterium lactofermentum," Biosci. Biolechnol. Biochem., 60(10): 1565-1570 (1996) AB018531 dtsRI; dtsR2 AB020624 murl D-glutamate racemase AB023377 tkt transketolase AB024708 gltB; gltD Glutamine 2-oxoglutarate aminotransferase large and small subunits AB025424 acn aconitase AB027714 rep Replication protein AB027715 rep; aad Replication protein; aminoglycoside adenyltransferase A F005242 argC N-acetylglutamate-5-sem ialdehyde dlehydrogenase AF005635 gInA Glutamine AF030405 hisF cyclase A F030520 argG Argininosuccinate synthetase AF03 1518 argF Ornithine carbamolytransferase A F036932 aroD 3-dehydroquinate j v e m oxv ase 2007202378 25 May 2007 Table 2 (continued)Igeei AF038651I dciAE; apt; rel Dipeptide-binding protein; adenine Wehmeier, L. et al. "The role of the Corynebacteriumn glutamicumn rel eei phosphoribosyltransferase;, GTP (p)ppGpp metabolism," Microbiology, 144:1853-1862 (1998) AF04 1436 argR Arginine repressor A F045998 impA Inositol monophosphate phosphatase AF048764 argH Argininosuccinate lyase AF049897 argC; argJ; argB; N-acetylglutamylphosphate reductase; argD; argF; argR; ornithine acetyltransferase; NargG; argH acetylglutamnate kinase; acetylomithine transminase; ornithine carbamoyltransferase; arginine repressor; argininosuccinate synthase; ______________argininosuccinate lyase A FOSO 109 inhA Enoyl-acyl carrier protein reductase AF050 166 hisG ATP phosphoribosyltransferase 1846 hisA Fh-osphoribosyl form imino-5-amino- I1phosphoribosyl-4-im idazolecarboxam ide isomerase AF052652 metA Homoserine 0-acety Itran sfe rase Park, S. et al. "Isolation and analysis of metA, a methionine biosynthetic gene encoding homoserine acetyltransferase in Corynebacterium glutam icum," Mo.
8(3):286-294 (1998) AF053071I aroB Dehydroquinate synthetase AF060558 hisH Glutamine amidotransferase AF086704 hisE Phosphoribosyl-ATPpyrophosphohydrolase AF1 14233 aroA 5-enolpyruvylshikimate 3-phosphate synthase A F 16184 panD L-aspartate-alpha-decarboxylase precursor Dusch, N. et al. "Expression of the Corynebacteriumn glutamicumn panD gene encoding L- as partate-al ph'adecarboxy lase leads to pantothenate overproduction in Escherichia coli," Appi. Environ. Microbiol., 65(4)1530- (1999) AF1245 18 aroD; aroE 3-dehydroquinase; shikimate AF124600 aroC; aroK; aroB; Chorismate synthase; shikimate kinase; 3pepQ dehydroquinate synthase; putative ic peptidase AF145897 inhA AF145898 2007202378 25 May 2007 2 (continued) AJO01436 ectP Transport of ectoine, glycine betaine, prolime AJ004934 idapD AJO07732 AJO 103 19 ppc; secG; amt;od soxA ftsY, glnB, ginD Y srp; amtP Tetrahydrodipicolinate 'succinyiase (incomplete') Phosp hoenolp)yruvate-carboxylase; high affinity ammonium uptake protein; putative om ith ine-cyc lodecarboxy lase; sarcosine oxidase Involved in cell division; PHt protein; uridylyltransferase (uridylyl-removing enzmye); signal recognition particle; low affnit am onim uptakeprotein Ch loramphen icol aceteyl transferase L-malate: quinone oxidoreductase- AJ132968 cat AJ224946 mqo A123 8250 AJ237Q03 ndh NADH dehydrogenase o-rA IPorin Peter, I-.et al. "Crnbce iglutaicum1111 is' eqipdwith four secondary carriers for compatible solutes: Identification, sequencing, and characterization of the prolinelectoine uptake system, ProP, and the ectoine/proline/glycine betaine carrier, EctP," J. Bacterial., 180(22):6005-6012 (1998) Wehrmnann, A. et al. "Different modes of diaminopim;elate synthesis and their role in cell wall integrity: A study with Corynebacterium glutamicurn,"
J.
Bact'eriol., 180C12):3 159-3165 (1998) Jakoby, M. et al. "Nitrogen regulation in Corynebacterium glutamicumn; Isolation of genes involved in biochemical characterization of caorresponding proteins," FEMS Microbial, 1 73(2):303-3 10 (1999) Mlctnger, T. et al. "Biochemnical and giophic characterization f he cl wall porin of Corynebacterium glutamicum: The channel is formed by a low molecular mass polypeptide," Biochemistry, y74)104102(98 Vertes et al."Isolation and characterization of 1S3 1831 a transposable element from Corynebacterium gkuthmicum," Mo!. Microbial., I11(4):739-746 (1994) Usuda, Y. et al. "Molecular cloning of the Corynebacterium glutamicum (Brevibacterium lactofeunentumn AJ 12036) odhA gene encoding a novel type of 2 -oxoglutarate dehydrogenase," Microbiolooy, 142:3347-3354 (1996) utilization of tryptophan operon gene expression and production of tryptophan," Patent: JP 1987244382-A 1 10/24/87 j DU 17429 1 Transposable elem ent 1S3 183 I1 D84102 odh-A E01358 hdh;hk ECQ 1359 2 -oxoglutarate dehydrogenase Homoserine deyrogenase;, homoserine kinase Upstream of the start codon of homoserine kinase gene Tryptophan operon Leader peptide; anthrani late 'synthase SEQ 1376 trpL; trpE 2007202378 25 May 2007 Tal2 confi~d) E01377 Promoter and o perator regions of Matsui, K. et al. "Tryptophan operon, peptide and protein coded thereby, tryptophan operon utilization of tryptophan operon gene expression and production of tryptophan," Patent: JP 1987244382-A 1 10124/87 E03937 Biotin-synthase Hatakeyama, K. et al. "DNA fragment containing gene capable of coding biotin synthetase and its utilization," Patent: JP 1992278088-A 1 10/02/92 E04040 Diamino pelargonic acid amino-transferase Kohama, K. et al. "Gene coding diaiino-pelargonic acid -aminotransferase and desthiobiotin synthetase and its utilization," Patent: JP 1992330284-A 1 E04041 Desthiobiotinsynthetase Kohama, K. et al. "Gene coding diaminopelargonic acid aminotransf-erase and desthiobiotin synthetase and its utilization," Patent: JP 1992330284-A I 11/18/92 E04307 Flavum aspartase Kurusu, Y. et al. "Gene DNA coding aspartase and utilization thereof," Patent: JP 1993030977-A 1 02/09/93 .1 Jdae 1 4 E0437 I
I
Isocitric acid lyase N-terminal fragment E04484 Prephenate dehydratase 108 Aspartokinase y ro- p chormate synthetase 1- I E 0U 7 7
I/I
Diaminopimelic acid dehydrogenase 20577 Threonine synthase E061 10 Prephenate dehydratase E061 11 Mutated Prephenate dehydratase E 06146 Acetohydroxy acid synthetase Katsumata, K. et al. Gene manarestation controlling DNA," Patent:- JP 1993056782-A 3 03/09/93 Katsumata, R. et al. "Gene manifestation controlling DNA," Patent -JP 1993056782-A 3 03/09/93 Sotouchi, N. et "Production of L-phenylklanine by fermentation," Patent: JP 1993076352-A 2 03/30/93 Fugono, N. et al. "Gene DNA coding Aspartokinase and its use," Patent: JP 1993184366-A 1 07/27/93 Hatakeyama, K. et al. "Gene DNA coding dihydrodipicolinic acid synthetase and its use," Patent: JP 1993184371 -A 1 07/27/93 Kobayashi, M. et al. "Gene DNA coding Diaminopimelic acid dehydrogenase and its use," Patent: J PA 993 284970-A 1 11/02/93 Kohama, K. et al. "Gene,.DNA coding threonine synthase and its -use," Patent: JP 1993284972-A 1 11/02/93 Kikuchi, T. et al. "Production of L-phenylalanine by fermentation method," Patent: JP 1993344881 12/27/93 Kikuchi, T. et "Production of L-phenylalanine by fermentation method," Patent: JP 199334488 1I-A 1 12/27/93 lnui, M. et al. "Gene capable of coding Acetohydroxy acid synthetase and its use,".Patent: JP 1993344893-A 1 12/27/93 Sugimoto, M. et al. "Mutant asparlokinase gene," patent: IP 1994062866-A 1 03/08/94 Sugimoto, M. et al. "Mutant aspartokinase gene," patent: JP 1994062866-A 1 03/08/94 A spat LokIIade E06826 i 1 kA A uate spitoUnaIsdeadpria suounit I I 2007202378 25 May 2007 2 (conti ued) E06827 Mutated aspartokinase alpha subunit Sugimoto, M. et al. "Mutant aspartokinase gene," patent: JP 1994062_866-A i E07701 secY Honno, N. et al. "Gene DNA participating in integration of membraneous to membrane," Patent: JP 1994169780-A 1 06121/94 E08177 Aspartokinase Sato, Y. et al. "Genetic DNA capable of coding Aspartokinase released from inhibition and its utilization," Patent: JP 1994261766-A 1 09/20/94 £08178, Feedback inhibition -released Aspartokinase Sato, Y. et al. "Genetic DNA capable of coding Aspartokinase released -from E08179, feedback inhibition and its utilization Patent: JP 1994261766-A 1 09/20/94 £08180, E0818 1, E08182 £08232 Acetohydroxy-acid isomeroreductase lnui, M. et al. "Gene DNA coding acetohydroxy acdisomeroreductase," JP 1994277067-A 1 10/04/94 £08234 secE Asai, Y. et al. "Gene DNA coding for. trans location machinery of protein," JP 1994277073-A 1 10/04/94 E08643 FT aminotransferase and desthiobiotin Hatakeyama, K. et al. "DNA fragment having promoter function in promoter region coryneform bacterium," Patent: JP 199503 1476-A 1 02/03/95 E08646 Biotin synthetase Hatakeyama, K. et al. "DNA fragment having promoter function in bacterium," Patent: JP 199503 1476-A 1 02/03/95 £08649 Aspartase Kohama, K. et al "DNA fragment having promoter function in coryneform bacterium," Patent: JP 199503 1478-A 1 02/03/95 E08900 Dihydrodipicolinate reductase Madori, M. et al. "DNA fragment containinig gene coding Dihydrodipicolinate reducta se and utilization thereof," Patent: JP 1995075578-A 1 03/20/95 £08901 Diaminopimelic acid decarboxylase Madori, M. et al. "DNA fragment containing gene coding Diaminopimelic acid and utiWatjon thereof," Patent: JP 1995075579-A 1 03/20/95 £12594 Serine hyd roxym ethyl trans ferase Hatakeyama, K. et al. "Pioduction of L-trypophan," Patent: JP 1997028391 -A 1 02/04/97 E£12760, transposase Moriya, M. et al. "Amplification of gene using artificial trnssnPaet El12759, J P 199707029 1 -A 03/18/7, £E12758 E12764 Arginyl-tRNA synthetase; diaminopimelic Moriya, M. et al. "Amplification of gene using artificial transposon," Patent: dlecarboxylase JIP 1997070291 -A 03/18/97 E12767 Dihydrodipicolinic acid synthetase Moriya, M. et al. "Amplification of gene using artificial transposon," Patent: JP 199707029 1I-A 03/18/97 £12770 aspartokinase Moriya, M. et al. "Amplification of gene using atficial transposon," Patent: 1997070291 -A 03/18/97 £12773 Dihydrodipicolinic acid reductase Moriya, M. et al. "Amplification of gene using artificial transposon," -Patent: 1997070291 -A 03/18/97 2007202378 25 May 2007 Table 2 (continued) El13655 Glucose-6-phosphate dehydrogenase H-atakeyama, K. et al, "Glucose-6-phosphate dehydrogenase and DNA capable of coding the same," Patent: JP 199722466 1-A 1 09/02/97 LO1508 INA Threonine dehydratase Moeckel, B. et al. "Functional and structural analysis of the threonine dehydratase of Corynebacterium glutamicum," J Bacleriol, 174:8065-8072 L07603 EC 4.2.1.15 3-deoxy-D-arabinoheptulosonatc-7- Chen, C. et al. "The cloning and nucleotide sequence of Corynebaclerium phosphate synthase glutam icum 3-deoxy-D-arabinoheptu losonate-7-phosphate synthase gene,"' Microbial. Left., 107:223-230 (1993) L09232 llvB; ilvN; ilvC Acetohydroxy acid synthase large subunit; Keilhauer, C. et al. "Isoleucine synthesis in Corynebacterium glutamicum: Acetohydroxy acid synthase small subunit; molecular analysis of the ilvB-ilvN-ilvC operon," J Bacterial, 175(17):5595- Acetohydroxy acid isomeroreductase 5603 (1993) L18874 PtsM Phosphoenolpyruvate sugar Fouet, A et al. "Bacillus subtilis sucrose-specific enzyme HI of the phosphotransferase phosphotransferase system: expression in Escherichia coli and homology to enzymes 11 from enteric bacteria,"'PNAS USA, 84(24):8773-8777 (1987); Lee, i.K. et al. "Nucleotide sequence of the gene encoding the Corynebacterium glutamicum mannose enzyme 11 and analyses of the deduced protein sequence," FEMS Microbial. Left., 1 19(1-2):137-145 (1994) L27 123 aceB Malate synthase Lee, H-S. et al. "Molecular characterization of aceB, a gene encoding malate synthase in Corynebacterium glutamicum," J Micrabial. Biatechnol., (1994) L27 126 Pyruvate kinase letten, M. S. et al. "Structural and functional analysis of pyruvate kinase fromn Corynebacterium glutam icum," Appi. Environ. Microbial, 60(7):250 1-2507 L28760 aceA Isocitrate lyase L35906 dtxr Diphtheria toxin repressor Oguiza, J.A. et aU -"Mcqlectular cloning, DNA sequence analysis, and characterization of the Q~rynebacteriumn diphtheriae dtxR from Brevibacteriumn lactofermentum," J. Bactlerial., 1 77(2):465-467 (1995) M1 3774 Prephenate dehydratase Follettie, M.T. et al. "Molecular cloning and nucleotide sequence of the Corynebacterium glutamhicum pheA gene," J Bacterial., 167:695-702 (1986) *M16175 5S rRNA Park, Y-H. et al. "Phylogenetic analysis of the coryneform bacteria by 56 rRNA sequences," J Bacterial, 169:1801-1806 (1987) M 16663 trpE Anthranilate synthase, 5' end Sano, K. et al. "Structure and function of the trp operon control regions of Brevibacterium lactofermentum, a glutam ic-acid-producing bacterium," Gene, 52: 191-200 1987) M* 16664 trpA Tryptophan synthase, 3'end Sano, K. et al. "Structure and function of the trp operon control regions of Brevibacterium lactofermentum, a glutamic-acid-producing bacterium," Gene, (1987) 2007202378 25 May 2007 1 T a b le 2 c o n tiin u e d *p oe -VO py-Q UUa oyas' NAR'% 106 1 t ')IQ DXJA r gene inseC Lion sequence M85107. t rPMsA on;I 108 r, e on sequence KAQQO11 I n L f. .LL.
a'4..J MVi, yiIOw Beta C-S lyase; branch ed-ch a in- amino acid uptake carrier; hypothetical protein yhbw 1 S59299 1 tM I onfio Pn O'Regan, M. et ai. "Cloning and nuc]eoide sequenceof th~e Phosphoenolpyruvate carboxylase-coding gene of Corynebact erium glutamicum ATCCI3OJ2," Gene, 77(2):237-251 (1989) Roller, C. et al. "Gram-positive bacteria with a high DNA G+C content are characterized by a common insertion within their 23S rRNA genes," J. Gen.
Microbial, 138:1167-1175 (1992) Roller, C. et al. "Gram-postie bacteria with a high DNA G+C content arecharacterized by a common insertion within their 23S rRNA genes," J. Gen.
MicrobiaL, 138:1167-1175 (1992) Rossol, 1. et al. "TIhe Corynebacterium glutamicum aecD) gene encodes a C-S lyase with alpha, beta-elimination activity that degrades am inoethylcysteine," J. Bacteril.t, 174(9):2968-2977 (1992); ITauch, A. et al. "Isoleucine uptake in Corynebacterium glutamicum ATCC 13032 is directed by the brnQ gene proryecterium lMic umt 16T4)30-25 t(1gn."Tess9Mcrbolg Sher, et al. "Cloning and cha racteian utrof a NAregio~n-cdn stres-sensiiv rstricin sytefo Corynebacterium glutamicum atCC 1303and nalysisoflitseroeqiunreneic cEvongaio ih, 59(ch:richia cOlia, J.P atal.d 176(3):709-7 1.1994) SchafeerA cet al.uc "Theh Corynebacterium glutamicum Acgl285Mp gene. enoigacToesine MiaMcrbC-g deficien nlsso t oei negnrccnuainwt Escherichia cl-tan"Gn,232:5l1(97 5 pionto er) i i Ut t1345 Anthran ilate phosphoribosyltransferase i "I I)AII 11AA TID 1 1112) 6 i 'r 1 C g Putative type 11 S-cytosoine methyltransferase; putative type 11 restriction endonuclease; putative type I or type IlI restriction endonuclease -1 U1t4965 1 recA i '44 LL9 "I 111) r I i '44 LU pro%-..
L-proilne: NAUV-I 5-oxidoreductase Ankri, S. et al. "Mutations. in the Corynebacterium glutamicumproline biosynthetic pathway: A natural bypass of the proA step," J. Bacterial., 178(15):4412-4419 (1996) Ankri, S. et al. "Mutations in the Corynebacterium glutamicumproline biosynthetic pathway: A natural bypass of the proA step," J. Bacterial, 178(1 5):44 12-44 19 (1996) Ankri, S. et al. "Mutations in the Corynebacterium glutamicumproline biosynthetic pathway: A natural bypass of the proA step," J BacieriI., 178(15):44 12-44 19 (1996) t. I '44~ L4U oug; pro; uiiyun ?;gamma glutamyl kinase;similar to Disomer specific 2-hydroxyacid dehydrogenases *1 i i j 2007202378 25 May 2007 2 (continued) U3 1281 bioB Biotin synthase Serebriiskii, "Two new mem bers of the blo B superfamily: Cloning, sequencing and expression of bio B genes of Methylobacillus flagellatum and ____________Corynebacterium glutamicum," Gene, 175:15-22 (1996) U35023 thtR; accBC Thiosulfate sulfurtransferase; acyl CoA Jager, W. et al. "A Gorynebacterium glutamicum gene encoding a two-domain carboxylase protein similar to biotin carboxylases and biotin-carboxyl-carrier proteins," Microbial., 66(2);7682 (1996) -in Cnern utdu U43535 cmr Multidrug resistance protein Jager, W. et al. "A Corynebacterium glutamicumgeecnrigmuidg resistance in the heterologous host Escherichia coli," J. Bacterial., 1 79(7):2449-245 1 (1997) U43536 clpB Heat shock ATP-binding protein U53587 aphA-3 3'S' -aminoglycoside phosphotransferase U8964 8 Corynebacterium glutam icum unidentified sequence involved in histidine biosynthesis, sequence X04960 trpA; trpB; trpC; trpD; Tryptophan operon Matsui, K. et al. "Complete nucleotide and deduced amino acid sequences of trpE; trpG; trpL the Brevibacterium lactofermentum tryptophan operon," Nucleic Acids Res., 14(24):101 13-10114 (1986) X07563 lys A DAP decarboxylase (m eso-d iam inop ime late Yeh, P. et al. "Nucleic sequence of the lysA gene of Corynebacterium decarboxylase, EC 4.1.1.20) glutamnicum and possible mechanisms for modulation of its expression," Mo!.
X 14234 EC 4.1.1.31 Phosphoenolpyruvate carboxylase Eikmanns, B.J. et al. "The Phosphoenolpyruvate carboxylase gene -of Corynebacterium glutamicum: Molecular cloning, nucleotide sequence, and expression," Mo!. Gen Genet., 218(2):330-339 (1989); Lepiniec, L. et al.
"Sorghum Phosphoenolpyruvate carboxylase gene family: structure, function and molecular evolutio6,"Plan. Ma!. Bio., 21 (3):487.502 (1993) Xl 17313 fda Fructose-bisphosphate aldolase Von der Osten, C.H. et a 'P."Molecular cloning, nucleotide sequence and finestructural analysis of the Corynebacterium glutamicum fda gene: structural comparison of C. glutamicum fructose-I, 6-biphosphate aldolase to class l and I1 aldolases," Mat. Microbial., X53993 dapA L-2, 3-dihydrodipicolinate synthetase (EC Bonnassie, S. et al. "Nucleic sequence of the dapA gene from 4.2.1.52) Corynebacterium glutariicum," Nucleic Acids Res., 18(21):6421 (1990) X54223 AttB-related site Cianciotto, N. et al. "DNA sequence homology between art B-related sites of Corynebacterium diphtheriae, Corynebacterium u Icerans, Corynebacteriu m glutamicum and the attP site of lambdacorynephage," FEMS. Microbial, 66:299-302 (1990) X54740 argS; lysA Arginyl-tRNA synthetase; Diaminopimelate Marcel, T. et al. "Nucleotide sequence and organization of the upstream region decarboxylase of the Corynebacterium glutamicum lysA gene," Ma!. Microbial., 4(11): 181- (1990) 2007202378 25 May 2007 2 continued) X55994 trpL; trpE Putative leader peptide; anthranilate Heery, D.M. et al. "Nucleotide sequence of the Corynebacterium glutamicum component I trpE gene," Nucleic Acids Res., 18(23):7138 (1990) 1 th r
I
3SIIUIO3 V 1; AM c I "n I A I i -re ate a te Attachment s Le X57226 1 1 sC-al ha I cr-kptn 1
I
asd Aspartokinase-aipha subuni-t; Aspartokinase-beta subunit; aspartate beta semnialdehyde dehydrogenase
I
I
X 'IOAAI a 1, O~l, t t, I b 1 11,1b 1 V G Iyceraldehyde-3 -phosphate; phosphoglycerate kinase; triosephosphate isomerase Han KS. t l."The molec-ular structure of the Corynebacterium glutamicum threonine synthase gene," Mo!. M icrobiol, 4(10):1693-1702 (1990) Cianciotto, N. et al. "DNA sequence homology between at" B-related sites of Corynebacterium diphtheriae, Corynebacterium u Icerans, Corynebacteriumn glutamicum and the attP site of lambdacorynephage," FEMS. Microbial, Left., 66:299-302 (1990) Kalinowski, J. et al. "Genetic and biochemical analysis of the Aspartokinase from Corynebacterium glutamicum," Mo6. Microbial., 5(5):1 197-1204 (1991); Kalinowski, J. et al. "Aspartokinase genes lysC alpha and IysC beta overlap and are adjacent to the aspertate beta- scm ialIdehyde dehydrogenase gene asd in Corynebacterium glutam icum," MoL. Gen. Genel., 224(3):3 17-324.(1990) Eikmanns, B.J. "Identification, sequence analysis, and expression of a Corynebacterium glutamicum gene cluster encoding the three glycolytic enzymes glyceraldehyde-3 -phosphate dehydrogenase, 3-phosphoglycerate kinase, and triosephosphate isomeras," J. Bacteriol., 174(1 9):6076-6086 (1992) PSIone ofR te tw a"oleclrd protesis of eCorynebacterium glutamicum 85h compex, Mnoi. Mgcutaat, 6(1roe Mo23t-36 (1992) (3;37-2 Eimn, et al. "Coing seqnenceidexpressieonfn trancsptIn on analysis of the Corynebacterium glutamicum gItA gene encoding citrate isynthase," Microblol., 140:1817-1828 (1994) Y C10A 1 Al, 1-71 i j ULaIIaLe ueiiyurogenase V~A2I1 I t a /(.UUJ IL L.-Iysine permease Yrfl,79 n i Psi protein YA1 1 1 ilt 1 h..
.QLe synL ase Vf'7117 1 A4..D i uan uiiivurouioicniinare reduct~~ Diiuo incInt re. cta.
I n 3UIa~ iacrprotein rP.e X69104 1 1 IC-1 I A I IPeyret, J1L. cc al. "'Characterization of the cspB gene en'oding PS2, an ordered surface-layer protein in Corynebacterium glutam icum," Mo!. Microbial., 9(t):97-109 (1993) Bonamy, C. et al. "Identification of IS 1206, a Corynebacterium glutamicum 153-related insertion sequence and phylogenctic analysis," Mol. Microbiol., 14(3):571-581 (1994) 1 atl. riseitionelement L
J
2007202378 25 May 2007 2 (continued) X70959 leuA Isopropylmalate synthase Patek, M. et al. "Leucine synthesis in Corynebacterium glutamicum: enzyme activities, structure of leuA, and effect of leuA inactivation on lysine synthesis," Appi. Environ. MicrobioL, 60(l):133-140 (1994) X71489 icd Isocitrate dehydrogenase (NADP+) Eikmanns, B.J. et al. "Cloning sequence analysis, expression, and inactivation of the Corynebacterium glutamicum icd gene encoding isocitrate dehydrogenase and biochemical characterization of the enzyme," J. BacterioL, (1995) X72855 ODHA Glutamate dehydrogenase (NADP4-) X75083, mtrA 5-methyltryptophan resistance Heery, D.M. et al. "A sequence from a tryptophan-hyperproducing strain of X70584 Corynebacteriumn glutamicum encoding resistance to Biophys. Res. Commun., 20](3):1255-1262 (1994) X75085 recA Fitzpatrick, R. et al. "Construction and characterizition of recA -mutant strains of Corynebacteriumn glutam icum and Brevibacterium lactoferrnentum," Appi.
icrobial BiotechnoL, 42(4):575-580 (1994) X75504 aceA; thiX Partial Isocitrate lyase; Reinscheid, DJ. et al, "Characterization of the isocitrate lyase gene from Corynebacterium glutamicum and biochemical analysis of the enzyme," 176(1 2):3474-3483 (1994) X76875 ATPase beta-subunit Ludwig, W. et al. "Phylogenetic relationships of bacteria based on comparative sequence analysis of elongation factor Tu and ATP-synthase beta-subunit Antonie Van Leeuwenhoe k 64:285-305 (1993) X77034 tuf Elongation factor Tu Ludwig, W. et al. "Phylogenetic relationships of bacteria based on comparative sequence analysis of elongation factor Tu and ATP-synthase beta-subunit Anlonie Van. Leeuwenhoek 64:285-305 (1993) X77384 recA Billman-Jacobe, H. "Nucleotide sequence of a recA gene from glutaiicim," DNA Seq., 4(6):403-404 (1994) X78491 aceB Malate synthase Reinscheid, D.J. et al. "Ma late synthase from Corynebacterium glutamilcum pta-ack operon encoding ph osph otransace ty lase: sequence analysis," 140:3099-3 108 1994) X80629 16S rDNA 16S ribosomal RNA Rainey, F.A. et al. "Phylbgenetic analysis of the genera Rhodococcus and Norcardia and evidence for the evolutionary origin of the genus Norcardia from within the radiation of Rhodococcus species," Microbial., 141:523-528 X81 191 gluA; gluB; gluC; Glutamate uptake system Kronemeyer, W. et al. "Structure of the gluABCD cluster encodingth gluD glutamate uptake system of Corynebacterium glutamicum," J. Bacterial., 177(5):1 152-1158 (1995)
F
"y an, "op me ate esuccinylase ven rmann, A. et ai. Analysis of different DNA fragments of Corynebacterium glutamicum complementing dapE of Escherichia coli," Microbiology, 40:3349-56 (1994) 2007202378 25 May 2007 2 (continued) X82061 16S rDNA 16S ribosomal RNA Ruimy, R. et al. "Phylogeny of the genus Corynebacterium deduced fromn analyses of small-subunit ribosomal DNA sequences," Int. J. Sys. Bacterial., 45(4):740-746 (1995) X82928 asd; lysC Aspartate-semialdehyde dehydrogenase;?7 Serebrijski, L.et al. "Multicopy suppression by asd gene and osmotic stressdependent complementation by heterologous proA in proA mutants," J.
177(24):7255-7260 (1995) X82929 proA Gamma-glutamyl phosphate reductase Sererisi .eta. "Multicopy suppression by asd gene and osmotic stressdependent complementation by heterologous proA in proA mutants," J.
177(24):7255-7260 (1995) X84257 16S rDNA 16S ribosomal RNA Pascual, C. et al. "Phylogenetic analysis of the genus Corynebacterium based 16S rRNA gene sequences," Int. J. Syst. Bacterial., 45(4):724-728 (1995) X85965 aroP; dapE Aromatic amino acid permease; Wehrmann et al. "Functional analysis of sequences adjacent to dapE of C.
glutamicum proline reveals the presence of aroP, which encodes the aromatic acid transporter," J. Bacterial. 1 77(20):599 1-5993 (1995) X86 157 argB; argC; argD; Acetylglutamate kinase; N-acetyl-gamma- Sakanyan, V. et al. "Genes and enzymes of the acetyl cycle of arginine argF; argJ glutamyl-phosphate reductase; biosynthesis in Corynebacterium glutamicum: enzyme evolution in the early acetylornithine aminotransferase; omnithine steps of the arginine pathway," Microbiology, 142:99-108 (1996) carbamoyltransferase; glutamate Nferase X89084 pta; ackA Phosphate acetyltransferase; acetate kinase Reinscheid, D.J. et al. "Cloning, sequence analysis, expression and inactivation of the Corynebacterium glutamicum pta-ack operon encoding phosphotransacetylase and acetate kinase," Microbiology, 145:503-513 (1999) X89850 attB Attachment site Le Marrec, C. et al. "Genetic characterization of site-specific integration functions of phi AAU2 infecting "Arthrobacter aureus C70," J Bacterial., 178(7): 1996-2004 (1996)_ X90356 Promoter fragment Fl Patek, M. et al. "Promotesi from Corynebacteriiim glutamicum: cloning, molecular analysis and search for a consensus motif," Microbiology, _________142:1297-1309 (1996) X90357 Promoter fragment F2 Patek, M. et al. "Promoters from Corynebacterium glutamicum: cloning, molecular analysis and search for a consensus motif," Microbiology, 142:1297-1309 (1996) X90358 Promoter fragment FIO Pate, M. et al. "Promoters from Corynebacterium glutamicum: cloig molecular analysis and search for a consensus motif," Microbiology, (1996) X90359 Promoter fragment F13 Patek, M. et al. "Promoters from Corynebacteriumglutamicum: cloning, molecular analysis and search for a consensus motif," Microbiology, (1996) 2007202378 25 May 2007 nued Patek, M. et al. -Tromoters from CorynebacteTi urrt glutamicum-cloning, molecular analysis and search for a consensus motif," Microbiology, 142:1297-1309 (1996) Patek, M. et al Promoters trom Corynebacte I 11"glilir l 11 cloning, molecular analysis and search for a consensus motif," Uicrobiology, 142:1297-1309 (1996) Patek, M. et al. "Promoters from C. glitamic -in: cloning, molecular analysis and search for a consensus motif," Microbiology, 142:1297-1309 (1996) Patek, M. et al. "Promoters from Cory a i it I I I I I 1, g i I :i I !I I l li Iii! D-oning molecular analysis and search for a consensus motif," Microbiology 142:1297-1309 (1996) Patek, M. et al. "Promoters from Corynebacterium glutamicuii cloning, molecular analysis and search for a consensus motif," Microbiology, 142: 1297-1309 (1996) u ing, icrobiolo'y, Patek,-M. et al. "Pro ioters from Corynebacterium glutamicum: cloning, C rY terium gj tamicu clonin c s sus motif Micr iology molecular analysis and search forr aa consensus motif," Microbiology, 142:1297-1309 (1996) Patek, M. et al. "Promoters from Cory ne bacterm m glutamicum: cloninF, molecular analysis and search for a consensus motif," Microbiology.
142:1297-1309 (1996) 00 Patek, M. et 'Promoters froin I)rynebacterium g a I C11111 cloning- 0' molecular analysis and search for a consensus motif," Microbiology, 142:1297-1309 (1996) Patek M et al. "Promi)[CIS I nin I Inn ulpilin ::,111 ;Ioning, molecular analysis and earph for a consensus motif," Microbiology, 142:1297-1309(1996) Siewe, K.M. et al. "Func(lonhi and genetic characterization of thei (methyl) ammonium uptake carrier of Corynebacterium glutam icum," J Biol. Chem., 271(10):5398-5403 (1996).
Peter, H. et al. "Isolation, characterization, and expression of the Corynebacterium glutamicum betP gene, encoding the transport system for thh]ee compatible solute glycine betaine," J Bacteriol., 178(17):5229-5:234 (1996) Patek, M. et al. "Identification and t S Cil 1111 (111 a i i a %S S of the d apB-ORF-2 dapA-ORF4 operon ofCorynebacterium glutamicum, encoding two enzymes involved in Uysine sy nthesis," Biolechnol. Left, 19:1113-1117 (1997) VrIjic, M. et al. "A new N__ ype ol barisporterwith anew type ofcellularfunction: L-lysine export from Corynebacterium glutamicum," Mot Microbiol., 22(5):815-926 (1996) 2007202378 25 May 2007 giutam1icum and itothenate (1999) mne encoding lactofermentumn -222 (1997) nase (thrB) gene ;(9):3922 (1987) e lateD D-d (eic Acids Res., h ydrogenase -ids Res., alysis of the 2007202378 25 May 2007 V 11Y)I IA I r'Z 5 1I.f Uutrmine syntnetase I Jakoby, M.d e 'alIsolation of Coryebacteriumn glutamicu ng-inA gene encoding glutamine synthetase FEMS Microblot. Lett., 154(1 1-88 (1997) t. I.
164I pd Dihydrolipoamide dehydrogenase Attachment site Corynephage 304L en. t ILZ I DU I argS; lysA Arginyl-tRNA synthetase; diminopime-late decarboxylase (partial) Z21502) dapA; dapB Dihiydrodipiolinate synthase; dihydrodipicolinate reductase Moreau, S. et al. "Analysis of the integration functions of &phi;304L: Akn integrase module among corynephages," Virology, 255(1): 150-159 (1999) Oguiza, J.A. et al. "A gene encoding arginyl-tRNA synihetase is located in the upstream region of the lysA gene in Brevibacterium lactofermentum: Regulation of argS-lysA cluster expression by arginine," J.
Bacteriol., 1 75(22):73 56-7362 (1993) Pisabarro, A. et a. "A cler of, tc'hree genes (dapAl oI2 and dap;B) -of Brevibacterium lactofermentumn encodes dihydrodipicolinate reductase, and a third polypeptide of unknown function," Bacterial., 1 75(9):2743-2749 (1993) Malumbres, M. et, al. "Anaysis and expressioni of te thrC1 gene oI!f the enicoded threonine synthase," Appi. Environ. Mdicrobiol., 60(7)2209.2219 (1994) Z29563 Z46753 f th rC 16S rDNA [Gene for16riooaRN L482sigA SigA sigma factor Oguiza, J.A. et al "Multiple sigma factor genes in Brevibacteriumn lactoferrnencum: Characterization of sigA and sigl3," J. Bacterial., 178(2):550- 553 (1996) Z49823 galE; dtxR catalytic activity UDP-galactose 4- Oguiza, J.A. et al "The gaE gene encoding the UDP-galactose 4 epimerase of epimerase; diphtheria toxin regulatory Brevibacterium lactofermentumn is coupled transcriptionally to the dmdR gene," Gene, 177:103-107 (1996) Z49824 orfi; sigB SigB sigmia factor Oguiza, J.A. et al "Multiple sigma factor genes inB3revibacterium lactofermentum: Charact oerization of sigA and sigB,' J. Bacterial, 178(2):550.
Z66534 Transposase Core3 a .eta996)o i.
CoriA;ia."lnn nd characterI iaion of an I-ik e i!!lement present in the genome of Brevibacterium lactofermentumn ATCC 13869," Gene, 170(lt):9 1-94 (1996) A sequence for this gene was published in the indicated reference. However, the sequence obtained by-the inventors 'of the present application is signifficantly longer than the published version. It is believed that the published version relied on an incorrect start codon, and thus represents only a fragment of the actual coding region.
TABLE 3: Corynebacterium and Brevibacterium Strains Which May be Used in the Practice of the Invention Brevibacterium amman iagenes 21054 Brevibacterium ammoniagenes 19350 Brevibacterium ammoniagenes 19352 Brevibacterium arnmoniagenes 19353 Brevibacteriuni ammoniagenes 193543 Brevibacterium anmdniagenes 19355 Brevibacterium armoniagenes 19356 Brevibacterium ammoniagenes 10355 Brevibacterium ammoniagenes 21077 Brevibacterium ammoniagenes 215537 Brevibacterium ammoniagenes 21580 Brevibacterium ammoniagenes 3150 Brevibacterium butanicum 21196 Brevibacterium divaricatum 21792 P928 Brevibacterium flavum 21474 Brevibacterium flavum 21129 Brevibacterium flavum 21518 Brevibacterium flavum B] 1474 Brevibacterium flavum B 11472 Brevibacterium flavum 21127 Brevibacterium flavum 21128 Brevibacterium flavum 21427 IBrevibacterium flavum 21475 Brevibacterium flavum 21517 Brevibacterium flavum 21528 Brevibacterium flavum 21529 Brevibacterium flavum B811477 Brevibacterium flavum BI 1478 Brevibacterium flavum 21127 Brevibacterium flavum BI 1474 Brevibacterium healii 15527 Brevibacterium ketoglutamicum 21004 Brevibacterium ketoglutarbicum 21089 Brevibacterium ketosoreductum 21914 Brevibacterium lactoferrnentum Brevibacterium lactoferinentum 74 Brevibacterium lactofermentum 77 Brevibacterium lactofermentum 21798 Brevibacterium lactofementum 21799 Brevibacterium lactofermnentum 21800 Brevibacterium lactofermentum 21801 Brevibacterium lactofermentum B811470 Brevibacterium lactofermentum B 11471 92 2i2 r I~ reviLvdLLeIiumI l actolermentum I21086 1 Brevibacterjum lactofernientum 21420 Brevibacterium :lactofennentum 21086 Brevibacterium lactofermnentum 131269 Brevibacterium Ilinens 9174 Brevibacterium Ilinens 19391 Brevibacterium Ilinens 8377 Brevibacterium paraffinolyticum 11160 Brevibacterium spec. _____717.73 Brevibacteriuni spec. 717.73- Brevibacterium spec. 14604 Brevibacterium spec- 21860 Brevibacterium spec. 21864 Brevibacterium spec. 21865 Brevibacterium spec. 21866 Brevibacterium Ispec. 19240 Corynebacterium :acetoacidophilum 21476 Corynebacterium acetoacidophilum 13870 Corynebacterium acetoglutamicurn BI 1473 Corynebacterium. acetoglutamicum BI 11475 Corynebacterium acetoglutamicum 15806 Corynebacterium acetoglutanaicum 21491 Corynebacterium acetoglutanicum 3 1270 Corynebacterium acetophilum B3671 Corynebacterium ammoniagenes 6872 2399 Corynebacterium ammoniagenes 15511 Corynebacterium fujiokense 21496 Corynebacterium glutamicum 14067 Corynebacterium glutamicum 39137 Corynebacterium giutamicum 21254 Corynebacterium glutamicum 21255 Corynebacterium glutanaicum 3 1830 Corynebacterium glutanaicum 13032 Corynebacterium glutamicum 14305 Corynebacterium glutarnicum 15455 Corynebacterium glutarnicum 13058 Corynebacterium glutamicum 13059 Corynebacterium glutamicum 13060 Corynebacterium glutamicum 21492 Corynebacterium giutamicum 21513 Corynebacterium glutamicum 21526 Corynebacterium glutamicum 21543 Corynebacterium glutanaicum 13287 Corynebacterium glutamicum 21851 Corynebacterium glutamnicum 21253 Corynebacterium glutamicum 21514 Corynebacterium glutamicum 21516 Cotynebacterium Iglutamicum 21299____ 93 AfRCC NRff CEOT NCHMi qCBS% I~C J SM Corynebacterium glutamicum 21300 Corynebacterium glutamicum 39684 Corynebacterium glutamicum 21488 Corynebacterium glutamicum 21649 Corynebacterium glutamicum 21650 Corynebacterium glutaniicum 19223 Corynebacterium glutamicum 13869 Corynebacterium glutamicum 21157 Corynebacterium glutamicum 21158 Corynebacterium glutamicum 21159 Corynebacterium glutamicum 21355 Corynebacterium glutamicum 3 1808 Corynebacterium glutamicum 21674 Corynebacterium glutamicum 21562 Corynebacterium glutam icum 21563 Corynebacterium glutamicum 21564 Corynebacterium glutamicum 21565 Corynebacterium glutamicum 21566 Corynebacterium glutamicum 21567 Corynebacterium glutamicum 21568 Corynebacterium glutamicum 21569 Corynebacterium glutamicum 21570.
Corynebacterium glutamicum 2157] Corynebactrnum glutamicum 21572 Corynebacterium glutamicum 21573 Corynebacterium glutamicum 21579 Corynebacterium glutamicum 19049 Corynebacterium glutamicum 19050 Corynebacterium glutamicum 19051 Corynebacterium glutamicum 19052 Corynebacterium glutamicum 19053 Corynebacterium glutamicum 19054 Coryriebacterium glutamicum 19055 Corynebacterium glutamicum 19056 Corynebacterium glutamicum 19057 Corynebacterium glutamicum 19058 Coryneba'ctenium glutamicum 19059 Corynebacterim glutamicum 19060 Corynebacterium glutamicum 19185 Corynebacterium glutamicum 13286 Coryriebacterium glutamicum 21515 Coryriebacterium glutarnicum 21527 Corynebacterium glutamicum 21544 Corynebacterium glutamicum 21492 Corynebacterium glutarnicum B8 183 Corynebacteriuin glutamicum B8-182 Corynebacterium glutamicum B 12416 Corynebacterium glutamicum B 12417 -94- 00 1 r-IIII Corynebacterium spec. P4446 SCorynebacterium spec. 31088 C1 Corynebacterium spec. 31089 Corynebacterium spec. 31090 Corynebacterium spec. 31090 Corynebacterium spec. 31090 Corynebacterium spec. 15954 Corynebacterium spec. 21857 Corynebacterium spec. 21862 Corynebacterium spec. 21863 ATCC: American Type Culture Collection, Rockville, MD, USA FERM: Fermentation Research Institute, Chiba, Japan NRRL: ARS Culture Collection, Northern Regional Research Laboratory, Peoria, IL, USA CECT: Coleccion Espanola de Cultivos Tipo, Valencia, Spain NCIMB: National Collection of Industrial and Marine Bacteria Ltd., Aberdeen, UK CBS: Centraalbureau voor Schimmelcultures, Baam, NL NCTC: National Collection of Type Cultures, London, UK DSMZ: Deutsche Sammlung von Mikroorganismen und Zellkulturen, Braunschweig, Germany For reference see Sugawara, H. et al. (1993) World directory of collections of cultures of microorganisms: Bacteria, fungi and yeasts (4 h edn), World federation for culture collections world data center on microorganisms, Saimata, Japen.
2007202378 25 May 2007 D 4 length
(NT)
rxaOOO23 3579 rxaOOO44 1059 rxaOOO64 1401 Genbank Hit Table 4: Alignment Results Length Accession Name of Genbank I-it GB-EST33:A[776129 483 GB-EST33:AI776129 483 EMVIPAT: E11760 6911 A1776129 EST257217 tomato resistant, Cornell Lycopersicon esculentum cDNA clone CLER17D3, mRNA sequence.
A1776129 EST257217 tomato resistant, Cornell Lycopersicon esculentumn cIDNA clone cl-ER173, MRNA sequence.
Eli 6 Base sequence of sucrase gene.
Source of Genbank Hit Lycopersicon esculentum Lycopersicon escuientumn Corynebacterium glutamicum Unknown.
Escherichla Coll Corynebacterlum glutamicum Drosophila melanogaster Drosophila melanogaster homology Date of (GAP) Deposit 40,956 40.956 29-Jun-99 GB PAT: 126 124 GBBA2:ECOUW69 GB-PAT:E16763 GB-HTG2:AC007892 6911 176195 2517 134257 126124 U00006 E16763 AC007892 Sequence 4 from patent US 5556776.
E. coil chromosomal region from 89.2 to 92.8 minutes.
gDNA encoding aspartate transferase (AAT).
Drosophila melanogaster chromosome 3 clone BACRO2003 (D797) R PCI-98 02.0.3 map 9913-998 strain y; cn bw sp, -SEQUENCING IN PROGRESS **.113 unordered pieces.
Drosophila melanogaster chromosome 3 clone BACRO2003 (D797) RPCI-98 02.0.3 map 9913-99B strain y; cn bw sp, SEQUENCING
IN
PROGRESS-, 113 unordered pieces.
42.979 08-OCT- 1997 (Rel.
52, Created) 42,979 07-OCT.
1996 39,097 17-DEC- 1993 95,429 28-Jul-99 31,111 2-Aug.99
CD
Ch GB-HTG2:AC007892 134257 AC007892 31.111 2-Aug-99 rxaOO072 rxaOOlOS 798 rxaOOlO6 579 1170 GB-BA:MV0O2 56414 GB-BA1:ECU29581 71128 G13_BA2:AE000366 10405 GB-EST15:AA494237 367 GB8BA2:AF161327 2021 GB PAT:ARO41 189 654 GB8PR4:ACOO71 10 148336 GB-HTG3:AC008537 170030 GB-HTG3:AC008537 170030 AL008967 U29581 AE000366 AA4 94237 AF161327 AR04 1189 AC0071 10 AC008537 AC008537 Mycobacterium tuberculosis H37Rv complete genome; segmelt ,22/1 62. Mycobacterium tuberculosis Escherichla Coli K-1 2 genome; approximately 63 to64 minutes: Escheuichia Coill Escherichia Coil K-12 MG 1655 section 256 of 400 of the complete genome. Escherichia col ng83104.sl NCI_OGAP -Pr6 H-omo sapiens cDNA clone IMAGE:941407 Homo sapiens similar to SW:DYRLACCA P00381 DIHYDROFOLATE
REDUCTASE;,
MRNA sequence.
Corynebacterium diphtheriae histidlne kinase ChrS (chrS) and response Corynebacterium regulator ChrA (chrA) genes, complete cds. diphtheriae Sequence 4 from patent US 5811286. Unknown.
Homo sapiens chromosome i17, clone hRPK.472J1 8, complete sequence. Homo sapiens Homo sapiens chromosome 19 clone CIT-HSPC-49OE21, SEQUENCING Homo sapiens IN PROGRESS 93 unordered pieces.
Homo sapiens chromosome 19 coeCIT-HSPC49E2 1. SEQUENCING Homo sapiens IN PROGRESS 93 unordered pieces.
37,753 35,669 35,669 42,896 40,210 41,176 36,783 40.296 40,296 1 7-Jun-98 14-Jan-97 12-Nov-98 20-Aug-97 9-Sep-99 29-Sep-99 1999 2-Sep-99 2-Sep-99 2007202378 25 May 2007 rxaO0l 16 1284 GB-BA2:AF062345 16458 AF062345 GBPAT:118647 GBGSSI3:A044619 7 rxaOOl3l 732 GB-BA1:MTY20B11 GBBA1:5AR7932 GB-BA: MTY20B1 11 rxa00132 1557 GB-BAI:MTY2OBl1 GB-IN2:TVU40872 GBHTG6:AC010706 rxa00145 1059 rxa00146 1464 rxaOO147 1302 GB-BA1 :MTCY281 2 GB-BA1 :PSEPYRBX
GB-BAI:LLPYRBDNA
GB-BA1 :MTCY2B1 2 GBBA1:MTCYI54 GBBA1:MSGYI54 OB-BAl :MTCY2B12 GBBAI:MSGB937C
S
GBBA1:PAU81259 3300 751 36330 15176 36330 36330 1882 169265 20431 2273 1468 20431 13935 40221 20431 38914 7285 118647 AQ446 197 Z951 21 AJ007932 Z95121 Z95121 U40872 ACO 10706 Z810111 L1 9649 X84262 Z81011I Z98209 AD000002 Z81011 L78820 U8 1259 Table 4 (continued) Caulobacter crescentus Sstl (sstl), S.Iayer protein subunit (rsaA), ABC transporter (rsaD), membrane forming unit (rsaE), putative GDP-mannose-4,6dehydratase (IpsA), putative acetyltransferase (lpsB). putative perosamine synthetase (IpsO), putative mannosyltransferase (IpsD), putative m annosyltra nsfe rase (IpsE), outer membrane protein (rsaF), and putative perosamine transferase (IpsE) genes, complete cds.
Sequence 6 from patent US 5500353.
nbxbOO62Dl6r CUIGI Rice BAC Library Oryza sativa genomic clone nbxbOO62D16r, genomic survey sequence.
Mycobacterium tuberculosis H37Rv complete genome: segment 1391162.
Streptomyces argiliaceus mlthramycln biosynthetic genes.
Mycobacterium tuberculosis H37Rv complete genome; segment 139/1162.
Mycobacterium tuberculosis H37Rv complete genome; segment 139/162.
Trichomonas vaginalis S-adenosyi-L-homocystelne hydrolase gene, complete cds.
Drosophila melanogaster chromosome X clone BACR36D1 5 (0887) RPCI-98 36.D.15 map 13C-13E strain y; cn bw sp. -SEQUENCING IN PROGRESS unordered pieces.
Mycobacterium tuberculosis H37Rv complete genome: segment 61/1 62.
Pseudomonas aeruginosa aspartate transcarbamnoylase (pyrB) and dihydroorotase-like (pyrX) genes, complete cds's.
L.leichmannii pyrB gene.
Mycobacterium tuberculosis H37Rv complete genome; segment 61/162.
Mycobacterium tubercuiosis H37Rv complete genome; segment 121/1 62.
Mycobacterlum tuberculosis sequence from clone yl 54.
Mycobacterlum tuberculosis H37Rv complete genome; segment 61/1 62.
Mycobactedrlu leprae cosmid B937 DNA sequence.
Pseudomonas aeruginosa dihydrodipicolinate reductase (dapB) gene, partial cds. carbamnoylphosphate synthetase small subunit (carA) and carbamnoylphosphaie synlhetase large subunit (carB) genes, complete cds, and FtsJ homolog (ftsJ) gene. partial cds.
Caulobacter crescentus Unknown.
Oryza sativa Mycobacterium tuberculosis Streptomyces argillaceus Mycobacterium tuberculosis Mycobacterium tuberculosis Trichomonas vaginatis Drosophila melanogaster Mycobacterium tuberculosis Pseudomonas aeruginosa Lactobacillus Ieichmannii Mycobacteriumn tuberculosis Mycobacterium tuberculosis Mycobacterium tuberculosis Mycobacteriumn tuberculosis Mycobacterium ieprae Pseudomonas aeruginosa 36.235 36,821 38.124 43,571 41,116 39,726 36,788 6 1,914 51,325 63.365 56.080 47,514 60,714 39,229 36,618 61. 527 59,538 55.396 19-OCT.
1999 07-OCT- 1996 8-Apr-99 17-Jun-98 15-Jun-99 17-Jun-98 17-Jun-98 31-OCT.
1996 22-Nov-99 1 8-Jun-98 26-Jul-93 29-Apr-97 1 8-Jun-95 17-Jun-98 03-DEC- 1996 18-Jun-98 15-Jun-96 23-DEC- 1996 rxaOO156 1233 ncaOlSB 233 GBBA1:SC9BIO 33320 AL009204 Streptomyces coelicolor cosmid 9B10. Srpoye olclr 5,6 0Fb9 Streptomyces coelicolor 52.666 10-Feb-99 2007202378 25 May 2007 GBBA2:AF002133 15437 AF002133 rxaOO166 783 rxaOOl98 672 rxa00216 1113 GB BAl: 0854 17 GBHTG3:ACOO8 167 GBHTG3:AC008167 GBHTG4:AC010118 GBBAl :A8024708 GBBA1:AB024708 GB-EST24:A1232702 GBHTG2:HSDJB5OE 9 GBHTG2:HS0J85OE 9 GBPR2:CNSO1OSA 7 984 74223 74223 30605 8734 8734 528 117353 1117353 159400 110000C 1 10000 11 0000 19771 2477 13893w 1758 15022 18897 4232 085417 AC0081 67 AC008 167 AC0101 18 AB024708 AB024708 A1232702 AL12 1758 AL12 1758 AL12 1766 IAC005079 IAC005079 IAC005079 X99694 AF1 28444 8 AC01011' AF124511 1 AC00459 2 AC00690 X6031 2 Table 4 (continued) Mycobacterium avium strain GIR10 transcriptional regulator (may81) gene, partial cds. aconitase (acn). invasln I (Cmvi), Invasin 2 (Inv2). transcriptional regulator (moxR), ketoacyl-reductase (fabG). enoyl-reductase (inhA) and ferrochelatase (mav272) genes, complete cds.
Propionibacterium freudenreichi hem'?. hemH-, hemB. hemX, hemR and hemL genes, complete cds.
Homo sapiens clone NI 72013. SEQUENCING IN PROGRESS ~,7 unordered pieces.
Homo sapiens dlone NH01 72013. SEQUENCING IN PROGRESS ~,7 unordered pieces.
Drosophila melanogasler chromosome 3162131 clone RPCI98-10015.
SEQUENCING IN PROGRESS 51 unordered pieces.
Corynebacterium glutamicurn gItB and gIlD genes for glutamine 2oxoglutarate aminotransferase large and small subunits, complete cds.
Corynebacterium glutamicum glIB and gillD genes for glutamine 2oxoglutarate amlnotransfe rase large and small subunits, complete cds.
EST229390 Normalized rat kidney. Bento Soares Rattus sp. cDNA clone 3Send, mRNA sequence.
Homo sapiens chromosome 20 clone RP5-850E9, -SEQUENCING IN PROGRESS in unordered pieces.
Homo sapiens chromosome 20 clone RP5-850E9, SEQUENCING IN PROGRESS in unordered pieces.
Human chromosome 14 DNA sequence -IN PROGRESS BAC R-4121-8 of RPCI-1 1 library from chromosome 14 of Homo sapiens (Human), complete sequence.
Homo sapiens clone RG252P22, SEQUENCING IN PROGRESS *,3 unordered pieces.
Homo sapiens clone RG252P22, SEQUENCING IN PROGRESS *.3 unordered pieces.
H omo sapiens clone RG25222,~ SEQUENCING IN PROGRESS ,3 unordered pieces.
Plasmid pEA3 nitrogen fixation genes.
Rhodobacter capsulatus molybdenum cofactor biosynthetic gene cluster, partial sequence.
I Drosophila metanogaster chromosome 31170C1 clone RPCI98-9B18, SEQUENCING IN PROGRESS 64 unordered pieces.
8 Corynebacterium glulamicum 3-dehydroquinase (aroD) and shikimate Mycobacterium avium Propionibaclerium freudenreichii Homo sapiens Homo sapiens Drosophila melanogaster Corynebacterium glutamicum Corynebacterium glutamicum Rattus sp.
Homo sapiens Homo sapiens Homo sapiens Homo sapiens Homo sapiens Homo sapiens Enterobacter agglomerans Rhodobacter capsulatus Drosophila melanogasler Corynebacterlum glutamicum Homo sapiens Caenorhabditls elegans Corynebacterium glulamicum 46,667 37,451 37,451 38,627 92,113 93,702 34.221 37,965 37,965 38,796 38,227 38,227 38,227 48.826 40,135 39,527 98,237 36,616 37,095 100,000 6-Feb-99 21 -Aug-99 21-Aug-99 16-OCT- 1999 13-MAR- 1999 13-MAR- 1999 31-Jan-99 03-DEC- 1999 03-DEC- 1999 11 -N ov-99 22-Nov-98 22-Nov-98 22-Nov-98 2-Aug-96 22-MAR- 1999 16-OCT- 1999 18-MAY- 1999 18B-Apr-98 26-Feb-99 30-Jan-92 54,191 26-MAR- 1998 rxaO0219 1065 GBHTG2:AC005079 -0 GB,,HTG2:AC005079 GBHTG2:AC005079 -1 rxa00223 1212 GBBA1:PPEA3NIF GBBA2:AF128444 GB-HTG4:ACO 10111 rxa00229 803 GB-BA2:AF1 24518 GB-PR3:A0004593 GB-HTG2:AC006907 rxaOO241 1626 GB-BA1:CGLYSI 3 7 dehydrogenase (aroE) genes, complete cds.
Homo sapiens PAC dlone DJ0964CI11 from 7p14-p15, complete sequence.
Caenorhabditls etegans clone Y76B12, -SEQUENCING IN PROGRESS 25 unordered pieces.
C.glutamlcum lysI gene for L-Iyslne permease.
2007202378 25 May 2007 GBHTG1:PFMAL13P 192581 GBHTG1:PFMAL13P 192581 GB_1N2:EI-U89655 3219 GBJIN2:EH-UB9655 3219 AL049 180 ALD491 80 U89655 U89655 Table 4 (continued) Plasmodium falciparumn chromosome 13 strain 3D7, SEQUENCING IN PROGRESS in unordered pieces.
Plasmodium falciparum chromosome 13 strain 3D7. SEQUENCING IN PROGRESS in unordered pieces.
Entamoeba histolytica unconventional myosin lB mRNA. complete cds.
Entamoeba histolytica unconventional myosin lB mRNA. complete cds.
rxa00262 1 197
F
nxa00266 531 GB-RO:AF016190 2939 AF016190 Mus musculus connexin-36 (CX36) gene, complete cds.
EMPAT:E09719 3505 E09719 DNA encoding precursor protein of alkaline cellulase.
rxa00278 1155 rxa00295 1125 rxa00323 1461 fxa00324 3258 nxa00330 1566 rxa00335 1554 GBPAT:E02133 3 GB-lN1:CEtLK05F6 GB-BA1:CGU43535 GBRO:RNU30789 GBBA2:CGU3I2BI GBBAi:BRLBIOBA GBPAT:E03937 GB-BA1:MTCY427 GBBAI:MSGB32CS GB-BA1:MTCY427 GBBA1:MSGB32CS GBBA1:MTCY427
GBOM:BOVELA
GBBA1:CGTHRC GBPAT:109078 GBBA1:BLTHRESY
N
GBBAI:CGGLNA
3494 36912 2531 3510 1614 1647 1005 38110 36404 38110 36404 38110 3242 3120 3146 1892 3686 E02133 AF040653 U43535 U30789 U31281 014084 E03937 Z70692 178818 Z70692 L78818 Z70692 J02717 X56037 109078 Z29563 Y13221 gDNA encoding alkaline cellulase.
Caenorhabditls elegans cosmid K05F6.
Corynebacterium glutamicum multidrug resistance protein (cmr) gene.
complete cds.
Rattus norvegicus dlone N27 mRNA.
Corynebacteriurri glutamicum biotin synthase (bloB) gene, complete cds.
Brevibacterlumn flavumn gene for biotin synthetase, complete cds.
DNA sequence encoding Brevibacterlumn flavum biotin-synthase.
Mycobacteriumn tuberculosis H37Rv complete genome; segment 99/1 62.
Mycobacterium leprae cosmld B32 DNA sequence.
Mycobacterium tuberculosis H37Rv complete genome; segment 99/162.
Mycobacterium leprae cosmid B32 DNA sequence.
Mycobacterium tuberculosis H-37Rv complete genome; segment 99/162.
Bovine elastin a mRNA, complete cds.
Corynebacterium glutamicum thrC gene for threonine synthase (EC 4.2.99.2).
Sequence 4 from Patent WO 8809819.
Brevibacteriuni lactofermenlum: ATCC 13869;; DNA (genomic).
Corynebacterium glutamicum gInA gene.
~lasmodium falciparum Plasmodium falciparumn Entamoeba histolytica Entamoeba histolytica Mus musculus Bacillus sp.
Bacillus sp.
Caenorhabditis elegans Corynebacteriumn glulamicuim Rattus norvegicus Corynebacterlum glutamicum Corynebacterium glutamicum Corynebacterium glutamicum Mycobacterlumn tuberculosis Mycobacterium leprae Mycobacterlumn tuberculosis Mycobacterium leprae Mycobacteriumn tuberculosis Bos taurus Corynebacterlum glutamicum Unknown.
Corynebactenium glutamicum Corynebacterlum glutamicum 34,947 11 -Aug-99 34947 11 -Aug-99 36496 23-MAY- 1997 37,544 23-MAY- 1997 41,856 9-Feb-99 34,741 08-007- 1997 (Rel.
52, Created) 34,741 29-Sep-97 36943 6-Jan-98 36,658 9-Apr-97 38190 20-Aug-96 99,111 21-Nov-96 (.0 98,489 3-Feb-99 c 98,207 29-Sep-97 35.615 24-Jun-99 60,917 15-Jun-96 44,606 24-Jun-99 52,516 15-Jun-96 38,079 24-Jun-99 39,351 27-Apr-93 99,808 17-Jun-97 99,617 02-DEC- 1994 99,170 20-Sep-95 100,000 28-Aug-97 2007202378 25 May 2007 Table 4 (continued) AF005635 Corynebacteriun glutamnicum glutamine synthetase (ginA) gene, complete GB_8A2:AF005635 1690 Mycobacteriumi leprae cosmid B27 DNA sequence.
rxa00347 891 GB BA1:MSGB27CS GB_.EST27:A1455217 GBBA2:SSU30252 38793 624 2891 GB..EST21:AA91 1262 581 L788 17 rxa00351 1578 rxca00365 727 rxa00366 480 rxa0036 7 4653 rxa00371 1917 GB..BA1 :MLU1 5187 GB_..N2:AC004373 GB-lN2:.AF145653 GBBA1:AB024708 GBBA1:MTCY1A6a GBBA1:SC3A3 GB_BA1:AB024708 GBBAI:MTCY1A6 GBBA1:SC3A3 GB,,BA1 :AB024 708 GBBA1:MTCY1A6 GB..BA1 :SC3A3
GBVI:SBVORFS
36138 72722 3197 8734 37751 159011 8734 37751 15901 8734 3775 1 15901 758 380 A1455217 LD21828.3prime LO Drosophila melanogaster embryo pOT2 Drosophila melanogaster cDNA clone L021828 3prime, niRNA sequence.
U30252 SynechococcuS PCC7942 nucleoside diphosphate kinase and ORF2 protein genes, complete cds. ORFI1 protein gene, partial cds, and neutral site I for vector use.
AA9 11262 oe75a02.sl Nd _CGAP_Lu5 Homo sapiens cDNA clone IMAGE: 1417418 3' similar to gb:A18757 UROKINASE PLASMINOGEN ACTIVATOR SURFACE RECEPTOR. GPI-ANCHDRED (H-UMAN):. mRNA sequence.
U15187 Mycobacterium leprae cosmid L296.
AC004373 Drosophila melanogaster DNA sequence (P1 DS05273 complete sequence.
AF145653 Drosophila melanogaster clone GH08860 BcDNA.GH08860 (BcDNA.GH08860) mRNA. complete cds.
AB024708 Corynebacterium glutamicum gltB and gItD genes for glutamine 2oxoglutarate aminotransferase large and small subunits, complete ods.
Z83864 Mycobacterium tuberculosis H37Rv complete genome; segment 1591162.
AL109849 Streptomyces coelicolor cosmid 3A3.
AB024708 Corynebacterlum glutamicum glIB and gItD genes for glutamlne 2oxoglutarate ami notransfe rase large and small subunits, complete cds.
Z63864 Mycobacterium tuberculosis H37Rv complete genome; segment '1591162.
AL109849 Streptomyces coelicolor cosmid 3A3.
AB024708 Corynebacterium glutamlcum gltB and gIlD genes for glutamine 2oxoglutarate aminotransfe rase large and small subunits, complete cds.
Z83864 Mycobactefium tuberculosis H37Rv complete genome; segment 1591162.
AL109849 Streptomyces coelicolor cosmid 3A3.
M89923 Sugarcane baclliform virus ORF 1 ,2,and 3 DNA. complete cds.
Corynebacterium glutamlocum Mycobacteriuni eprae Drosophila melanogaster Synechococcus PCC7942 Homo sapiens Mycobacterium leprae Drosophila melanogaster Drosophila melanogaster Corynebacterium glutamcum Mycobacterium tuberculosis Streptomyces coelicolor A3(2) Corynebacterium glutamicum Mycobacteriun tuberculosis Streptomyces coelicolor A3(2) Corynebacerium glutamicum Mycobacterium tuberculosis Streptomyces coelicolor 98,906 14-Jun-99 66,345 1 5-Jun-96 34,510 09-MAR- 1999 37,084 29-OCT- 1999 37.500 21-Apr-98 52.972 46,341 49,471 96.556 39,496 37.946 99,374 41.333 37,554 99,312 36,971 37.905 1995 17-Jul-98 14-Jun-99 13-MAR- 1999 17-Jun-98 16-Aug-99 13-MAR- 1999 17-Jun-98 16-Aug-99 13-MAR- 1999 17-Jun-98 16-Aug-99 12-Jun-93 24-Aug-99 22-Jan-98 GBJIN I:CELKO9H9 37881 A1967505 Ljitnpest03- 2 15-cI10 Ljirnp Lambda HybriZa p two -hybrid library Lotus japonicus cONA clone LP21 5-03--dO 5' similar to 60S ribosomal protein L39.
mRNA sequence.
AF043700 Caenorhabditis elegans cosmid K091-9.
Sugarcane bacilliform virus 35,843 Lotus japonlcus 42,593 Caenorhabdltls elegans 34,295 2007202378 25 May 2007 rxa00377 1245 GBBA1:CCU13664 '1678 GBPL1:ANSDGENE 1299 U13664 Y08866 Table 4 (continued) Caulobacter crescentus uroporphyrinogen decarboxylase homolog (hemE) gene, partial cds.
Anidulans sD gene.
GBGSS4:AQ730303 rxa00382 1425 GBBAI:PAHEML GB_BA1:MTY25D1O GBBA1:MSGY224 rxa00383 1467 GB -BA1:MLCB1222 GB-TG2:AC006269 GBHTG2:AC007638 rxa00391 843 GBEST38:AW01705 3 GBPAT:AR065852 GB)Jl:AF1 48805 rxa00393 1017 rxa00402 623 rxaDO403 1254 GBBA1 :MTY25D1 0 GBBAI:MSGY224 GBBA1:MLB13O6 GBBA2:AF052652 GB BA2:AF109162 GBBA2:AF092918 GBBA2:AF052652 GB BA1:MTVO16 483 4444 40838 40051 34714 167171 178053 613 32207 28559 40838 40051 7762 2096 4514 20758 2096 53662 X82072 Z95558 AD000004 ALD49491 AC006269 AC007638 AW01 7053 AR065852 AF1 48805 Z95558 AD000004 Y13803 AF052652 AF109162 AF09291 8 AF052652 AL02 1841 Mycobacterium tuberculosis H37Rv complete genome; segment 28/162.
Mycobacterium tuberculosis sequence from clone y224.
Mycobacterium leprae cosmid 81 222.
Homo sapiens chromosome 17 clone hRPK.515_E_23 map 17, SEQUENCING IN PROGRESS 2 ordered pieces.
Homo sapiens chromosome 17 clone hRPK.515_0_17 map 17, SEQUENCING IN PROGRESS 8 unordered pieces.
EST272398 Schistosoma mansoni male, Phil LoVerde/Joe Merrick Schistosoma mansoni cDNA clone SMMAS14 5'end, mRNA sequence.
Sequence 20 from patent US 5849564.
Kaposi's sarcoma-associated herpesvirus ORF 68 gene, partial cds: and ORF 69. kaposin. v-FLIP, v-cyclin, latent nuclear antigen, ORF K14. v-GPCR, putative phosphoribosylformyglycinamdine synthase, and LAMP (LAMP) genes, complete cds.
Mycobacterium tuberculosis H37Rv complete genome: segment 28/162.
Mycobacterium tuberculosis sequence from clone y224.
Mycobacterium leprae cosmid 8 1306 DNA.
Corynebacterium glutamicum homoserine 0-acetyltransferase (metA) gene, complete cds.
Corynebacterium diphtheriae heme uptake locus, complete sequence.
Pseudomonas alcaligenes outer membrane Xcp-secretion system gene cluster.
Corynebacterium glutamicum homoserine 0-acetyltransferase (meLA) gene.
complete cds.
Mycobacterium tuberculosis H37Rv complete genome; segment 143/1 62.
A0730303 HS_5505_8 1 _CO4_T7A RPCI-1 1 Human Male BAC Library Homo sapiens genomic clone PlatelO08l Col=7 Row=F, genomic survey sequence.
P.aeruginosa hemL gene.
Caulobacter crescentus Emericella nidulans Homo sapiens Pseudomonas aeruginosa Mycobacterium tuberculosis Mycobacterium tuberculosis Mycobacterium leprae Homo sapiens Homo sapiens Schistosoma mansoni Unknown.
Kaposi's sarcomaassociated herpesvirus Mycobacteriurn tuberculosis Mycobacterlum tuberculosis Mycobacterium leprae Corynebacterium glutamicum Corynebacterium diphtheriae Pseudomonas alcaligenes Corynebacterlum glutamicum 36,832 39,603 36,728 54,175 61,143 61,143 43,981 35.444 3.821 40,472 38.586 38.509 36,308 39,282 39.228 99,672 40,830 50,161 99,920 52,898 24-MAR- 1995 17-OCT- 1996 15-Jul-99 1 8-DEC- 1995 17-Jun-98 03-DEC- 1996 27-Aug-99 10-Jun-99 22-MAY- 1999 10O-Sep-99 29-Sep-99 2-Aug-99 17-Jun-98 03-DEC- 1996 24-Jun-97 19-MAR- 1998 8-Jun-99
OG-DEC-
1998 19-MAR- 1998 23-Jun-99 Mycobacterium tuberculosis GB_E5T23:A1111288 750 All111288 SWOvAMCAQO2AO5SK Onchocerca volvulus adult male cDNA (SAW98MLW-Onchocerca volVUlus OvAM) Onchocerca volvutus cONA clone SWOvAMCAQO2AO5 mRNA sequence.
37,565 31-Aug-98 2007202378 25 May 2007 rxaOO4OS 613 GBBAI:MTV016 GBPR4:AC005145 GBBA1:MTVO16 rxa00420 1587 GBBA1:MTY13D12 GB-BAI:MSGY126 GBBA1:MSGB971C
S
rxa00435 1296 GB-BA1:AFACBBTZ GBHTG4:AC00954 1 GB-HTG4:AC009541 rxa00437 579 GBPR4:AC005951 GBBA1:SC2A 1 GBPR4:AC005951 rxa00439 591 GB-BAI:MTV016 GBPL2:AF167358 GB-HTG3:ACOO9I 20 rxa0O440 582 GBBA2:SKZ86I 11 GBBA1:SC2EI GB BA1 :SC2E1 rxa00441 1257 GBPR2:H-S173D1 GBHTG2:HSDJ719K 53662 143678 53662 37085 37 164 37566 2760 '169583 169583 155450 22789 155450 53662 1022 269445 7860 38962 38962 117338 267114 AL021841 ACOO05145 AL02 1841 Z80343 AD00001 2 L78821 M68904 AC009541 AC009541 AC005951 AL031 184 AC005951 AL021841 AF167358 AC009 120 Z861 11 AL023797 AL023797 AL031 984 AL109931 AL1 09931 AL034355 AC009367 AC009367 Table 4 (continued) Mycobacterium tuberculosis H37Rv complete genome; segment 1431162. Mycobacterium tuberculosis Homo sapiens Xp22-166-169 GSHB-523A23 (Genome Systems Human BAC Homo sapiens library) complete sequence.
Mycobacterium tuberculosis H37Rv complete genome, segment 1431162.
Mycobacterlum tuberculosis H37Rv complete genome; segment 156/1 62.
Mycobacterium tuberculosis sequence from clone yl 26.
Mycobacterium leprae cosmid B971 DNA sequence.
Alcaligenes eutrophus cthromsomal transketolase (cbbTc) and phosphoglycolate phosphatase (cbbZc) genes. complete cds.
Homo sapiens chromosome 7, SEQUENCING IN PROGRESS 25 unordered pieces.
Homo sapiens chromosome SEQUENCING IN PROGRESS 25 unordered pieces.
Homo sapiens chromosome 17, Clonle hRPK.372_K_20. complete sequence.
Streptomyces coelicolor cosmid -2A11.
Homo sapiens chromosome 17, clone hRPK.372_K-20, complete sequence.
Mycobacterium tuberculosis H37Rv complete genome; segment 143/1 62.
Rumex acetosa expansin (EXP3) gene, partial cds.
Homo sapiens chromosome 16 clone RPCI-1 1_464E3, SEQUENCING IN PROGRESS 34 unordered pieces.
Streptomyces lvidans rpsP, trmD, rplS, sipW, slpX. sipY, sipZ, mutT genes and 4 open reading frames.
Streptomyces coelicolor cosmid 2E1.
Streptomyces coelicolor cosmld 2E1.
Human DNA sequence from clone 17301 on chromosome 1p36.
2 l- 36.33.Contains ESTs, STSs and GSSs, complete sequence.
Homo sapiens chromosome X clone RP4-719K3 map q21.1-21.31, SEQUENCING IN PROGRESS in unordered pieces.
Homo sapiens chromosome X clone RP4-719K3 map q2l.1-21.31, SEQUENCING IN PROGRESS In unordered pieces.
Streptomyces coelicolor cosmld 078.
Drosophila melanogaster chromosome 3tJ76A2 clone RPCI98-48B15, SEQUENCING IN PROGRESS 44 unordered pieces.
Drosophla melanogaster chromosome 3L176A2 clone RPCI98-48BI5, SEQUENCING IN PROGRESS 44 unordered pieces.
Mycobacterium tuberculosis Mycobacterium tuberculosis Mycobacterium tuberculosis Mycobacterium Ieprae Ralstonia eutropha Homo sapiens Homo sapiens Homo sapiens Streptomyces coelicolor Homo sapiens Mycobacterium tuberculosis Rumex acetosa Homo sapiens Streptomyces lvidans Streptomyces coelicolor Streptomyces coelicolor Homo sapiens Homo sapiens Homo sapiens Streptomyces coelicolor Drosophila melanogaster DrosophIla melanogaster 57,259 34,.179 40,169 62,03 1 61.902 39.651 38.677 36,335 36.335 31,.738 43,262 37,647 37,088 46,538 43,276 43.080 42.931 36,702 38,027 34,521 34,521 56,410 34.959 34.959 23-Jun.99 08-DEC- 1998 23-Jun-99 17-Jun-98 1 0-DEC- 1996 15-Jun-96 27-Jul-94 1 2-OCT- 1999 I 2-OCT- 1999 1 8-Nov-98 5-Aug-98 1 8-Nov-98 23-Jun-99 17-Aug-99 3-Aug-99 27-OCT- 1999 4-Jun-98 4-Jun-98 23-Nov-99 03-DEC- 1999 03-DEC- 1999 26-Nov-98 I 6-OCT- 1999 16-OCT.
1999 rxa00446 987 GBHTG2:HSDJ719K 267114 3 GBBA1:5CD78 36224 GBHTG4:AC009367 226055 GBHTG4:AC009367 226055 2007202378 25 May 2007 rxaOO44B 1143 rxaO450 424 GB_PR3:AC003670 GBYHTG2:AF029367 GB-HTG2:AF029367 GBHTG2:AC007824 88945 AC003670 148676 AF029367 148676 133361 AF029367 AC007824 Table 4 (continued) H-omo sapiens 12q 13.1 PAC RPCI1I- 130F5 (Roswell Park Cancer Institute Homo sapiens Human PAC library) complete sequence.
Homo sapiens chromosome 12 clone RPCI-1 130F5 map 12q13.1I. Homo sapiens SEQUENCING IN PROGRESS 156 unordered pieces.
Homo sapiens chromosome 12 clone RPCI-1 130F5 map 12q13.1. Homo sapiens SEQUENCING IN PROGRESS 156 unordered pieces.
Drosophila melanogaster chromosome 3 clone BACR02L_16 (0715) RPCI-98 Drosophila mel; 021-.16 map 89E-90A strain y: cn bw sp, -SEQUENCING IN PROGRESS unordered pieces.
Drosophila melanogaster chromosome 3 clone BACRO2LI16 (D7 15) RPCI-98 Drosophila met 021L.16 map 89E-90A strain y; cn bw sp, SEQUENCING IN PROGRESS 91 unordered pieces.
wkl 4aOS.xl1 NCICGAPLyml 2 Homo sapiens cDNA clone IMAGE:24 12278 Homo sapiens 3 similar to gb:Y00764 UBIQUINOL-CYTOCHROME C REDUCTASE 11 KO PROTEIN (HUMAN);, mRNA sequence.
Mycobacterium Ieprae cosmld B1779. Mycobacterium Drosophila melanogaster cosmid clone 86E4. Drosophila mel 927P31-2H-3.TP 927P1 Trypanosoma brucei genomic clone 927P1-2H3, Trypanosomab genomic survey sequence.
anogaster anogaster 35,682 31.373 31.373 40.000 40,000 9-Jun-98 1 8-OCT- 1997 I 8-OCT- 1997 2-Aug-99 2-Aug-99 GBHTG2:AC007824 133361 AC007824 GBEST35:A1818057 412 A1818057 35.714 24-Aug-99 rxaOO46l 975 GBBA1:MLCB1779 GBIN 1 :DMC86E4 GB-GSS15:AQ64032 43254 29352 467 Z98271 AL021 086 AQ640325 Ieprae anogaster )rucei 39,308 37.487 38,116 8-Aug-97 27-Apr-99 8-Jul-99 rxa00465 rxa00487 1692 rxa00488 1641 rxa00.489 1245 rxa00533 1155
GB_BAI:BAGUMA
GB_BA2:U00015 GBBA1 :MVTCY78 GBBA1:MTCY78 GBBA2:U00015 GB BA1:SCAJ1O601 GB_8A2:U00015 GB-HTG2:HS225E1 2 GBHTG2:H5225E12 GB_BAl :CGLYS 3866 42325 33818 33818 42325 4692 42325 126464 126464 2803 Y10499 U00015 Z77 165 Z77165 U0001 5 AJO1 0601 UOOO15 AL031772 AL031772 X57226 B.ammoniagenes guaA gene.
Mycobacterium leprae cosmid 81620.
Mycobacterium tuberculosis H37Rv complete genome; segment 145/162.
Mycobacterium tuberculosis H37Rv complete genome; segment 145/162.
Mycobacterium leprae cosmid 61620.
Streptomyces coelicolor A3(2) DNA for whiD and whiK loci.
Mycobacterium Ieprae cosmid B 1620.
Homo sapiens chromosome 6 clone RP1-225E12 map q24.
SEQUENCING IN PROGRESS In unordered pieces.
Homo sapiens chromosome 6 clone RP1-225E12 map q24, SEQUENCING IN PROGRESS In unordered pieces.
C. glutamicum IysC-alpha, lysC-beta and asd genes for aspaflokinase-alpha and -beta subunits, and aspartate beta semialdehyde dehydrogenase, respectively (EC 2.7.2.4; EC 1.2. 1.11).
Corynebacterium ammoniagenes Mycobacterium Ieprae Mycobacterium tuberculosis Mycobaclerlumn tuberculosis Mycobacterium leprae Streptomyces coelicolor Mycobacterium Ieprae Homo sapiens Homo sapiens Corynebacterium glutamicum 74,259 37,248 39,725 39,451 39,178 60,835 38,041 36,756 36,756 99,9 13 8-Jan-98 01-MAR- 1994 17-Jun-98 17-Jun-98 01-MAR- 1994 17-Sep-98 01-MAR- 1994 03-DEC- 1999 03-DEC- 1999 17-Feb-97 2007202378 25 May 2007 Table 4 (continued) X82928 C.glutamicum aspartate-semialdehyde dehydrogenase gene.
GBBA1:CGCYSCAS 1591 0 Corynebacterium 99,221 glutamicum GBPAT:A07546 GB-BA1:CGLYS rxa00534 1386
GBBAI:CORASKO
GB-PAT:E 14514 rxa00536 1494 GBBA1:CGLEUA GBBA1:MTV025 GBBA1:MTU88526 rxaOO537 2409 GBBA2:SCD25 GBBAI:MTCY7H7A GBBAI:MTU34956 rxaOO541 792 GBPAT:192052 GB.BA1 :MLCB5 GB-BA1 :MTCY369 rxaOO558 1470 GBBA1:BAPURF GB-BAI:MLU15182 GBBA1:MTCY7-7A rxa00579 1983 GBPAT:AR016483 EMVIPAT:EI 1273 2112 A07546 Recombinant DNA fragment (Pstl-Xhol).
2803 X57226 C. glutamicum tysO-alpha. lysC-beta and asd genes for aspartokinase-alpha and -beta subunits, and aspartate beta semlaldehyde dehydrogenase.
respectively (EC 2.7.2.4; EC 1.2. 1.11).
2957 Li16848 Corynebacterlum Ilavum aspartokinase (ask), and aspartate-semialdehyde dehydrogenase (asd) genes, complete cds.
1643 E14514 DNA encoding Brevibacterium aspartokinase.
3492 X70959 C.glutamlicum gene leuA for Isopropylmalate synthase.
121125 AL022121 Mycobacterium tuberculosis H37Rv complete genome; segment 1551162.
2412 U88526 Mycobacterium tuberculosis putative alpha-isopropyl malate synthase (leuA) gene, complete cds.
41622 AL1 18514 Streptomyces coellcolor cosmid D25.
10451 Z9561B8 Mycobacterium tuberculosis H37Rv complete genome; segment 39/162.
2462 U34956 Mycobacterium tuberculosis phosphoribosylformylglycinamidine synthase (purt.) gene, complete cds.
2115 192052 Sequence 19 from patent US 5726299.
38109 Z95151 Mycobacterium leprae cosmid B5.
36850 Z80226 Mycobacterium tuberculosis H37Rv complete genome; segment 36/162.
1885 X91252 Bammoniagenes purF gene.
40123 U15182 Mycobacterium leprae cosmid B2266.
10451 Z95618 Mycobacterlum tuberculosis H37Rv complete genome; segment 39/162.
2104 AR0 16483 Sequence 1 from patent US 5776740.
2104 E11273 DNA encoding serlne hydroxymethyl transferase.
synthetic construct Corynebacterium glutamicum Corynebacterium flavescens Corynebacterlum glutamicum Corynebacterlum glutamicum Mycobacterlum tuberculosis Mycobacterium tuberculosis Streptomyces coelicolor A3(2) Mycobacterlum tuberculosis Mycobacterium tuberculosis Unknown.
Mycobacterium leprae Mycobacterium tuberculosis Corynebacterlum ammoniagenes Mycobacterium leprae Mycobacterium tuberculosis Unknown.
Corynebacterium glutamicum Corynebacterlum glutamicum Corynebacterium glutamicum 99,391 99,856 98,701 98,773 100,000 68,003 68,185 63,187 62,401 62,205 98,359 62,468 60,814 66,095 64,315 64,863 98,810 98,810 98,810 99,368 17-Feb-97 30-Jul-93 1 7-Feb-97 I11-Jun-93 28-Jul-99 10-Feb-99 24-Jun-99 26-Feb-97 21 -Sep-99 17-Jun-98 28-Jan-97 Ol-DEC- 1998 24-Jun-97 17-Jun-98 5-Jun-97 09-MAR- 1995 17-Jun-98
OS-DEC-
1998 08-aCT- 1997 (Rel.
52, Created) 24-Jun-98 24-Jun-98 GBPAT:E12594 GBPAT:E12594 2104 E12594 2104 E12594 DNA encoding serine hydroxymethyltransferase from Brevibacterium flavumn.
DNA encoding serine hydroxyme thyltran sfe rase from Brevibacterium flavum.
rxaGO580 1425 2007202378 25 May 2007 GBPAT:AR016483 2104 EM PAT:E11273 2104 AR01 6483 El 1273 rxa00581 1092 GBPAT:E12594 EMIPAT: E11273 2104 E12594 2104 El11273 GB-PAT:AR016483 2104 rxa00584 1248 GBBA1:CORAHPS 2570 GS-BAl AOPCZA361 37941 GBBAI:D90714 14358 rxa00618 1230 GBESTI9:AA802737 280 GBE5T28:A1534381 581 GBIN1:DMANILLIN rxaOO6l9 1551 GBBAI:MTCY369 GBPAT:A60305 rxaOO62O 1014 GBPL2:AF063247 GBBA1:STMAPP GBHTG3:AC008763 rxa00624 810 GBINl:CEY41E3 GB8EST13:AA362 167 GB IN1:CEY41E3 rxa00626 1386 GBBA1:MTCY369 GBBA1:MLU15187 4029 36850 38 109 1845 1450 2069 214575 150641 372 150641 36850 38109 36138 AR01 6483 L07603 AJ223998 D90714 AA802737 A1534381 X89858 Z80226 Z95 151 A60305 AFD63247 M91 546 AC008763 Z95559 AA362 167 Z95559 Z80226 Z95151 U1 5187 Table 4 (continued) Sequence 1 from patent us 5776740. Unknown.
DNA encoding serIne hydroxymethyl transferase. Corynebacterium glutamicum DNA encoding serine hydroxymethyltransferase from Brevibacteriumn flavum. Corynebacterium glutamicum DNA encoding serine hydroxymethyl transferase. Corynebacterium glutamicum Sequence 1 from patent US 5776740. Unknown.
Corynebacterium glutamicum 3-deoxy-D-arabinoheptulosonate-7-phosphate Corynebacterium synthase gene, complete cds. glutamicurn Amycolatopsis orlentalis cosmid PCZA361 Amycolatopsis oriental is Escherlchia coli genomic DNA. (16.8 17.1 mn. Escherichia coi GMV06236.Sprime GMV Drosophila melanogaster ovary BlueScript Drosophila Drosophila melanogaster melanogasler cDNA clone GM06236 5prime. mRNA sequence.
SD07 186.5prime SD Drosophila melanogaster Schneider L2 cell culture p072 Drosophila melanogaster Drosophila melanogaster cDNA clone SD071 86 5prime similar to X89858: Ani FBgnO0l 1558 PID:g927407 SPTREMBL:024240, mRNA sequence.
D.melanogaster mRNA for anillin protein. Drosophila melanogaster Mycobacterium tuberculosis H37Rv complete genome; segment 36/162. Mycobacterium tuberculosis Mycobacterium leprae cosmid B5. Mycobacterium leprae Sequence'S from Patent WO9708323. unidentified Pneumocystis carinii f. sp. ratti enolase mRNA, complete cds. Pneumocystis carinii f. sp ratti Streplomyces livldans aminopeptidase P (PepP) gene, complete cds. Streptomyces lividans Homo sapiens chromosome 19 clone CITB3-E1_3214H 19. SEQUENCING Homo sapiens IN PROGRESS 21 unordered pieces.
Caenorhabditis elegans cosmid Y41E3, complete sequence. Caenorhabditls elegans EST71 561 Maci-ophage I Homo sapiens cDNA S'end, mRNA sequence. Homo sapiens Caenorhabdills elegans cosmid Y41IE3, complete sequence. Caenorhabditis elegans Mycobacteriumn tuberculosis H37Rv complete genome; segment 36/1 62. Mycobacterium tuberculosis Mycobacterlum teprae cosmid B5. Mycobacterium leprae Mycobacteriumn Ieprae cosmld 1296. Mycobacterium leprae 99,368 99.368 37,071 37,071 37,071 98,236 54,553 53,312 39,928 41,136 34,398 62.776 61,.831 61,785 4 1,060 37,126 40,020 36,986 38,378 37,694 57,971 58,806 38.007 1998 08-OCT- 1997 (Rel.
52, Created) 24-Jun-98 08-OCT.
1997 (Rel.
52, Created)
OS-DEC-
1998 26-Apr-93 29-MAR- 1999 7-Feb.99 25-Nov-98 1 8-MAR- 1999 8-Nov-95 17-Jun-98 24-Jun-97 06-MAR- 1998 5-Jan-99 12-Jun-93 3-Aug-99 2-Sep-99 21 -Apr-97 2-Sep-99 17-Jun-98 24-Jun-97 09-MAR- 1995 2007202378 25 May 2007 rxa00632 795 GBBA1:BRLBIOAD GB-PATE04041 GBPAT:E04040 rxa00633 1392 GBBA1:BRLBIOAD GB-PAT:E04040 GBBA2:EI-U38519 rxa00688 666 GBBA1:MTV041
GB-BAI:BRLSECY
GB_8A2:MBU77912 930 GB BA2:AF157493 GSBPAT:100836 GB-PAT:E0031 1 rxaOO717 1083 GBPAT:178753 GBPAT:192042 GBBA1:MTCI125 rxaO0llB 831 GB-BA1:MTCII25 GB-GSS1I2:AQ42075 rxa00727 1035 GB-HTG3:AC008332 2272 575 1272 2272 1272 1290 28826 1516 7163 25454 1853 1853 1187 1187 37432 37432 37432 671 118545 D14083 E04041 E04040 D14083 E04040 U3851 9 AL02 1958 014 162 U77912 AF1 57493 100836 E00311 178753 192042 Z98268 Z98268 Z98268 A0420755 ACGO8332 Table 4 (continued) Brevibacterium flavumn genes for 7,8-diaminopelargonic acid aminotra nsfe rase Corynebacterium and dethloblotin synthetase, complete cds. glutamlcum DNA sequence coding for desthiobiotinsynthetase. Corynebacterium glutamicum DNA sequence coding for diamino pelargonic acid aminotransferase. Corynebacterium glutamicum Brevibacterlum flavumn genes for 7.8-dlaminopelargonic acid aminotransferase Corynebacterium and dethloblotin synthetase, complete cds. glutamicumn DNA sequence coding for diamina pelargonic acid a minotra nsfe rase. Corynebacterlum glutamicum Erwinia herbicola adenosylmelhionine-8-amino-7-oxononanoate transaminase Erwinia herbicola (bioA) gene, complete cds.
Mycobacterium tuberculosis H37Rv complete genome: segment 35/162. Mycobactertum tuberculosis Brevibacterdum flavumn gene for SecY protein (complete cds) and gene or Corynebacterium adenylate kinase (partial cds). glutamnicumn Mycobacterlum bovIs MVBE5Oa gene, partial cds; and MVBE5Ob, MBE5Oc, Mycobacterium bovis preprotein translocase SecY subunit (secY), adenylate kinase (adk).
methionune aminopeptidase (map), RNA polymerase ECF sigma factor MVBE5Od, and MVBE50e genes, complete eds.
Zymomonas mobills ZM4 fosmid clone 42137. complete sequence. Zymomonas mobilis Sequence 1 from Patent US 4758514. Unknown.
DNA coding of 2.5-diketogluconic acid reductase. unidentified Sequence 9 from patent US 5693781. Unknown.
Sequence 9 from patent US 5726299. Unknown.
Mycobacterium tuberculosis H-37Rv complete genome; segment 761162. Mycobacterium tuberculosis Mycobacterium tuberculosis H37Rv complete genome; segment 76/162. Mycobacterium tuberculosis Mycobacterium tuberculosis H37Rv complete genome: segment 76/1 62. Mycobacterium tuberculosis RPCI-1 1-168G18.TJ RPCI-1 1 Homo sapiens genomic clone RPCI-1 1- Homo sapiens 168G18, genomic survey sequence.
Drosophila melanogaster chromosome 2 clone BACR48D10 (D867) RPCI-98 Drosophila melanoga! 48.03.10 map 34A-34A strain y; cen bw sp, SEQUENCING IN PROGRESS -,78 unordered pieces.
Drosophila melanogaster chromosome 2 clone BACR48DI 0 (D867) RPCI-98 Drosophila melanoga! 48.0.10 map 34A-34A strain y; cii bw sp, SEQUENCING IN PROGRESS-, 78 unordered pieces.
97.358 98,074 93,814 95,690 95,755 55.564 60,030 99,563 60,030 39,116 47,419 47.419 37.814 37,814 50,647 55,228 40,300 35.750 40,634 40.634 3-Feb-99 29-Sep-97 29-Sep-97 3-Feb-99 29-Sep-97 4-Nov-96 17-Jun-98 3-Feb-99 27-Jan-99 5-Jul-99 21-MAY- 1993 29-Sep-97 3-Apr-98 Ol-DEC- 1998 17-Jun-98 17-Jun-98 17-Jun-98 23-MAR- 1999 6-Aug.99 6-Aug-99 ster ster GB-HTG3:AC008332 118545 AC008332 2007202378 25 May 2007 GBHTG3:AC008332 118545 rxa00766 966 rxaOl770 1293 rxa00779 1056 rxaOO780 669 rxa00838 1023 rxa00863 867 rxa00864 873 GB-HTG2:AC006789 GBHTG2:AC006789 GBBA1:D90810 GB-BA1 :MTV043 GBBAI:MLU15182 GBBA2:SCD25 GB-HTG1 :CER08A5 GBPL2:AF078693 GBBA1:MTCY98 GBBA1:AVINIFREG GB-BA2:AFOO1 780 GBEST12:30506 GBPL2:AC006258 GBE5T37:A1998439 GBBAl :BLDAPAB G38_PAT:E16749 GB-PAT:E1 4520 GBBAl :BLDAPAB GB-BA1 :CGDAPB 83823 83823 20476 68848 40123 41622 51920 51920 1492 31225 7099 6701 329 110469 455 3572 2001 2001 3572 1902 Table 4 (continued) AC008332 Drosophila melanogaster chromosome 2 clone B3ACR48DIO (D867) RPCI-98 Drosophila me la nog aster 33.888 48.D.10 map 34A-34A strain y; cn bw sp. -SEQUENCING IN PROGRESS-, 78 unordered pieces.
AC006789 Caenorhabditls elegans clone Y49F6, SEQUENCING IN PROGRESS .Caenorhabditis elegans 36,737 2 unordered pieces.
AC006789 Caenortiabditis etegans clone Y49F6, SEQUENCING IN PROGRESS ,Caenorhabdltis elegans 36,737 2 unordered pieces.
090810 Ecoli genomic DNA, Kohara clone #319(37.4-37.8 min.). Escherichia coli 36,526 AL022004 Mycobacterium tuberculosis H37Rv complete genome; segment 40/1 62. Mycobacterium 66,193 tuberculosis U15182 Mycobacterlum leprae cosmld B2266. Mycobacterium Ieprae 61,443 ALl118514 Streptomyces coelicolor cosmid D25. Streptomyces coelicolor 59,938 A3(2) Z82281 Caenorhabditis elegans chromosome V clone ROSA5, -SEQUENCING IN Caenorhabditis elegans 64,896 PROGRESS In unordered pieces.
Z82281 Caenorhabditis elegans chiromosome V clone R08A5, -SEQUENCING IN Caenorhabdltis elegans 64,896 PROGRESS In unordered pieces.
AF078693 Chlamydomonas relnhardtii putative Q-acetylserine(thiol)lyase precursor Chlamydlomonas reinhardti 57,970 (Crcys-IA) mRNA, nuclear gene encoding organellar protein, complete cds.
Z83860 Mycobacterium tuberculosis H37Rv complete genome; segment 103/162. Mycobacterium 54,410 tuberculosis M60090 Azotobacter chroococcum nifU, nifS, nifV, nitP, nifW, nifZ and nitM genes, Azotobacter chroococcumn 51,729 complete cds, AF001780 Cyanothece PCC 8801 NifF (n61P), nitrogenase (01f3), FdxN (fdxN), NifS (nifS) Cyanothece PCC8801 36,309 and NifU (nifU) genes, complete cds, and NifH (nifH-) gene, partial cds.
Z30506 ATTS2430 AC16H Arabidopsis thaliana cDNA clone TA1306 3, mRNA Arabidopsis thallana 44,308 sequence.
IAC006258 Arabidopsis thaliana BAC: F18G18 from chromosome V near 60.5 cM, Arabidopsis thaliana 35,571 complete sequence.
A1998439 701545695 A. thaiana, Columbia Col-0, rosette-2 Arabidopsis thaliana cONA Arabidopsis thaliana 36,044 clone 701545695, mRNA sequence.
Z21 502 BlIactotermentum dapA and dapB genes for dihydrodipicolinate synthase and Corynebacterium 99,539 dihydrodiplcollnate reductase. glutarnicumn E16749 gDNA encoding dihydrodipicolinate synthase (DOPS). Corynebacteriumn 99,539 glutamicum E14520 DNA encoding Brevlbacterlum dhydrodipicolinic acid synthase. Corynebactedrn 99,539 glutamicum Z21502 Blactofermentum dapA and dapB genes for dihydrodipicolinate synthase and Corynebacterium 99,885 dihydrodlplcollnate reductase. .giutamicumn X67737 C.glutamicum dapB gene for dihydrodipicolinate reductase. Corynebacterium 100,000 glutamlcum 8-Aug-99 25-Feb-99 25-Feb-99 29-MAY- 1997 24-Jun-99 09-MAR- 1995 21-Sep-99 1 4-OCT- 1998 14-OCT- 1998 3-Nov-99 17-Jun-98 26-Apr-93 08-MAR- 1999 11-MAR- 1994 28-DEC- 1998 8-Sep-99 16-Aug-93 28-Jul-99 28-Jul-99 16-Aug-93 1-Apr-93 2007202378 25 May 2007 Table 4 (continued) 2001 E14520 DNA encoding Brevibacterium dihydrodipicolinic acid synihase.
GB-PAT:E14 520 rxa00865 1026 GBBA1:BLDAPAB GBPAT:E16752 GBPAT:ARO3B1 13 rxa00867 650 GBBAI:M1VOO2 GBBA1:MLCB22 GB-BAI:SAU 19858 rxa00873 779 GB BAl :S00001 206 GB-BA1 :500001205 GB BA1 :D78198 rxaOO884 1263 GB-BA1:MTCY253 GBBAl :MSGY222 GBGSS15:A065460 0 rxa0C891 1102 GBBA1:MTCi418B GBBA1:500001206 GB-BA1 :SC0001205 rxaOO952 963 EMPAT:E 10963 3572 1411 1411 56414 40281 2838 9184 9589 2304 41230 41156 468 11700 9184 9589 3118 7725 7725 7726 7725 7725 7726 Z21502 E16752 AR038 113 AL 008967 Z98741 U 19858 AJOG01206 AJO01 205 D78198 Z81368 ADOOOO1O A0654600 Z96071 AJO01 206 AJO0l1205 El10963 X04960 E01688 E01 375 E01688 X04960 E01 375 Blactofermentum dapA and dap6 genes for dihydrodipicolinate synthase and dihydrodipicolinate reductase.
gONA encoding dihydrodipicolinate reductase (DDPR).
Sequence 18 from patent US 5604414.
Mycobacterium tuberculosis H37Rv complete genome; segment 1221162.
Mycobacterium leprae cosmid B22.
Streptomyces antibioticus guanosine pentaphosphate synthetase (gpsl) gene.
complete cds.
Streptomyces coelicolor A3(2), glycogen metabolism cluster 11.
Streptomyces coelicolor A3(2) glycogen metabolism clusteri.
Pimelobacter sp. DNA for trehalose synthase, complete cds.
Mycobacterium tuberculosis H37Rv complete genome; segment 106/162.
Mycobacterium tuberculosis sequence from clone y222.
Sheared DNA-1014.TF Sheared DNA Trypanosoma brucei genomic clone Sheared DNA-1 014, genomic survey sequence.
Mycobacterium tuberculosis H37Rv complete genome: segment 7/162.
Streptomyces coelicolor A3(2), glycogen metabolism cluster 11.
Streptomyces coelicolor A3(2) glycogen metabolism clusteri.
9DNA encoding tryptophan synthase.
Brevibacterium lactofermentum tryptophan operon.
Genomlc DNA of trp operon of prepibacterium latophelmentamn.
DNA sequence of tryptophan operon.
Genomic DNA of trp operon of prepibacterium latophelmentamn.
Brevibacterium lactofermentumn Iryptophan operon.
DNA sequence of tryptophan operon.
Corynebacterium glutamicum Corynebacterium glutamicumn Corynebacterium glutamicum Unknown.
Mycobacterium tuberculosis Mycobacterium leprae Streptomyces antloticus Streptomyces coelicolor Streptomyces coelicolor Pimelobacter sp.
Mycobacterium tuberculosis Mycobaclerium tuberculosis Trypanosoma brucei Mycobacterium tuberculosis Streptomyces coelicolor Streptomyces coellcolor Corynebacterlum glutamicum Corynebacterium glutamicum unidentified Corynebacterium glutamicum unidentified Corynebacterium glutamicum Corynebacterium glutamicum 100,000 100,000 99,805 99,805 39,179 39,482 69.706 63,415 61,617 60,594 37,785 38,005 33,974 63,297 61,965 61,727 99.688 98,847 98,428 98,758 98,758 98.758 98,372 28-Jul-99 16-Aug-93 28-Jul-99 29-Sep-99 17-Jun-98 22-Aug-97 1996 29-MAR- 1999 29-MAR- 1999 5-Feb-99 17-Jun-98 03-DEC- 1996 22-Jun-99 18-Jun-98 29-MAR- 1999 29-MAR- 1999 08-OCT- 1997 (Ret.
52, Created) 10-Feb-99 29-Sep-97 29-Sep-97 29-Sep-97 10-Feb-99 29-Sep-97 rxa00954 644 rxa00955 1545 GB-BAI :BLTRP GB PAT:E01688 GB-PAT:EO1 375 GS-PAT:EO1 688 GB-BA1 :BLTRP GBPAT:E01375 2007202378 25 May 2007 Table 4 (continued) GBBA1:BLTRP GBPAT:E01688 EMPAT:E10963 rxa00956 1237 GBBA1:BLTRP GBPAT:E01375 rxa00957 1677 GBBAI:BLTRP GBPAr:EO 1375 GBPAT:E01688 rxa00958 747 GBBA1:BLTRP GBPAT:E01375 GBPAT:E0 1688 rxa00970 1050 GBBA1:CGHOMTHR GBPAT:109077 GBPAT:E01358 rxa00972 1458 GBPAT:E16755 GBPAT:AR038110 GBPAT:E14508 rxa00981 753 GBOV:GGA245664 GBL2:AC007887
GBGSSI:CNSOORN
W
7725 7725 3118 7725 7726 7725 7726 7725 7725 7726 7725 3685 3685 2615 3579 3579 3579 512 159434 542 63033 3619 63033 X04960 EQ01688 E10963 X04960.
EQ01375 X04960 E01 375 E01688 X04960 E01375 E01 688 Y00546 109077 E01 358 E16755 AR0381 10 E14508 AJ245664 AC007887 AL087338 AL02 1246 Y13070 AL021 246 Brevibactenium lactofermentumn tryptaphan operon. Corynebacterium glutamicum Genomic DNA of trp operon of prepibacterium latophelmentamn. unidentified gONA encoding tryptophan synthase. Corynebacterium glutamicum Brevibactenium lactofen-nentumn tryptophan operon. Corynebacterium glutamicum.
DNA sequence of tryptophan operon. Corynebacterium glutamicumn Brevibacterium lactofermentumn tryptophan operon. Corynebacterium glutamicum DNA sequence of tryptophan operon. Corynebacteriumn glutamicum Genomic DNA of trp operon of prepibacterium latophelmentamn. unidentified Brevibacterium lactofermentumn tryptophan operon. Corynebacterlumn glutamicum DNA sequence of tryptophan operon. Corynebacterium glutamlcum Genomic DNA of trp operon of preplbacterium latophelmentamn. unidentified Corynebacterium glutamicum honi-thrB genes for homoserine dehydrogenase Corynebacterium and homoserine kinase. glutamicum Sequence 1 from Patent WO 880981Il9. Unknown.
DNA encoding for homoserine dehydrogenase(HDH)and homoserine Corynebacteriumn kinase(HK). glutamicum gDNA encoding diaminopimelate decarboxylase (DDC) and arginyl-tRNA Corynebacterlumn synthase. glutamicum Sequence 15 from patent US 5804414. Unknown.
DNA encoding Brevibacterium diaminopimelic: acid decarboxylase and arginyl- Corynebacterium tRNA synthase. glutamicum Gallus gallus partial mRNA for ATP-citrate lyase (ACL gene). Gallus gallus Genomic sequence for Arabidopsis thaliana BAC Fl 504 from chromosome 1, Arabidopsis thaliana complete sequence.
Arabidopsis thaliana genome survey sequence T7 end of BAC F1407 of IGF Arabidopsis thaliana library from strain Columbia of Arabidopsis thaliana, genomic survey sequence.
Mycobactedriu tuberculosis H37Rv complete genome; segment 1081162. Mycobacterium, tuberculosis Scoelicolor valS, fpgs, ndk genes. Streptomyces coelicolor Mycobacterium tuberculosis H37Rv complete genome; segment 108/162. Mycobacterium tuberculosis 98,372 98,242 98.949 99.107 98,945 99,165 98,927 98,867 98,792 98,792 98,658 99,905 99,810 97,524 99,931 99,93 1 99.931 37,538 37,600 41,264 40,773 58.119 38.167 10-Feb-99 29-Sep-97 08-OCT- 1997 (Rel.
52, Created) 10-Feb-99 29-Sep-97 10-Feb-99 29-Sep-97 29-Sep-97 10-Feb-99 29-Sep-97 29-Sep-97 12-Sep-93 02-DEC- 1994 29-Sep-97 28-Jul-99 29-Sep-99 28-Jul-99 28-Sep-99 04-OCT- 1999 28-Jun-99 17-Jun-98 03-MAR- 1998 17-Jun-98 rxa00989 1644 GBBA1:MTVOO8 GB-BA1 :SCVALSFP GBBA1:MTVOO8 2007202378 25 May 2007 Table 4 (continued) Corynebacterium glutamicum L-proiine:NADP+ 5-oxidoreductase (proC) gene. Corynebacterium rxa00997 705 GBBA2:CGU31225 1817 U31225 GBHTG1:CEY39CI2 282838 GBINI:CEBO0O1 39416 nxa01019 1110 GBHTG2:AC005052 144734 GB-HTG2:AC005052 144734 GB GSS9:A0171808 512 rxa0lO26 1782 GBBAI:SC1C2 42210 GBBA:ATLEUCD 2982 GBBAI:MTV012 70287 rxa0lO27 1131 GB-BA1:MLCB637 44882 GBBA1:MTCY349 43523 GB_BA1:SPUNGMUT 1172 x rxaOlO73 954 GB_BA1:BACOUTB 1004 GB-PR4:AC007938 167237 GBPL2.ATA0006282 92577 rxaGlO79 2226 GB_BA2:AF1 12535 4363 GB-BA1:CANRDFGE 6054
N
GBBA1:M1V012 70287 rxa0lO80 567 GB-BA2:AF112535 4363 GB_BAI:CANRDFGE 6054
N
GBBA1:STNRD 4894 fxa01087 999 GB-IN2:AF063412 1093 GBPR3:HS24M15 134539 GB-IN2:ARU85702 1240 AL009026 Z69634 AC005052 AC005052 AQ171808 AL031 124 X84647 AL021 287 Z99263 Z83018 Z21 702 M1 5811 AC007938 AC006282 AF1 12535 Y09572 AL021 287 AF112535 Y09572 X73226 AF063412 Z94055 U85702 complete cds.
Caenorhabditis elegans chromosome IV clone Y39C 12. SEQUENCING IN PROGRESS in unordered pieces.
Caenorhabditis elegans cosmid B0001, complete sequence.
Homo sapiens clone RG038K21, -SEQUENCING IN PROGRESS .3 unordered pieces.
Homo sapiens clone RG038K21, SEQUENCING IN PROGRESS ~,3 unordered pieces.
HS_-3179_-Al_-G03_1 7 CIT Approved Human Genomic Sperm Library 0 Homo sapiens genomic clone Plate=3179 Col=5 Row=M, genomic survey sequence.
Streptomyces coelicolor cosmid 102.
Ateichomyceticus leuC and leuD) genes.
Mycobacterlum tuberculosis H37Rv complete genome; segment 132/162.
Mycobacterium Ieprae cosmid B637.
Mycobacterium tuberculosis H37Rv complete genome; segment 131/1 62.
S.pneumoniae ung gene and mutX genes encoding uracl-DNA glycosylase and 8-oxodGTP nucleoside triphosphatase.
Bacillus subtilis outB gene encoding a sporulatlon protein, complete cds.
Homo sapiens clone UWGC:djs2Ol from 7q31. complete sequence.
Arabidopsis thallana chromosome 11 BAG F13K3 genomic sequence.
complete sequence.
Corynebacterium glutamicum putative glutaredoxin NrdH (nrdH), Nrdl (nrdl), and ribonucleotide reductase alpha-chain (nrdE) genes, complete cds.
Corynebacterium ammoniagenes nrdH, nrdl, nrdE, nrdF genes.
Mycobacterium tuberculosis H37Rv complete genome; segment 132/162.
Corynebacterium glutamicum putative glutaredoxin NrdH (nrdH), Nrdl (nrdi).
and ribonucleotide reductase alpha-chain (nrdE) genes, complete cds.
Corynebacterium ammonlagenes nrdH, nrdl, nrdE, nrdF genes.
Styphimurium nrdEF operon.
Limnadia lenticularis elongation factor 1 -alpha mRNA, partial cds.
Human DNA sequence from PAC 24M15 on chromosome 1. Contains tenascin-R (restrictin). EST.
Anathix ralla elongation factor-i alpha (EF-1 a) gene, partial cds.
glutamicum Caenorhabditis elegans Caenorhabditis elegans Homo sapiens Home sapiens Home sapiens Streptomyces coelicolor Actinoplanes teichomyceticus Mycobacterium tuberculosis Mycobacterium leprae Mycobacterium tuberculosis Streptococcus pneumoniae Bacillus subtilis Home sapiens Arabidopsis thaliana Corynebacterium glutamicum Corynebacterlum ammonlagenes Mycobacterium tuberculosis Corynebacterium glutamicum Corynebacterium ammonlagenes Salmonella typhimurium Limnadia lenticularis Home sapiens Anathix ralla 40,841 36,416 36,416 39,172 39.172 34.661 68,275 65,935 40,454 38.636 51,989 38,088 53,723 34,322 36,181 99.820 75,966 38,296 100,000 65,511 52,477 43,750 37,475 37.319 2-Aug-96 26-OCT- 1999 2-Sep-99 1 2-Jun-98 12-Jun-98 1 7-OCT- 1998 1 5-Jan-99 04-OCT.
1995 231-Jun-99 17-Sep-97 17-Jun-98 15-Jun-94 26-Apr-93 1-Jul-99 13-MAR- 1999 5-Aug-99 18-Apr-98 23-Jun-99 5-Aug-99 18-Apr-98 03-MAR- 1997 29-MAR- 1999.
23-Nov-99 16-Jul-97 2007202378 25 May 2007 rxa01095 857 GBBA1:MTCYO1B2 GBHTG5:AC011632 1632 rxa01097 477 GB-BA2:AF030405 GB_8A2:AF030405 rxa0 1098 897 GBBA2:AF030405 GBBAl :MSGY223 GBBA1:MLCB161O rxa0llOO 861 GBBA2:AF051846 35938 175917 175917 774 774 774 42061 40055 738 GBBA2:AF060558 636 GBHTG1:HSDJ14OA 221755 9 rxa0ll0l 756 GB-BA2:AF060558 636 GB-BA1:SC4G6 36917 GBBAI:STMHISOPA 3981 rxa01 104 729 GBBA1:STMHISOPA 3981 GBBA1:SC4G6 36917 GB-BA1 :MTCY336 32437 rxa01l105 1221 GBBA1:MTCY336 32437 GBBA1:MSGY223 42061 GBBA1:MLCB16IO 40055 rxa0llO6 1449 GBBAI:MSGY223 42061 Table 4 (continued) Z95554 Mycobacterium tuberculosis H37Rv complete genome; segment 72/162.
AC0l 1632 Homo sapiens clone RP1 1-3N13, WORKING DRAFT SEQUENCE, 9 unordered pieces.
AC0 11632 Homo sapiens clone RPII1-3N13, WORKING DRAFT SEQUENCE. 9 unordered pieces.
AF030405 Corynebacterium glutamicum cyclase (hisF) gene, complete cds.
AF030405 Corynebactenium glutamicum cyclase (hlsF) gene, complete cds.
AF030405 Corynebacterium glutamicum cyclase (hisF) gene, complete cds.
AD000019 Mycobadterium tuberculosis sequence from clone y223.
AL049913 Mycobacterium leprae cosmid B81610.
AF051 846 Corynebacterium glutamicum phosphoribosylformimino-5-aminc-1 phosp horibosyl-4- imidazolecarboxamide isomerase (hisA) gene, complete cds.
AF060558 Corynebacterium glulamicum glutamine amidotransferase (hIsH) gene, complete cds.
AL109917 Homo sapiens chromosome I clone RP1-140A9, SEQUENCING IN PROGRESS in unordered pieces.
AF060558 Corynebactenium glutamicum glutamine amidotransferase (hisH) gene.
complete cds.
AL096884 Streptomyces coelicolor cosmid 4G6.
M31628 S.coelicolor histidine biosynthesis operon encoding hisO, partial cds., and hisO, hisB. hisH, and hisA genes, complete cds.
M31628 S.coelicolor histidine biosynthesis operon encoding hIsD, partial cds., and hlsC, hlsB, hisH, and hisA genes, complete cds.
AL096884 Streptomyces coelicolor cosmid 4G6.
Z95586 Mycobacterium tuberculosis H37Rv complete genome; segment 701162.
Z95586 Mycobacterium tuberculosis H37Rv complete genome: segment 701162.
A00000 19 Mycobacterlum tuberculosis sequence from clone y223.
AL049913 Mycoba cterium Ieprae cosmid 81610.
ADOOOD19 Mycobacterlum tuberculosis sequence from clone y223.
Mycobacterium tuberculosis Homo sapiens Homo sapiens Corynebacterium glutamicum Corynebacterlum glutamicumn Corynebacterlum glutamnicumn Mycobacterlum tuberculosis Mycobacteriumn leprae Corynebacterlum glutamicum Corynebacterium glutamicum Homo sapiens Corynebacterlum glutamnicumn Streptomyces coelicolor A3(2) Streptomyces coelicolor Streptomyces coelicolor Sireptomyces coelicolor A3(2) Mycobacterium tuberculosis Mycobacterium tuberculosis Mycobacterium, tuberculosis Mycobacterium leprae Mycobacterium tuberculosis 43,243 36.471 36,836 100,000 41,206 97,933 40,972 61,366 97,1 54 95,455 30.523 94,462 38,378 60,053 58,333 39,045 60,364 60,931 36,851 60,902 37.233 13-Nov-97 lO-DEC- 1996 27-Aug.99 12-MAR- 1998 29-Apr-98 23-Nov-99 29-Apr.98 23-Jul-99 26-Apr-93 26-Apr.93 23-Jul-99 24-Jun-99 24-Jun-99 1996 27-Aug-99 1996 17-Jun-98 19-Nov-99 119-Nov-99 13-Nov-97 13-Nov-97 2007202378 25 May 2007 rxa0l 145 1137 rxaOl1162 1449 nca01208 846 nxa01209 1528 1098 rxa01239 2556 rxa01 253 873
GBBAIMSHISCD
GBBA1:MTCY336 GBBA1:CORAIA GB-BA1 :BRLILVCA GB-PAT:E08232 GBPAT:A60299 GBPR3:HS24E5 GByPR3:AC005265 GB-HTG2:AC004965 GBHTG2:AC004965 GB-yL2:TAU55859 G13HTG3:AC01 1469 GB_-HTG3:ACO1 1461 GB-PLlABO1 0077 GB-BA1 :MTCYl0G2
GBJINI:LEIPRPP
GB {TG2:HSJ799D 6 GB,8A1 :MrCY48 GBj'R2:.AB029032 GBJ3GSS9:AQ10 72 GBPL2:F508 GBiPL2:F508 GB-lN1:CELC06Gl 2298 X65542 32437 Z95586 Table 4 (continued) M.smegmatis genes hisD and hlsC for histidiflol dehydrogenase and histidinol- Mycobacteriumn smegmatis 60,111 phosphate aminotransterase, respectively.
Mycobacterium tuberculosis H37Rv complete genome; segment 70/162. Mycobacteriumn 58,420 tuberculosis 4705 L09232 Corynebacterlurn glutamicumn acetohydroxy acid synthase (itvB) and (iivN) genes, and acetohydroxy acid isomeroreductase (iivC) gene, complete cds.
1364 014551 Brevibacteriumn flavumn ilvC gene for acetohydroxy acid isomeroreductase, complete ods.
1017 E08232 DNA encoding acetohydroxy-acld isomeroreductase.
2869 A60299 Sequence 18 from Patent W09706261.
35506 Z82185 Human DNA sequence from Fosmid 24E5 on chromosome 22q1 1.2-qter contains parvalbumln. ESTs, STS.
43900 AC005265 Homo sapiens chromosome 19, cosmld Fl 9750, complete sequence.
323792 AC004965 H-omo sapiens clone DJ1 1061-14. SEQUENCING IN PROGRESS .42 unordered pieces.
323792 AC004965 Homo sapiens clone DJ1 1061-14,. SEQUENCING IN PROGRESS ,42 unordered pieces.
2397 U55859 Triticumn aestivumn heat shock protein 80 mRNA, complete cds.
1 13436 AC01 1469 Homo sapiens chromosome 19 clone CIT-HSPC 475D23,~ SEQUENCING IN PROGRESS 31 unordered pieces.
9113436 AC01 1469 Homo sapiens chromosome 19 clone CIT-HSPC_475023,
SEQUENCING
IN PROGRESS 31 unordered pieces.
77380 AB010077 Arabidaopsis thaliana genomic DNA, chromosome 5. P1 clone: MYHig.
complete sequence.
38970 Z92539 Mycobacterium tuberculosis H37Rv complete genome; segment 47/162: 1887 M76553 Leishmania donovani phosphoribosylpyrophosphate synthetase gene, complete cds.
1i 130149 AL050344 Homo sapiens chromosome 1 clone RP4-799D16 map p34.3-36.1, SEQUENCING IN PROGRESS In unordered pieces.
35377 Z74020 Mycobacteriur tuberculosis H37Rv complete genome; segment 69/1 62.
6377 AS029032 H-omo sapiens mRNA for KIAA 109 protein, partial cds.
01 355 AQ107201 HS,,.3098..A1-.C03-T7 CIT Approved Human Genomic Sperm Library D Homo sapiens genomic clone Piate=3098 Col=5 Row-E, genomic survey glutamicumn Corynebacterum glutamicum Corynebacterium glutamnicurn Aspergilius niger Homo sapiens Homo sapiens Homo sapiens Homo sapiens Triticumn aestivurn Homo sapiens Homo sapiens Arabidopsis thaliana Mycobacterium tuberculosis Leishmanla donovani Homo sapiens Mycobacterium tuberculosis Homo sapiens Homo sapiens Arabidopsis thaliana Arabidopsis thaliana 99.560 99.803 38.675 36,204 38,363 36,058 36,058 37,269 40,000 40,000 36,803 37,047 50,738 38,135 38.139 39.394 41,408 36,118 35,574 30-Jun-93 24-Jun-99 23-Feb-95 3-Feb-99 29-Sep-97 06-MAR- 1998 23-Nov-99 6-Jul-98 12-Jun-98 12-Jun-98 1-Feb-99 07-OCT- 1999 07-OCT- 1999 20-Nov-99 17-Jun-98 7-Jun-93 29-Nov-99 17-Jun-98 4-Aug-99 28-Aug-98 23-DEC- 1998 23-DEC- 1998 30-Nov-95 99923 99923 sequence.
AC005990 Arabidopsis thaliana chromosome 1 BAC F508 sequence, complete sequence.
31205 U41014 Caenorhabditls elegans cosmid CO6GI.
Caenorhabdltis elegans 38,560 2007202378 25 May 2007 rxa01 321 1044 GBGSS14:AQ51884 441 3 GBHTG2:AC007473 194859 AQ51 8843 AC007473 Table 4 (continued) HS_5106 Al DIO,5P6E RPCI-1 I Human Male BAC Library Homo sapiens Homo sapiens genomic clone Plate=682 001=19 Row--G, genomic survey sequence.
Drosophila melanogaster chromosome 2 clone BACR38D12 (D590) RPCI-98 D rosophila mela nog aster 38.0.12 map 4BA-488 strain y: cn bw sp. SEQUENCING IN PROGRESS unordered pieces.
Drosophila melanogaster chromosome 2 clone BACR35FO1 (01 156) RPCI-98 Drosophila melanogasler 35.17.1 map 4BA-48C strain y; cn bw sp, SEQUENCING IN PROGRESS GBHTG4:AC011696 115847 AC011696 ,xa01 352 706 GBPL2:ATACOC5I67 83260 GB-PL2:ATAC005825 97380 GBI-HTG3:ACO1 1150 127222 rxa01360 259 GB-EST32:A1725583 728 GBPR2:I-1227P17 82951 GBE5T34:AV171099 173 rxa0l36l 629 GB -RO:AB008915S1 530 GBE5T22.A1050532 293 AC0051 67 AC005825 AC01 1150 A1725583 Z81 007 AV17 1099 AB008915 A1050532 AB008895 AB005237 AQ766840 AL022004 X96471 AL031 107 IAC004054 X96471 AQ769223 rxa0l3BI 944 rxa01393 993 rxa01394 822 GB RO:AB008895 GBPL1:AB005237 GB-GSS5:AQ766840 GBBA1:M1VOA3 GB.BA1 :CGLYSEG GB BA1:SC5A7 GBPR3:AC004054 GBBA1 :CGLYSEG GB GSS5:A0769223 3062 87835 491 68848 2374 40337 112 184 2374 500 -,108 unordered pieces.
Arabidopsis thaliana chromosome 11 BAC F12A24-genomic sequence, complete sequence.
Arabidapsis thaliana chromosome 11 BAC T24121 genomic sequence.
complete sequence.
H-omo sapiens clone 4-K17, LOW-PASS SEQUENCE SAMPLING.
BNLGHi12371 Six-day Cotton fiber Gossypium hirsutum cDNA 5' similar to (U86081) root hair defective 3 [Arabldopsis thallanal. mRNA sequence.
Human DNA sequence from PAC 227P17, between markers DXS6791 andDXS8O38 on chromosome X contains CpG island, EST.
AV1 71099 Mus musculus head C57B3LJ6J 14, 17 day embryo Mus musculus cDNA clone 3200002M1 1, mRNA sequence.
Mus musculus mGpII gene. exon 1.
uc83dlO.yl Sugano mouse kidney mkla Mus musculus cDNA clone IMAGE:1 432243 5 similar to TR:0351 20 035120 MGPI 1 P. ;,mRNA sequence.
Mus musculus mRNA for mGpil1p, complete cds.
Arabldopsls thaliana genomic DNA, chromosome 5, P11 clone: MJJ3.complete sequence.
HS_2026_A2_C09.T7C CIT Approved Human Genomic Sperm Library 0 Homo sapiens genomic clone Plate=2026 Co1=18 Row--E, genomic survey sequence.
Mycobacterium tuberculosis H37Rv complete genome: segment 40/162.
C.glutamlcum lysE and IysG genes.
Streptomyces coelicolor cosmid 5A7.
Homo sapiens chromosome 4 clone B220G8 map 4q21, complete sequence.
C:glutamicum lysE and IysG genes.
HS1-131 55_B2_G1O_.T7C CIT Approved Human Genomlc Sperm Library 0 Homo sapiens genomic clone Plate=3155 Col='20 Row-N, genomlc survey sequence.
Arabidopsis thallana Arabldopsis Ihaliana Homo sapiens Gossypium hlrsutum Homo sapiens Mus musculus Mus musculus Mus musculus Mus musculus Arabidopsis thaliana Homo sapiens Mycobacterium tuberculosis Corynebacterium glutamicum Streptomyces coelicolor Homo sapiens Corynebacterium glutsmicum Homo sapiens 41.121 40.634 38,290 34,311 34,311 37,722 38.492 39.738 46,237 45.574 44.097 41,316 36,606 37,916 37,419 34,831 35.138 37,277 100.000 38,400 05-MAY- 1999 2-Aug-99 26-OCT- 1999 1 1998 12-Apr-99 01 -OT- 1999 11-Jun-99 23-Nov-99 6-Jul-99 28-Sep-99 9-Jul-98 23-Nov-97 20-Nov-99 28-Jul-99 24-Jun-99 24-Feb-97 27-Jul-98 9-Jul-98 24-Feb-97 28-Jul-99 2007202378 25 May 2007 GB-BA1:CGLYSEG 2374 rxa014I6 630 GBBA1:S03C3 31382 GBBA1:MLCB22 40281 GB-BA1:M1V002 56414 rxa01442 '1347 GB-BA1:090827 18886 GB-BAI:Dg0828 14590 GB-BA2:AE000279 10855 rxa01446 1413 GB_8A1:SCHIO 39524 GB-BA1:MTYI3EIO 35019 GB8BA1:MLCB4 36310 fxa01483 1395 GB-BA1 :MTCY98 31225 GBBA1:MSGBI229C 30670
S
GB_8A2:AF027507 5168 rxa01486 757 GBBAI:MTVOO2 56414 GB-BA1:MLCB22 40281 GBBA:SC303 31382 rxa01489 1146 GB-BAI:CORFADS 1547 GBBA1:MLCB22 40281 GB-BA1 :SC1OA7 39739 rxa01491 774 GBBA1:MTVOO2 56414 GB-EST3:AA356956 255 GB-OV:OMDNAPROI 7327 rxaOlSO8 1662 GBIN1:0EF28C12 14653 GBIN1:CEF28C12 14653 rxaO1512 723 GB-BAI:SGE9 37730 GB-BA1 :MAU88875 840 X96471 AL031 231 Z98741 AL008967 090827 D90828 AE000279 AL049754 Z95324 AL023514 Z83860 L78812 AF027507 AL008967 Z98741 AL031231 D37967 Z98741 AL076618 AL008967 AA356956 X92380 Z93380 Z93380 AL049841 U88875 Table 4 (continued) C.gtutamicum lysE and lysG genes.
Streptomyces coelicolorcosmid 3C3.
Mycobacterium leprae cosmid B22.
Mycobacterium tuberculosis H37Rv complete genome; segment 1221162.
E.coll genomic DNA. Kohara clone #336(41.2-41.6 min.).
E.coli genomlc DNA, Kohara clone #336gap(41 .6-41.9 min.).
Escherichla coil K-12 MG 1655 section 169 of 400 of the complete genome.
Streptomyces coellcotor cosmid H-110.
Mycobacterlum tuberculosis H37Rv complete genome: segment 116/162.
Mycobacterium leprae cosmld 84.
Mycobacterlum tuberculosis H37Rv complete genome: segment 103/1 62.
Mycobacterium leprae cosmid BI1229 DNA sequence.
Mycobacterlumn smegmatis dGTPase (dgt), and primase (dnaG) genes, complete cds; tRNA-Asn gene, complete sequence.
Mycobacterium tuberculosis H37Rv complete genome; segment 1221162.
Mycobacterium leprae cosmid 822.
Streptomyces coelicolor cosmid 3C3.
Corynebacterium ammoniagenes gene for FAD synthetase, complete cds.
Mycobactedriu leprae cosmid B22.
Streptomyces coelicolor cosmid 1 OAT.
Mycobacterium tuberculosis H37Rv complete genome: segment 1221162.
EST65614 Jur1Wa T-cells III Homo sapiens cONA V~end, mRNA sequence.
Omossamblcus prolactin I gene.
Caenorhabditls elegans cosmid F28CI 2, complete sequence.
Caenorhabditis elegans cosmld F28C1 2. complete sequence.
Streptomyces coelicolor cosmid E9.
Mycobacterium avium hypoxa nthlne-g u anine phosphoribosyl transferase gene, complete cds.
Co ryne bacterium glutamicum Streptomyces coelicolor Mycobacterlum leprae Mycobacterium tuberculosis Escherichla coil Eschericia coil Escherichia coli Streptomyces coelicolor Mycobacterium tuberculosis Mycobacterium leprae Mycobacterium tuberculosis Mycobacterium leprae Mycobacterium smegmatis Mycobacterlumn tuberculosis Mycobacterium leprae Streptomyces coelicolor Corynebacterium ammonlagenes Mycobacterium leprae Streptomyces coelicolor Mycobacterium tuberculosis Homo sapiens Tilapla mossambica Caenorhabditls etegans Caenorhabditis elegans Streptomyces coelicolor Mycobacterium avium 33.665 62,726 39,159 37,340 58,517 56,151 56,021 39,037 40,130 37,752 39,057 54,382 52.941 40,941 38,451 6 1.194 58,0211 38,414 36,930 37,062 37,647 38,289 37,984 38,469 39,021 57,521 24-Feb-97 10O-Aug-98 22-Aug-97 1 7-Jun-98 21-MAR- 1997 21-MAR- 1997 12-Nov-98 04-MAY- 1999 17-Jun-98 27-Aug-99 17-Jun-98 1 5-Jun-96 1 6-Jan-98 17-Jun-98 w 22-Aug-97 10-Aug-98 8-Feb-99 22-Aug-97 9-Jun-99 17-Jun-98 21 -Apr-97 I 9-OCT- 1995 23-Nov-98 23-Nov-98 19-MAY- -:1999 1997 2007202378 25 May 2007 GBBA1:MTYI5C1O 33050 Z95436 rxaOl5l4 711 GBBA1:MTCY7H7B GB-BA1 :MLCB2548
GBPLI:EGGTPCHI
rxa0l5i5 975 GB-BA1:ECOUW93 GBBA1:ECOUrW93 GB-BA1:MTCY49 =x01516 513 GBINI:DME238847 GBHTG3:ACOO921O GB_1N2:AF132179 rxaOl517 600 GBPL2:F6H8 GBPL2:AF038831 GSBPL2:ATAC005957 rxa01521 921 GB-BA1:ANANIFBH GBPR2:AC002461 GB PR2:AC002461 rxa01528 651 GBRO:MM437P9 GB...R3:AC005740 GBPR3:AC005740 rxa0lSSI 1998 GB-BA1I:MTCY22G1O GBBA2:ECOUW89 G13_BA1:SCQI1 rxa01561 1053 GS-IN1:CEY62H9A GB-PR4:HSU51OO3 GB.OM:PIGDAO1 rxa01599 1785 GB-BAI:MT01125 GBBA1:U.00021 24244 38916 242 338534 338534 39430 5419 103814 4842 82596 647 108355 5936 197273 197273 165901 186780 186780 35420 176 19t 15441 47396 3202 395 37432 39193 Z95557 AL023093 Z49757 U 14003 U 14003 Z73966 AJ238847 AC009210 AF1 32179 AF178045 AF038831 AC005957 J05111 AC002461 AC002461 AL049866 AC005740 AC005740 Z84724 5 U00006 AL096823 AL032630 U51003 M18444 Z98268 U00021 Table 4 (continued) Mycobacterium tuberculosis H37Rv complete genome; segment 154/162.
Mycobacterium tuberculosis H37Rv complete genome; segment 153/162.
Mycobacterium ieprae cosmid B2548.
E.graciiis mRNA for GTP cyclohydrolase I (core region).
Escherichia coil K-12 chromosomal region from 92.8 to 00.1 minutes.
Escherichia coil K-12 chromosomal region from 92.8 to 00.1 minutes.
Mycobacterium tuberculosis H37Rv complete genome;*segment 93/162.
Drosophila melanogaster mRNA for drosophila dlodeca-satelite protein 1 (DDP-1).
Drosophila melanogaster chromosome 2 clone BACRO1106 (01054) RPCI-98 01.1.6 map 55D-55D strain y; cn bw sp, -SEQUENCING IN PROGRESS 86 unordered pieces.
Drosophila melanogaster clone LD21677 unknown mRNA.
Arabidopsis Ihaliana BAC F6H-18.
Sorosporium saponarlae internal transcr'ibed spacer 1, 5.8S ribosomal RNA gene: and internal transcribed spacer 2, complete sequence.
Arabidopsis thaliana chromosome 11 BAG T15.114 genomic sequence, complete sequence.
Anabaena sp. (clone AnH20.1) nitrogen fixation operon nifB, fdxN, MIfS. nifU, and nIfH genes, complete cds.
Human BAG clone RG204116 from 7q31, complete sequence.
Human BAC clone RG204116 from 7q31, complete sequence.
Mus musculus chromosome X, clone 437P9.
pHomo sapiens chromosome 5p, BAG clone 50921 (LBNI H 154), complete sequence.
Homo sapiens chromosome 5p, BAG clone 50g21 (LBNL H 154), complete sequence.
Mycobacterium tuberculosis H37Rv complete genome; segment 2 1/162.
E. coli chromosomal region from 89.2 to 92.8 minutes.
Streptomyces coelicolor cosmid Q 11.
Caenorhabditis elegans cosmid Y621-19A, complete sequence.
Homo sapiens DILX-2 (DiX-2) gene, complete cds.
Pig 0-amino acid oxidlase (DAO) gene, exon 1.
Mycobacterium tuberculosis H37Rv complete genome; segment 76/1 62.
Mycobacterium ieprae cosmid L247.
Mycobacterium tuberculosis Mycobacterium tuberculosis Mycobacterium leprae Euglena gracilis Escherichia coli Escherlchia coli Mycobacterium tuberculosis Drosophila melanogaster Drosophila melanogaster Drosophila melanogaster Arabidopsis thailana Sorosporium saponariae Arabidlopsis thaliana Anabaena sp.
Homo sapiens Homo sapiens Mus musculus Homo sapiens Homo sapiens Mycobacterium tuberculosis Escherlchla coli Streptomyces coelicolor Caenorhabditls elegans Homo sapiens Sus scrofa Mycobacterium tuberculosis Mycobacterium lepre 40.086 43.343 38.177 64,876 38,943 37,500 38.010 36,346 37,897 36,149 35.846 40,566 38.095 38,206 36,623 34,719 37.500 37,031 38,035 38,371 38,064 60,775 38,514 37,730 39,340 63,300 36.756 17-Jun-98 18-Jun-98 27-Aug-99 1995 17-Apr-96 17-Apr-96 24-Jun-99 113-Aug-99 20-Aug-99 3-Jun-99 1 9-Aug-99 13-Apr-99 7-Jan-99 26-Apr-93 20-Aug-97 20-Aug-97 29-Jun-99 Ol-OCT- 1998 Ol-OCT- 1998 17-Jun-98 I 7-DEC- 1993 8-Jul-99 2-Sep-99 07-DEC- 1999 27-Apr-93 1 7-Junt98 29-Sep-94 2007202378 25 May 2007 Table 4 (continued) 38936 Z95117 Mycobacteriumn Ieprae cosmid B1351.
rxa0l617 795 rxa01657 723 rxaOI166O 675 rxa01678 651 nca01679 1359 rxa0l69O 1224 rxa01692 873 rxa01698 1353 GB BA1 :MLCB1 351 GB-PR2:HSMTMO GBYPR2:HSI3D1O GBPR2:HSMTMO GBBA1:MTCY1A1O GBEST6:D79278 GB-BA2:AF1 29925 GBBA1:MTVO13
GBRO:MMFVI
GB-PAT:A67508 GBVi:TVU95309 GBVi:TVU95303 GBVI:TVU95302 GBEST5:H9 1843 Mycobacteriumn leprae 217657 AL034384 Human chromosome Xq28, cosmid clones 7H3, 14Q7, C1230, 11127, F1096, Homo sapiens 36,756 24-Jun.97 40,811 5-Jul.99 153147 217657 25949 392 10243 11364 6480 6480 600 600 600 362 AL02 1407 AL034384 Z95387 079278 AF1 29925 AL021 309 X97719 A67508 U95309 U95303 U95302 H91843 G26925 AF139451 AL031 124 A1064232 AF1 17896 AF067 123 M37227 X1 3804 AF1 24600 A12197, 12G8, A09100: complete sequence bases 1. .217657.
Homo sapiens DNA sequence from PAC 13010 on chromosome 6p22.3-23.
Contains CpG island.
Human chromosome Xq28, cosmid clones 7H3, 1407, C1 230, 11 E7. F1096, A12197, 12G8. A09100; complete sequence bases 1. .217657.
Mycobacterium tuberculosis H37Rv complete genome; segment 117/162.
HUM213006B Human aorta polyA+ (TFujiwara) Homo sapiens cONA clone GEN-213D06 mRNA sequence.
Thiobacillus ferrooxidans carboxysome operon, complete cds.
Mycobacterium tuberculosis H37Rv complete genome; segment 134/1162.
M.musculus retrovirus restriction gene Fyi.
Sequence 1 from Patent W0974341 0.
Tula virus 064 nucleocapsid protein gene, partial cds.
Tula virus 052 nucieocapsid protein gene, partial cds.
Tula virus 024 riucleocapsid protein gene, partial cds.
ys~le~l.s1 Soares retina N2b4HR Homo sapiens cONA clone IMAGE:221208 3'similar to gb:X63749_rnal GUANINE NUCLEOTIDE- BINDING PROTEIN GMT, ALPHA-i (HUMAN);, mRNA sequence.
human STS SHGC-30023, sequence tagged site.
Gossypium robinsonli CeIA2 pseudogene, partial sequence.
Streplomyces coelicolor cosmid 10C2.
GH04563.5prime GH- Drosophila melanogaster head pOT2 Drosophila melanogaster cDNA clone GH04563 Sprime, mRNA sequence.
Drosophila melanogaster neuropeptide F (ripf gene, complete cds.
Lactobacillus reuten cobalamin biosynthesis protein J (cbiJ) gene, partial cds; and uroporphyrin-lil C-methyltransferase (sumT) gene, complete cds.
Rat heavy neurofitament (NP-H) polypeptidle, partial ods.
Rat mRNA for heavy neurol'ilament polypeptide NP-H C-terminus.
Corynebacterlum glutamicum chorismate synthase (aroC), shikimate kinase (aroK), and 3-dehydroquinate synthase (aroB) genes, complete cds; and putative cytoplasmic peptidase (pepQ) gene, partial cds.
Mycobacterium tuberculosis H37Rv complete genome; segment 111/ 162.
Homo sapiens Homo sapiens Mycobacterium tuberculosis Homo sapiens Thiobacillus ferrooxidans Mycobacterium tuberculosis Mus musculus Mus musculus Tula virus Tula virus Tula virus Homo sapiens Homo sapiens Gossypium robinsonil Streptomyces coelicolor Drosophila melanogaster Drosophila melanogaster Lactobacillus reuter Rattus norvegicus Rattus sp.
Corynebacterium glutamicumn Mycobacterium tuberculosis 38,768 39,018 40,656 44.262 40,709 40,986 35.364 35.364 40,894 41,712 39.576 39,157 23-Nov-99 5-Jul-99 17-Jun-98 9-Feb-96 17-MAY- 1999 17-Jun-98 29-Aug-96 1999 28-OCT- 1997 28-OCT- 1997 28-OCT- 1997 29-Nov-95 GBSTS:G26925 362 GBPL2:AF139451 1202 GB BA1:SClC2 42210 GBEST22:A1064232 493 GB IN2:AFI 17896 1020 GBBA2:AP067123 1034 GBRO:RATNFHPEP 3085 GBRO:RSNF- 3085 GB-BA2:AF124600 4115 39,157 14-Jun-96 38,910 1-Jun-99 60,644 15-Jan-99 38,037 24-Nov-98 36,122 2-Jul-99 48,079 3-Jun-98 37,093 27-Apr-93 37.093 14-Jul-95 100,000 04-MAY- 1999..
36,323 17-Jun-98 GBBAI:MTCY1 59 33818 Z83863 2007202378 25 May 2007 GB-BA1:MSGB937C 38914
S
GBBA2:AF124600 4115 L78820 AF1 24600 rxa01699 693 GB-BA2:AF016585 41097 AF01 6585 rxa01712 805 rxa01719 684 rxa01720 1332 rxa01746 876 rxa01747 1167 rxa01757 924 GB-EST9:CI 9712 399 GB-EST21 :AA952466 278 GBEST21 :AA952466 278 GBHTG1:HSDJ534K 154416 7 GB-HTGI:HSOJ534K 154416 7 GBEST27:A1447108 431 GBPR4:AC006322 179640 GBPL2:TMO18A10 106184 GByPR4:AC006322 179640 GB-EST3:R46227 443 GBEST3:R46227 44.3 GB-BA1 :MTCY19O 34150 GBBAI:MLCB22 40281 GB-BAI:SC5F7 40024 GB-EST2I:AA918454 416 GBEST4:t-34042 345 GB-EST20:AA899038 450 C19712 AA952466 AA952466 AL1 09925 AL1 09925 A1447108 AC006322 AF013294 AC006322 R46227 R46227 Z70283 Z98741 AL096872 AA9 184 54 H34042 A899031 Table 4 (continued) Mycobacterium leprae cosmid B937 DNA sequence. Mycobacterium leprae Corynebacteriumn glutamicum chorismate synthase (aroC), shikimate kinase Corynebacterlum (aroK). and 3-dehydroquinate synthase (aroB) genes, complete cds; and glutamicum putative cytoplasmic peptIdase (pepQ) gene, partial cds.
Streptomyces caelestis cytochrome P-450 hydroxylase homolog (nidi) gene. Streptomyces caelestis partial cds; polyketide synthase modules 1 through 7 (nidA) genes, complete cds: and N-methyltransferase homolog gene, partial cds.
C19712 Rice panicle at ripening stage Oryza sativa cONA clone E10821_lA, Oryza saliva mRNA sequence.
TENS1404 T. cruzi epimastigote normalized cONA Library Trypanosoma cruzi Trypanosoma cruzi cDNA clone 1404 mRNA sequence.
TENS1404 T. cruzi epimnastigote normalized cONA Library Trypanosoma cruzi Trypanosoma cruzi cDNA clone 1404 mRNA sequence.
Homo sapiens chromosome 1 clone RP4-534K7, SEQUENCING IN Homo sapiens PROGRESS In unordered pieces.
Homo sapiens chromosome 1 clone RP4-534K7, -SEQUENCING IN H-omo sapiens PROGRESS in unordered pieces.
mnq9l eO8.xlI Stratagene mouse heart (#937316) Mus musculuscDNA clone Mus rnusculus IMAGE:586118 3T, mRNA sequence.
Homo sapiens PAC clone DJ1060BI 1 from 7q1 1.23-q21.l. complete Homo sapiens sequence.
Arabidopsis thaliana BAC T1018A10. Arabidopsis thaliana Homo sapiens PAC clone DJ1 060811 from 7q I i.23.q21. complete Homo sapiens sequence.
yg52a03.sl Soares infant brain 1 NIB Homo sapiens cOMA clone Homo sapiens IMAGE:36000 mRNA sequence.
yg52a03.sl Soares infant brain 1 NIB Homo sapiens cONA clone Homo sapiens IMAGE:36000 3T, mRNA sequence.
Mycobacterium tuberculosis H37Rv complete genome; segment 981162. Mycobacteriumn tuberculosis Mycobactedriu Ieprae cosmid 822. Mycobacteriumn leprae Streptomyces coelicolor cosmid 5F7. Streptomyces coelicolor A3(2) Iom38c02.sl SoaresNFLTGBC,,S1 Homo sapiens cONA clone Homo sapiens IMAGE:1543298 3'similar to WP:F28F8.3 CE09757 SMALL NUCLEAR RIBONUCLEOPROTEIN E mRNA sequence.
ESTI10563 Rat PC-1 2 cells, NGF-treated (9 days) Rattus sp. cDNA clone Rattus sp.
RPNB181 5Vend. mRNA sequence.
8NCP6G8T7 Peritheclal Neurospora crassa cOMA clone NP6G8 3' end, mRNA Neurospora cr'assa sequence.
62,780 100.000 40,260 45,425 40.876 4 1.367 35,651 35,651 39.671 35.817 35.698 37,243 42,812 42,655 59,294 57,584 61,.810 39,655 35,942 40,000 15-Jun-96 04-MAY- 1999 07-DEC- 1997 24-OCT- 1996 29-OCT- 1998 29-OCT- 1998 23-Nov-99 23-Nov-99 09-MAR- 1999 18-MAR- 1999 1 2-Jul-97 1 8-MAR- 1999 22-MAY- 1995 2?-MAY- 1995 17-Jun-98 22-Aug-97 22-Jul-99 23-Jun-98 2-Apr,98 1 2-Apr-'98 2007202378 25 May 2007 rxa01807 915 GBBAI:AP000063 0694 GBHTG4:AC010694 rxaOl82l 401 GB_BA1:CGL007732
GB_RO:RATALGL
GB OV:APIGY2 rxa01835 654 GBEST3O:A1629479 GB-STS:G48245 GBGSS3:B49052 rxa0l85O 1470 GBBA2:ECOUW67_ 0 GB-BA2:AE000392 GBBA2:U32715 rxa01878 1002 GB-HTGI:CEY64F11 GByHTGI:CEY64F11 GB-HTG1 :CEY64FI 1 rxa01892 852 GB-BA1:MTCY274 GB BA1:MLC8250 GB GAl :MSGB1 5290
S
rxa0l 894 978 GBBAI:MTCY274 GBINI:CELF46H5 GB-HTG3:AC009204 115857 4460 7601 1381 353 515 515 110000 10345 13136 177748 177748 177748 39991 40603 36985 39991 38886 115633 185300 AP000063 115857 AC010694 AC0 10694 AJ007732 M241 08 X78272 A1629479 G48245 B49052 U 18997 AE000392 U32715 Z99776 Z99776 Z99776 Z74024 Z97369 L78824 Z74024 U4 1543 AC009204 AF112536 Y09572 Table 4 (continued) Aeropyrum pemix genomic DNA, section 6/7.
Drosophila melanogaster clone RPCl98-6H2,~ SEQUENCING IN PROGRESS 75 unordered pieces.
Drosophila melanogaster clone RPCI98-6H2, SEQUENCING IN PROGRESS 75 unordered pieces.
Corynebacterium glutamicum 3'ppc gene, secG gene. amt gene, oc gene and 5' soxA gene.
Rattus norvegicus (dlone A2U.42) alpha2u globulin gene, exons 1-7.
Anas platyrhynchos (Super M) IgY upsilon heavy chain gene, exon 2.
486101 Dl0.x1 486 -leaf primordia cONA library from Hake lab Zea mays cDNA, mRNA sequence.
SHGC-62915 Human Homo sapiens STS genomic, sequence tagged site.
RPCI1 1-4112.TV RPCI-1 I Homo sapiens genomic clone RPCI-1 1-4112, genomlo survey sequence.
Escherlchla coll K-12 chromosomal region from 67.4 to 76.0 minutes.
Escherichia cell K-12 MG 1655 section 282 of 400 of the complete genome.
Haemophilus influenzae Rd section 30Oof 163 of the complete genome.
Caenorhabditis etegans chromosome IV coY64F 11, SEQUENCING IN PROGRESS in unordered pieces.
Caenorhabditls elegans chromosome IV clone Y64F1 1, SEQUENCING IN PROGRESS In unordered pieces.
Caenorhabditis elegans chromosome IV clone Y64171 1, SEQUENCING IN PROGRESS in unordered pieces.
Mycobacterium tuberculosis H37Rv complete genome: segment 1261162.
Mycobacterium leprae cosmid 8250.
Mycobacterium leprae cosmld B81529 DNA sequence.
Mycobacterlum tuberculosis H37Rv complete genome: segment 126/162.
Aeropyrum pernix Drosophila melanogaster Drosophila melanogasler Corynebacterium glutamicum Rattus norvegicus Anas platyrhynchos Zea mays Homo sapiens Homo sapiens Eschericiia coi Escherlchla cell l-aemnophilus influenzae Rd Caenorhabdltis elegans Caenorhabditis elegans Caenorhabditis elegans Mycobacterlum tuberculosis Mycobacterium leprae Mycobacterlum leprae Mycobacterium tuberculosis Caenorhabditls elegans 35,450 35,450 100,000 38.692 36,962 38,109 37,021 37.021 37,196 38,021 39.860 37,564 37.564 37.576 35,910 64,260 64,260 37,229 38,525 31,579 99.733 70,321 1 6-OCT- 1999 1 6-OCT- 1999 7-Jan-99 1994 15-Feb-99 26-Apr.99 26-MAR- 1999 8-Apr-99 U 18997 12-Nov-98 29-MAY- 1998 I 4-OCT- 1998 14-001- 1998 14-OCT.
1998 19-Jun-98 27-Aug-99 15-Jun-96 19-Jun-98 29-Nov-96 18-Aug-99 5-Aug-99 18-Apr-98 40,067 22-Jun-99 Caenorhabditis-elegans cesmld F46H-5.
rxaOI92O 1125 GB.BA2:AFI 12536 1798 GBBAI:CANRDFGE 6054
N
Drosophila melanogaster chromosome 2 clone BACRO3E19 (01033) RPCI-98 Drosophila melanogaster 03.E.19 map 36E-37C strain y: cn bw sp, SEQUENCING IN PROGRESS -,94 unordered pieces.
Corynebaclerium glutamicum ribonucleotide reductase beta-chain (nrdF) Corynebacterium gene, complete ods. glutamicum Corynebaclerium ammoniagenes nrdH, nrdl, nrdE, nrdF genes. Corynebacterlum ammoniagenes 2007202378 25 May 2007 Table 4 (continued) 1228 AF050168 Corynebacterium ammoniagenes ribonucleoslde diphosphate reductase small Corynebacterium GB-BA2:AF0501 68 rxa01928 960 rxa01929 936 rxa01940 1059 rxa02022 1230 rxa02O24 859 GBBA1:CGPAN 2164 GBPL1:AP000423 154478 GB PLI:AP000423 154478 GB BA1:CGPAN 2164 GB-BA1:XCU33548 8429 GBBAI:XANHRPB6 1329
A
GB-IN2:CFU43371 1060 GB-BA2:AE001467 11601 GE3_RO:AF175967 3492 GBBA1:CGDAPE 1966 GBBA1:CGDNAARO 2612
P
GB-BA1:APU47055 6469 GBBA1:MTC1364 29540 GB GAl :MSGBI912C 38503
S
GB-BAI:MLU15180 38675 X96580 AP000423 AP000423 X95580 U33548 M99174 U43371 AE001467 AF175967 X81379 X85965 U47055 Z93777 L01536 U.115180 subunit (nrdF) gene, complete cds.
C.glutamicum panB, panC xylB genes.
Arabidopsis thaliana chioroplast genomic DNA, complete sequence, strain:Columbia.
Arabidopsis thaliana chioroplast genomic DNA, complete sequence.
straln:Columbia.
Oglutamlcum panG, panC xylG genes.
Xanthomonas campestris hrpB pathogenicity locus proteins HrpBl, HrpG2, llrpB3, H-rpB4, H-rpB5, HrpB6, HrpB7, IpB8, HrpAl, and ORF62 genes. complete cds.
Xanthomonas campestris hrpB6 gene, complete cds.
Crithldla fasciculata inosine-unidine preferring nucleoside hydrolase (IUNH) gene, complete cds.
Helicobacter pylori, strain J99 section 28 of 132 of the complete genome.
Homo sapiens Leman coiled -coil protein (LCCP) mRNA, complete cds.
Oglutamicum dapE gene and orf2.
C.glutamlcum ORF3 and aroP gene.
Anabaena PCC7112O nitrogen fixation proteins (nifE, nufN, nilX, niflW) genes.
complete cds, and nitrogenase (nifK) and hesA genes, partial cds.
Mycobacterium tuberculosis H-37Rv complete genome; segment 52/162.
M. leprae genorn-ici dna sequence, cosmid b 1912.
Mycobacterium Ieprae cosmid B1756.
ammoniagenes Corynebacterium glutamicumn Chloroplast Arabidopsls thaliana Chloroplast Arabidopsis thaliana Corynebacterlum glutamicum Xanthomonas campestrls pv. veslcatoria Xanthomonas campestris Crithidia fasciculata Helicobacter pylon J99 Mus musculus Corynebacterlum glutamicum Corynebacterium glutamlcum Anabaena PCC7120 Mycobacterium tuberculosis Mycobacterium Ieprae Mycobacterium leprae 72.082 100.000 35,917 33,925 100,000 38,749 39,305 61,417 38.560 40.275 100.000 38,889 36,647 59,415 57,093 57,210 23-Apr-98 11 -MAY- 1999 1 5-Sep-99 I 5-Sep-99 11 -MAY- 1999g 19-Sep-96 14-Sep-93 18-Jun-96 20-Jan-99 26-Sep-99 8-Aug-95 30-Nov-97 17-Feb-96 17-Jun-98 14-Jun-96 09-MAR- 1995 nca02027 rxa02031 rxa02072 1464 GB-GAl :CGGDI-A 2037 GB BAt :CGGDH 2037 GB GBAl:PAE1 8494 1628 X72855 X59404 Yl 8494 C.glutamlcum GDH-A gene.
Corynebacterium glutamicum, gdh gen for glutamate dehydrogenase.
Pseudomonas aeruginosa gdhA gene, strain PAC1.
Corynebacterlum glutamicum Corynebacterium glutamicum Pseudomonas aeruginosa 99,317 94,387 62,247 24-MAY- 1993 30-Jul-99 6-Feb-99 2007202378 25 May 2007 rxa02O85 2358 GB-BA1:MTCY22G8 22550 GB BA1:MLCB33 42224 GB-BA1:ECOUW85 91414 rxa02O93 927 GBEST14:AA448146 452 GBEST17:AA641937 444 GBPR3:AC003074 143029 rxaO2106 1179 GBBA1:SCIA6 37620 GB-PR4:AC005553 179651 GB-EST3:R49746 397 rxaO2111 1407 rxaO2ll2 960 rxa02134 1044 rxa02135 1197 GBBA1:SC6G1O GB-BA1:U00010 GB-BA1:MTCY336 GBHTG3:AC010579 GBGSS3:B09839 GBHTG3:AC010579 36734 41171 32437 157658 1191 157658 Z95585 Z94723 M87049 AA4481 46 AA641 937 AC003074 AL023496 AC005553 R49746 AL049497 u00010 Z95586 AC01 0579 B09839 AC01 0579 X8301 1 A1731 596 X83011 AL023807 AL022347 Table 4 (continued) Mycobacterium tuberculosis H37Rv complete genome; segment 491162. Mycobacterium tuberculosis Mycobacterium leprae cosmid B33. Mycobacterlum leprae E. coli genomic sequence of the region from 84.5 to 86.5 minutes. Eschertchla coli zwB2h01.rl Soares testisNHT Homo sapiens cONA clone IMAGE:782737 Homo sapiens mRNA sequence.
nsl8blO.rI NCI -CGAPGCB 1 H-omo sapiens cONA clone IMAGE: 1183963 Homo sapiens mRNA sequence.
Human PAC clone DJ0596009 from 7p15, complete sequence. Homo sapiens Streptomyces coelicolor cosmid 1A6. Streptomyces coelicolor Homo sapiens chromosome 17, clone hRPK.1 12_J-9, complete sequence. Homo sapiens yg7lglO.rl Soares, infant brain INIB Homo sapiens cDNA clone Homo sapiens IMAGE:38768 5' similar to gb:V00567 BETA-2-MICROGLOBULIN PRECURSOR (HUMAN):, mRNA sequence.
Streptomyces coelicolor cosmid 6G 1. Streptomyces coelicolor Mycobacterium leprae cosmid 81170. Mycobacterium leprae Mycobacterium tuberculosis H37Rv complete genome; segment 701162. Mycobacterium tuberculosis Drosophila melanogaster chromosome 3 clone BACRO9008 (01 101) RPCI-98 Drosophila melanogaster 09.0.8 map 96F-96F strain y; cin bw sp, -SEQUENCING IN PROGRESS 121 unordered pieces.
T12A1 2-Sp6 TAMU Arabidopsis thaliana genomic. clone TI12A12, genomic Arabidopsis thaliana survey sequence.
Drosophila melanogasler chiromosome 3 clone BACR09D08 (01 101) RPCI-98 Drosophila melanogaster 09.0.8 map 96F-96F strain y, cii bw sp, SEQUENCING IN PROGRESS **,121 unordered pieces.
S.coelicolor secY locus DNA. Streptomyces coelicolor BNLGHi1O1 85 Six-day Cotton fiber Gossypium hirsutum cDNA 5'similar to Gossypium hirsutum (AC004005) putative ribosomal protein L7 [Arabldopsis thallanal, mRNA sequence.
Scoelicolor secY locus DNA. Streptomyces coelicolor Human DNA sequence from clone RP3-525L-6 on chromosome 6p22.3-23 Homo sapiens Contains CA repeat, STSs, GSSs and a CpG Island, complete sequence.
Arabidopsis thaliana DNA chromosome 4, BAC clone F21 P8 (ESSA project). Arabidopsis thaliana 38,442 56,486 52,127 34,163 35,586 31,917 35,818 34.274 41,162 50.791 37,563 39.504 37,909 37,843 37,909 36,533 33,451 36,756 34.365 34,325 17-Jun-98 24-Jun.97 29-MAY- 1995 4-Jun-97 27-OCT.
1997 6-Nov-97 13-Jan-99 31 -DEC- 1998 1 8-MMY- 1995 24-IMAR- 1999 01-MAR- 1994 24-Jun-99 24-Sep-99 14-MAY- 1997 24-Sep-99 02-MAR- 1998 11-Jun-99 02-MAR.
1998 23-Nov-99 9-Jun-99 GB-BA1 :SCSECYDN 6154
A
GB-EST32AI731 596 568 G8..BA1 :SCSECYDN
A
GBPR3:HS525L6 GB-PL2:ATF21 P8 6154 168111 85785 GS-PL2:UB9959 GB~fL2U89959 106973 U89959 Arabidopsis thaliana BAC T7123, complete sequence. Aalosstain 384 2-u~9 Arabldopsls thaliana 33.874 26-iu'A--98 2007202378 25 May 2007 rxa02136 645 GBPL2:ATACOOS819 57752 AC005819 GB-PL2:F1 5K9 GB-PL2:U89959 GB-BAI:MTCYI 90 71097 AC005278 106973 U189959 34150 Z70283 Table 4 (continued) Arabidopsis thaliana chromosome 11 SAG T3A4 genomic sequence, complete sequence.
Arabidopsis thaliana chromosome 1 BAG FI1<K9 sequence, complete sequence.
Arabidopsis thaliana BAG T7123, complete sequence.
Mycobacterium tuberculosis H-37Rv complete genome; segment 981162.
rxa02139 1962 rxa02153 903 rxs02l54 414 rxa02155 1287 GBBSAi:MSGBI 5540 36548
S
GB BAt :MSGB1 551 C 36548
S
GB-BA2:AF049897 9196 GB-BAI :AF005242 1044 GBBA1:CGARGCJB 4355 0 GB-BA2:AF049897 9196 GBBA1:AF005242 1044 GBBA1:CGARGCJB 4355
D
GB.BA1 :CGARGCJB 4355 0 GBBA2:AF049897 9196 1-78814 Mycobacterium leprae cosmid B1554 DNA sequence.
178813 Mycobacteriumi leprae cosmid 8 1551 DNA sequence.
AF049897 Corynebactertum glutamicum N-acetylglutamylphosphate reductase (argC), ornithine acetyltransferase (argJ), N-acelylglutamnate kinase (argB), acetylomithine transaminase (argD). ornithine carbamoyltransferase (argF). arginine repressor (argR). argininosuccinate synthase (argG), and argintnosuccinate lyase (argH) genes, complete cds.
AF005242 Corynebactertum glutamicumn N-acetylglutamate-5-semiatdehyde dehydrogenase (argC) gene, complete cds.
X86157 Oglutamicum argO. argJ. argB, argD, and argF genes.
AF049897 Corynebacterium glutamicum N-acetylglutamytphosphate reductase (argC), ornithine acetyltransferase (argJ), N-acelylglutamate kinase (argB).
acetylornithine transaminase (argD), ornithine carbamoyltransferase (argF).
arginine repressor (argR), argininosuccinate synthase (argG), and argininosuccinate lyase (argH) genes, complete cds.
AF005242 Corynebacterium glutamicum N-acetylglutamate-5-semlatdehyde dehydrogenase (argC) gene, complete cds.
X86157 C.glutamlcum argC. argJ, argB, argD, and argF genes, X86157 C.glutamlcum argC, argJ, argB, argo. and arglF genes.
AF049897 Corynebacteriumn glutamicum N-acetylgtutamylphosphate reductase (argO), ornithine acetyltransferase (argi), N-acetylg luta mate kinase (argB), acetytornithine transaminase (argD). ornithine carbamoyltransferase (argF), arginine repressor (argR), argininosuccinate synthase (argG), and argininosuccinate lyase (argH) genes, complete cds.
[78811 Mycobacterium leprae cosmld BI1133 DNA sequence.
AF049897 Corynebacterium glutamicum N-acetylglutamylphosphate reductase (argC).
ornithine actyltran sfe rase (argJ), N-acetylglutamate kinase (argB), acetylornithine transaminase (argD). ornithine carbamnoyltransferase (argF), arginine repressor (argR), argIninosuccinate synthase (argG). and argininosuccinate lyase (argH) genes, complete cds.
Arabidopsis thatiana Arabidopsis thaliana Arabidapsis thaliana Mycobacterium tuberculosis Mycobacterium leprae Mycobacterium leprae Corynebacterium glutamicum Corynebactenium glutamicum Corynebactertum glutamicum Corynebacterium glutamicum Corynebacterium glutamicum Corynebacterium glutamnicumn Corynebacterium glutamicum Corynebacterium glutamicum Mycobacterlum leprae Corynebacterium glutamicum 34,123 3-Nov-96 31,260 7-Nov-98 34.281 26-Jun-98 62,904 17-Jun-98 36,648 15-Jun-96 36,648 15-Jun-96 99,104 1-Jul-98 99,224 2-Jul-97 100.000 25-Jul1-96 98,551 1 -Jul-98 96,477 2-Jul-97 100.000 25-Jul-96 99,767 25-Jul-96 99,378 1-Jul-98 55,504 15-Jun-96 100,000 1-Jul-98 GB.BAI:MSGB1 133C 42106
S
GBBA2:AF049897 9196 rxaO2156 1074 2007202378 25 May 2007 Table 4 (continued) X86157 COglutamicum argC, argJ, argB. argD, and argF genes. GB BA1:CGARGCJB 4355
D
GBBA2:AE001816 10007 GB-BA2:AF049897 9196 AE001816 AF049897 rxaO2157 1296 Thermotoga maritima section 128 of 136 of the complete genome.
Corynebacterium glutamicum N-acetylglutamnylphosphate reductase (argC), omnithine acetyltransferase (argJ), N-acetylglutamate kinase (argB).
acetylornithine transaminase (argD), ornithine carbamoyltransferase (argF).
arginine repressor (argR), argininosuccinate synthase (argG), and argininosuccinate lyase (argH-) genes, complete csls.
Oglutamlcum argO, argJ, argB, argD, and argF genes. GB-BAI :CGARGCJB 4355
D
GBBAI:MTCYO6H-11 38000 GB-BA2:AF049897 9196 X86 157 rxa02I58 1080 Z85982 Mycobacterium tubercuilosis H37Rv complete genome; segment 73/162.
AF049897 Corynebacterlum glutamicum N-acetylglutamnylphosph ate reductase (argC), ornithine acetyltransfe rase (argJ), N-acetylglutamate kinase (argB), acetylomithine transaminase (argD), omithlne carbamoyltre nsfe rase (argF), arginine repressor (argR), argininosuccinate synthase (argG). and argininosuccinate lyase (argH) genes, complete cds.
AF031518 Coryrrebacterium glutamicumn ornithine carbamolytransferese (argF) gene.
complete cds.
X86157 C.glutamlcum argC. argJ. argB, argD. and argF genes.
GBBA2:AF031518 2045 GBBA1:CGARGCJ8
D
GB-BA2:AF049897 4355 Corynebacteium 100,000 25-Jul-96 glutamicum Thermotoga maritlma 50,238 2-Jun-99 Corynebacterium 99,612 1-Jul-98 glutamicum Corynebacterium 99,612 25-Jul-96 glulamicum Mycobacterium 57,278 17-Jun-98 tuberculosis Corynebacterium 100,000 1-Jul-98 glulamicum Corynebacterium 99,898 5-Jan-99 glutamicum Corynebacterium 100,000 25-Jul-96 glutamicum N Corynebacterium 99,843 1-Jul-98 glutamicum Corynebacterium 88,679 5-Jan-99 glutamicum Corynebacterlum 100,000 5-Jan-99 glutamicum Corynebacterium 99,774 1-Jul-98 glutamicumn Corynebacterlum 99,834 19-Nov-97 glutamlcum Streptomyces clavullgerus 65,913 22-Apr-96 Corynebacterium 88,524 1-Jul-98 glutamicum rxa02159 636 GB-BA2:AFO3I 51 8 GBBA2:AF041436 GBBA2:AF049897 rxaO216O 1326 9196 AF049897 Corynebaclerium glutamicum N-acetylglutamylphosphate reductase (argC).
ornithine acetyltransferase (argJ), N-acetylglutamnate kinase (argB), acetylornithine transaminase (argD), omnithine carbamoyltransferase (argF), arginine repressor (argR), arglninosuccinate synthase (argG), and argininosuccinate lyase (argH-) genes, complete cds.
2045 AF031518 Corynebacterlum glutamicum ornithine carbamolytransferase (argF) gene, completdecds.
516 AF041436 Corynebacterium glutamicum arginine repressor (argR) gene, complete cds.
9196 AF049897 Corynebacterlum glutamicum N-acetylglutamyiphosphate. reductase (argC), ornithine acetyltransfe rase (argJ), N-acetylglutamate kinase (argB), acetylornithine transaminase (argD), omnithine carbamoyIltransferase (argF).
arginine repressor (argR), argininosuccinate synthase (arjG). and arglnlnosuccinate lyase (argH-) genes, complete cds.
1206 AF030520 Corynebacterium glutamicum argininosuccinate synthetase (argG) gene, complete cds.
1909 Z4911111 S.clavullgerus argG gene and argH gene (partial).
9196 AF049897 Corynebacterium glutamicum N-acetylgilamylphosphate reductase (argC).
ornithine acetyltransfe rase (argJ), N-acetylglutamate kinase (argB), acetylornithine transaminase (argD), ornithlne carbamoyllransfe rase (argF).
arginine repressor (argR), argininosuccinate synthase (argG), and argininosuccinate lyase (argH) genes, complete cds.
GS.BA2:AF030520 GBBAI :SCARGGH- GB-BA2:AF049897 rxa02162 1554 2007202378 25 May 2007 Table 4 (continued) GBBA2:AF048764 1437 AF048764 Corynebacterium glutamicum arginiflosuccinate lyase (argH) genecomplete GB-BAI:MTCYO6H1 1 GBBA1:MTCY3I 38000 37630 rxaO2176 1251 GB-BAI:CGGLTG 3013 GB-PL2:PGU65399 2700 rxa02189 861 GBPR3:AC002468 115888 GB-BA:MSGBI97OC 39399
S
GSPR3AC002468 115888 rxa02193 1701 GB-BAI:BRLASPA 1987 GBPAT:E04307 1581 GB-BAI:ECOUW93 338534 rxa02194 966 GB-BA2AFO5O166 840 GBBAI:BRLASPA 1987 GB-PATE08649 188 rxa02195 393 GB_8A2:AF086704 264 GB-BAli:EAY17145 6019 GBSTS:G01 195 332 rxa02197 551 GB-BA1:MTCY261 27322 GB_8A1:MLCB2533 40245 GSBBAI:U00017 42157 rxa02198 2599 GB-BA1:U00017 42157 GB BA1:MLCB2533 40245 GB-BAI:MTCY261 27322 rxa02208 1025 GBBAI:U00017 42157 Z85982 Z73 101 X66112 U65399 AC002468 L78815 AC002468 025316 E04307 U 14003 AF0501 66 D25316 E08649 AF086704 Y17145 G01 195 Z97559 AL03531 0 U00017 U00017 AL035310 Z97559 UOC017 Mycabacteriumn tuberculosis H37Rv complete genome; segment 731162.
Mycobacterium tuberculosis H37Rv complete genome; segment 41/1 62.
C.glutamicum git gene for citrate synthase and ORR.
Basidiomycete CECT 20197 phenoloxidase (pox 1) gene, complete cds.
Human Chromosome 15q26.1 PAC clone pDJ4l7d7. complete sequence.
Mycobacterlum leprae cosmid 81970 DNA sequence.
Human Chromosome 15q26.1 PAC clone pDJ4I7d7, complete sequence.
Brevibacterlumn flavumn aspA gene for aspartase, complete cds.
DNA encoding Brevibacterium flavum aspartase.
Escierichia coli K-12 chromosomal region from 92.8 to 00.1 minutes.
Corynebacterium glulamicum ATPR ph os phoribosyltransfe rase (hlsG) gene.
complete cds.
Brevibacterium flavumn aspA gene for aspartase, complete cds.
DNA encoding part of aspartase from coryneform bacteria.
Corynebacterium glutamicum phosphoribosyl-ATP-pyrophosphohydrolase (hisE) gene, complete cds.
Eubaclerium acidaminophilum grdR, grdl, grdH genes and partial ldc, grdT genes.
fruit fly STS Dm 1930 clone DS06959 17.
Mycobacterium tuberculosis H-37Rv complete genome; segment 95/162.
Mycobacterium leprae cosmid B2533.
Mycobacterium leprae cosmld 82126.
Mycobacterium leprae cosmid 62126.
Mycobacterium leprae cosmid 82533.
Mycobacterium tuberculosis H37Rv complete genome; segment 95/1162.
Mycobacterium leprae cosmid B2126.
Corynebacterium, glutamicurn Mycobacterium tuberculosis Mycobacterium tuberculosis Corynebacteriumn glutamicum basidiomycete CECT 20197 Homo sapiens Mycobacterium leprae Homo sapiens Corynebacterium glutamicum Corynebacterlum glutamicum' Escherichia coli Corynebacterlum glutamicumn Corynebacterium glutamicum Corynebacterium glutamnicumn Corynebacterium glutamicum Eubacterium acidaminophilum Drosophila melanogaster Mycobacteriumn tuberculosis Mycobacterium leprae Mycobacterium leprae Mycobacterium leprae Mycobacterium leprae Mycobacterlum tuberculosis Mycobacterlum leprae 87,561 64.732 36,998 39.910 38,474 35,941 40,286 33,689 99,353 99,367 37,651 98,214 93.805 100,000 100,000 39,075 35,542 33.938 65,517 36,770 38.674 65.465 37,577 59,823 1-Jul-98 17-Jun-98 17-Jun-98 17-Feb-95 19-Jul-97 16-Sep-98 15-Jun-96 16-Sep-98 6-Feb-99 29-Sep-97 17-Apr-96 5-Jan-99 6-Feb-99 29-Sep-97 8-Feb-99 5-Aug-98 28-Feb-95 17-Jun-98 27-Aug-99 01-MAR.
1994 01-MAR- 1994 27-Aug-99 17-Jun-98 01-M/A:R-.
1994' GB-BA1:AP000063 GBBAI:AP00063185300 AP000063 Aeropyrum pernix genomlc DNA, section 6/7.Aeorupenx 3,4 2-J-9 Aeropyrum pernix 39.442 22-Jun-99 2007202378 25 May 2007 rxa02229 948 rxa02234 3462 rxa02235 727 GB.PR4:AC006236 GBBA1:MSGYI54 GBBAI:MTCYI54 GB-BAI:U00019 GBBA1:MSGB937C
S
GB-BA1 :MTCY2B1 2 GB_8A2:U01072 GB BA1 :MSU91 572 127593 40221 13935 36033 38914 20431 4393 960 192791 192791 39150 5228 586 39150 193862 193862 1239 GB-HTG3:AC009364 GB-HTG3AC009364 rxa02237 693 GB-BA1:MTCY21B4 GB BA2:AF077324 GB-EST22:AU017763 rxa02239 1369 GBBA1:MTCY21 B4 GBJITG3:ACO1 0745 GB-HTG3:ACOI 0745 rxa02240 1344 EM IPATE09855 AC006236 AEJ000002 Z98209 UD0019 L78820 Z8101 1 U01 072 U9 1572 AC009364 AC009364 Z80108 AF077324 AU01 7763 Z80108 AC010745 AC010745 E09855 A37831 AF117274 A8003693 Table 4 (continued) Homo sapiens chromosome 17, clone hCIT.162_E_12, complete sequence.
Mycobacterium tuberculosis sequence from clone y154.
Mycobacterium tuberculosis H-37Rv complete genome: segment 121/162.
Mycobacteriumn leprae cosmid 82235.
Mycobacterium leprae cosmid 8937 DNA sequence.
Mycobacteriumn tuberculosis H37Rv complete genome; segment 61162.
Mycobacterium bovis BCG orotidine-5'-monophosphate decarboxylase (uraA) gene.
Mycobacterium smegmatis carbamoyl phosphate synthetase (pyrAB) gene, partial cds and orotidine 5'-monophosphate decarboxylase (pyrF) gene.
complete cds.
Homo sapiens chromosome -SEQUENCING IN PROGRESS-*, 57 unordered pieces.
Homo sapiens chromosome 7, -SEQUENCING IN PROGRESS 57 unordered pieces.
Mycobacteriumn tuberculosis H-37Rv comptete genome; segment 621162.
Rhodococcus equl strain 103 plasmid RE-VP 1 fragment f.
AUWi7763 Mouse two-cell stage embryo cDNA Mus muscutus cDNA clone J0744A04 3% mRNA sequence.
Mycobacteuium tuberculosis H37Rv complete genome; segment 62/162.
Homo sapiens clone NH0549D1 B. SEQUENCING IN PROGRESS ~,30 unordered pieces.
Homo sapiens clone NH0549D18, SEQUENCING IN PROGRESS ~,30 unordered pieces.
gDNA encoding S-adenosytmethionine synthetase.
Homo sapiens Mycobacteriumn tuberculosis Mycobacterium tuberculosis Mycobacteriumn ieprae Mycobacterium ieprae Mycobacterium tuberculosis Mycobacteriumn bovis Mycobacterium smegmatis Homo sapiens Homo sapiens Mycobacteriumn tuberculosis Rhodococcus equi Mus musculus Mycobacteriurn tuberculosis Homo sapiens Homo sapiens Corynebacterium glutamicum 37,191 53.541 40,407 40,541 66.027 71,723 67. 101 60,870 37,994 37,994 55,844 41,185 38,616 56,282 36.772 36,772 99,515 63,568 65,000 52,909 29-DEC- 1998 03-DEC- 1996 17-Jun-98 01-MAR- 1994 15-Jun-96 18-Jun-98 22-DEC.
1993 22-MAR- 1997 1 -Sep-99 1-Sep-99 c43 23-Jun-98 5-Nov-98 I 9-OCT- 1998 23-Jun-98 21 -Sep-99 21-Sep-99 07-OCT- 1997 (Rel.
52, Created) 1997 31-MAR- 1999 03-OCT- 1997 (Rel.
52, Creited) GB-PAT:A37831 5392 GB..BA2:AFI 17274 2303 EMBA1:A8003693 5589 Sequence 1 from Patent W09408014. Streptomyces pristinaespiraiis Streptomyces spectabilis flavoprotein homatog Dfp (dfp) gene, partial cds; and Streptomyces speclabitis S-adenosyimethionlne synthetase (metK) gene, complete cds.
Corynebacterium ammoniagenes DNA for rib aperon. complete cds. Corynebacterium ammoniagenes nca02246 1107 2007202378 25 May 2007 Table 4 (continued) 5589 E07957 gDNA encoding at feast guanosine triphosphate cyclohydrolase and riboflavin Corynebacterium nxa02247 756 rxa02248 1389 rxca02249 '600 rxa02250 643 rxa02262 1269 =x02263 488 GBPAT:E07957 GBPAT:132742 GBPAT:132743 EMBA1:A8003893 GBPAT:132742 GB PAT:132742 EM-BAl :AB003693 GBPAT:E07957 GBPAT:E07957 GB PAT:132742 GBPAT:132743 GB-PAT:E07957 GBPAT:I32742 EMBAl :AB003693 GBBA1 :CGL007732 GB-BA1 :CGAMTGEN
E
GB_ VI:HEHCMVCG GBBAI :CGL007732 GBBA1:CGL-007732 5569 132742 2689 132743 5589 A8003693 5589 132742 5589 132742 5589 AB003693 synthase.
Sequence I from patent US 5589355.
Sequence 2 from patent US 55893 55.
Corynebacterium ammoniagenes DNA for rib operon. complete cds.
Sequence 1 from patent US 5569355.
Sequence i-from patent US 5589355.
Corynebacterium ammoniagenes DNA for rib operon, complete cds.
5589 E07957 gDNA encoding at least guanosine Iriphosphate cyclohydrolase and riboflavin synthase.
5589 E07957 gDNA encoding at least guanoslne triphosphate cyclohydrolase and riboflavin synthase.
5589 132742 Sequence 1 from patent US 5589355.
2689 132743 Sequence 2 from patent US 5589355.
5589 E07957 gONA encoding at least guanosine Iriphosphate cydlohydrolase and riboflavin synthase.
5589 132742 Sequence 1 from patent US 5589355.
5589 AB003693 Corynebacterium ammoniagenes; DNA for rib operon. complete cds.
4460 AJ007732 Corynebacterium glutamicum 3 ppc gene, secG; gene, amt gene. ocd gene and 5' soxA gene.
2028 X93513 COglutamlcum amt gene.
229354 X17403 Human cytomegalovirus strain AD169 complete genome.
4460 AJ007732 Corynebacterium glutamlcum 3 ppc gene. secG gene, amt gene, ocd gene ammoniagenes Unknown.
Unknown.
Corynebacleium ammoniagenes Unknown.
Unknown.
Corynebacterum ammoniagenes Corynebacterium ammoniagenes Corynebacterium ammoniagenes Unknown.
Unknown.
Corynebacterium ammoniagenes Unknown.
Corynebacterium ammoniagenes Corynebactenium glutamicum Corynebacterium glutamicum humanrhhrpe~virtfs 5 Corynebacterium 52,909 29-Sep-97 52.909 6-Feb-97 57.937 6-Feb-97 57,937 03-OCT- 1997 (Rel.
52. Created) 57,937 6-Feb-97 61,843 6-Feb-97 61,843 03-OCT- 1997 (Rel.
52, Created) 61,843 29-Sep-97 64.346 29-Sep-97 64,346 6-Feb-97 64,346 6-Feb-97 56,318 29-Sep-97 56,318 6-Feb-97 56,318 03-OCT- 1997 (Rel.
52, Created) 100,000 7-Jan-99 100.000 29-MAY- 1996 38,65 1 10$eFbb99 100.000 7-Jan-99 37,526 7-Jan-99 96,928 08-OCr- 1997 (Rel.
52, Created) 96,781 7-Aug-98 36,264 20-Feb-99 36,197 17-MAR- 4460 AJ007732 and 5 soxA gene. glutamicum Corynebacterium glutamicum 3' ppc gene, secG gene, amt gene, ocd gene Corynebacterium and 5' soxA gene. glutamicum rxa02272 1368 EM-PAT:E09373 GBBA1:D38505 GB HTG2:AC006595 rxa02281 1545 GBG5 1 2:AQ4I11 0 1591 E09373 Creatinine deiminase gene.
Bacillus sp.
038505 AC006595 AQ4 11010 Bacillus sp. gene for creatinine deaminase, complete cds.
Homo sapiens, SEQUENCING IN PROGRESS 4 unordered pieces.
HS-2257_BIH02_MR CIT Approved Human Genomic Sperm Library D Homo sapiens genomic dlone Plate=2257 001=3 Row--P, genomlc survey sequence.
Bacillus sp.
Homo sapiens Homo sapiens 2007202378 25 May 2007 GBE5T23:A1128623 363 GB-PL2:ATACO7O1 9 102335 GBBA2:AF 116184 540 GBGSS9:AQ16>4310 507 A1128623 AC007019 AF1 16184 A01 64310 rxa02299 531 rxa02311 813 GB IHTG-C0 1168TK 578 XC3468 GBFHTG4AC006091 176878 AC006091 GBBA2:RRU6551O 16259 U65510 Table 4 (continued) qa62c01.sl Soares -fetal -heartNbHHI9W Homo sapiens cONA clone Homo sapiens IMAGE: 1691328 mRNA sequence.
Arabldopsis thaliana chromosome 11 BAG F7D8 genomic sequence, complete Arabidopsis thaliana sequence.
Corynebacterium glutamicum L-aspartate-alpha-decarboxylase precursor Corynebacterium (panD) gene, complete cds. glutamicum HS-2l7lA2_EOIMR CIT Approved Human Genomic Sperm Library D Homo sapiens Homo sapiens genomlc clone Plate=2171 Col=2 Row--l. genomic survey sequence.
Murine herpesvius; type 68 thymidine kinase and glycoprotein H genes. murine herpesvirus 68 Drosophila melanogaster chromosome 3 clone BACR48GO5 (D475) RPCI-98 Drosophila melanogaster 48.G.5 map 91F1-91F13 strain y; cn bw sp, SEQUENCING IN PROGRESS 4 unordered pieces..- Drosophila melanogaster. chromosome 3 clone BACR48GO5 (D475) RPCI-98 Drosophila melanogaster 48.G.5 map 91 F1-91F13 strain y; cn bw sp, SEQUENCING IN PROGRESS unordered pieces.
Rhodospirillum rubrum CO-induced hydrogenase operon (cooM, cooK cool-, Rhodospirillum nabrum cooX. cooU, cooH) genes, Iron sulfur protein (cooF) gene, carbon monoxide dehydrogenase (cooS) gene, carbon monoxide dehydrogenase accessory proteins (cooC, cooT, cooJ) genes, putative transcriptional activator (cooA) gene, nicotinate-nucleotide pyrophosphorylase (nadC) gene, complete cds, 1-aspartate oxidase (nadB) gene, and alkyl hydroperoxide reductase (ahpC) gene, partial cds.
37.017 1998 33,988 16-MAR- 1999 100,000 02-MAY- 1999 37,278 16-OC- 1998 40,288 3-Sep-96 36.454 27-OCT- 1999 36,454 27-OCT.
1999 37,828 9-Apr-97 rxa02315 1752 rxa02318 402 rxa02319 1080 GBBA1:MSGY224 GB-BA1:MTY25D1 0 GBBAI:M5GY224 GBHTG3:AC01 1348 GB-HTG3:ACOi 1348 GB-HTG3:ACO1 1412 GB-BA1 :MSGY224 GB-BA:MTY25D1O GBE5T23:A1117213 40051 40838 40051 111083 111083 89234 40051 40838 476 A0000004 Z95558 AD000004 AC01 1348 AC01 1348 AC01 1412 A0000004 Z95558 All117213 Mycobacterlum tuberculosis sequence from clone y224.
Mycobacterlum tuberculosis H37Rv complete genome: segment 28/162.
Mycobactertum tuberculosis sequence from cdone y224.
Homo sapiens chromosome 5 clone CIT-HSPC_303E1 3, SEQUENCING IN PROGRESS 3 ordered pieces.
Homo sapiens chromosome 5 clone CIT-HSPC_303E 13, SEQUENCING IN PROGRESS 3 ordered pieces.
Homo sapiens chromosome 5 clone CIT978SKB8 8K21,~ SEQUENCING IN PROGRESS ordered pieces.
Mycobacterium tuberculosis sequence from clone y224.
Mycobacterium tuberculosis H37Rv complete genome; segment 28/162.
ub83h02.rI Soares 2NbMT Mus musculus cONA clone IMAGE: 1395123 sequence.
Mycobacterium tuberculosis Mycobacterlum tuberculosis Mycobacterium tuberculosis Homo sapiens Homo sapiens Homo sapiens Mycobacterium tuberculosis Mycobacterlum tuberculosis Mus musculus 49,418 49,360 38,150 35,821 35,821 36.181 37,792 37,792 35.084 03-DEC- 1996 17-Jun-98 03-DEC- 1996 06-OCT- 1999 06-OCr- 1999 06-OCr- 1999 03-DEC- 1996 17-Jun-98 -2-Sep-98.
2007202378 25 May 2007 rxa02345 1320 GBBAI:BAPURKE GB-BA1 :MTCY71 GBBAI:MTCY71 rxa02350 618 GB-BA1:BAPURKE GB-PL1:SCi 3OKBXV
GB_'PLI:SCXVORFS
rxa02373 1038 GBPAT:E00311 GB-PAT:106030 GBPAT:I00836 rxa02375 1350 GBBA2:CGU31230 2582 42729 42729 2582 129528 50984 1853 1853 1853 3005 169072 169072 41230 1 20754 120754 1783 26914 3005 rxa02380 777 rxa02382 1419 rxaO2400 693 rxa02432 1098 GB-HTG3:AC009946 GBHTG3:AC009946 GB-BAI :MTCY253 GBHTG4:AC010658 GBJITG4:ACO1 0858
GBBAI:CGPROAGE
N
GB-BA1 :MrCY428 GB-BA2:CGU31230 X91189 Z92771 Z92771 X91189 X94335 X90518 E00311 106030 100836 U31230 AC009946 AC009946 Z81368 AC010658 AC0l 0658 X82929 Z81451 U31230 X75504 186191 1113693 A0606842 Table 4 (continued) Bammoniagenes purK and purE genes.
Mycobacterium tuberculosis H37Rv complete genome; segment 141/162.
Mycobacterium tuberculosis H37Rv complete genome; segment 141/162.
Bammoniagenes purK and purE genes.
S.cerevisiae 130kb DNA fragment from chromosome XV.
S.cerevlslae DNA of 51 Kb from chromosome XV right arm.
DNA coding of 2.5-diketogluconic acid reductase.
Sequence 4 from Patent EP 0305608.
Sequence 1 from Patent US 4758514.
Corynebacterium glutamicum Obg protein homolog gene, partial cds. gamma glutamyl kinase (proB) gene, complete cds, and (unkdh) gene, complete cds.
Homo sapiens done NH-0012C17, SEQUENCING IN PROGRESS unordered pieces.
Homo sapiens clone NHOOI2C17, SEQUENCING IN PROGRESS unordered pieces.
Mycabacterlum tuberculosis H37Rv complete genome; segment 106/162.
Drosophila melanogaster chromosome 3L/175C1 clone RPCI98-3320, SEQUENCING IN PROGRESS 78 unordered pieces.
Drosophila melanogaster chromosome 3L175C1 clone RPCI98-31320, SEQUENCING IN PROGRESS 78 unordered pieces.
C.glutamicumn proA gene.
Mycobacterium tuberculosis H37Rv complete genome; segment 107/162.
Corynebacterium glutamicum Obg protein homolog gene, partlal cds, gamma glutamyl klnase (proB) gene, complete cds, and (unkdh) gene, complete cds.
C.gtutamlcumn aceA gene and thiX genes (partial).
Sequence 3 from patent US 5700661.
Sequence 3 from patent US 5439822.
HS_5i404_B2_E07_T7A RPCI-1 I Human Male BAC Library Homo sapiens genomic clone Plate=960 Col=14 Row=J, genomic survey sequence.
Corynebacteriumn ammoniagenes Mycobacterium tuberculosis Mycobacterium tuberculosis Corynebacteriurn ammontagenes Saccharomyces cerevisiae Saccharomyces cerevlslae unidentified Unknown.
Unknown.
Corynebacterium glutamicumn Homo sapiens Homo sapiens Mycobacteriumn tuberculosis Drosophila melanogaster Drosophila melanogaster Corynebacteriumn glutamicum Mycobacterium tuberculosis Corynebacteriumn glutamnicumn Corynebaceriumn glutamicurn Unknown.
Unknown.
Homo sapiens 56,123 29-Sep-97 56.220 56,220 99,332 36,115 36.115 38,088 35,817 35,817 98,802 38,054 98,529 100,000 100.000 100,000 39.716 02-DEC- 1994 21 -MAY- 1993 2-Aug-96 8-Sep-99 8-Sep-99 17-Jun-98 1 6-OCT- 1999 16-OCT- 1999 23-Jan-97 17-Jun-98 2-Aug-96 9-Sep-94 10-Jun-98 26-Sept95 1 0-Jun'99 61,731 39,624 39,847 64.286 36.617 36i617 14-Jan-97 10-Feb-99 10-Feb-99 14-Jan-97 15-Jul-97 1-Nov-95 GBBAI :CGACEA 2427 GBPAT:186191 2135 GB-PAT:113693 2135 GBG5515:AQ60684 574 2 2007202378 25 May 2007 rxa02458 1413 GB-EST1 :T05804 406 GB-PL1:AB006699 77363 GB-BA2:AFi 14233 1852 GB-EST37:AWOI 306 578 GB-GSS 15:A065002 728 7 GB-BAI :MT0Y359 36021 Table 4 (continued) T05804 EST03693 Fetal brain, Stratagene (cat#936205) Homo sapiens cONA clone Homo sapiens HFBDG63 similar to EST containing Alu repeat, mRNA sequence.
A8006699 Arabidopsis thaliana genomic DNA, chromosome 5. P1 clone: MDJ22, Arabidopsis tha complete sequence.
AF1 14233 Corynebacterium giutamicum 5-enolpyruvylshikimate 3-phosphate synthase Corynebacteriu (aroA) gene, complete cds. giutaniicum AW0 13061 ODT-0033 Wniter flounder ovary Pleuronectes americanus cONA clone OT- Pleuronectes a 0033 5 similar to FRUCTOSE-BISPi-OSPHATE ALDOLASE B (LIVER).
mRNA sequence.
AQ650027 Sheared ONA-5L2.TF Sheared DNA Trypanosoma brucel genomic clone Trypanosoma Sheared DNA-5L-2, genomlc survey sequence.
Z83859 Mycobacterium tuberculosis H37Rv complete genome; segment 84/162. Mycobacteriunm ltiana
M
mericanus brucef 37,915 30-Jun-93 35,526 20-Nov-99 100,000 7-Feb-99 39.175 10-Sep-99 39,281 22-Jun-99 39,634 17-Jun-98 rxa02469 1554 rxa02497 1050 GBBAI:MLCB1788 GB-BA1:SCAJ1O6OI GB-BA2:CGU3I 224 39228 4692 422 AL008609 AJO10601 U31224 Mycobacterium ieprae cosmld B1788.
Streptomyces coelicaior A3(2) DNA for whiD and whiK loci.
Corynebacterium glutamicum (ppx) gene, partial cds.
rxa02499 933 GBBA1:MTCY2OG9 GBBAI:SCE7 GBBA2:CGU31225 GSBBA1 :NGI7PILA GB-HTG2:AC007984 GB-BA1 :MTCY2OG9 37218 Z77162 Mycobacterium tuberculosis H37Rv complete genome; segment 25/162.
16911 AL049819 Streptomyces coeficolor cosmid E7.
1817 U31225 Corynebacterium glutamicum L-proline:NADP+ 5-oxidoreductase (proC) gene, complete cds.
1920 X13965 Neisseria gonorrhoeae piLA gene.
129715 AC007984 Drosophila meianogaster chromosome 3 clone BACR05CIO (D781) RPCI-98 05.C.10 map 970-97E strain y; cn bw sp. SEQUENCING IN PROGRESS 87 unordered pieces.
37218 Z77162 Mycobacterlum tuberculosis H37Rv complete genome: segment 25/162.
tuberculosis Mycobacterium leprae Streptomyces coelicolor Corynebacterium glutamicum Mycobacterium tuberculosis Streptomyces coelicolor Corynebacterlum giutamicum Neisseria gonorrhoeae Drosophila melanogaster Mycobacterium tuberculosis Mycobacterium ieprae human herpesvlrus 1 Homo sapiens Homo sapiens Homo sapiens Mycobacterlumn tuberculosis Homo sapiens Homo sapiens Mycobacterium leprae Mycobacterium leprae 59.343 48.899 96,445 rxaO2501 1188 27-Aug-99 17-Sep-98 2-Aug-96 GBBA1:U00018 GBVi:HE1ICG rxa02503 522 GBPR3:AC005328 GBJ'R3:AC005545 GBPR3:AC005328 rxa02504 681 GBBAI :MTCY2OG9 GBJ'R3:AC005328 GBPR3:AC005545 rxa02516 1386 GB BAI:MLCL536 42991 U00018 Mycobacterlum leprae cosmid B2168.
59,429 17-Jun-98 39,510 1 0-MAY- 1999 97,749 2-Aug-96 43,249 30-Sep-93 33,406 2-Aug-99 39,357 17-Jun-98 51,768 01-MAR- 1994 39,378 1 7-Apr-97 39,922 28-Jul-98 39,922 3-Sep-98 34.911 28-Jul-98 54,940 17-Jun-98 41,265 28-Jul-98 41,265 3-Sep-98 37,723 04-DEC- 1998 37.723 01-MAR- 1994 152261 35414 43514 35414 37218 35414 43514 36224 X14112 AC005328 AC005545 AC005328 Z77 162 AC005328 AC005545 Z99125 Herpes simplex virus (HSV) type 1 complete genome.
Homo sapiens chromosome 19, cosmid R26660, complete sequence.
Homo sapiens chromosome 19, cosmid R26634, complete sequence.
Homo sapiens chromosome 19, cosmid R26660, complete sequence.
Mycobacterium tuberculosis H37Rv complete genome: segment 25/162.
Homo sapiens chromosome 19, cosmid R26660, complete sequence.
Homo sapiens chromosome 19, cosmld R26634, complete sequence.
Mycobacterium leprae cosmid L536.
GBBAI:U00013 35881 U0001 3 Mycobacterium leprae cosmid B1496.
2007202378 25 May 2007 na02517 570 rxa02532 1170 rxaO2536 879 rxaB0255O 1434 rxaO2559 1026 rxa02622 1683 rxa02623 714 GBBA1:MTVOO7 GBBAI:MLCL536 GBBA1:UCOl 3 GSBAl :SCC22 GB-OV:AF137219 GBEST3O:A1645057 GBEST2O:AA822595 GBHTG2:AF130866 GBHTG2:AF130866 GBPLI-ATT12J5 GB-BA1 :MTCY279 GBBA1:MSGB197OC
S
GBBA2:SC2H4 OB-BAI :MTVOO4 GBPAT:128684 GBBA1:MTU27357 GBBA2:AEOO1 780 GBOV:AF064564 GB-OV:AF064564 32806 36224 35881 22115 831 301 429 118874 118874 84499 9150 39399 25970 69350 5100 5100 11997 49254 AL021 184 Z99 125 UOO013 AL096839 AF 1372 19 A1645057 AA822595 AF 130866 AF130866 AL035522 Z97991 L78815 AL031 514 AL009198 128684 U27357 AE001 780 AF064564 Table 4 (continued) Mycobacterium tuberculosis H37Rv complete genome: segment 64/162. Mycobacterium tuberculosis Mycobacterium leprae cosmid L536. Mycobacterium leprae Mycobacterium leprae cosmid B 1496. Mycobacterlum leprae Streptomyces coelicolor cosmid C22. Streptomyces coelicolor Amla calva mixed lineage leukemia-like protein (Mul) gene, partial cds. Amia catva vsS2alO.y1 Stratagene mouse Tcell 937311 Mus musculus cONA clone Mus musculus IMAGE: 1149882 mRNA sequence.
vsS2alO.ri Stratagene mouse Tcell 937311 Mus musculus cDNA clone Mus musculus IMAGE:1 1149882 5% mRNA sequence.
Honmo sapiens chromosome 8 clone PAC 172N13 map 8q24. Homo sapiens SEQUENCING IN PROGRESS In unordered pieces.
Homo sapiens chromosome 8 clone PAC 172N13 map 8q24, Homo sapiens SEQUENCING IN PROGRESS in unordered pieces.
Arabldopsis thaliana DNA chromosome 4, BAC clone T12J5 (ESSAII project). ArabIlopslIs thaliana Mycobacteriumn tuberculosis H37Rv complete genome; segment 17/1 62. Mycobacterium tuberculosis Mycobacterium leprae cosmid 81970 DNA sequence. Mycobacterium leprae Streptomyces coelicolor cosmid 2H4. Streptomyces coelicolor A3(2) Mycobacterium tuberculosis H37Rv complete genome: segment 144/162. Mycobacterium tuberculosis Sequence 1 from patent US 5573915. Unknown.
Mycobacterium tuberculosis cyclopropane mycolic acid synthase (crnal1) Mycobacterium gene, complete cds. tuberculosis Thermotoga maritima section 92 of 136 of the complete genome. rhermotoga maritima Fugu rubripes neurotibromatosis type 1 (N17l), A-klnase anchor protein Fugu rubripes (AKAP84). BAW protein (BAW). and WSB1 protein (WS81) genes, complete cds.
Fugu rubripes neurotibromatosis type 1 (NFl). A-kinase anchor protein Fugu rubripes (AKAP84), BAW protein (BAW), and WSB1 protein (WSB1) genes, complete cds.
HS_5268 Al G09_SP6E RPCI-1 I Human Male 13AC Library Homo sapiens Homo sapiens genomic clone Plate=844 Col=17 Row--M, genomic survey sequence.
Homo sapiens chromosome 9 clone RP1 1-111 M7 map 9, WORKING DRAFT Homo sapiens SEQUENCE, 51 unordered pieces.
HS_5014_A2_Cl 2_T7A RPCI-1 1 Human Male BAC Library Homo sapiens Homo sapiens genomic clone Plate=590 Col=24 Row=E. genomic survey sequence.
61,.335 37,018 37.0 18 37,071 36,853 41.860 42.353 40.754 40.754 35,063 37,773 39,024 37,906 47,358 39,138 39,138 44,914 39.732 36.703 17-Jun-98 04-DEC- 1 998 01-MAR- 1994 12-Jul-99 7-Sep-99 29-Apr-99 1-7-Feb-98 21-MAR- 1999 2 1-MAR- 1999 24-Feb-99 17-Jun-98 1 5-Jun-96 19-OCT.
1999 18-Jun-98 6-Feb-97 26-Sep-95 2-Jun-99 17-Aug-99 17-Aug-99 49254 AF064564 GB-GSS5:AQ818728 444 AQ818728 38,801 26-Aug-99 GB-HTGS:ACO1 1083 GB-GSS6AQ826948 198586 544 AC01 1083 A0826948 35,714 39,146 19-Nov-99 27-Aug'99 2007202378 25 May 2007 rxa02629 708 GBVI:BRSMGP
GBVI:BRSMGP
462 M86652 462 M86652 Table 4 (continued) Bovine respiratory syncytial virus membrane glycoprotein mRNA, complete cds.
Bovine respiratory syncytial virus membrane glycoprotein mRNA, complete cds.
Bovine respiratory syncytial 37,013 virus Bovine respiratory syncytial 37,013 virus rxa02645 1953 rxa02646 1392 rxa02648 1326 GB-PAT:A45577 1925 GB-PAT:A45581 1925 GB.BAI :CORILVA 1925 GB-BA1:CORILVA 1925 GB-PAT:A45585 1925 GBPAT:A45583 1925 GBOV:ICTCNC, 2049 GB-ESTi 1:AA265464 345 GB-GSS8:AQ006950 480 A45577 A45581 L01508 LO01508 A45585 A45583 M831 11 AA265464 AQ006950 Sequence 1 from Patent W09519442.
Sequence 5 from Patent W09519442.
Corynebacterium glutamicum threonine dehydratase (ilvA) gene, complete cds.
Corynebacteriumn glulamicum threonine dehydratase (ilvA) gene, complete cds.
Sequence 9 from Patent W095 19442.
Sequence 7 from Patent W095 19442.
Ictalurus pundtatus cyclic nucleotide-gated channel RNA sequence.
mx9lcOS.rl Soares mouse NML Mus musculus cDNA clone IMAGE:693706 mRNA sequence.
CIT-H-SP-2294EI4.TR CIT-HSP Homo sapiens genomic clone 2294E14, genomic survey sequence.
Corynebacterium glutamicum Corynebacterium glutamicum Corynebacterium glutamicum Corynebacterium glutamicum Corynebacterium glutamicumn Corynebacterium glutamicum Ictalurus punctatus Mus musculus Homo sapiens 39,130 39. 130 39,130 99,138 99.066 99,066 38,402 38,655 36.074 28-Apr-93 28-Apr.93 07-MAR- 1997 07-MAR- 1997 26-Apr-93 26-Apr-93 07-MAR- 1997 07-MAR- 1997 24-MAY- 1993 1997 27-Jun-98 rxa02653 rxaO2687 1068 rxa0271l7 1005 rxa02754 1461
GBBAI:CORPHEA
GBPAT:E04483 GB-PAT:E061 10 GBPL1J-iVCH4H GBJ'R2:HS31 01-5 GBPR3:AC004754 1088 948 948 59748 29718 M13774 E04483 E061 10 Y14573 Z69705 C.glutamicum pheA gene encoding prephenate dehydratase, complete cds.
DNA encoding prephenate dehydratase.
DNA encoding prephenate dehydratase.
Hordeum vulgare DNA for chromosome 4H1.
Human DNA sequence from cosmid 3101-5 from a contig from the tip of the short arm of chromosome 16, spanning 2Mb of 16p13.3. Contains EST and CpG Island.
Homo sapiens chromosome 16, cosmid clone RT286 (LANL), complete sequence.
Drosophila melanogaster chromosome 3 clone BACR16I118 (D815) RPCI-98 16.1.18 map 95A-95A strain y; cn bw sp, SEQUENCING IN PROGRESS-. 101 unordered pieces.
Corynebacterium glutamnicum Corynebacterium glutamicum Corynebacterium glutamicum Hordeum vulgare Homo sapiens Homo sapiens Drosophila melanogaster 99.715 98,523 98,523 36,593 36,089 26-Apr-93 29-Sep-97 29-Sep-97 1999 22-Nov-99 39188 AC004754 36,089 28-MAY- 1998 GB-HTG2:AC008223 130212 AC008223 32,757 2-Augt99 2007202378 25 May 2007 rxa02758 1422 rxa02771 678 rxa02772 1158 GB-TG2:AC008223 GB-BA1:MTCY71 GBHTG5:ACO1 1678 GB-HTG5:ACO1 1678 GBBA2:AF064070 GB_8A2:AF038651 GBIN1:CELTI9B4 GB_EST36:AV1 93572 GB-BA2:AF038651 Table 4 (continued) 130212 AC008223 Drosophila melanogaster chromosome 3 done BACR11I8 (0815) RPCI-98 16.1.18 map 95A-95A strain y; cn bw sp., SEQUENCING IN PROGRESS 101 unordered pieces.
42729 Z92771 Mycobacterium tuberculosis H37Rv complete genome; segment 141/162.
171967 AC0 11678 Homo sapiens clone 14_B_7, SEQUENCING IN PROGRESS ,20 unordered pieces.
171967 AC01 1678 Homo sapiens clone 14_8_7, SEQUENCING IN PROGRESS ,20 unordered pieces.
23183 AF064070 Burkholderla pseudomallel putative dihydroorotase (pyrC) gene, partial cds; putative 1 -acyi-sn-glycerol-3-phosphate acyltransferase (pisC), putative diadenosine tetra ph osphatase (apaH), complete cds: type 11 0-antigen biosynthesis gene cluster, complete sequence; putative undecaprenyl phosphate N-acetylglucosaminyltransferase. and putative UDP-glucose 4epimerase genes, complete cds; and putative galactosyl transferase gene.
parti al cds.
4077 AF038651 Corynebacteriumn glutamicumn dipeptide-binding protein (dciAE) gene, partial cds; adenine phosphoribosyltransferase (apt) and GTP pyrophosphokinase (ret) genes, complete cds:. and unknown gene.
37121 U80438 Caenorhabditis elegans cosmid T19B4.
360 AV193572 AV193572 Yuji Kohara unpublished cONA:Strain N2 hermaphrodite embryo Caeriorhabditis elegans cDNA clone yk61 8h8 mRNA sequence.
4077 AF038651 Corynebacterium glutamnicumn dipeptide-binding protein (dciAE) gene, partial cds; adenine phosphoribosyltransferase (apt) and GTP pyropliosphokinase (rel) genes, complete cds; and unknown gene.
35946 Z77724 Mycobactedriu tuberculosis H37Rv complete genome: segment 114/162.
40429 U00011 Mycobacterlum-leprae cosmid 81177.
33818 Z83863 Mycobacteriumn tuberculosis H37Rv complete genome; segment 111/162.
172931 AC006581 Honio sapiens 12p 2 l BAC RPCI1 1-259018 (Roswell Park Cancer Institute Human BAC: Library) complete sequence.
172931 AC006581 Homo sapiens 12p 2 l BAC RPCII 1-259018 (Roswell Park Cancer Institute Human BAC Library) complete sequence.
33818 Z83863 Mycobacterium tuberculosis H37Rv complete genome; segment 1111162.
3694 M35 195 Chicken tyrosine kinase (cek2) mRNA, complete cds.
5037 Z17372 M.smegmatls asd, ask-alpha, and ask-bela genes.
169 A1223401 qg48gOl.xl Soares-testisN-T Homo sapiens cDNA clone IMAGE:1838448 Drosophila melanogaster Mycobacterium tuberculosis Homo sapiens Homo sapiens Burkhotderia pseudomallel Corynebacteriumn glutamicum Caenorhabditls elegans Caenorhabditis elegans Corynebacterium glutamicumn Mycobacteriumn tub~erculosis Mycobacterium Ieprae Mycobacterium tuberculosis Homo sapiens Homo sapiens Mycobacterium tuberculosis Gallus gailus Mycobacteriumn smegmatis 32,757 37.838 35,331 33,807 36,929 2-Aug-99 1 0-Fe b-99 5-Nov-99 5-Nov-99 20-Jan-99 99,852 14-Sep-98 GBBAI:MTCY227 GB_BA1:UOO I I rxa02790 1266 GBBA1:MTCY159 GBPR4:AC006581 GB-PR4:AC0O65B1 rxa02791 951 GB-BA1:MTCY159 GBOV:Cl-KCEK2 GB-BA1 :MSASDASK rxa028O2 1 194 GBE5124:A1223401 43.836 48,588 99,914 38,339 38.996 37,640 37,906 35,280 39,765 38,937 38,495 04-DEC- 1996 22-Jul-99 14-Sep-98 17-Jun-98 01WAR- 1994 17-Jun-98 3-Jun-99 3-Jun-99 17-Jun-98 28-Apr-93 9-Aug-94 IH-omo sapiens 40,828 27-OCT- 1998 3 similar to WP:C25D7.8 CE08394 mRNA sequence.
2007202378 25 May 2007 GB-EST24:A223401 169 Table 4 (continued) A1223401 Qg48g01.x1 Soares -testis_-NHT Homo sapiens cONA clone IMAGE: 1838448 H-omo sapiens 3' similar to WP:C25D7.8 CE08394 mRNA sequence.
40,828 27-OCT- 1998 rxa02814 494 rxa02843 608 rxs03205 963 rxs03223 1237 GBBA1:MTCY7DI1 GB-BAI :MTCY7D1 1 GB PRI:HSAJ2g62 GB-BA1 :CGAJ4934 GBB3AI:MTC 1364 GB-BAI :MLU1 5180 GB BAl:BLSIGBGN GB-EST2I :M980237 GB-EST23:AII 58316 GB-IN1 :LMFL2743 GB-PR3:H-SDJ61 82 22070 22070 778 1160 29540 38675 2906 377 371 38368 119666 Z95120 Z95120 AJ002962 AJ004934 Z93777 U 15 180 Z49824 AA980237 Al1158316 AL031910 AL096710 Mycobacterium tuberculosis H-37Rv complete genome: segment 1 38/162. Mycobacteriurr tuberculosis Mycobacterlum tuberculosis H37Rv complete genome; segment 1381162. Mycobacteriurr tuberculosis Homo sapiens mRNA for hB-FABP. Homo sapiens Corynebacterium glutamicum dapD gene, complete CDS. Corynebactedu glutamicum Mycobacterium tuberculosis H-37Rv complete genome: segment 52/1 62. Mycobactedrn tube rculosis Mycobacteriumn leprae cosmid 81756. Mycobact eriun- B.lactofermentum orfl gene and sigB gene. Corynebacteriu ua32a12.rl Soares -mammary~glandNbMMG Mus musculus cDNA clone Mus musculus IMAGE: i348414 5'sImilar to TR:Q61025 061025 HYPOTHETICAL 15.2 KO PROTEIN. mRNA sequence.
ud27c05.rl Soares hymus_2NbMT Mus musculus cDNA clone Mus musculus IMAGE:144711 12 mRNA sequence.
Leishmanla major Friedlin chromosome 4 cosmid 12743. Leishmania ma Human DNA sequence from clone RPI-6162 on chromosome 6p11.2-12.3 Homo sapiens Contains Isoforms 1 and 3 of BPAG11 (bullous pemphIgoid antigen 1 (230I240kD). an exon of a gene similar to murine MACIF cytoskeletal protein, STSs and GSSs, complete sequence.
Human DNA sequence from clone RP1-61B2 on chromosome 6p1 1.2-12.3 Homo sapiens Contains isoforms 1 and 3 of BPAG1 (builous pemphigoid antigen 1 (230f240kD), an exon of a gene similar to murine MACIF cytoskeletal protein.
STSs and GSSs, complete sequence.
I58,418 40,496 39,826 im 100.000 I37.710 leprae 39.626 im glutamicur@8.854 41,489 17-Jun-98 17-Jun-98 8-Jan-98 17-Jun-98 17-Jun-98 09-MAR- 1995 25-Apr-96 27-MAY- 1998 30-Sep-98 1999 1 7-DEC- 1999 jar 38.005 39,869 34,930 GBPR3:HSDJ61B2 119666 AL096710 34,634 17-DEC- 1999 132- Exemplification t Example 1: Preparation of total genomic DNA of Corynebacterium glutamicum ATCC 13032 A culture of Corynebacterium glutamicum (ATCC 13032) was grown overnight at 30 0 C with vigorous shaking in BHI medium (Difco). The cells were harvested by 00 centrifugation, the supernatant was discarded and the cells were resuspended in 5 ml buffer-I of the original volume of the culture all indicated volumes have been calculated for 100 ml of culture volume). Composition of buffer-I: 140.34 g/l sucrose, .2.46 g/l MgSO, x 7HO, 10 ml/1 KH 2 PO, solution (100 g/l, adjusted to pH 6.7 with KOH), 50 ml/1 M12 concentrate (10 g/l (NHL),SO,, 1 g/ NaC1, 2 g/l MgSO, x 7H,O, 0.2 g/1 CaCI,, 0.5 g/l yeast extract (Difco), 10 ml/1 trace-elements-mix (200 mg/l FeSO, x H 2 0, 10 mg/l ZnSO, x 7 HO, 3 mg/l MnCI, x 4 HO 2 30 mg/l H 3 BO, 20 mg/1 CoC,1 x 6 H,O, 1 mg/1 NiCI, x 6 HO, 3 mg/l NaMoO, x 2 HO, 500 mg/1 complexing agent (EDTA or critic acid), 100 ml/l vitamins-mix (0.2 mg/1 biotin, 0.2 mg/l folic acid, mg/1 p-amino benzoic acid, 20 mg/l riboflavin, 40 mg/l ca-panthothenate, 140 mg/1 nicotinic acid, 40 mg/1 pyridoxole hydrochloride, 200 mg/l myo-inositol). Lysozyme was added to the suspension to a final concentration of 2.5 mg/ml. After an approximately 4 h incubation at 37'C, the cell wall was degraded and the resulting protoplasts are harvested by centrifugation. The pellet was washed once with 5 ml buffer-I and once with 5 ml TE-buffer (10 mM Tris-HC1, 1 mM EDTA, pH The pellet was resuspended in 4 ml TE-buffer and 0.5 ml SDS solution and 0.5 ml NaCI solution (5 M) are added. After adding ofproteinase K to a final concentration of 200 ig/ml, the suspension is incubated for ca.18 h at 37 0 C. The DNA was purified by extraction with phenol, phenol-chloroform-isoamylalcohol and chloroformisoamylalcohol using standard procedures. Then, the DNA was precipitated by adding 1/50 volume of 3 M sodium acetate and 2 volumes of ethanol, followed by a 30 min incubation at -20oC and a 30 min centrifugation at 12,000 rpm in a high speed centrifuge using a SS34 rotor (Sorvall). The DNA was dissolved in 1 ml TE-buffer containing tig/ml RNaseA and dialysed at 4 0 C against 1000 ml TE-buffer for at least 3 hours.
During this time, the buffer was exchanged 3 times. To aliquots of 0.4 ml of the dialysed DNA solution, 0.4 ml of 2 M LiCI and 0.8 ml of ethanol are added. After a 133 min incubation at -20°C, the DNA was collected by centrifugation (13,000 rpm, Biofuge Fresco, Heraeus, Hanau, Germany). The DNA pellet was dissolved in TE-buffer. DNA prepared by this procedure could be used for all purposes, In including southern blotting or construction of genomic libraries.
Example 2: Construction of genomic libraries in -Escherichia coli of Corynebacterium glutamicum ATCC13032 Using DNA prepared as described in Example 1, cosmid and plasmid libraries were constructed according to known and well established methods (see Sambrook, J. et al. (1989) "Molecular Cloning A Laboratory Manual", Cold Spring Harbor Laboratory Press, or Ausubel, F.M. et al. (1994) "Current Protocols in Molecular Biology", John Wiley Sons).
Any plasmid or cosmid could be used. Of particular use were the plasmids pBR322 (Sutcliffe, J.G. (1979) Proc. Natl. Acad. Sci. USA, 75:3737-3741); pACYC177 (Change Cohen (1978) J. Bacteriol 134:1141-1156), plasmids of the pBS series (pBSSK+, pBSSK- and others; Stratagene, LaJolla, USA), or cosmids as SuperCosl (Stratagene, LaJolla, USA) or Lorist6 (Gibson, T.J., Rosenthal A. and Waterson, R.H. (1987) Gene 53:283-286. Gene libraries specifically for use in C. glutamicum may be constructed using plasmid pSL109 (Lee, and A.J. Sinskey (1994) J. Microbiol. Biotechnol. 4:256-263).
Example 3: DNA Sequencing and Computational Functional Analysis Genomic libraries, as described in Example 2 were used for DNA sequencing according to standard methods, in particular by the chain termination method using AB1377 sequencing machines (see Fleischman, R.D. et al.
(1995) "Whole-genome Random Sequencing and Assembly of Haemophilus Influenzae Rd., Science, 269:496-512). Sequencing primers with the following nucleotide sequences were used: 5'-GGAAACAGTATGACCATG-3' (SEQ ID NO.
1157) or 5'-GTAAAACGACGGCCAGT-3' (SEQ ID NO. 1158).
Example 4: In vivo Mutagenesis In vivo mutagenesis of Corynebacterium glutamicum can be performed by passage of plasmid (or other vector) DNA through E. coli or other microorganisms Bacillus spp. or yeasts such as Saccharomyces cerevisiae) which are impaired in their capabilities to maintain 134- 0 the integrity of their genetic information. Typical mutator strains have mutations in the genes for the DNA repair system mutHLS, mutD, mutT, etc.; for reference, see Rupp, W.D.
(1996) DNA repair mechanisms, in: Escherichia coli and Salmonella, p. 2277-2294, ASM: Washington.) Such strains are well known to those of ordinary skill in the art. The use of such C1 5 strains is illustrated, for example, in Greener, A. and Callahan, M. (1994) Strategies 7: 32-34.
00 SExample 5: DNA Transfer Between Escherichia coli and Corynebacterium Sglutamicum SSeveral Corynebacterium and Brevibacterium species contain endogenous plasmids (as pHM1519 or pBLl) which replicate autonomously (for review see, e.g., SMartin, J.F. et al. (1987) Biotechnology, 5:137-146). Shuttle vectors for Escherichia coli and Corynebacterium glutamicum can be readily constructed by using standard vectors for E. coli (Sambrook, J. et al. (1989), "Molecular Cloning: A Laboratory Manual", Cold Spring Harbor Laboratory Press or Ausubel, F.M. et al. (1994) "Current Protocols in Molecular Biology", John Wiley Sons) to which a origin or replication for and a suitable marker from Corynebacterium glutamicum is added. Such origins of replication are preferably taken from endogenous plasmids isolated from Corynebacterium and Brevibacterium species. Of particular use as transformation markers for these species are genes for kanamycin resistance (such as those derived from the Tn5 or Tn903 transposons) or chloramphenicol (Winnacker, E.L. (1987) "From Genes to Clones Introduction to Gene Technology, VCH, Weinheim). There are numerous examples in the literature of the construction of a wide variety of shuttle vectors which replicate in both E.
coli and C. glutamicum, and which can be used for several purposes, including gene overexpression (for reference, see Yoshihama, M. et al. (1985) J. Bacteriol. 162:591-597, Martin J.F. et al. (1987) Biotechnology, 5:137-146 and Eikmanns, B.J. et al. (1991) Gene, 102:93-98).
Using standard methods, it is possible to clone a gene of interest into one of the shuttle vectors described above and to introduce such a hybrid vectors into strains of Corynebacterium glutamicum. Transformation of C. glutamicum can be achieved by protoplast transformation (Kastsumata, R. et al. (1984) J. Bacteriol. 159306-311), electroporation (Liebl, E. et al. (1989) FEMS Microbiol. Letters, 53:399-303) and in cases where special vectors are used, also by conjugation (as described e.g. in Schafer, A et al.
135- I (1990) J. Bacteriol. 172:1663-1666). It is also possible to transfer the shuttle vectors for C. glutamicum to E. coli by preparing plasmid DNA from C. glutamicum (using standard methods well-known in the art) and transforming it into E. coli. This transformation step f can be performed using standard methods, but it is advantageous to use an Mcr-deficient E. coli strain, such as NM522 (Gough Murray (1983) J. Mol. Biol. 166:1-19).
00 Genes may be overexpressed in C. glutamicum strains using plasmids which r comprise pCGI Patent No. 4,617,267) or fragments thereof, and optionally the gene for kanamycin resistance from TN903 (Grindley, N.D. and Joyce, C.M. (1980) Proc. Natl. Acad. Sci. USA 77(12): 7176-7180). In addition, genes may be overexpressed in C. glutamicum strains using plasmid pSL109 (Lee, and A. J.
Sinskey (1994) J. Microbiol. Biotechnol. 4: 256-263).
Aside from the use of replicative plasmids, gene overexpression can also be achieved by integration into the genome. Genomic integration in C. glutamicum or other Corynebacterium or Brevibacterium species may be accomplished by well-known methods, such as homologous recombination with genomic region(s), restriction endonuclease mediated integration (REMI) (see, DE Patent 19823834), or through the use of transposons. It is also.possible to modulate the activity of a gene of interest by modifying the regulatory regions a promoter, a repressor, and/or an enhancer) by sequence modification, insertion, or deletion using site-directed methods (such as homologous recombination) or methods based on random events (such as transposon mutagenesis or REMI). Nucleic acid sequences which function as transcriptional terminators may also be inserted 3' to the coding region of one or more genes of the invention; such terminators are well-known in the art and are described, for example, in Winnacker, E.L. (1987) From Genes to Clones Introduction to Gene Technology. VCH: Weinheim.
Example 6: Assessment of the Expression of the Mutant Protein Observations of the activity of a mutated protein in a transformed host cell rely on the fact that the mutant protein is expressed in a similar fashion and in a similar quantity to that of the wild-type protein. A useful method to ascertain the level of transcription of the mutant gene (an indicator of the amount of mRNA available for translation to the gene product) is to perform a Northern blot (for reference see, for example, Ausubel et al.
-136- O (1988) Current Protocols in Molecular Biology, Wiley: New York), in which a primer designed to bind to the gene of interest is labeled with a detectable tag (usually radioactive or chemiluminescent), such that when the total RNA of a culture of the organism is In extracted, run on gel, transferred to a stable matrix and incubated with this probe, the 5 binding and quantity of binding of the probe indicates the presence and also the quantity of mRNA for this gene. This information is evidence of the degree of transcription of the 00 mutant gene. Total cellular RNA can be prepared from Corynebacterium glutamicum by N, several methods, all well-known in the art, such as that described in Bormann, E.R. et al.
C (1992) Mol. Microbiol. 6: 317-326.
To assess the presence or relative quantity of protein translated from this mRNA, Sstandard techniques, such as a Western blot, may be employed (see, for example, Ausubel et al. (1988) Current Protocols in Molecular Biology, Wiley: New York). In this process, total cellular proteins are extracted, separated by gel electrophoresis, transferred to a matrix such as nitrocellulose, and incubated with a probe, such as an antibody, which specifically binds to the desired protein. This probe is generally tagged with a chemiluminescent or colorimetric label which may be readily detected. The presence and quantity of label observed indicates the presence and quantity of the desired mutant protein present in the cell.
Example 7: Growth of Genetically Modified Corynebacterium glutamicum Media and Culture Conditions Genetically modified Corynebacteria are cultured in synthetic or natural growth media. A number of different growth media for Corynebacteria are both well-known and readily available (Lieb et al. (1989) Appl. Microbiol. Biotechnol., 32:205-210; von der Osten et al. (1998) Biotechnology Letters, 11:11-16; Patent DE 4,120,867; Liebl (1992) "The Genus Corynebacterium, in: The Procaryotes, Volume II, Balows, A. et al., eds.
Springer-Verlag). These media consist of one or more carbon sources, nitrogen sources, inorganic salts, vitamins and trace elements. Preferred carbon sources are sugars, such as mono-, di-, or polysaccharides. For example, glucose, fructose, mannose, galactose, ribose, sorbose, ribulose, lactose, maltose, sucrose, raffinose, starch or cellulose serve as very good carbon sources. It is also possible to supply sugar to the media via complex compounds such as molasses or other by-products from sugar refinement. It can also be 137advantageous to supply mixtures of different carbon sources. Other possible carbon sources are alcohols and organic acids, such as methanol, ethanol, acetic acid or lactic acid. Nitrogen sources are usually organic or inorganic nitrogen compounds, or materials n which contain these compounds. Exemplary nitrogen sources include ammonia gas or ammonia salts, such as NHCI or (NHI),SO,, NH,OH, nitrates, urea, amino acids or o0 complex nitrogen sources like corn steep liquor, soy bean flour, soy bean protein, yeast extract, meat extract and others.
Inorganic salt compounds which may be included in the media include the chloride-, phosphorous- or sulfate- salts of calcium, magnesium, sodium, cobalt, molybdenum, potassium, manganese, zinc, copper and iron. Chelating compounds can be
C
1 added to the medium to keep the metal ions in solution. Particularly useful chelating compounds include dihydroxyphenols, like catechol or protocatechuate, or organic acids, such as citric acid. It is typical for the media to also contain other growth factors, such as vitamins or growth promoters, examples of which include biotin, riboflavin, thiamin, folic acid, nicotinic acid, pantothenate and pyridoxin. Growth factors and salts frequently originate from complex media components such as yeast extract, molasses, corn steep liquor and others. The exact composition of the media compounds depends strongly on the immediate experiment and is individually decided for each specific case. Information about media optimization is available in the textbook "Applied Microbiol. Physiology, A Practical Approach (eds. P.M. Rhodes, P.F. Stanbury, IRL Press (1997) pp. 53-73, ISBN 0 19 963577 It is also possible to select growth media from commercial suppliers, like standard 1 (Merck) or BHI (grain heart infusion, DIFCO) or others.
All medium components are sterilized, either by heat (20 minutes at 1.5 bar and 121'C) or by sterile filtration. The components can either be sterilized together or, if necessary, separately. AlIl media components can be present at the beginning of growth, or they can optionally be added continuously or batchwise.
Culture conditions are defined separately for each experiment. The temperature should be in a range between 15'C and 45'C. The temperature can be kept constant or can be altered during the experiment. The pH of the medium should be in the range of 5 to 8.5, preferably around 7.0, and can be maintained by the addition of buffers to the media.
An exemplary buffer for this purpose is a potassium phosphate buffer. Synthetic buffers such as MOPS, HEPES, ACES and others can alternatively or simultaneously be used. It 138is also possible to maintain a constant culture pH through the addition of NaOH or NHOH during growth. If complex medium components such as yeast extract are utilized, the necessity for additional buffers may be reduced, due to the fact that many complex compounds have high buffer capacities. If a fermentor isutilized for culturing the micro- N 5 organisms, the pH can also be controlled using gaseous ammonia.
The incubation time is usually in a range from several hours to several days. This 00 0- time is selected in order to permit the maximal amount of product to accumulate in the Sbroth. The disclosed growth experiments can be carried out in a variety of vessels, such as microtiter plates, glass tubes, glass flasks or glass or metal fermentors of different sizes.
For screening a large number of clones, the microorganisms should be cultured in microtiter plates, glass tubes or shake flasks, either with or without baffles. Preferably 100 ml shake flasks are used, filled with 10% (by volume) of the required growth medium. The flasks should be shaken on a rotary shaker!(amplitude 25 mm) using a speed-range of 100 300 rpm. Evaporation losses can be diminished by the maintenance of a humid atmosphere; alternatively, a mathematical correction for evaporation losses should be performed.
If genetically modified clones are tested, an unmodified control clone or a control clone containing the basic plasmid without any insert should also be tested. The medium is inoculated to an OD 600 of 0.5 1.5 using cells grown on agar plates, such as CM plates (10 g/l glucose, 2,5 g/l NaCI, 2 g/l urea, 10 g/1 polypeptone, 5 g/1 yeast extract, 5 g/l meat extract, 22 g/1 NaCI, 2 g/l urea, 10 g/1 polypeptone, 5 g/l yeast extract, 5 g/l meat extract, 22 g/l agar, pH 6.8 with 2M NaOH) that had been incubated at 30'C. Inoculation of the media is accomplished by either introduction of a saline suspension of C. glutamicum cells from CM plates or addition of a liquid preculture of this bacterium.
Example 8 -In vitro Analysis of the Function of Mutant Proteins The determination of activities and kinetic parameters of enzymes is well established in the art. Experiments to determine the activity of any given altered enzyme must be tailored to the specific activity of the wild-type enzyme, which is well within the ability of one of ordinary skill in the art. Overviews about enzymes in general, as well as specific details concerning structure, kinetics, principles, methods, applications and examples for the determination of many enzyme activities may be -139found, for example, in the following references: Dixon, and Webb, (1979) Enzymes. Longmans: London; Fersht, (1985) Enzyme Structure and Mechanism.
Freeman: New York; Walsh, (1979) Enzymatic Reaction Mechanisms. Freeman: San t Francisco; Price, Stevens, L. (1982) Fundamentals of Enzymology. Oxford Univ.
Press: Oxford; Boyer, ed. (1983) The Enzymes, 3 r d ed. Academic Press: New 00 York; Bisswanger, (1994) Enzymkinetik, 2 nd ed. VCH: Weinheim (ISBN c 3527300325); Bergmeyer, Bergmeyer, GraB1, eds. (1983-1986) Methods of Enzymatic Analysis, 3 rd ed., vol. I-XII, Verlag Chemie: Weinheim; and Ullmann's Encyclopedia of Industrial Chemistry (1987) vol. A9, "Enzymes". VCH: Weinheim, p.
352-363.
The activity of proteins which bind to DNA can be measured by several wellestablished methods, such as DNA band-shift assays (also called gel retardation assays).
The effect of such proteins on the expression of other molecules can be measured using reporter gene assays (such as that described in Kolmar, H. el al. (1995) EMBOJ. 14: 3895-3904 and references cited therein). Reporter gene test systems are well known and established for applications in both pro- and eukaryotic cells, using enzymes such as beta-galactosidase, green fluorescent protein, and several others.
The determination of activity of membrane-transport proteins can be performed according to techniques such as those described in Gennis, R.B. (1989) "Pores, Channels and Transporters", in Biomembranes, Molecular Structure and Function, Springer: Heidelberg, p. 85-137; 199-234; and 270-322.
Example 9: Analysis of Impact of Mutant Protein on the Production of the Desired Product The effect of the genetic modification in C. glutamicum on production of a desired compound (such as an amino acid) can be assessed by growing the modified microorganism under suitable conditions (such as those described above) and analyzing the medium and/or the cellular component for increased production of the desired product an amino acid). Such analysis techniques are well known to one of ordinary skill in the art, and include spectroscopy, thin layer chromatography, staining methods of various kinds, enzymatic and microbiological methods, and analytical chromatography such as high performance liquid chromatography (see, for example, -140- Ullman, Encyclopedia of Industrial Chemistry, vol. A2, p. 89-90 and p. 443-613, VCH: Weinheim (1985); Fallon, A. et al., (1987) "Applications of HPLC in Biochemistry" in: Laboratory Techniques in Biochemistry and Molecular Biology, vol. 17; Rehm et al.
S(1993) Biotechnology, vol. 3, Chapter III: "Product recovery and purification", page 469-714, VCH: Weinheim; Belter, P.A. et al. (1988) Bioseparations: downstream processing for biotechnology, John Wiley and Sons; Kennedy, J.F. and Cabral, J.M.S.
00 (1992) Recovery processes for biological materials, John Wiley and Sons; Shaeiwitz, J.A. and Henry, J.D. (1988) Biochemical separations, in: Ulmann's Encyclopedia of Industrial Chemistry, vol. B3, Chapter 11, page 1-27, VCH: Weinheim; and Dechow, F.J. (1989) Separation and purification techniques in biotechnology, Noyes SPublications.) In addition to the measurement of the final product of fermentation, it is also possible to analyze other components of the metabolic pathways utilized for the production of the desired compound, such as intermediates and side-products, to determine the overall efficiency of production of the compound. Analysis methods include measurements of nutrient levels in the medium sugars, hydrocarbons, nitrogen sources, phosphate, and other ions), measurements ofbiomass composition and growth, analysis of the production of common metabolites of biosynthetic pathways, and measurement of gasses produced during fermentation. Standard methods for these measurements are outlined in Applied Microbial Physiology, A Practical Approach, P.M. Rhodes and P.F. Stanbury, eds., IRL Press, p. 103-129; 131-163; and 165-192 (ISBN: 0199635773) and references cited therein.
Example 10: Purification of the Desired Product from C. glutamicum Culture Recovery of the desired product from the C. glutamicum cells or supernatant of the above-described culture can be performed by various methods well known in the art.
If the desired product is not secreted from the cells, the cells can be harvested from the culture by low-speed centrifugation, the cells can be lysed by standard techniques, such as mechanical force or sonication. The cellular debris is removed by centrifugation, and the supernatant fraction containing the soluble proteins is retained for further purification of the desired compound. If the product is secreted from the C. glutamicum 0 -141- 0 cells, then the cells are removed from the culture by low-speed centrifugation, and the Ssupernate fraction is retained for further purification.
The supernatant fraction from either purification method is subjected to C chromatography with a suitable resin, in which the desired molecule is either retained on a chromatography resin while many of the impurities in the sample are not, or where the 0 impurities are retained by the resin while the sample is not. Such chromatography steps may be repeated as necessary, using the same or different chromatography resins. One 0 of ordinary skill in the art would be well-versed in the selection of appropriate chromatography resins and in their most efficacious application for a particular molecule to be purified. The purified product may be concentrated by filtration or ultrafiltration, and stored at a temperature at which the stability of the product is maximized.
There are a wide array of purification methods known to the art and the preceding method of purification is not meant to be limiting. Such purification techniques are described, for example, in Bailey, J.E. Ollis, D.F. Biochemical Engineering Fundamentals, McGraw-Hill: New York (1986).
The identity and purity of the isolated compounds may be assessed by techniques standard in the art. These include high-performance liquid chromatography (HPLC), spectroscopic methods, staining methods, thin layer chromatography, NIRS, enzymatic assay, or microbiologically. Such analysis methods are reviewed in: Patek et al. (1994) Appl. Environ. Microbiol. 60: 133-140; Malakhova et al. (1996) Biotekhnologiya 11: 27- 32; and Schmidt et al. (1998) Bioprocess Engineer. 19: 67-70. Ulmann's Encyclopedia of Industrial Chemistry, (1996) vol. A27, VCH: Weinheim, p. 89-90, p. 521-540, p. 540- 547, p. 559-566, 575-581 and p. 581-587; Michal, G. (1999) Biochemical Pathways: An Atlas of Biochemistry and Molecular Biology, John Wiley and Sons; Fallon, A. et al.
(1987) Applications of HPLC in Biochemistry in: Laboratory Techniques in Biochemistry and Molecular Biology, vol. 17.
Example 11: Analysis of the Gene Sequences of the Invention The comparison of sequences and determination of percent homology between two sequences are art-known techniques, and can be accomplished using a mathematical algorithm, such as the algorithm of Karlin and Altschul (1990) Proc. Natl. Acad. Sci.
USA 87:2264-68, modified as in Karlin and Altschul (1993) Proc. Natl. Acad. Sci. USA 142- O 90:5873-77. Such an algorithm is incorporated into the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score 100, wordlength 12 to obtain nucleotide sequences homologous to MP nucleic acid c 1 5 molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score 50, wordlength 3 to obtain amino acid sequences 00 homologous to MP protein molecules of the invention. To obtain gapped alignments for C( comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25(17):3389-3402. When utilizing BLAST and Gapped BLAST programs, one of ordinary skill in the art will know how to optimize the C parameters of the program XBLAST and NBLAST) for the specific sequence being analyzed.
Another example of a mathematical algorithm utilized for the comparison of sequences is the algorithm of Meyers and Miller ((1988) Comput. Appl. Biosci. 4: 11- 17). Such an algorithm is incorporated into the ALIGN program (version 2.0) which is part of the GCG sequence alignment software package. When utilizing the ALIGN program for comparing amino acid sequences, a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4 can be used. Additional algorithms for sequence analysis are known in the art, and include ADVANCE and ADAM. described in Torelli and Robotti (1994) Comput. Appl. Biosci. 10:3-5; and FASTA, described in Pearson and Lipman (1988) P.N.A.S. 85:2444-8.
The percent homology between two amino acid sequences can also be accomplished using the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blosum 62 matrix or a PAM250 matrix, and a gap weight of 12, 10, 8, 6, or 4 and a length weight of 2, 3, or 4. The percent homology between two nucleic acid sequences can be accomplished using the GAP program in the GCG software package, using standard parameters, such as a gap weight of 50 and a length weight of 3.
A comparative analysis of the gene sequences of the invention with those present in Genbank has been performed using techniques known in the art (see, Bexevanis and Ouellette, eds. (1998) Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins. John Wiley and Sons: New York). The gene sequences of the invention 143- CI were compared to genes present in Genbank in a three-step process. In a first step, a BLASTN analysis a local alignment analysis) was performed for each of the sequences of the invention against the nucleotide sequences present in Genbank, and the top 500 hits were retained for further analysis. A subsequent FASTA search a combined local and global alignment analysis, in which limited regions of the sequences 00 are aligned) was performed on these 500 hits. Each gene sequence of the invention was c subsequently globally aligned to each of the top three FASTA hits, using the GAP O program in the GCG software package (using standard parameters). In order to obtain correct results, the length of the sequences extracted from Genbank were adjusted to the length of the query sequences by methods well-known in the art. The results of this analysis are set forth in Table 4. The resulting data is identical to that which would have been obtained had a GAP (global) analysis alone been performed on each of the genes of the invention in comparison with each of the references in Genbank, but required significantly reduced computational time as compared to such a database-wide GAP (global) analysis. Sequences of the invention for which no alignments above the cutoff values were obtained are indicated on Table 4 by the absence of alignment information.
It will further be understood by one of ordinary skill in the art that the GAP alignment homology percentages set forth in Table 4 under the heading homology (GAP)" are listed in the European numerical format, wherein a represents a decimal point. For example, a value of "40,345" in this column represents "40.345%".
Example 12: Construction and Operation of DNA Microarrays The sequences of the invention may additionally be used in the construction and application of DNA microarrays (the design, methodology, and uses of DNA arrays are well known in the art, and are described, for example, in Schena, M. et al. (1995) Science 270: 467-470; Wodicka, L. et al. (1997) Nature Biotechnology 15: 1359-1367; DeSaizieu, A. et al. (1998) Nature Biotechnology 16: 45-48; and DeRisi, J.L. et al.
(1997) Science 278: 680-686).
DNA microarrays are solid or flexible supports consisting of nitrocellulose, nylon, glass, silicone, or other materials. Nucleic acid molecules may be attached to the surface in an ordered manner. After appropriate labeling, other nucleic acids or nucleic acid mixtures can be hybridized to the immobilized nucleic acid molecules, and the label -144may be used to monitor and measure the individual signal intensities of the hybridized molecules at defined regions. This methodology allows the simultaneous quantification of the relative or absolute amount of all or selected nucleic acids in the applied nucleic acid sample or mixture. DNA microarrays, therefore, permit an analysis of the C 5 expression of multiple (as many as 6800 or more) nucleic acids in parallel (see, e.g., Schena, M. (1996) BioEssays 18(5): 427-431).
00 0 The sequences of the invention may be used to design oligonucleotide primers which are able to amplify defined regions of one or more C. glutamicum genes by a nucleic acid amplification reaction such as the polymerase chain reaction. The choice and design of the 5' or 3' oligonucleotide primers or of appropriate linkers allows the Scovalent attachment of the resulting PCR products to the surface of a support medium described above (and also described, for example, Schena, M. et al. (1995) Science 270: 467-470).
Nucleic acid microarrays may also be constructed by in situ oligonucleotide synthesis as described by Wodicka, L. et al. (1997) Nature Biotechnology 15: 1359- 1367. By photolithographic methods, precisely defined regions of the matrix are exposed to light. Protective groups which are photolabile are thereby activated and undergo nucleotide addition, whereas regions that are masked from light do not undergo any modification. Subsequent cycles of protection and light activation permit the synthesis of different oligonucleotides at defined positions. Small, defined regions of the genes of the invention may be synthesized on microarrays by solid phase oligonucleotide synthesis.
The nucleic acid molecules of the invention present in a sample or mixture of nucleotides may be hybridized to the microarrays. These nucleic acid molecules can be labeled according to standard methods. In brief, nucleic acid molecules mRNA molecules or DNA molecules) are labeled by the incorporation ofisotopically or fluorescently labeled nucleotides, during reverse transcription or DNA synthesis.
Hybridization of labeled nucleic acids to microarrays is described in Schena, M. et al. (1995) supra; Wodicka, L. et al. (1997), supra; and DeSaizieu A. et al. (1998), supra). The detection and quantification of the hybridized molecule are tailored to the specific incorporated label. Radioactive labels can be detected, for example, as -145- (N1 described in Schena, M. et al. (1995) supra) and fluorescent labels may be detected, for example, by the method of Shalon et al. (1996) Genome Research 6: 639-645).
The application of the sequences of the invention to DNA microarray technology, as described above, permits comparative analyses of different strains of C.
glutamicum or other Corynebacteria. For example, studies of inter-strain variations 00 based on individual transcript profiles and the identification of genes that are important C for specific and/or desired strain properties such as pathogenicity, productivity and 0 stress tolerance are facilitated by nucleic acid array methodologies. Also, comparisons of the profile of expression of genes of the invention during the course of a fermentation reaction are possible using nucleic acid array technology.
Example 13: Analysis of the Dynamics of Cellular Protein Populations (Proteomics) The genes, compositions, and methods of the invention may be applied to study the interactions and dynamics of populations of proteins, termed 'proteomics'. Protein populations of interest include, but are not limited to, the total protein population of C.
glutamicum in comparison with the protein populations of other organisms), those proteins which are active under specific environmental or metabolic conditions during fermentation, at high or low temperature, or at high or low pH), or those proteins which are active during specific phases of growth and development.
Protein populations can be analyzed by various well-known techniques, such as gel electrophoresis. Cellular proteins may be obtained, for example, by lysis or extraction, and may be separated from one another using a variety of electrophoretic techniques. Sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) separates proteins largely on the basis of their molecular weight. Isoelectric focusing polyacrylamide gel electrophoresis (IEF-PAGE) separates proteins by their isoelectric point (which reflects not'only the amino acid sequence but also posttranslational modifications of the protein). Another, more preferred method of protein analysis is the consecutive combination of both IEF-PAGE and SDS-PAGE, known as 2-D-gel electrophoresis (described, for example, in Hermann et al. (1998) Electrophoresis 19: 3217-3221; Fountoulakis et al. (1998) Electrophoresis 19: 1193-1202; Langen et al.
(1997) Electrophoresis 18: 1184-1192; Antelmann el al. (1997) Electrophoresis 18: -146- O 1451-1463). Other separation techniques may also be utilized for protein separation, such as capillary gel electrophoresis; such techniques are well known in the art.
Proteins separated by these methodologies can be visualized by standard t techniques, such as by staining or labeling. Suitable stains are known in the art, and r 5 include Coomassie Brilliant Blue, silver stain, or fluorescent dyes such as Sypro Ruby (Molecular Probes). The inclusion of radioactively labeled amino acids or other protein 00 precursors 3SS-methionine, 35 S-cysteine, 1 4 C-labelled amino acids, 5 N-amino acids, sN0 3 or 'INHI4 or 3 C-labelled amino acids) in the medium of C. glutamicum permits the labeling of proteins from these cells prior to their separation. Similarly, fluorescent labels may be employed. These labeled proteins can be extracted, isolated Sand separated according to the previously described techniques.
Proteins visualized by these techniques can be further analyzed by measuring the amount of dye or label used. The amount of a given protein can be determined quantitatively using, for example, optical methods and can be compared to the amount of other proteins in the same gel or in other gels. Comparisons of proteins on gels can be made, for example, by optical comparison, by spectroscopy, by image scanning and analysis of gels, or through the use of photographic films and screens. Such techniques are well-known in the art.
To determine the identity of any given protein, direct sequencing or other standard techniques may be employed. For example, N- and/or C-terminal amino acid sequencing (such as Edman degradation) may be used, as may mass spectrometry (in particular MALDI or ESI techniques (see, Langen et al. (1997) Electrophoresis 18: 1184-1192)). The protein sequences provided herein can be used for the identification of C. glutamicum proteins by these techniques.
The information obtained by these methods can be used to compare patterns of protein presence, activity, or modification between different samples from various biological conditions different organisms, time points of fermentation, media conditions, or different biotopes, among others). Data obtained from such experiments alone, or in combination with other techniques, can be used for various applications, such as to compare the behavior of various organisms in a given metabolic) situation, to increase the productivity of strains which produce fine chemicals or to increase the efficiency of the production of fine chemicals.
EQUIVALENTS
Those of ordinary skill in the art will recognize, or will be able to ascertain I using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. Such equivalents are intended to be encompassed by the following claims.
o00 Comprises/comprising and grammatical variations thereof when used in tc, Cthis specification are to be taken to specify the presence of stated features, l integers, steps or components or groups thereof, but do not preclude the presence or addition of one or more other features, integers, steps, components or groups thereof.

Claims (22)

1. An isolated nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:449, or a complement thereof. oo 5 2. An isolated nucleic acid molecule which encodes a polypeptide comprising on the amino acid sequence of SEQ ID NO:450, or a complement thereof.
3. An isolated nucleic acid molecule which encodes a naturally occurring Sallelic variant of a polypeptide comprising the amino acid sequence of SEQ ID NO:450, or a complement thereof.
4. An isolated nucleic acid molecule comprising a nucleotide sequence which is at least 50% identical to the entire nucleotide sequence of SEQ ID NO:449, or a complement thereof. An isolated nucleic acid molecule comprising a fragment of at least contiguous nucleotides of the nucleotide sequence of SEQ ID NO:449, or a complement thereof.
6. An isolated nucleic acid molecule which encodes a polypeptide comprising an amino acid sequence which is at least 50% identical to the entire amino acid sequence of SEQ ID NO:450, or a complement thereof.
7. An isolated nucleic acid molecule comprising the nucleic acid molecule of any one of claims 1-6 and a nucleotide sequence encoding a heterologous polypeptide.
8. A vector comprising the nucleic acid molecule of any one of claims 1-7.
9. The vector of claim 8, which is an expression vector. A host cell transfected with the expression vector of claim 9. 149 O 11. The host cell of claim 10, wherein said cell is a microorganism.
12. The host cell of claim 11, wherein said cell belongs to the genus Corynebacterium or Brevibacterium. (N
13. The host cell of claim 10, wherein the expression of said nucleic acid oo 5 molecule results in the modulation in production of a fine chemical from said cell.
014. The host cell of claim 13, wherein said fine chemical is selected from the group consisting of: organic acids, proteinogenic and nonproteinogenic amino Sacids, purine and pyrimidine bases, nucleosides, nucleotides, lipids, saturated and unsaturated fatty acids, diols, carbohydrates, aromatic compounds, vitamins, cofactors, polyketides, and enzymes. A method of producing a polypeptide, the method comprising culturing the host cell of claim 10 in an appropriate culture medium to, thereby, produce the polypeptide.
16. An isolated polypeptide comprising the amino acid sequence of SEQ ID NO:450.
17. An isolated polypeptide comprising a naturally occurring allelic variant of a polypeptide comprising the amino acid sequence of SEQ ID NO:450.
18. An isolated polypeptide which is encoded by a nucleic acid molecule comprising a nucleotide sequence which is at least 50% identical to the entire nucleotide sequence of SEQ ID NO:449.
19. An isolated polypeptide comprising an amino acid sequence which is at least 50% identical to the entire amino acid sequence of SEQ ID NO:450. An isolated polypeptide comprising a fragment of a polypeptide comprising the amino acid sequence of SEQ ID NO:450, wherein said polypeptide fragment 150 Smaintains a biological activity of the polypeptide comprising the amino sequence N of SEQ ID NO:450.
21. An isolated polypeptide comprising an amino acid sequence which is encoded by a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:449. 00 M 22. The isolated polypeptide of any of claims 16-21, further comprising a 0 heterologous amino acid sequence. S23. A method for producing a fine chemical, the method comprising culturing the cell of claim 10 such that the fine chemical is produced.
24. The method of claim 23, wherein said method further comprises the step of recovering the fine chemical from said culture. The method of claim 23, wherein said cell belongs to the genus Corynebacterium or Brevibacterium.
26. The method of claim 23, wherein said cell is selected from the group consisting of: Corynebacterium glutamicum, Corynebacterium herculis, Corynebacterium lilium, Corynebacterium acetoacidophilum, Corynebacterium acetoglutamicum, Corynebacterium acetophilum, Corynebacterium ammoniagenes, Corynebacterium fujiokense, Corynebacterium nitrilophilus, Brevibacterium ammoniagenes, Brevibacterium butanicum, Brevibacterium divaricatum, Brevibacterium flavum, Brevibacterium healii, Brevibacterium ketoglutamicum, Brevibacterium ketosoreductum, Brevibacterium lactofermentum, Brevibacterium linens, Brevibacterium paraffinolyticum, and those strains of Table 3.
27. The method of claim 23, wherein expression of the nucleic acid molecule from said vector results in modulation of production of said fine chemical. 151 0 28. The method of claim 23, wherein said fine chemical is selected from the 0 c group consisting of: organic acids, proteinogenic and nonproteinogenic amino S acids, purine and pyrimidine bases, nucleosides, nucleotides, lipids, saturated and unsaturated fatty acids, diols, carbohydrates, aromatic compounds, vitamins, C 5 cofactors, polyketides and enzymes. oO 29. The method of claim 23, wherein said fine chemical is an amino acid. 0 30. The method of claim 29, wherein said amino acid is selected from the group consisting of: lysine, glutamate, glutamine, alanine, aspartate, glycine, Sserine, threonine, methionine, cysteine, valine, leucine, isoleucine, arginine, proline, histidine, tyrosine, phenylalanine, and tryptophan.
31. A method for producing a fine chemical, comprising culturing a cell whose genomic DNA has been altered by the introduction of a nucleic acid molecule of any one of claims 1-6.
32. A method for diagnosing the presence or activity of Corynebacterium diphtheria, comprising detecting the presence of at least one of the nucleic acid molecules of any one of claims 1-6 or the polypeptide molecules of any one of claims 16-21, thereby diagnosing the presence or activity of Corynebacterium diphtheriae.
33. A host cell comprising the nucleic acid molecule of SEQ ID NO:449, wherein the nucleic acid molecule is disrupted.
34. A host cell comprising the nucleic acid molecule of SEQ ID NO:449, wherein the nucleic acid molecule comprises one or more nucleic acid modifications as compared to the sequence of SEQ ID NO:449. A host cell comprising the nucleic acid molecule of SEQ ID NO:449, wherein the regulatory region of the nucleic acid molecule is modified relative to the wild-type regulatory region of the molecule. 00 Associated Physical Media Submitted 1 Basic Document (ie Conventi LIZ Verified Translation LIZ Description LIZ Claims LIZ Abstract 11111Drawings EIIIIGene Sequence Listing Ezi or Diskette Other LIIJ (eg. Deeds, Assignm ton/Priority Document) en. lents, etc.)
AU2007202378A 1999-06-25 2007-05-25 Coryneracterium glutamicum genes encoding metabolic pathway proteins Abandoned AU2007202378A1 (en)

Applications Claiming Priority (59)

Application Number Priority Date Filing Date Title
US60/141031 1999-06-25
DE19930476 1999-07-01
US60142101 1999-07-02
DE19931420 1999-07-08
DE19931541 1999-07-08
DE19931424 1999-07-08
DE19931435 1999-07-08
DE19931418 1999-07-08
DE19931592 1999-07-08
DE19931419 1999-07-08
DE19931632 1999-07-08
DE19931510 1999-07-08
DE19931465 1999-07-08
DE19931443 1999-07-08
DE19931428 1999-07-08
DE19931634 1999-07-08
DE19931415 1999-07-08
DE19931457 1999-07-08
DE19931453 1999-07-08
DE19931636 1999-07-08
DE19931434 1999-07-08
DE19931573 1999-07-08
DE19931478 1999-07-08
DE19932206 1999-07-09
DE19932228 1999-07-09
DE19932126 1999-07-09
DE19932229 1999-07-09
DE19932230 1999-07-09
DE19932125 1999-07-09
DE19932130 1999-07-09
DE19932227 1999-07-09
DE19932186 1999-07-09
DE19933006 1999-07-14
DE19933005 1999-07-14
DE19932928 1999-07-14
DE19932922 1999-07-14
DE19933004 1999-07-14
DE19932926 1999-07-14
US60/148613 1999-08-12
DE19940766 1999-08-27
DE19940832 1999-08-27
DE19940765 1999-08-27
DE19940764 1999-08-27
DE19941396 1999-08-31
DE19941378 1999-08-31
DE19941380 1999-08-31
DE19941379 1999-08-31
DE19941394 1999-08-31
DE19942077 1999-09-03
DE19942087 1999-09-03
DE19942095 1999-09-03
DE19942129 1999-09-03
DE19942088 1999-09-03
DE19942086 1999-09-03
DE19942079 1999-09-03
DE19942076 1999-09-03
DE19942124 1999-09-03
US60/187970 2000-03-09
AU2006200807A AU2006200807A1 (en) 1999-06-25 2006-02-24 Coryneracterium glutamicum genes encoding metabolic pathway proteins

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
AU2006200807A Division AU2006200807A1 (en) 1999-06-25 2006-02-24 Coryneracterium glutamicum genes encoding metabolic pathway proteins

Publications (1)

Publication Number Publication Date
AU2007202378A1 true AU2007202378A1 (en) 2007-06-14

Family

ID=38197529

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2007202378A Abandoned AU2007202378A1 (en) 1999-06-25 2007-05-25 Coryneracterium glutamicum genes encoding metabolic pathway proteins

Country Status (1)

Country Link
AU (1) AU2007202378A1 (en)

Similar Documents

Publication Publication Date Title
US7439050B2 (en) Corynebacterium glutamicum genes encoding diaminopimelate epimerase
US7510854B2 (en) Corynebacterium glutamicum genes encoding metabolic pathway proteins
US6696561B1 (en) Corynebacterium glutamicum genes encoding proteins involved in membrane synthesis and membrane transport
AU2001223903B2 (en) Corynebacterium glutamicum genes encoding metabolic pathway proteins
EP2292763A1 (en) Corynebacterium glutamicum genes encoding proteins involved in carbon metabolism and energy production
CN101082049A (en) Corynebacterium glutamicum genes encoding metabolic pathway proteins
US20080064067A1 (en) Corynebacterium glutamicum genes encoding phosphoenolpyruvate: sugar phosphotransferase system proteins
US7270984B1 (en) Polynucleotides encoding a 6-phosphogluconolactonase polypeptide from corynebacterium glutamicum
AU2001223903A1 (en) Corynebacterium glutamicum genes encoding metabolic pathway proteins
EP2272953B1 (en) Corynebacterium glutamicum gene encoding phosphoenolpyruvate-protein phosphotransferase
US20060269975A1 (en) Corynebacterium glutamicum genes encoding proteins involved in DNA replication, protein synthesis, and pathogenesis
CA2592965A1 (en) Corynebacterium glutamicum genes encoding metabolic pathway proteins
CA2587128A1 (en) Corynebacterium glutamicum genes encoding proteins involved in carbon metabolism and energy production
AU783709B2 (en) Corynebacterium glutamicum genes encoding metabolic pathway proteins
AU2007203039A1 (en) Corynebacterium glutamicum genese encoding proteins involved in carbon metabolism and energy production
AU2007202378A1 (en) Coryneracterium glutamicum genes encoding metabolic pathway proteins
CA2590625A1 (en) Corynebacterium glutamicum genes encoding phosphoenolpyruvate: sugar phospho-transferase system proteins
AU2007202318A1 (en) Corynebacterium Glutamicum genes encoding phosphoenolpyruvate: sugar phosphotransferase system proteins
AU2006200807A1 (en) Coryneracterium glutamicum genes encoding metabolic pathway proteins
EP1702980A1 (en) Corynebacterium glutamicum gene encoding Hpr of phosphoenolpyruvate:sugar phosphotransferase system

Legal Events

Date Code Title Description
MK4 Application lapsed section 142(2)(d) - no continuation fee paid for the application