WO2014059541A1 - Novel cell wall deconstruction enzymes of thermoascus aurantiacus, myceliophthora fergusii (corynascus thermophilus), and pseudocercosporella herpotrichoides, and uses thereof - Google Patents
Novel cell wall deconstruction enzymes of thermoascus aurantiacus, myceliophthora fergusii (corynascus thermophilus), and pseudocercosporella herpotrichoides, and uses thereof Download PDFInfo
- Publication number
- WO2014059541A1 WO2014059541A1 PCT/CA2013/050778 CA2013050778W WO2014059541A1 WO 2014059541 A1 WO2014059541 A1 WO 2014059541A1 CA 2013050778 W CA2013050778 W CA 2013050778W WO 2014059541 A1 WO2014059541 A1 WO 2014059541A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- polypeptide
- corth2p4
- psehe2p4
- unknown
- protein
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- D—TEXTILES; PAPER
- D06—TREATMENT OF TEXTILES OR THE LIKE; LAUNDERING; FLEXIBLE MATERIALS NOT OTHERWISE PROVIDED FOR
- D06M—TREATMENT, NOT PROVIDED FOR ELSEWHERE IN CLASS D06, OF FIBRES, THREADS, YARNS, FABRICS, FEATHERS OR FIBROUS GOODS MADE FROM SUCH MATERIALS
- D06M16/00—Biochemical treatment of fibres, threads, yarns, fabrics, or fibrous goods made from such materials, e.g. enzymatic
- D06M16/003—Biochemical treatment of fibres, threads, yarns, fabrics, or fibrous goods made from such materials, e.g. enzymatic with enzymes or microorganisms
-
- A—HUMAN NECESSITIES
- A21—BAKING; EDIBLE DOUGHS
- A21D—TREATMENT OF FLOUR OR DOUGH FOR BAKING, e.g. BY ADDITION OF MATERIALS; BAKING; BAKERY PRODUCTS
- A21D8/00—Methods for preparing or baking dough
- A21D8/02—Methods for preparing dough; Treating dough prior to baking
- A21D8/04—Methods for preparing dough; Treating dough prior to baking treating dough with microorganisms or enzymes
- A21D8/042—Methods for preparing dough; Treating dough prior to baking treating dough with microorganisms or enzymes with enzymes
-
- A—HUMAN NECESSITIES
- A23—FOODS OR FOODSTUFFS; TREATMENT THEREOF, NOT COVERED BY OTHER CLASSES
- A23K—FODDER
- A23K10/00—Animal feeding-stuffs
- A23K10/10—Animal feeding-stuffs obtained by microbiological or biochemical processes
- A23K10/14—Pretreatment of feeding-stuffs with enzymes
-
- A—HUMAN NECESSITIES
- A23—FOODS OR FOODSTUFFS; TREATMENT THEREOF, NOT COVERED BY OTHER CLASSES
- A23K—FODDER
- A23K10/00—Animal feeding-stuffs
- A23K10/30—Animal feeding-stuffs from material of plant origin, e.g. roots, seeds or hay; from material of fungal origin, e.g. mushrooms
-
- A—HUMAN NECESSITIES
- A23—FOODS OR FOODSTUFFS; TREATMENT THEREOF, NOT COVERED BY OTHER CLASSES
- A23K—FODDER
- A23K20/00—Accessory food factors for animal feeding-stuffs
- A23K20/10—Organic substances
- A23K20/189—Enzymes
-
- A—HUMAN NECESSITIES
- A23—FOODS OR FOODSTUFFS; TREATMENT THEREOF, NOT COVERED BY OTHER CLASSES
- A23K—FODDER
- A23K50/00—Feeding-stuffs specially adapted for particular animals
- A23K50/30—Feeding-stuffs specially adapted for particular animals for swines
-
- A—HUMAN NECESSITIES
- A23—FOODS OR FOODSTUFFS; TREATMENT THEREOF, NOT COVERED BY OTHER CLASSES
- A23K—FODDER
- A23K50/00—Feeding-stuffs specially adapted for particular animals
- A23K50/70—Feeding-stuffs specially adapted for particular animals for birds
- A23K50/75—Feeding-stuffs specially adapted for particular animals for birds for poultry
-
- A—HUMAN NECESSITIES
- A23—FOODS OR FOODSTUFFS; TREATMENT THEREOF, NOT COVERED BY OTHER CLASSES
- A23L—FOODS, FOODSTUFFS OR NON-ALCOHOLIC BEVERAGES, NOT OTHERWISE PROVIDED FOR; PREPARATION OR TREATMENT THEREOF
- A23L29/00—Foods or foodstuffs containing additives; Preparation or treatment thereof
- A23L29/06—Enzymes
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/37—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from fungi
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/52—Genes encoding for enzymes or proenzymes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/24—Hydrolases (3) acting on glycosyl compounds (3.2)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P19/00—Preparation of compounds containing saccharide radicals
- C12P19/02—Monosaccharides
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P19/00—Preparation of compounds containing saccharide radicals
- C12P19/14—Preparation of compounds containing saccharide radicals produced by the action of a carbohydrase (EC 3.2.x), e.g. by alpha-amylase, e.g. by cellulase, hemicellulase
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P7/00—Preparation of oxygen-containing organic compounds
- C12P7/02—Preparation of oxygen-containing organic compounds containing a hydroxy group
- C12P7/04—Preparation of oxygen-containing organic compounds containing a hydroxy group acyclic
- C12P7/06—Ethanol, i.e. non-beverage
- C12P7/08—Ethanol, i.e. non-beverage produced as by-product or from waste or cellulosic material substrate
- C12P7/10—Ethanol, i.e. non-beverage produced as by-product or from waste or cellulosic material substrate substrate containing cellulosic material
-
- D—TEXTILES; PAPER
- D06—TREATMENT OF TEXTILES OR THE LIKE; LAUNDERING; FLEXIBLE MATERIALS NOT OTHERWISE PROVIDED FOR
- D06P—DYEING OR PRINTING TEXTILES; DYEING LEATHER, FURS OR SOLID MACROMOLECULAR SUBSTANCES IN ANY FORM
- D06P5/00—Other features in dyeing or printing textiles, or dyeing leather, furs, or solid macromolecular substances in any form
- D06P5/02—After-treatment
-
- D—TEXTILES; PAPER
- D06—TREATMENT OF TEXTILES OR THE LIKE; LAUNDERING; FLEXIBLE MATERIALS NOT OTHERWISE PROVIDED FOR
- D06P—DYEING OR PRINTING TEXTILES; DYEING LEATHER, FURS OR SOLID MACROMOLECULAR SUBSTANCES IN ANY FORM
- D06P5/00—Other features in dyeing or printing textiles, or dyeing leather, furs, or solid macromolecular substances in any form
- D06P5/15—Locally discharging the dyes
- D06P5/158—Locally discharging the dyes with other compounds
-
- D—TEXTILES; PAPER
- D21—PAPER-MAKING; PRODUCTION OF CELLULOSE
- D21C—PRODUCTION OF CELLULOSE BY REMOVING NON-CELLULOSE SUBSTANCES FROM CELLULOSE-CONTAINING MATERIALS; REGENERATION OF PULPING LIQUORS; APPARATUS THEREFOR
- D21C11/00—Regeneration of pulp liquors or effluent waste waters
- D21C11/0007—Recovery of by-products, i.e. compounds other than those necessary for pulping, for multiple uses or not otherwise provided for
-
- D—TEXTILES; PAPER
- D21—PAPER-MAKING; PRODUCTION OF CELLULOSE
- D21C—PRODUCTION OF CELLULOSE BY REMOVING NON-CELLULOSE SUBSTANCES FROM CELLULOSE-CONTAINING MATERIALS; REGENERATION OF PULPING LIQUORS; APPARATUS THEREFOR
- D21C5/00—Other processes for obtaining cellulose, e.g. cooking cotton linters ; Processes characterised by the choice of cellulose-containing starting materials
- D21C5/005—Treatment of cellulose-containing material with microorganisms or enzymes
-
- D—TEXTILES; PAPER
- D21—PAPER-MAKING; PRODUCTION OF CELLULOSE
- D21H—PULP COMPOSITIONS; PREPARATION THEREOF NOT COVERED BY SUBCLASSES D21C OR D21D; IMPREGNATING OR COATING OF PAPER; TREATMENT OF FINISHED PAPER NOT COVERED BY CLASS B31 OR SUBCLASS D21G; PAPER NOT OTHERWISE PROVIDED FOR
- D21H17/00—Non-fibrous material added to the pulp, characterised by its constitution; Paper-impregnating material characterised by its constitution
- D21H17/005—Microorganisms or enzymes
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02E—REDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
- Y02E50/00—Technologies for the production of fuel of non-fossil origin
- Y02E50/10—Biofuels, e.g. bio-diesel
Definitions
- the present invention relates to novel polypeptides and enzymes having activities relating to biomass processing and/or degradation (e.g., cell wall deconstruction), as well as polynucleotides, vectors, cells, compositions and tools relating to same, or functional variants thereof. More particularly, the present invention relates to secreted enzymes that may be isolated from the fungi, Thermoascus aurantiacus strain CBS 181.67, Myceiiophthora fergusii (Corynascus thermophilus) strain CBS 405.69, and Pseudocercosporella herpotrichoides strain 494.80. Uses thereof in various industrial processes such as in biofuels, food preparation, animal feed, pulp and paper, textiles, detergents, waste treatment and others are also disclosed.
- Biomass-processing enzymes have a number of industrial applications such as in: the biofuel industry (e.g., improving ethanol yield and/or increasing the efficiency and economy of ethanol production); the food industry (e.g., production of cereal-based food products; the feed-enzyme industry (e.g., increasing the digestibility/absorption of nutrients); the pulp and paper industry (e.g., enhancing bleachability of pulp); the textile industry (e.g., treatment of cellulose-based fabrics); the waste treatment industry (e.g., de-colorization of synthetic dyes); the detergent industry (e.g., providing eco-friendly cleaning products); and the rubber industry (e.g., catalyzing the conversion of latex into foam rubber).
- the biofuel industry e.g., improving ethanol yield and/or increasing the efficiency and economy of ethanol production
- the food industry e.g., production of cereal-based food products
- the feed-enzyme industry e.g., increasing the digestibility/absorption of nutrients
- Conversion of plant biomass to glucose may also be enhanced by supplementing cellulose cocktails with enzymes that degrade the other components of biomass, including hemicelluloses, pectins and lignins, and their linkages, thereby improving the accessibility of cellulose to the cellulase enzymes.
- Such enzymes include, without being limiting, to: xylanases, mannanases, arabinanases, esterases, glucuronidases, xyloglucanases and arabinofuranosidases for hemicelluloses; lignin peroxidases, manganese-dependent peroxidases, versatile peroxidases, and laccases for lignin; and pectate lyase, pectin lyase, polygalacturonase, pectin acetyl esterase, alpha-arabinofuranosidase, beta-galactosidase, galactanase, arabinanase, rhamnogalacturonase, rhamnogalacturonan lyase, and rhamnogalacturonan acetyl esterase, xylogalacturonosidase, xylogalacturonase, and rham
- lignin modifiying enzymes may be used to alter the structure of lignin to produce novel materials, and hemicellulases may be employed to produce 5-carbon sugars from hemicelluloses, which may then be further converted to chemical products.
- Cereal- based food products such as pasta, noodles and bread can be prepared from dough which is usually made from the basic ingredients (cereal) flour, water and optionally salt.
- Cereal basic ingredients
- Suitable enzymes include, for example, xylanase, starch degrading enzymes, oxidizing enzymes, fatty material splitting enzymes, protein degrading, and modifying or crosslinking enzymes.
- Amylases are used for the conversion of plant starches to glucose.
- Pectin-active enzymes are used in fruit processing, for example to increase the yield of juices, and in fruit juice clarification, as well as in other food processing steps.
- enzymes are used to make the bleaching process more effective and to reduce the use of oxidative chemicals.
- enzymatic treatment is often used in place of (or in addition to) a bleaching treatment to achieve a "used" look of jeans, and can also improve the softness/feel of fabrics.
- enzymes can enhance cleaning ability or act as a softening agent.
- enzymes play an important role in changing the characteristics of the waste, for example, to become more amenable to further treatment and/or for bio-conversion to value-added products.
- thermostable enzymes and proteins that are "thermostable” in that they retain a level of their function or protein activity at temperatures about 50°C. These thermostable enzymes are highly desirable, for example, to be able to perform reactions at elevated temperatures to avoid or reduce contamination by microorganisms (e.g., bacteria).
- the present invention relates to soluble, secreted proteins relating to biomass processing and/or degradation (e.g., cell wall deconstruction) that may be isolated from the fungi, Thermoascus aurantiacus strain CBS 181.67, Myceiiophthora fergusii (Corynascus thermophilus) strain CBS 405.69, and Pseudocercosporella herpotrichoides strain 494.80, as well as polynucleotides, vectors, compositions, cells, antibodies, kits, products and uses associated with same. Briefly, these fungal strains were cultured in vitro and genomic DNA along with total RNA were isolated therefrom.
- Thermoascus aurantiacus strain CBS 181.67 Myceiiophthora fergusii (Corynascus thermophilus) strain CBS 405.69
- Pseudocercosporella herpotrichoides strain 494.80 as well as polynucleotides
- nucleic acids were then used to determine/assemble fungal genomic sequences and generate cDNA libraries.
- Bioinformatic tools were used to predict genes in the assembled genomic sequences, and those genes encoding proteins relating to biomass-degradation (e.g., cell wall deconstruction) were identified based on bioinformatics (e.g., the presence of conserved domains). Sequences predicted to encode proteins which are targeted to the mitochondria or bound to the cell wall were removed.
- cDNA clones comprising full-length sequences predicted to encode soluble, secreted proteins relating to biomass-degradation were fully sequenced and cloned into appropriate expression vectors for protein production and characterization.
- the full-length genomic, exonic, intronic, coding and polypeptide sequences are disclosed herein, along with corresponding putative (biological) functions and/or protein activities, where available.
- the soluble, secreted, biomass degradation proteins of the present invention comprise a proteome which is referred to herein as the SSBD proteome of Thermoascus aurantiacus strain CBS 181.67, Myceiiophthora fergusii (Corynascus thermophilus) strain CBS 405.69, and Pseudocercosporella herpotrichoides strain 494.80.
- the present invention relates to an isolated polypeptide which is:
- polypeptide comprising an amino acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% amino acid sequence identity to the polypeptide defined in (a);
- polypeptide comprising an amino acid sequence encoded by the nucleic acid sequence of any one of SEQ ID NOs: 201-400, 890-1178, or 1992-2514;
- polypeptide comprising an amino acid sequence encoded by a polynucleotide molecule that hybridizes under medium-high stringency conditions, high stringency conditions, or very high stringency conditions with the full-length complement of a polynucleotide molecule comprising the nucleic acid sequence defined in (c) or (d);
- polypeptide comprising an amino acid sequence encoded by a polynucleotide molecule having at least 60%, at least 65% at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% nucleic acid sequence identity to a polynucleotide molecule comprising the nucleic acid sequence defined in (c) or (d);
- the above mentioned polypeptide has a corresponding function and/or protein activity according to Tables 1A-1C.
- the above mentioned polypeptide comprises or consists of the amino acid sequence of any one of SEQ ID NOs: 401-600, 1179-1467, or 2516-3039.
- the above mentioned polypeptide is a recombinant polypeptide.
- polypeptide is obtainable from a fungus.
- the fungus is from the genus Thermoascus, Myceliophthora (Corynascus), or Pseudocercosporella.
- the fungus is Thermoascus aurantiacus, Myceliophthora fergusii (Corynascus thermophilus), or
- the present invention relates to an antibody that specifically binds to any one of the above mentioned polypeptides.
- the present invention relates to an isolated polynucleotide molecule encoding any one of the above mentioned polypeptides.
- the present invention relates to an isolated polynucleotide molecule which is:
- polynucleotide molecule comprising a nucleic acid sequence encoding the polypeptide of any one of SEQ ID NOs: 401-600, 1179-1467, or 2516-3039;
- a polynucleotide molecule comprising a nucleic acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% nucleic acid sequence identity to any one of the polynucleotide molecules defined in (a) to (d); or
- the above mentioned polynucleotide molecule is obtainable from a fungus.
- the fungus is from the genus Thermoascus, Myceiiophthora (Corynascus), or Pseudocercosporeiia.
- the fungus is Thermoascus aurantiacus, Myceiiophthora fergusii (Corynascus thermophilus), or Pseudocercosporeiia herpotrichoides.
- the present invention relates to a vector comprising any one of the above mentioned polynucleotide molecules.
- the vector comprises a regulatory sequence operatively linked to the polynucleotide molecule for expression of same in a suitable host cell.
- the suitable host cell is a bacterial cell; a fungal cell; or a filamentous fungal cell.
- the present invention relates to a recombinant host cell comprising any one of the above mentioned polynucleotide molecules or vectors.
- the present invention relates to a polypeptide obtainable by expressing the above mentioned polynucleotide or vector in a suitable host cell.
- the suitable host cell is a bacterial cell; a fungal cell; or a filamentous fungal cell.
- the present invention relates to a composition
- a composition comprising any one of the above mentioned polypeptides or the recombinant host cells.
- the composition further comprising a suitable carrier.
- the composition further comprises a substrate of the polypeptide.
- the substrate is biomass.
- the present invention relates to a method for producing any one of the above mentioned polypeptides, the method comprising: (a) culturing a strain comprising the above mentioned polynucleotide molecule or vector under conditions conducive for the production of the polypeptide; and (b) recovering the polypeptide.
- the strain is a bacterial strain; a fungal strain; or a filamentous fungal strain.
- the present invention relates to a method for producing any one of the above mentioned polypeptides, the method comprising: (a) culturing the above mentioned recombinant host cell under conditions conducive for the production of the polypeptide; and (b) recovering the polypeptide.
- the present invention relates to a method for preparing a food product, the method comprising incorporating any one of the above mentioned polypeptides during preparation of the food product.
- the food product is a bakery product.
- the present invention relates to the use of the above mentioned polypeptide for the preparation or processing of a food product.
- the food product is a bakery product.
- the present invention relates to the use of any one of the above mentioned polypeptides for the preparation or processing of a food product.
- the food product is a bakery product.
- the present invention relates to the above mentioned polypeptide for use in the preparation or processing of a food product.
- the food product is a bakery product.
- the present invention relates to the use of any one of the above mentioned polypeptides for the preparation of animal feed. In some aspects the present invention relates to the use of any one of the above mentioned polypeptides for increasing digestion or absorption of animal feed. In some aspects, the present invention relates to any one of the above mentioned polypeptides for use in the preparation of animal feed, or for increasing digestion or absorption of animal feed. In some embodiment, the animal feed is a cereal-based feed.
- the present invention relates to the use of any one of the above mentioned polypeptides for the production or processing of kraft pulp or paper. In some aspects the present invention relates to any one of the above mentioned polypeptides for the production or processing of kraft pulp or paper. In some embodiments, the processing comprises prebleaching and/or de-inking.
- the present invention relates to the use of any one of the above mentioned polypeptides for processing lignin. In some aspects the present invention relates to any one of the above mentioned polypeptides for processing lignin.
- the present invention relates to the use of any one of the above mentioned polypeptides for producing ethanol. In some aspects the present invention relates to any one of the above mentioned polypeptides for producing ethanol.
- the above mentioned uses are in conjunction with cellulose or a cellulase.
- the present invention relates to the use of any one of the above mentioned polypeptides for treating textiles or dyed textiles. In some aspects the present invention relates to any one of the above mentioned polypeptides for treating textiles or dyed textiles. [0038] In some aspects the present invention relates to the use of any one of the above mentioned polypeptides for degrading biomass or pretreated biomass. In some aspects the present invention relates to any one of the above mentioned polypeptides for degrading biomass or pretreated biomass.
- the present invention relates to proteins and/or enzymes that are thermostable.
- a polypeptide of the present invention retains a level of its function and/or protein activity at about 50°C, about 55°C, about 60°C, about 65°C, about 70°C, about 75°C, about 80°C, or about 95°C.
- a polypeptide of the present invention retains a level of its function and/or protein activity between about 50°C and about 95°C, between about 50°C and about 90°C, between about 50°C and about 85°C, between about 50°C and about 80°C, between about 50°C and about 75°C, between about 50°C and about 70°C, or between about 50°C and about 65°C.
- a polypeptide of the present invention has optimal or maximal function and/or protein activity greater than 50°C, greater than 55°C, greater than 60°C, greater than 65°C, or greater than 70°C.
- a polypeptide of the present invention has optimal or maximal function and/or protein activity between about 50°C and about 95°C, between about 50°C and about 90°C, between about 50°C and about 85°C, between about 50°C and about 80°C, between about 50°C and about 75°C, between about 50°C and about 70°C, or between about 50°C and about 65°C.
- Headings, and other identifiers e.g., (a), (b), (i), (ii), etc., are presented merely for ease of reading the specification and claims.
- the use of headings or other identifiers in the specification or claims does not necessarily require the steps or elements be performed in alphabetical or numerical order or the order in which they are presented.
- DNA or "RNA” molecule or sequence (as well as sometimes the term “oligonucleotide”) refers to a molecule comprised generally of the deoxyribonucleotides adenine (A), guanine (G), thymine (T) and/or cytosine (C).
- A deoxyribonucleotides
- G guanine
- T thymine
- C cytosine
- T is replaced by uracil (U).
- rDNA recombinant DNA
- polynucleotide or “nucleic acid molecule” refers to a polymer of nucleotides and includes DNA (e.g., genomic DNA, cDNA), RNA molecules (e.g., mRNA), and chimeras thereof.
- the nucleic acid molecule can be obtained by cloning techniques or synthesized.
- DNA can be double-stranded or single-stranded (coding strand or non-coding strand [antisense]).
- nucleic acid molecule and “polynucleotide” as are analogs thereof (e.g., generated using nucleotide analogs, e.g., inosine or phosphorothioate nucleotides). Such nucleotide analogs can be used, for example, to prepare polynucleotides that have altered base-pairing abilities or increased resistance to nucleases.
- a nucleic acid backbone may comprise a variety of linkages known in the art, including one or more of sugar- phosphodiester linkages, peptide-nucleic acid bonds (referred to as "peptide nucleic acids" (PNA); Hydig-Hielsen et al., PCT Int'l Pub. No. WO 95/32305), phosphorothioate linkages, methylphosphonate linkages or combinations thereof.
- Sugar moieties of the nucleic acid may be ribose or deoxyribose, or similar compounds having known substitutions, e.g., 2' methoxy substitutions (containing a 2'-0-methylribofuranosyl moiety; see PCT No.
- Nitrogenous bases may be conventional bases (A, G, C, T, U), known analogs thereof (e.g., inosine or others; see “The Biochemistry of the Nucleic Acids 5-36", Adams et al., ed., 11th ed., 1992), or known derivatives of purine or pyrimidine bases (see, Cook, PCT Int'l Pub. No. WO 93/13121) or "abasic" residues in which the backbone includes no nitrogenous base for one or more residues (Arnold et al., U.S. Pat. No. 5,585,481).
- a nucleic acid may comprise only conventional sugars, bases and linkages, as found in RNA and DNA, or may include both conventional components and substitutions (e.g., conventional bases linked via a methoxy backbone, or a nucleic acid including conventional bases and one or more base analogs).
- an "isolated nucleic acid molecule” refers to a polymer of nucleotides, and includes, but should not limited to DNA and RNA.
- the "isolated” nucleic acid molecule is purified from its natural in vivo state, obtained by cloning or chemically synthesized.
- gene and “recombinant gene” refer to nucleic acid molecules which may be isolated from chromosomal DNA, and very often include an open reading frame encoding a protein, e.g., polypeptides of the present invention.
- a gene may include coding sequences, non-coding sequences, introns and regulatory sequences, as well known.
- Amplification refers to any in vitro procedure for obtaining multiple copies ("amplicons") of a target nucleic acid sequence or its complement or fragments thereof.
- In vitro amplification refers to production of an amplified nucleic acid that may contain less than the complete target region sequence or its complement.
- In vitro amplification methods include, e.g., transcription-mediated amplification, replicase-mediated amplification, polymerase chain reaction (PCR) amplification, ligase chain reaction (LCR) amplification and strand-displacement amplification (SDA including multiple strand-displacement amplification method (MSDA)).
- Replicase-mediated amplification uses self-replicating RNA molecules, and a replicase such as ⁇ -replicase (e.g., Kramer et al., U.S. Pat. No. 4,786,600).
- PCR amplification is well known and uses DNA polymerase, primers and thermal cycling to synthesize multiple copies of the two complementary strands of DNA or cDNA (e.g., Mullis et al., U.S. Pat. Nos. 4,683,195, 4,683,202, and 4,800,159).
- LCR amplification uses at least four separate oligonucleotides to amplify a target and its complementary strand by using multiple cycles of hybridization, ligation, and denaturation (e.g., EP Pat. App. Pub. No. 0320308).
- SDA is a method in which a primer contains a recognition site for a restriction endonuclease that permits the endonuclease to nick one strand of a hemimodified DNA duplex that includes the target sequence, followed by amplification in a series of primer extension and strand displacement steps (e.g., Walker et al., U.S. Pat. No. 5,422,252).
- oligonucleotide primer sequences of the present invention may be readily used in any in vitro amplification method based on primer extension by a polymerase (e.g., see Kwoh et al., 1990, Am. Biotechnol. Lab. 8:14 25 and Kwoh et al., 1989, Proc. Natl. Acad. Sci.
- oligos are designed to bind to a complementary sequence under selected conditions.
- the terminology "amplification pair” or “primer pair” refers herein to a pair of oligonucleotides (oligos) of the present invention, which are selected to be used together in amplifying a selected nucleic acid sequence by one of a number of types of amplification processes.
- hybridizing and “hybridizes” are intended to describe conditions for hybridization and washing under which nucleotide sequences at least about 60%, at least about 70%, at least about 80%, more preferably at least about 85%, even more preferably at least about 90%, more preferably at least 95%, more preferably at least 98% or more preferably at least 99% homologous to each other typically remain hybridized to each other.
- hybridization conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 1X SSC, 0.1 % SDS at 50°C, preferably at 55°C, preferably at 60°C and even more preferably at 65°C.
- Highly stringent conditions include, for example, hybridizing at 68°C in 5x SSC/5x Denhardt's solution / 1.0% SDS and washing in 0.2x SSC/0.1% SDS at room temperature. Alternatively, washing may be performed at 42°C. The skilled artisan will know which conditions to apply for stringent and highly stringent hybridization conditions.
- a polynucleotide which hybridizes only to a poly (A) sequence such as the 3' terminal poly(A) tract of mRNAs), or to a complementary stretch of T (or U) residues, would not be included in a polynucleotide of the invention used to specifically hybridize to a portion of a nucleic acid of the invention, since such a polynucleotide would hybridize to any nucleic acid molecule containing a poly (A) stretch or the complement thereof (e.g., practically any double-stranded cDNA clone).
- identity and “percent identity” are used interchangeably herein.
- sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in the sequence of a first amino acid or nucleic acid sequence for optimal alignment with a second amino or nucleic acid sequence).
- the amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position.
- the two sequences are the same length.
- the term "identical” or “percent identity” in the context of two or more nucleic acid or amino acid sequences refers to two or more sequences or subsequences that are the same, or that have a specified percentage of amino acid residues or nucleotides that are the same (e.g., 60% or 65% identity, preferably, 70-95% identity, more preferably at least 95% identity), when compared and aligned for maximum correspondence over a window of comparison, or over a designated region as measured using a sequence comparison algorithm as known in the art, or by manual alignment and visual inspection. Sequences having, for example, 60% to 95% or greater sequence identity are considered to be substantially identical.
- Such a definition also applies to the complement of a test sequence.
- the described identity exists over a region that is at least about 15 to 25 amino acids or nucleotides in length, more preferably, over a region that is about 50 to 100 amino acids or nucleotides in length.
- Those having skill in the art will know how to determine percent identity between/among sequences using, for example, algorithms such as those based on CLUSTALW computer program (Thompson Nucl. Acids Res. 2 (1994), 4673-4680) or FASTDB (Brutlag Comp. App. Biosci. 6 (1990), 237-245), as known in the art.
- the FASTDB algorithm typically does not consider internal non-matching deletions or additions in sequences, i.e., gaps, in its calculation, this can be corrected manually to avoid an overestimation of the % identity.
- CLUSTALW does take sequence gaps into account in its identity calculations.
- the BLASTP program uses as defaults a wordlength (W) of 3, and an expectation (E) of 10.
- the present invention also relates to nucleic acid molecules the sequence of which is degenerate in comparison with the sequence of an above-described hybridizing molecule.
- the term "being degenerate as a result of the genetic code” means that due to the redundancy of the genetic code different nucleotide sequences code for the same amino acid.
- the present invention also relates to nucleic acid molecules which comprise one or more mutations or deletions, and to nucleic acid molecules which hybridize to one of the herein described nucleic acid molecules, which show (a) mutation(s) or (a) deletion(s).
- nucleic acid molecules which comprise one or more mutations or deletions
- nucleic acid molecules which hybridize to one of the herein described nucleic acid molecules, which show (a) mutation(s) or (a) deletion(s).
- homology refers to a similarity between two polypeptide sequences, but take into account changes between amino acids (whether conservative or not).
- amino acids can be classified by charge, hydrophobicity, size, etc. It is also well known in the art that amino acid changes can be conservative (e.g., they do not significantly affect, or not at all, the function of the protein).
- homology introduces evolutionistic notions (e.g., pressure from evolution to a retain function of essential or important regions of a sequence, while enabling a certain drift of less important regions).
- the skilled person will be aware of the fact that several different computer programs are available to determine the homology between two sequences. For instance, a comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch (J. Mol. Biol. (48): 444-453 (1970)) algorithm which has been incorporated into the GAP program in the Accelrys GCG software package (available at http://www.accelrys.com/products/gcg/), using either a BLOSUM62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1 , 2, 3, 4, 5, or 6. The skilled person will appreciate that all these different parameters will yield slightly different results but that the overall percentage identity of two sequences is not significantly altered when using different algorithms.
- the percent identity between two nucleotide sequences is determined using the GAP program in the Accelrys GCG software package (available at http://www.accelrys.com/products/gcg/), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1 , 2, 3, 4, 5, or 6.
- the percent identity two amino acid or nucleotide sequence is determined using the algorithm of E. Meyers and W.
- the nucleic acid and protein sequences of the present invention can further be used as a "query sequence" to perform a search against public databases to, for example, identify other family members or related sequences.
- Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul et al., (1990) J. Mol. Biol. 215:403-10.
- Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25(17): 3389- 3402.
- the default parameters of the respective programs e.g., XBLAST and NBLAST
- sufficiently complementary is meant a contiguous nucleic acid base sequence that is capable of hybridizing to another sequence by hydrogen bonding between a series of complementary bases.
- Complementary base sequences may be complementary at each position in sequence by using standard base pairing (e.g., G:C, A:T or A:U pairing) or may contain one or more residues (including abasic residues) that are not complementary by using standard base pairing, but which allow the entire sequence to specifically hybridize with another base sequence in appropriate hybridization conditions.
- Contiguous bases of an oligomer are preferably at least about 80% (81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 96, 97, 98, 99, 100%), more preferably at least about 90% complementary to the sequence to which the oligomer specifically hybridizes.
- Appropriate hybridization conditions are well known to those skilled in the art, can be predicted readily based on sequence composition and conditions, or can be determined empirically by using routine testing (see Sambrook ef al., Molecular Cloning, A Laboratory Manual, 2 nd ed.
- the present invention refers to a number of units or percentages that are often listed in sequences. For example, when referring to "at least 80%, at least 85%, at least 90%.", or "at least about 80%, at least about 85%, at least about 90%.", every single unit is not listed, for the sake of brevity. For example, some units (e.g., 81 , 82, 83, 84, 85,... 91 , 92%.%) may not have been specifically recited but are considered encompassed by the present invention. The non-listing of such specific units should thus be considered as within the scope of the present invention.
- Nucleic acid sequences may be detected by using hybridization with a complementary sequence (e.g., oligonucleotide probes) (see U.S. Patent Nos. 5,503,980 (Cantor), 5,202,231 (Drmanac et al.), 5,149,625 (Church et al.), 5,112,736 (Caldwell et al.), 5,068,176 (Vijg et al.), and 5,002,867 (Macevicz)).
- a complementary sequence e.g., oligonucleotide probes
- Hybridization detection methods may use an array of probes (e.g., on a DNA chip) to provide sequence information about the target nucleic acid which selectively hybridizes to an exactly complementary probe sequence in a set of four related probe sequences that differ one nucleotide (see U.S. Patent Nos. 5,837,832 and 5,861 ,242 (Chee et al.)).
- a detection step may use any of a variety of known methods to detect the presence of nucleic acid by hybridization to an oligonucleotide probe.
- the types of detection methods in which probes can be used include Southern blots (DNA detection), dot or slot blots (DNA, RNA), and Northern blots (RNA detection).
- Labeled proteins could also be used to detect a particular nucleic acid sequence to which it binds (e.g., protein detection by far western technology: Guichet et al., 1997, Nature 385(6616): 548-552; and Schwartz et al., 2001 , EMBO 20(3): 510- 519).
- kits containing reagents of the present invention on a dipstick setup and the like include kits containing reagents of the present invention on a dipstick setup and the like.
- a detection method which is amenable to automation.
- a non-limiting example thereof includes a chip or other support comprising one or more (e.g., an array) of different probes.
- a "label” refers to a molecular moiety or compound that can be detected or can lead to a detectable signal.
- a label is joined, directly or indirectly, to a nucleic acid probe or the nucleic acid to be detected (e.g., an amplified sequence) or to a polypeptide to be detected.
- Direct labeling can occur through bonds or interactions that link the label to the polynucleotide or polypeptide (e.g., covalent bonds or non-covalent interactions), whereas indirect labeling can occur through the use of a "linker” or bridging moiety, such as additional nucleotides, amino acids or other chemical groups, which are either directly or indirectly labeled.
- Bridging moieties may amplify a detectable signal.
- Labels can include any detectable moiety (e.g., a radionuclide, ligand such as biotin or avidin, enzyme or enzyme substrate, reactive group, chromophore such as a dye or colored particle, luminescent compound including a bioluminescent, phosphorescent or chemiluminescent compound, and fluorescent compound).
- detectable moiety e.g., a radionuclide, ligand such as biotin or avidin, enzyme or enzyme substrate, reactive group, chromophore such as a dye or colored particle, luminescent compound including a bioluminescent, phosphorescent or chemiluminescent compound, and fluorescent compound).
- expression is meant the process by which a gene or otherwise nucleic acid sequence eventually produces a polypeptide. It involves transcription of the gene into mRNA, and the translation of such mRNA into polypeptide(s).
- peptide and oligopeptide are considered synonymous (as is commonly recognized) and each term can be used interchangeably as the context required to indicate a chain of at least two amino acids coupled by peptidyl linkages.
- polypeptide is used herein for chains containing more than seven amino acid residues. All oligopeptide and polypeptide formulas or sequences herein are written from left to right and in the direction from amino terminus to carboxyl terminus.
- the one-letter code of amino acids used herein is commonly known in the art and can be found in Sambrook, et al., supra. Sequence Listings programs can convert easily this one-letter code of amino acids sequence into a three-letter code.
- mature polypeptide is defined herein as a polypeptide having biological activity a polypeptide of the present invention that is in its final form, following translation and any post-translational modifications, such as N-terminal processing, C-terminal truncation, removal of signal sequences, glycosylation, phosphorylation, etc.
- polypeptides of the present invention comprise mature of polypeptides of any one of the polypeptides disclosed herein. Mature polypeptides of the present invention can be predicted using programs such as SignalP.
- mature polypeptide coding sequence is defined herein as a nucleotide sequence that encodes a mature polypeptide as defined above. As well known, some nucleotide sequences are non- coding.
- the term "purified” or “isolated” refers to a molecule (e.g., polynucleotide or polypeptide) having been separated from a component of the composition in which it was originally present.
- an "isolated polynucleotide” or “isolated polypeptide” has been purified to a level not found in nature.
- a “substantially pure” molecule is a molecule that is lacking in most other components (e.g., 30, 40, 50, 60, 70, 75, 80, 85, 90, 95, 96, 97, 98, 99, 100% free of contaminants).
- the term “crude” means molecules that have not been separated from the components of the original composition in which it was present.
- the units e.g., 66, 67...81 , 82, 83, 84, 85,...91 , 92%.
- an "isolated polynucleotide” or “isolated nucleic acid molecule” is a nucleic acid molecule (DNA or RNA) that is not immediately contiguous with both of the coding sequences with which it is immediately contiguous (one on the 5' end and one on the 3' end) in the naturally occurring genome of the organism from which it is derived.
- an isolated nucleic acid includes some or all of the 5' non-coding (e.g., promoter) sequences that are immediately contiguous to the coding sequence.
- the term therefore includes, for example, a recombinant DNA that is incorporated into a vector, into an autonomously replicating plasmid or virus, or into the genomic DNA of a prokaryote or eukaryote, or which exists as a separate molecule (e.g., a cDNA or a genomic DNA fragment produced by PCR or restriction endonuclease treatment) independent of other sequences. It also includes a recombinant DNA that is part of a hybrid gene encoding an additional polypeptide that is substantially free of cellular material, viral material, or culture medium (when produced by recombinant DNA techniques), or chemical precursors or other chemicals (when chemically synthesized). Moreover, an "isolated nucleic acid fragment" is a nucleic acid fragment that is not naturally occurring as a fragment and would not be found in the natural state.
- an "isolated polypeptide” or “isolated protein” is intended to include a polypeptide or protein removed from its native environment.
- recombinantly produced polypeptides and proteins expressed in host cells are considered isolated for the purpose of the invention, as are native or recombinant polypeptides which have been substantially purified by any suitable technique such as, for example, the single-step purification method disclosed in Smith and Johnson, Gene 67:31-40 (1988).
- variant refers herein to a polypeptide, which is substantially similar in structure (e.g., amino acid sequence) to a polypeptide disclosed herein or encoded by a nucleic acid sequence disclosed herein without being identical thereto.
- a variant can comprise an insertion, substitution, or deletion of one or more amino acids as compared to its corresponding native protein.
- a variant can comprise additional modifications (e.g., post-translational modifications such as acetylation, phosphorylation, glycosylation, sulfatation, sumoylation, prenylation, ubiquitination, etc).
- functional variant is intended to include a variant which is sufficiently similar in both structure and function to a polypeptide disclosed herein or encoded by a nucleic acid sequence disclosed herein, to maintain at least one of its native biological activities.
- biomass refers to any cellulosic or lignocellulosic material and includes materials comprising cellulose, and optionally further comprising hemicellulose, lignin, starch, oligosaccharides and/or monosaccharides. Biomass may also comprise additional components, such as protein and/or lipid. Biomass may be derived from a single source, or biomass can comprise a mixture derived from more than one source; for example, biomass could comprise a mixture of corn cobs and corn stover, or a mixture of grass and leaves.
- Biomass includes, but is not limited to, bioenergy crops, agricultural residues, municipal solid waste, industrial solid waste, sludge from paper manufacture, yard waste, wood and forestry waste or a combination thereof.
- biomass include, but are not limited to, corn grain, corn cobs, crop residues such as corn husks, corn stover, grasses, wheat, wheat straw, barley, barley straw, hay, rice straw, switchgrass, waste paper, sugar cane bagasse, sorghum, soy, components obtained from milling of grains, trees, branches, roots, leaves, wood chips, sawdust, shrubs and bushes, vegetables, fruits, flowers, and animal manure or a combination thereof.
- Biomass that is useful for the invention may include biomass that has a relatively high carbohydrate value, is relatively dense, and/or is relatively easy to collect, transport, store and/or handle.
- biomass that is useful includes corn cobs, corn stover, sawdust, and sugar cane bagasse.
- the terms “cellulosic” or “cellulose-containing material” refers to a composition comprising cellulose.
- the term “lignocellulosic” refers to a composition comprising both lignin and cellulose.
- Lignocellulosic material may also comprise hemicellulose.
- the predominant polysaccharide in the primary cell wall of biomass is cellulose, the second most abundant is hemi-cellulose, and the third is pectin.
- the secondary cell wall produced after the cell has stopped growing, also contains polysaccharides and is strengthened by polymeric lignin covalently cross-linked to hemicellulose.
- Cellulose is a homopolymer of anhydrocellobiose and thus a linear beta-(1-4)-D-glucan, while hemicelluloses include a variety of compounds, such as xylans, xyloglucans, arabinoxylans, and mannans in complex branched structures with a spectrum of substituents. Although generally polymorphous, cellulose is found in plant tissue primarily as an insoluble crystalline matrix of parallel glucan chains. Hemicelluloses usually hydrogen bond to cellulose, as well as to other hemicelluloses, which help stabilize the cell wall matrix.
- Cellulose is generally found, for example, in the stems, leaves, hulls, husks, and cobs of plants or leaves, branches, and wood of trees.
- the cellulose-containing material can be, but is not limited to, herbaceous material, agricultural residues, forestry residues, municipal solid wastes, waste paper, and pulp and paper mill residues.
- the cellulose-containing material can be any type of biomass including, but not limited to, wood resources, municipal solid waste, wastepaper, crops, and crop residues (e.g., see Wiselogel et al., 1995, in Handbook on Bioethanol (Charles E. Wyman, editor), pp.105-118, Taylor & Francis, Washington D.C.; Wyman. 1994.
- the cellulose may be in the form of lignocellulose, a plant cell wall material containing lignin, cellulose, and hemicellulose in a mixed matrix.
- cellulolytic enhancing activity or “cellulolysis-enhancing” is defined herein as a biological activity which enhances the hydrolysis of a cellulose-containing material by proteins having cellulolytic activity.
- cellulolytic activity is defined herein as a biological activity which hydrolyzes a cellulose- containing material.
- lignocellulolytic enhancing activity or "lignocellulolysis-enhancing” is defined herein as a biological activity which enhances the hydrolysis of a lignocellulose-containing material by proteins having lignocellulolytic activity.
- lignocellulolytic activity is defined herein as a biological activity which hydrolyzes a lignocellulose-containing material.
- thermalostable refers to an enzyme that retains its function or protein activity at a temperature greater than 50°C; thus, a thermostable cellulose-degrading or cellulase-enhacing enzyme/protein retains the ability to degrade or enhace the degradation of cellulose at this elevated temperature.
- a protein or enzyme may have more than one enzymatic activity.
- some polypeptide of the present invention exhibit bifunctional activities such as xylosidase/ arabinosidase activity.
- Such bifunctional enzymes may exhibit thermostability with regard to one activity, but not another, and still be considered as "thermostable”.
- Figure 1 is a schematic map of the pGBFIN-49 expression plasmid.
- Figures 2-4 show protein activity-temperature profiles of various secreted proteins from Thermoascus aurantiacus.
- Figures 5-7 show protein activity-temperature profiles of various secreted proteins from Myceliophthora fergusii (Corynascus thermophilus).
- Figure 8 show protein activity-temperature profiles of various secreted proteins Pseudocercosporella herpotrichoides.
- SEQ ID NOs: 1-600 relate to sequences from Thermoascus aurantiacus;
- SEQ ID NOs: 601-1467 relate to sequences from Myceliophthora fergusii (Corynascus thermophilus);
- SEQ ID NOs: 1468-3039 relate to sequences from Pseudocercosporella herpotrichoides.
- the present invention relates to isolated polypeptides secreted by Thermoascus aurantiacus, Myceliophthora fergusii (Corynascus thermophilus), or Pseudocercosporella herpotrichoides, (e.g., Thermoascus aurantiacus strain CBS 181.67, Myceliophthora fergusii (Corynascus thermophilus) strain CBS 405.69, or Pseudocercosporella herpotrichoides strain 494.80) having an activity relating to the processing or degradation of biomass (e.g., cell wall deconstruction).
- Thermoascus aurantiacus e.g., Myceliophthora fergusii (Corynascus thermophilus) strain CBS 405.69, or Pseudocercosporella herpotrichoides strain 494.80
- the present invention relates to isolated polypeptides comprising the amino acid sequences shown in any one of SEQ ID NOs: 401-600, 1179-1467, or 2516-3039.
- the present invention relates to isolated polypeptides sharing a minimum threshold of amino acid sequence identity with any one of the above-mentioned polypeptides.
- the present invention relates to isolated polypeptides having at least 60%, 65%, 70%, 71%, 72, 73%, 74%, 75%, 76%, 77%, 78%, 79, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% amino acid sequence identity to any one of the above-mentioned polypeptides.
- Other specific percentage units that have not been specifically recited here for brevity are nevertheless considered within the scope of the present invention.
- the present invention relates to a polypeptide encoded by a polynucleotide of the present invention, which includes genomic (e.g., SEQ ID NOs: 1-200, 601-889, or 1468-1991), and coding (e.g., SEQ ID NOs: 201-400, 890-1178, or 1992-2514) nucleic acid sequences disclosed herein, polynucleotides hybridizing under medium-high, high, or very high stringency conditions with a full-length complement thereof, as well as polynucleotides sharing a certain degree of nucleic acid sequence identity therewith.
- genomic e.g., SEQ ID NOs: 1-200, 601-889, or 1468-1991
- coding e.g., SEQ ID NOs: 201-400, 890-1178, or 1992-2514
- the present invention relates to a polypeptide comprising an amino acid sequence encoded by at least one exonic nucleic acid sequence of any one of the genomic sequences corresponding to SEQ ID NOs: 1-200, 601-889, or 1468-1991 (e.g., the intron or exon segments defined by the exon boundaries listed in Tables 2A-2C) or a functional part thereof.
- the present invention relates to functional variants of any one of the above- mentioned polypeptides.
- the term "functional” or “biologically active” relates to the native enzymatic (e.g., catalytic) activity of a polypeptide of the present invention.
- the present invention relates to a polypeptide comprising a biological activity of any one of the enzymes described below, or a polynucleotide encoding same.
- Carbohydrase refers to any protein that catalyzes the hydrolysis of carbohydrates.
- glycoside hydrolase “glycosyl hydrolase” or “glycosidase” refers to a protein that catalyzes the hydrolysis of the glycosidic bonds between carbohydrates or between a carbohydrate and a non-carbohydrate residue.
- Endoglucanases cellobiohydrolases, beta-glucosidases, a-glucosidases, xylanases, beta-xylosidases, alpha-xylosidases, galactanases, a-galactosidases, beta-galactosidases, a-amylases, glucoamylases, endo-arabinases, arabinofuranosidases, mannanases, beta-mannosidases, pectinases, acetyl xylan esterases, acetyl mannan esterases, femlic acid esterases, coumaric acid esterases, pectin methyl esterases, and chitosanases are examples of glycosidases.
- Cellulase refers to a protein that catalyzes the hydrolysis of 1 ,4-D-glycosidic linkages in cellulose (such as bacterial cellulose, cotton, filter paper, phosphoric acid swollen cellulose, Avicel®); cellulose derivatives (such as carboxymethylcellulose and hydroxyethylcellulose); plant lignocellulosic materials, beta-D-glucans or xyloglucans.
- Cellulose is a linear beta-(1-4) glucan consisting of anhydrocellobiose units. Endoglucanases, cellobiohydrolases, and beta- glucosidases are examples of cellulases.
- Endoglucanase refers to a protein that catalyzes the hydrolysis of cellulose to oligosaccharide chains at random locations by means of an endoglucanase activity.
- Cerabiohydrolase refers to a protein that catalyzes the hydrolysis of cellulose to cellobiose via an exoglucanase activity, sequentially releasing molecules of cellobiose from the reducing or non-reducing ends of cellulose or cello- oligosaccharides.
- Beta-glucosidase refers to an enzyme that catalyzes the conversion of cellobiose and oligosaccharides to glucose.
- Hemicellulase refers to a protein that catalyzes the hydrolysis of hemicellulose, such as that found in lignocellulosic materials. Hemicelluloses are complex polymers, and their composition often varies widely from organism to organism, and from one tissue type to another. Hemicelluloses include a variety of compounds, such as xylans, arabinoxylans, xyloglucans, mamians, glucomannans, and galacto(gluco)mannans. Hemicellulose can also contain glucan, which is a general term for beta-linked glucose residues. In general, a main component of hemicellulose is beta-1 ,4-linked xylose, a five carbon sugar.
- this xylose is often branched as beta-1 ,3 linkages or beta-1 ,2 linkages, and can be substituted with linkages to arabinose, galactose, mannose, glucuronic acid, or by esterification to acetic acid.
- Hemicellulolytic enzymes i.e., hemicellulases, include both endo-acting and exo-acting enzymes, such as xylanases, beta-xylosidases.
- alpha-xylosidases galactanases, a-galactosidases, beta- galactosidases, endo-arabinases, arabinofuranosidases, mannanases, and beta-mannosidases.
- Hemicellulases also include the accessory enzymes, such as acetylesterases, ferulic acid esterases, and coumaric acid esterases.
- xylanases and acetyl xylan esterases cleave the xylan and acetyl side chains of xylan and the remaining xylo-oligomers are unsubstituted and can thus be hydrolysed with beta-xylosidase only.
- beta-xylosidase beta-xylosidase
- several less known side activities have been found in enzyme preparations which hydrolyze hemicellulose. Accordingly, xylanases, acetylesterases and beta- xylosidases are examples of hemicellulases.
- Xylanase specifically refers to an enzyme that hydrolyzes the beta-1 , 4 bond in the xylan backbone, producing short xylooligosaccharides.
- Beta-mannanase or "endo-1,4-beta-mannosidase” refers to a protein that hydrolyzes mannan- based hemicelluloses (mannan, glucomannan, galacto(gluco)mannan) and produces short beta-1 ,4- mannooligosaccharides.
- Mannan endo-1,6-alpha-mannosidase refers to a protein that hydrolyzes 1 ,6-alpha-mannosidic linkages in unbranched 1 ,6-mannans.
- Beta-mannosidase (beta-1 ,4-mannoside mannohydrolase; EC 3.2.1.25) refers to a protein that catalyzes the removal of beta-D-mannose residues from the non-reducing ends of oligosaccharides.
- Galactanase refers to a protein that catalyzes the hydrolysis of endo-1 ,4-beta-D-galactosidic linkages in arabinogalactans.
- Glucoamylase refers to a protein that catalyzes the hydrolysis of terminal 1 ,4-linked-D-glucose residues successively from non-reducing ends of the glycosyl chains in starch with the release of beta-D-glucose.
- Beta-hexosaminidase or “beta-N-acetylglucosaminidase” refers to a protein that catalyzes the hydrolysis of terminal N-acetyl-D-hexosamine residues in N-acetyl-beta-D-hexosamines.
- arabinofuranosidase or “arabinofuranosidase” refers to a protein that hydrolyzes arabinofuranosyl-containing hemicelluloses or pectins. Some of these enzymes remove arabinofuranoside residues from 0-2 or 0-3 single substituted xylose residues, as well as from 0-2 and/or 0-3 double substituted xylose residues. Some of these enzymes remove arabinose residues from arabinan oligomers.
- Endo-arabinase refers to a protein that catalyzes the hydrolysis of 1 ,5-alpha-arabinofuranosidic linkages in 1 ,5-arabinans.
- Exo-arabinase refers to a protein that catalyzes the hydrolysis of 1 ,5-alpha-linkages in 1 ,5-arabinans or 1 ,5-alpha-L arabino-oligosaccharides, releasing mainly arabinobiose, although a small amount of arabinotriose can also be liberated.
- Beta-xylosidase refers to a protein that hydrolyzes short 1 ,4-beta-D-xylooligomers into xylose.
- Redwood dehydrogenase refers to a protein that oxidizes cellobiose to cellobionolactone.
- Chitosanase refers to a protein that catalyzes the endohydrolysis of beta-1 ,4-linkages between D- glucosamine residues in acetylated chitosan (i.e., deacetylated chitin).
- Exo-polygalacturonase refers to a protein that catalyzes the hydrolysis of terminal alpha 1 ,4-linked galacturonic acid residues from non-reducing ends thus converting polygalacturonides to galacturonic acid.
- Alcohol xylan esterase refers to a protein that catalyzes the removal of the acetyl groups from xylose residues.
- Alcohol mannan esterase refers to a protein that catalyzes the removal of the acetyl groups from mannose residues
- ferulic esterase or "ferulic acid esterase” refers to a protein that hydrolyzes the ester bond between the arabinose substituent group and ferulic acid.
- Coumaric acid esterase refers to a protein that hydrolyzes the ester bond between the arabinose substituent group and coumaric acid.
- Acetyl xylan esterases, ferulic acid esterases and pectin methyl esterases are examples of carbohydrate esterases.
- Pectate lyase and pectin lyases refer to proteins that catalyze the cleavage of 1 ,4-alpha-D- galacturonan by beta-elimination acting on polymeric and/or oligosaccharide substrates (pectates and pectins, respectively).
- Endo-1,3-beta-glucanase or “laminarinase” refers to a protein that catalyzes the cleavage of 1 ,3- linkages in beta-D-glucans such as laminarin or lichenin.
- Laminarin is a linear polysaccharide made up of beta-1 , 3- glucan with beta-1 , 6-linkages.
- lichenan refers to a protein that catalyzes the hydrolysis of lichenan, a linear, 1 ,3-1 ,4-beta-D glucan.
- Rhamnogalacturonan is composed of alternating alpha-1 ,4-rhamnose and alpha-1 ,2-linked galacturonic acid, with side chains linked 1 ,4 to rhamnose.
- the side chains include Type I galactan, which is beta- 1 ,4-linked galactose with alpha-1 , 3-linked arabinose substituents; Type II galactan, which is beta-1 , 3-1 ,6-linked galactoses (very branched) with arabinose substituents; and arabinan, which is alpha-1 ,5-linked arabinose with alpha-1 , 3-linked arabinose branches.
- the galacturonic acid substituents may be acetylated and/or methylated.
- "Exo-rhamnogalacturonanase” refers to a protein that catalyzes the degradation of the rhamnogalacturonan backbone of pectin from the non-reducing end.
- Rhamnogalacturonan acetylesterase refers to a protein that catalyzes the removal of the acetyl groups ester-linked to the highly branched rhamnogalacturonan (hairy) regions of pectin.
- Rhamnogalacturonan lyase refers to a protein that catalyzes the degradation of the rhamnogalacturonan backbone of pectin via a beta-elimination mechanism (e.g., see Pages et al., J. Bacteriol., 185:4727-4733 (2003)).
- Alpha-rhamnosidase refers to a protein that catalyzes the hydrolysis of terminal non-reducing alpha- L-rhamnose residues in alpha-L-rhamnosides.
- Certain proteins of the present invention may be classified as "Family 61 glycosidases" based on homology of the polypeptides to CAZy Family GH61.
- Family 61 glycosidases may exhibit cellulolytic enhancing activity or endoglucanase activity. Additional information on the properties of Family 61 glycosidases may be found in U.S. Patent Application Publication Nos. 2005/0191736, 2006/0005279, 2007/0077630, and in PCT Publication No.. WO 2004/031378.
- Esterases represent a category of various enzymes including lipases, phospholipases, cutinases, and phytases that catalyze the hydrolysis and synthesis of ester bonds in compounds.
- EC 3 Hydrolases catalyze the hydrolysis of various bonds
- EC 4 Lyases cleave various bonds by means other than hydrolysis and oxidation
- EC 5 Isomerases catalyze isomerization changes within a single molecule
- EC 6 Ligases join two molecules with covalent bonds.
- polypeptides/enzymes of the present invention are not meant to be limited to specific enzyme classes as they currently exist.
- the skilled person would know how to appropriately reclassify (and assign the appropriate functions) to the enzymes of the present invention based on the amino acid sequence information provided herein. Such reclassifications are thus within the scope of the present invention.
- the present invention relates to a polypeptide comprising a biological activity of any one of the enzymes (or sub-classes thereof), or a polynucleotide encoding same.
- Cellulose-hydrolyzing enzymes including: endoglucanases (EC 3.2.1.4), which hydrolyze the beta-1 ,4- linkages between glucose units; exoglucanases (also known as cellobiohydrolases 1 and 2) (EC 3.2.1.91), which hydrolyze cellobiose, a glucose disaccharide, from the reducing and non-reducing ends of cellulose; and beta-glucosidases (EC 3.2.1.21), which hydrolyze the beta-1 ,4 glycoside bond of cellobiose to glucose;
- GH61 glycoside hydrolase family 61 proteins
- proteins e.g., polysaccharide monooxygenases
- Enzymes that degrade or modify xylan and/or xylan-lignin complexes including: xylanases, such as endo- 1,4-beta-xylanase (EC 3.2.1.8), which catalyze the endohydrolysis of 1 -4-beta-D-xylosidic linkages in xylans (or xyloglucans); xylosidases, such as xylan 1,4-beta-xylosidases (EC 3.2.1.37), which catalyze hydrolysis of 1 ,4-beta-D-xylans to remove successive D-xylose residues from the non-reducing terminals, and also cleaves xylobiose; arabinosidases, such as alpha-arabinofuranosidases (EC 3.2.1.55), which hydrolyze terminal non-reducing alpha-L-arabinofuranoside residues in alpha-L-arabino
- Enzymes that degrade or modify mannan including: mannanases, such as mannan endo-1,4-beta- mannosidase (EC 3.2.1.78), which catalyze random hydrolysis of 1 ,4-beta-D-mannosidic linkages in mannans, galactomannans and glucomannans;
- mannanases such as mannan endo-1,4-beta- mannosidase (EC 3.2.1.78), which catalyze random hydrolysis of 1 ,4-beta-D-mannosidic linkages in mannans, galactomannans and glucomannans
- alpha-galactosidases (EC 3.2.1.25), which hydrolyze terminal, non-reducing beta-D-mannose residues in beta-D- mannosides; alpha-galactosidases (EC 3.2.1.22), which hydrolyzes terminal, non-reducing alpha-D- galactose residues in alpha-D-galactosides (including galactose oligosaccharides, galactomannans and galactohydrolase); and mannan acetyl esterases;
- Enzymes that degrade or modify xyloglucans including: xyloglucanases such as xyloglucan-specific endo- beta-1 ,4-glucanase (EC 3.2.1.151), which involves endohydrolysis of 1 ,4-beta-D-glucosidic linkages in xyloglucan; and xyloglucan-specific exo-beta-1 ,4-glucanase (EC 3.2.1.155), which catalyzes exohydrolysis of 1 ,4-beta-D-glucosidic linkages in xyloglucan; endoglucanases / cellulases;
- xyloglucanases such as xyloglucan-specific endo- beta-1 ,4-glucanase (EC 3.2.1.151), which involves endohydrolysis of 1 ,4-beta-D-glucosidic linkages in xyloglucan
- Enzymes that degrade or modify glucans including: Enzymes that degrade beta-1 ,4-glucan, such as endoglucanases; cellobiohydrolases; and beta-glucosidases;
- Enzymes that degrade beta-1 ,3-1 ,4-glucan such as endo-beta-1,3(4)-glucanases (EC 3.2.1.6), which catalyzes endohydrolysis of 1 ,3- or 1 ,4-linkages in beta-D-glucans when the glucose residue whose reducing group is involved in the linkage to be hydrolyzed is itself substituted at C-3; endoglucanases (beta-glucanase, cellulase), and beta-glucosidases;
- Enzymes that degrade or modify arabinans including: arabinanases (EC 3.2.1.99), which catalyze endohydrolysis of 1 ,5-alpha-arabinofuranosidic linkages in 1 ,5-arabinans;
- Enzymes that degrade or modify starch including: amylases, such as alpha-amylases (EC 3.2.1.1), which catalyze endohydrolysis of 1 ,4-alpha-D-glucosidic linkages in polysaccharides containing three or more 1 ,4- alpha-linked D-glucose units; and glucosidases, such as alpha-glucosidases (EC 3.2.1.20), which hydrolyze terminal, non-reducing 1 ,4-linked alpha-D-glucose residues with release of alpha-D-glucose;
- amylases such as alpha-amylases (EC 3.2.1.1), which catalyze endohydrolysis of 1 ,4-alpha-D-glucosidic linkages in polysaccharides containing three or more 1 ,4- alpha-linked D-glucose units
- glucosidases such as alpha-glucosid
- pectate lyases (EC 4.2.2.2), which carry out eliminative cleavage of pectate to give oligosaccharides with 4-deoxy-alpha-D-gluc-4-enuronosyl groups at their non- reducing ends
- pectin lyases (EC 4.2.2.10), which catalyze eliminative cleavage of (1-4)-alpha-D- galacturonan methyl ester to give oligosaccharides with 4-deoxy-6-0-methyl-alpha-D-galact-4-enuronosyl groups at their non-reducing ends
- polygalacturonases (EC 3.2.1.15), which carry out random hydrolysis of 1 ,4-alpha-D-galactosiduronic linkages in pectate and other galacturonans
- pectin esterases such as pectin acetyl esterase (EC 3.1.1.11), which hydrolyzes acetate
- Enzymes that degrade or modify lignin including: lignin peroxidases (EC 1.11.1.14), which oxidize lignin and lignin model compounds using hydrogen peroxide; manganese-dependent peroxidases (EC 1.11.1.13), which oxidizes lignin and lignin model compounds using Mn 2+ and hydrogen peroxide; versatile peroxidases (EC 1.11.1.16), which oxidize lignin and lignin model compounds using an electron donor and hydrogen peroxide and combines the substrate-specificity characteristics of the two other ligninolytic peroxidases: manganese peroxidase (EC 1.11.1.13) and lignin peroxidase (EC 1.11.1.14); and laccases (EC 1.10.3.2), a group of multi-copper proteins of low specificity acting on both o- and p-quinols, and often acting also on lignin; and
- Enzymes acting on chitin including: chitinases (EC 3.2.1.14), which catalyze random hydrolysis of N- acetyl-beta-D-glucosaminide 1 ,4-beta-linkages in chitin and chitodextrins; and hexosaminidases, such as beta-N-acetylhexosaminidase (EC 3.2.1.52), which hydrolyzes terminal non-reducing N-acetyl-D- hexosamine residues in N-acetyl-beta-D-hexosaminides.
- chitinases EC 3.2.1.14
- hexosaminidases such as beta-N-acetylhexosaminidase (EC 3.2.1.52), which hydrolyzes terminal non-reducing N-acetyl-D- hexosamine residues in N-acetyl-beta-D
- the present invention includes the polypeptides and their corresponding activities as defined in Tables 1A-1C, as well as functional variants thereof.
- a functional variant as used herein is intended to include a polypeptide which is sufficiently similar in structure and function to any one of the above-mentioned polypeptides (without being identical thereto) to maintain at least one of its native biological activities.
- a functional variant can comprise an insertion, substitution, or deletion of one or more amino acids as compared to its corresponding native protein.
- a functional variant can comprise additional modifications (e.g., post- translational modifications such as acetylation, phosphorylation, glycosylation, sulfatation, sumoylation, prenylation, ubiquitination, etc).
- functional variants of the present invention can contain one or more conservative substitutions of a polypeptide sequence disclosed herein. Such modifications can be carried out routinely using site-specific mutagenesis.
- conservative substitution is intended to indicate a substitution in which the amino acid residue is replaced with an amino acid residue having a similar side chain.
- Families of amino acids having similar side chains are known in the art and include amino acids with basic side chains (e.g., lysine, arginine and hystidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagines, glutamine, serine, threonine, tyrosine, cysteine), non-polar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine tryptophan, histidine).
- basic side chains e.g., lysine, arginine and hystidine
- acidic side chains e.g
- non-essential amino acid is a residue that can be altered in a polypeptide of the present invention without substantially altering its (biological) function or protein activity.
- amino acid residues that are conserved among the proteins of the present invention having similar biological activities (and their orthologs) are predicted to be particularly unamenable to alteration.
- functional variants can include functional fragments (i.e., biologically active fragments) of any one of the polypeptide sequences disclosed herein.
- Such fragments include fewer amino acids than the full length protein from which they are derived, but exhibit at least one biological activity of the corresponding full-length protein.
- biologically active fragments comprise a domain or motif with at least one activity of the full-length protein.
- a biologically active fragment of a protein of the invention can be a polypeptide which is, for example, 10, 25, 50, 100 or more amino acids in length.
- the present invention includes other functional variants of the polypeptides disclosed herein, which can be identified by techniques known in the art. For example, functional variants can be identified by screening combinatorial libraries of mutants (e.g., truncation mutants), of polypeptides of the present invention for biological activity. In another embodiment, a variegated library of variants can be generated by combinatorial mutagenesis at the nucleic acid level.
- a variegated library of variants can be produced by, for example, enzymatically ligating a mixture of synthetic oligonucleotides into gene sequences such that a degenerate set of potential protein sequences is expressible as individual polypeptides, or alternatively, as a set of larger fusion proteins (e.g., for phage display).
- a degenerate set of potential protein sequences is expressible as individual polypeptides, or alternatively, as a set of larger fusion proteins (e.g., for phage display).
- libraries of fragments of the coding sequence of a polypeptide of the present invention can be used to generate a variegated population of polypeptides for screening a subsequent selection of variants.
- a library of coding sequence fragments can be generated by treating a double stranded PCR fragment of the coding sequence of interest with a nuclease under conditions wherein nicking occurs only about once per molecule, denaturing the double stranded DNA, renaturing the DNA to form double stranded DNA which can include sense/antisense pairs from different nicked products, removing single stranded portions from reformed duplexes by treatment with S1 nuclease, and ligating the resulting fragment library into an expression vector.
- an expression library can be derived which encodes N-terminal and internal fragments of various sizes of the protein of interest.
- REM Recursive ensemble mutagenesis
- functional variants of the present invention can encompass orthologs of the genes and polypeptides disclosed herein.
- Orthologs of the polypeptides disclosed herein include proteins that can be isolated from other strains or species and possess a similar or identical biological activity. Such orthologs can be identified as comprising an amino acid sequence that is substantially homologous (shares a certain degree of amino acid sequence identity) with the polypeptides disclosed herein.
- substantially homologous refers to a first amino acid or nucleotide sequence which contains a sufficient or minimum number of identical or equivalent (e.g., with similar side chain) amino acids or nucleotides to a second amino acid or nucleotide sequence such that the first and the second amino acid or nucleotide sequences have a common domain.
- amino acid or nucleotide sequences which contain a common domain having at least 70%, 71%, 72%, 73% 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81 %, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91 % 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% sequence identity are defined herein as sufficiently identical.
- the present invention includes improved proteins derived from the polypeptides of the present invention.
- Improved proteins are proteins wherein at least one biological activity is improved. Such proteins may be obtained by randomly introducing mutations along all or part of the coding sequences of the polypeptides of the present invention such as by saturation mutagenesis, and the resulting mutants can be expressed recombinantly and screened for biological activity. For instance, the art provides for standard assays for measuring the enzymatic activity of the resulting protein and thus improved proteins may be selected.
- polypeptides of the present invention may be present alone (e.g., in an isolated or purified form), within a composition (e.g., an enzymatic composition for carrying out an industrial process), or in an appropriate host.
- polypeptides of the present invention can be recovered and purified from cell cultures (e.g., recombinant cell cultures) by methods known in the art.
- high performance liquid chromatography HPLC
- HPLC high performance liquid chromatography
- polypeptides of the present invention include naturally purified products, products of chemical synthetic procedures, and products produced by recombinant techniques from a prokaryotic or eukaryotic host, including, for example, bacterial, yeast, higher plant, insect and mammalian cells. Depending on the host employed in a recombinant production procedure, the polypeptides of the present invention may be glycosylated or may be non-glycosylated. In addition, polypeptides of the invention may also include an initial modified methionine residue, in some cases as a result of host-mediated processes.
- the present invention includes fusion proteins comprising a polypeptide of the present invention or a functional variant thereof, which is operatively linked to one or more unrelated polypeptide (e.g., heterologous amino acid sequences).
- unrelated polypeptides or “heterologous polypeptides” or “heterologous sequences” refer to polypeptides or sequences which are usually not present close to or fused to one of the polypeptides of the present invention.
- Such "unrelated polypeptides" or “heterologous polypeptides” having amino acid sequences corresponding to proteins which are not substantially homologous to the polypeptide sequences disclosed herein.
- fusion protein of the present invention comprises at least two biologically active portions or domains of polypeptide sequences disclosed herein.
- the term "operatively linked” is intended to indicate that all of the different polypeptides are fused in-frame to each other.
- an unrelated polypeptide can be fused to the N terminus or C terminus of a polypeptide of the present invention.
- a polypeptide of the present invention can be fused to a protein which enables or facilitates recombinant protein purification and/or detection.
- a polypeptide of the present invention can be fused to a protein such as glutathione S-transferase (GST), and the resulting fusion protein can then be purified/detected through the high affinity of GST for glutathione.
- GST glutathione S-transferase
- Fusion proteins of the present invention can be produced by standard recombinant DNA techniques. For example, DNA fragments encoding different polypeptide sequences can be ligated together in frame in accordance with conventional techniques, for example by employing blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for appropriate termini, filling -in of cohesive ends as appropriate, alkaline phosphatase treatment to avoid undesirable joining, and enzymatic ligation.
- the fusion gene can be synthesized by conventional techniques including automated DNA synthesizers.
- PCR amplification of gene fragments can be carried out using anchor primers, which give rise to complementary overhangs between two consecutive gene fragments which can subsequently be annealed and re-amplified to generate a chimeric gene sequence (e.g., see Current Protocols in Molecular Biology, eds. Ausubel et al., John Wiley & Sons: 1992).
- anchor primers which give rise to complementary overhangs between two consecutive gene fragments which can subsequently be annealed and re-amplified to generate a chimeric gene sequence
- expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide).
- a nucleic acid encoding a polypeptide of the present invention can be cloned into such an expression vector so that the fusion moiety is linked in-frame to the polypeptide of interest.
- a polypeptide of the present invention can be fused to a heterologous signal sequence (e.g., at its N terminus) to facilitate its isolation, expression and/or secretion from certain host cells (e.g., mammalian and yeast host cells).
- a heterologous signal sequence e.g., at its N terminus
- host cells e.g., mammalian and yeast host cells.
- Signal sequences are typically characterized by a core of hydrophobic amino acids, which are generally cleaved from the mature protein during secretion in one or more cleavage events.
- Such signal peptides may contain processing sites that allow cleavage of the signal sequence from the mature proteins as they pass through the secretory pathway.
- the gp67 secretory sequence of the baculovirus envelope protein can be used as a heterologous signal sequence ⁇ Current Protocols in Molecular Biology, Ausubel et al., eds., John Wiley & Sons, 1992).
- Other examples of eukaryotic heterologous signal sequences include the secretory sequences of melittin and human placental alkaline phosphatase (Stratagene; La Jolla, California).
- useful prokaryotic heterologous signal sequences include the phoA secretory signal (Sambrook et al., supra) and the protein A secretory signal (Pharmacia Biotech; Piscataway, New Jersey).
- the signal sequence can direct secretion of the protein, such as from a eukaryotic host into which the expression vector is transformed, and the signal sequence is subsequently or concurrently cleaved.
- the protein can then be readily purified from the extracellular medium by known methods.
- a signal sequence can be linked to a fusion protein of the present invention to facilitate detection, purification, and/or recovery thereof.
- the sequence encoding a fusion protein of the present invention may be fused to a marker sequence, such as a sequence encoding a peptide, which facilitates purification of the fused polypeptide.
- the marker sequence can be a hexa-histidine peptide, such as the tag provided in a pQE vector (Qiagen, Inc.), among others, many of which are commercially available.
- a pQE vector Qiagen, Inc.
- hexa-histidine provides for convenient purification of the fusion protein.
- the HA tag is another peptide useful for purification, which corresponds to an epitope derived of influenza hemaglutinin protein, which has been described by Wilson et al., Cell 37:767 (1984), for instance.
- the nucleic acid sequences of the genes disclosed herein were determined by sequencing cDNA clones, mRNA transcripts, or genomic DNA obtained from Thermoascus aurantiacus strain CBS 181.67, Myceliophthora fergusii (Corynascus thermophilus) strain CBS 405.69, and Pseudocercosporella herpotrichoides strain 494.80.
- polynucleotides encoding a polypeptide of the present invention comprising functional variants thereof.
- polynucleotides of the present invention comprise the coding nucleic acid sequence of any one of SEQ ID NOs: 201-400, 890-1178, or 1992-2514, or as set forth in Tables 1A-1C.
- polynucleotides of the present invention comprise the genomic nucleic acid sequence of any one of SEQ ID NOs: 1-200, 601-889, or 1468-1991 ; or as set forth in Tables 1A-1C.
- the present invention relates to a polynucleotide comprising at least one intronic or exonic nucleic acid sequence of any one of the genomic sequences corresponding to SEQ ID NOs: 1-200, 601-889, or 1468-1991 (e.g., the intron or exon segments defined by the exon boundaries listed in Tables 2A-2C).
- the intron or exon segments defined by the exon boundaries listed in Tables 2A-2C.
- polynucleotides comprising at least one these intronic segments are within the scope of the present invention.
- the present invention relates to a polynucleotide comprising at least one exonic nucleic acid sequence comprised within SEQ ID NOs: 1-200, 601-889, or 1468-1991, or as set forth in Tables 2A- 2C.
- the present invention relates to isolated polynucleotides sharing a minimum threshold of nucleic acid sequence identity with any one of the above-mentioned polynucleotides.
- the present invention relates to isolated polynucleotides having at least 60%, 65%, 70%, 71 %, 72, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% nucleic acid sequence identity to any one of the above-mentioned polynucleotides.
- Polynucleotides having the aforementioned thresholds of nucleic acid sequence identity can be created by introducing one or more nucleotide substitutions, additions or deletions into the coding nucleotide sequences of the present invention such that one or more amino acid substitutions, deletions or insertions are introduced into the encoded polypeptide. Such mutations may be introduced by standard techniques, such as site-directed mutagenesis and PCR-mediated mutagenesis.
- the present invention relates to a polynucleotide that hybridizes (or is hybridizable) under medium-high stringency conditions, high stringency conditions, or very high stringency conditions with the full- length complement of any one of the polynucleotides defined above.
- very low stringency conditions means for probes of at least 100 nucleotides in length, prehybridization and hybridization at 42°C in 5X SSPE, 0.3% SDS, 200 micrograms/mL sheared and denatured salmon sperm DNA, and 25% formamide, following standard Southern blotting procedures for 12 to 24 hours. The carrier material is finally washed three times each for 15 minutes using 2X SSC, 0.2% SDS at 45°C.
- low stringency conditions means for probes of at least 100 nucleotides in length, prehybridization and hybridization at 42°C in 5X SSPE, 0.3% SDS, 200 micrograms/mL sheared and denatured salmon sperm DNA, and 25% formamide, following standard Southern blotting procedures for 12 to 24 hours. The carrier material is finally washed three times each for 15 minutes using 2X SSC, 0.2% SDS at 50°C.
- medium stringency conditions means for probes of at least 100 nucleotides in length, prehybridization and hybridization at 42°C in 5X SSPE, 0.3% SOS, 200 micrograms/mL sheared and denatured salmon sperm DNA, and 35% formamide, following standard Southern blotting procedures for 12 to 24 hours. The carrier material is finally washed three times each for 15 minutes using 2X SSC, 0.2% SOS at 55°C.
- medium-high stringency conditions means for probes of at least 100 nucleotides in length, prehybridization and hybridization at 42°C in 5X SSPE, 0.3% SDS, 200 micrograms/mL sheared and denatured salmon sperm DNA, and 35% formamide, following standard Southern blotting procedures for 12 to 24 hours. The carrier material is finally washed three times each for 15 minutes using 2X SSC, 0.2% SDS at 60°C.
- high stringency conditions means for probes of at least 100 nucleotides in length, prehybridization and hybridization at 42°C in 5X SSPE, 0.3% SDS, 200 micrograms/mL sheared and denatured salmon sperm DNA, and 50% formamide, following standard Southern blotting procedures for 12 to 24 hours. The carrier material is finally washed three times each for 15 minutes using 2X SSC, 0.2% SDS at 65°C.
- very high stringency conditions means for probes of at least 100 nucleotides in length, prehybridization and hybridization at 42°C in 5X SSPE, 0.3% SDS, 200 micrograms/mL sheared and denatured salmon sperm DNA, and 50% formamide, following standard Southern blotting procedures for 12 to 24 hours. The carrier material is finally washed three times each for 15 minutes using 2X SSC, 0.2% SDS at 70°C.
- a polynucleotide of the present invention (or a fragment thereof) can be isolated using the sequence information provided herein in conjunction with standard molecular biology techniques (e.g., as described in Sambrook et al., supra.
- suitable hybridization oligonucleotides e.g., probes or primers
- the oligonucleotides can be employed in hybridization and/or amplification reactions, for example, to amplify a template of cDNA, mRNA or genomic DNA, according to standard PCR techniques.
- a polynucleotide so amplified can be cloned into an appropriate vector and characterized by DNA sequence analysis.
- the present invention relates to polynucleotides encoding functional variants of any one of the polypeptides of the present invention, including a biologically active fragment or domain thereof.
- the present invention can include nucleic acid molecules (e.g., oligonucleotides) sufficient for use as primers and/or hybridization probes to amplify, sequence and/or identify nucleic acid molecules encoding a polypeptide of the present invention or fragments thereof.
- the present invention relates to polynucleotides (e.g., oligonucleotides) that comprise, span, or hybridize specifically to exon-exon or exon- intron junctions of the genomic sequences identified herein, such as those defined in Tables 2A-2C. Designing such polynucleotides/oligonucleotides would be within the grasp of a person of skill in the art in view of the target sequence information disclosed herein and are thus encompassed by the present invention.
- the present invention relates to polynucleotides comprising silent mutations or mutations that do not significantly alter the (biological) function or protein activity of the encoded polypeptide.
- Guidance concerning how to make phenotypically silent amino acid substitutions is provided for example in Bowie et al., Science 247:1306-1310 (1990) and in the references cited therein.
- DNA sequence polymorphisms of the genes disclosed herein may exist within a given population, which may differ from the sequences disclosed herein. Such genetic polymorphisms may exist in cells from different populations or within a population due to natural allelic variation.
- the present invention can include natural allelic variants and homologs of polynucleotides disclosed herein.
- polynucleotides of the present invention can comprise only a portion or a fragment of the nucleic acid sequences disclosed herein. Although such polynucleotides may not encode a functional polypeptide of the present invention, they are useful for example as probes or primers in hybridization or amplification reactions.
- Exemplary uses of such polynucleotides include: (1) isolating a gene (as allelic variant thereof) from cDNA library; (2) in situ hybridization (e.g., FISH) to metaphase chromosomal spreads to provide precise chromosomal location of the gene as described in Verma et al., Human Chromosomes: a Manual of Basic Techniques, Pergamon Press, New York (1988); (3) Northern blot analysis for detecting expression of mRNA corresponding to a polypeptide disclosed herein, or a homolog, ortholog or variant thereof, in specific tissues and/or cells; and (4) probes and primers that can be used as a diagnostic tool to analyze the presence of a nucleic acid hybridizable to a polynucleotide disclosed herein in a given biological (e.g., tissue) sample.
- a given biological e.g., tissue
- Oligonucleotides typically comprise a region of nucleotide sequence that hybridizes (preferably under highly stringent conditions) to at least 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 37, 39, 40, 50, 60, 70, 80, 90 or 100 contiguous nucleotides of a polynucleotide of the present invention.
- such oligonucleotides can be used for identifying and/or cloning other family members, as well as orthologs from other species.
- the oligonucleotide can be attached to a detectable label (e.g., a radioisotope, a fluorescent compound, an enzyme, or an enzyme cofactor).
- a detectable label e.g., a radioisotope, a fluorescent compound, an enzyme, or an enzyme cofactor.
- Such oligonucleotides can also be used as part of a diagnostic method or kit for identifying cells which express a polypeptide of the present invention.
- full-length complements of any one of the polynucleotides of the present invention are also encompassed.
- the full-length complements are antisense molecules with respect to the coding strands of polynucleotides of the present invention, which hybridize (preferably under highly stringent conditions) to at least 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 37, 39, 40, 50, 60, 70, 80, 90 or 100 contiguous nucleotides to a polynucleotide of the present invention.
- sequence information as provided herein should not be so narrowly construed as to require inclusion of erroneously identified bases.
- the specific sequences disclosed herein can be readily used to isolate the corresponding complete genes from the organism sequenced herein, which in turn can easily be subjected to further sequence analyses thereby identifying sequencing errors.
- nucleotide sequences disclosed herein were determined by sequencing using an automated DNA sequencer, and all amino acid sequences of polypeptides disclosed herein were predicted by translation based on the genetic code. Therefore, as is known in the art for any DNA sequence determined by this automated approach, any nucleotide sequence determined herein may contain some errors. Nucleotide sequences determined by automation are typically at least about 90% identical, more typically at least about 95% to at least about 99.9% identical to the actual nucleotide sequence of the sequenced DNA molecule. The actual sequence can be more precisely determined by other approaches including manual DNA sequencing methods well known in the art.
- a single insertion or deletion in a determined nucleotide sequence compared to the actual sequence will cause a frame shift in translation of the nucleotide sequence such that the predicted amino acid sequence encoded by a determined nucleotide sequence will be completely different from the amino acid sequence actually encoded by the sequenced DNA molecule, beginning at the point of such an insertion or deletion.
- vectors e.g., expression vectors
- a polynucleotide encoding a polypeptide of the present invention e.g., amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino
- vector includes a nucleic acid molecule capable of transporting another nucleic acid molecule to which it has been linked.
- plasmid refers to a circular double stranded DNA loop into which additional DNA segments can be ligated.
- viral vector Another type of vector is a viral vector, wherein additional DNA segments can be ligated into the viral genome.
- Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors).
- vectors e.g., non-episomal mammalian vectors
- Other vectors are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome.
- certain vectors are capable of directing the expression of genes to which they are operatively linked.
- Such vectors are referred to herein as "expression vectors".
- expression vectors useful in recombinant DNA techniques are often in the form of plasmids.
- the terms "plasmid” and “vector” can be used interchangeably herein as the plasmid is the most commonly used form of vector.
- the invention is intended to include such other forms of expression vectors, such as viral vectors (e.g., replication defective retroviruses, adenoviruses and adeno- associated viruses), which serve equivalent functions.
- recombinant expression vectors of the invention can comprise a polynucleotide of the present invention in a form suitable for expression of the polynucleotide in a host cell, which means that the recombinant expression vector includes one or more regulatory sequences, selected on the basis of the host cells to be used for expression, which is operatively linked to the nucleic acid sequence to be expressed.
- operatively linked is intended to mean that the nucleotide sequence of interest is linked to the regulatory sequence(s) in a manner which allows for expression of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell).
- regulatory sequence is intended to include promoters, enhancers and other expression control elements (e.g., polyadenylation signal). Such regulatory sequences are described, for example, in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, CA (1990).
- Regulatory sequences include those which direct constitutive expression of a nucleotide sequence in many types of host cells and those which direct expression of the nucleotide sequence only in a certain host cell (e.g., tissue-specific regulatory sequences). It will be appreciated by those skilled in the art that the design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, etc.
- the expression vectors of the present invention can be introduced into host cells to thereby produce proteins or peptides, encoded by polynucleotides as described herein (e.g., polypeptides of the present invention).
- recombinant expression vectors of the present invention can be designed for expression of polypeptides of the present invention in prokaryotic or eukaryotic cells.
- these polypeptides can be expressed in bacterial cells such as E. coli, insect cells (using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel supra).
- recombinant expression vectors of the present invention can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.
- expression vectors of the present invention can include chromosomal-, episomal- and virus-derived vectors, e.g., vectors derived from bacterial plasmids, bacteriophage, yeast episome, yeast chromosomal elements, viruses such as baculoviruses, papova viruses, vaccinia viruses, adenoviruses, fowl pox viruses, pseudorabies viruses and retroviruses, and vectors derived from combinations thereof, such as those derived from plasmid and bacteriophage genetic elements, such as cosmids and phagemids.
- vectors derived from bacterial plasmids, bacteriophage, yeast episome, yeast chromosomal elements viruses such as baculoviruses, papova viruses, vaccinia viruses, adenoviruses, fowl pox viruses, pseudorabies viruses and retroviruses
- vectors derived from combinations thereof such as those derived from plasmid and bacterioph
- a DNA insert should be operatively linked to an appropriate promoter, such as the phage lambda PL promoter, the E. coli lac, trp and tac promoters, the SV40 early and late promoters and promoters of retroviral LTRs, to name a few.
- an appropriate promoter such as the phage lambda PL promoter, the E. coli lac, trp and tac promoters, the SV40 early and late promoters and promoters of retroviral LTRs, to name a few.
- promoters are preferred that are capable of directing a high expression level of biologically active polypeptides of the present invention (e.g., lignocellulose active proteins) from fungi.
- Such promoters are known in the art.
- the expression constructs may contain sites for transcription initiation, termination, and, in the transcribed region, a ribosome binding site for translation.
- the coding portion of the mature transcripts expressed by the constructs will
- Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional transformation or transfection techniques.
- transformation and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, transduction, infection, lipofection, cationic lipid-mediated transfection or electroporation.
- Suitable methods for transforming or transfecting host cells can be found in Sambrook, et al., (Molecular Cloning: A Laboratory Manual, 2nd ed. Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989), Davis et al., Basic Methods in Molecular Biology (1986) and other laboratory manuals.
- a gene that encodes a selectable marker (e.g., resistance to antibiotics) is generally introduced into the host cells along with the gene of interest.
- selectable markers include those which confer resistance to drugs, such as G418, hygromycin and methatrexate.
- a polynucleotide encoding a selectable marker can be introduced into a host cell on the same vector as that encoding a polypeptide of the present invention, or on a separate vector. Cells stably transfected with a polynucleotide of the present invention can be identified by drug selection (e.g., cells that have incorporated the selectable marker gene will survive, while the other cells die).
- Fusion vectors add a number of amino acids to a protein encoded therein, e.g., to the amino terminus of the recombinant protein.
- Such fusion vectors typically serve three purposes: (1) to increase expression of recombinant protein; (2) to increase the solubility of the recombinant protein; and (3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification.
- a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein.
- Vectors preferred for use in bacteria are for example disclosed in WO-A1 -2004/074468.
- Other suitable vectors will be readily apparent to the skilled artisan.
- Known bacterial promoters suitable for use in the present invention include the promoters disclosed in WO-A1 -2004/074468.
- the expression vectors will preferably contain selectable markers.
- markers include dihydrofolate reductase or neomycin resistance for eukaryotic cell culture and antibiotic resistance (e.g., tetracyline or ampicillin) for culturing in E. coli and other bacteria.
- antibiotic resistance e.g., tetracyline or ampicillin
- Representative examples of appropriate host include bacterial cells, such as E. coli, Streptomyces, Salmonella typhimurium and certain Bacillus species; fungal cells such as Aspergillus species, for example A. niger, A. oryzae and A. nidulans, yeast cells such as Kluyveromyces, for example K. lactis and/or Pichia, for example P.
- insects such as Drosophila S2 and Spodoptera Sf9
- animal cells such as CHO, COS and Bowes melanoma
- plant cells Appropriate culture mediums and conditions for the above-described host cells are known in the art.
- Enhancers are cis-acting elements of DNA, usually about from 10 to 300 bp that act to increase transcriptional activity of a promoter in a given host cell-type.
- enhancers include the SV40 enhancer, which is located on the late side of the replication origin at bp 100 to 270, the cytomegalovirus early promoter enhancer, the polyoma enhancer on the late side of the replication origin, and adenovirus enhancers.
- secretion signal may be incorporated into the expressed polypeptide.
- the signals may be endogenous to the polypeptide or they may be heterologous signals.
- a polypeptide of the present invention may be expressed in a modified form, such as a fusion protein, and may include not only secretion signals but also additional heterologous functional regions.
- additional amino acids particularly charged amino acids
- peptide moieties may be added to the polypeptide to facilitate purification and/or detection.
- the present invention features cells, e.g., transformed host cells or recombinant host cells that contain a polynucleotide or vector of the present invention.
- a "transformed cell” or “recombinant cell” is a cell into which (or into an ancestor of which) has been introduced a polynucleotide or vector of the invention by means of recombinant DNA techniques.
- prokaryotic and eukaryotic cells are included, e.g., bacteria, fungi, yeast, and the like, especially preferred are cells from filamentous fungi, in particular the strain from which the polynucleotide and polypeptide sequences disclosed herein were derived.
- a cell of the present invention is typically not a wild-type strain or a naturally- occurring cell.
- Host cells of the present invention can include, but are not limited to: fungi (e.g., Aspergillus niger, Trichoderma reesii, Myceliophthora thermophila and Talaromyces emersonii); yeasts (e.g., Saccharomyces cerevisiae, Yarrowia lipolytica and Pichia pastoris); bacteria (e.g., Escherichia coli and Bacillus sp.); and plants (e.g., Nicotiana benthamiana, Nicotiana tabacum and Medicago sativa).
- fungi e.g., Aspergillus niger, Trichoderma reesii, Myceliophthora thermophila and Talaromyces emersonii
- yeasts e.g., Saccharomyces cerevisiae,
- a polynucleotide (or a polynucleotide which is comprised within a vector) may be homologous or heterologous with respect to the cell into which it is introduced.
- a polynucleotide is homologous to a cell if the polynucleotide naturally occurs in that cell.
- a polynucleotide is heterologous to a cell if the polynucleotide does not naturally occur in that cell.
- the present invention relates to a cell which comprises a heterologous or a homologous sequence corresponding to any one of the polynucleotides or polypeptides disclosed herein.
- a host cell can be chosen that modulates the expression of the inserted sequences, or modifies and processes the gene product in a specific, desired fashion. Such modifications (e.g., glycosylation) and processing (e.g., cleavage) of protein products may facilitate optimal functioning of the protein.
- Various host cells have characteristic and specific mechanisms for post-translational processing and modification of proteins and gene products. Appropriate cell lines or host systems familiar to those of skill in the art can be chosen to ensure the desired and correct modification and processing of the foreign protein expressed. To this end, eukaryotic host cells that possess the cellular machinery for proper processing of the primary transcript, glycosylation, and phosphorylation of the gene product can be used. Such host cells are well known in the art.
- host cells can also include, but are not limited to, mammalian cell lines such as CHO, VERO, BHK, HeLa, COS, MDCK, 293, 3T3, WI38, and choroid plexus cell lines.
- mammalian cell lines such as CHO, VERO, BHK, HeLa, COS, MDCK, 293, 3T3, WI38, and choroid plexus cell lines.
- a stably transfected cell line can produce the polypeptides of the present invention.
- a number of vectors suitable for stable transfection of mammalian cells are available to the public, methods for constructing such cell lines are also publicly known, e.g., in Ausubel et al., (supra).
- the present invention relates to methods of inhibiting the expression of a polypeptide of the present invention in a host cell, comprising administering to the cell or expressing in the cell a double-stranded RNA (dsRNA) molecule (or a molecule comprising region of double-strandedness), wherein the dsRNA comprises a subsequence of a polynucleotide of the present invention.
- dsRNA double-stranded RNA
- the dsRNA is about 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25 or more duplex nucleotides in length.
- the dsRNA is preferably a small interfering RNA (siRNA) or a micro RNA (miRNA).
- the dsRNA is small interfering RNA (siRNAs) for inhibiting transcription.
- the dsRNA is micro RNA (miRNAs) for inhibiting translation.
- the present invention also relates to such double-stranded RNA (dsRNA) molecules, comprising a portion of the mature polypeptide coding sequence of any one of the coding sequences of the polypeptides disclosed herein of inhibiting expression of that polypeptide in a cell. While the present invention is not limited by any particular mechanism of action, the dsRNA can enter a cell and cause the degradation of a single-stranded RNA (ssRNA) of similar or identical sequences, including endogenous mRNAs.
- ssRNA single-stranded RNA
- RNA interference RNA interference
- the dsRNAs of the present invention can be used in gene-silencing methods.
- the invention relates to methods to selectively degrade RNA using the dsRNAi's of the present invention.
- the process may be practiced in vitro, ex vivo or in vivo.
- the dsRNA molecules can be used to generate a loss-of-function mutation in a cell, an organ or an oganism. Methods for making and using dsRNA molecules to selectively degrade RNA are well known in the art, see, for example, U.S. Patent No.
- the present invention relates to an isolated binding agent capable of selectively binding to a polypeptide of the present invention.
- Suitable binding agents may be selected from an antibody, an antigen binding fragment, or a binding partner.
- the binding agent selectively binds to an amino acid sequence selected from Tables 1A-1C, including to any fragment of any of the above sequences comprising at least one antibody binding epitope.
- the phrase “selectively binds to” refers to the ability of an antibody, antigen binding fragment or binding partner of the present invention to preferentially bind to specified proteins. More specifically, the phrase “selectively binds” refers to the specific binding of one protein to another (e.g., an antibody, fragment thereof, or binding partner to an antigen), wherein the level of binding, as measured by any standard assay (e.g., an immunoassay), is statistically significantly higher than the background control for the assay.
- any standard assay e.g., an immunoassay
- controls when performing an immunoassay, controls typically include a reaction well/tube that contain antibody or antigen binding fragment alone (i.e., in the absence of antigen), wherein an amount of reactivity (e.g., non-specific binding to the well) by the antibody or antigen binding fragment thereof in the absence of the antigen is considered to be background. Binding can be measured using a variety of methods standard in the art including enzyme immunoassays (e.g., ELISA, immunoblot assays, etc.).
- enzyme immunoassays e.g., ELISA, immunoblot assays, etc.
- Antibodies are characterized in that they comprise immunoglobulin domains and as such, they are members of the immunoglobulin superfamily of proteins.
- An antibody of the invention includes polyclonal and monoclonal antibodies, divalent and monovalent antibodies, bi- or multi-specific antibodies, serum containing such antibodies, antibodies that have been purified to varying degrees, and any functional equivalents of whole antibodies.
- Isolated antibodies of the present invention can include serum containing such antibodies, or antibodies that have been purified to varying degrees.
- Whole antibodies of the present invention can be polyclonal or monoclonal.
- antibodies such as antigen binding fragments in which one or more antibody domains are truncated or absent (e.g., Fv, Fab, Fab', or F(ab)2 fragments), as well as genetically-engineered antibodies or antigen binding fragments thereof, including single chain antibodies or antibodies that can bind to more than one epitope (e.g., bi-specific antibodies), or antibodies that can bind to one or more different antigens (e.g., bi- or multi-specific antibodies), may also be employed in the invention. Methods for the generation and production of antibodies are well known in the art. [00186] Monoclonal antibodies may be produced according to the methodology of Kohler and Milstein (Nature 256:495-497, 1975).
- Non-antibody polypeptides may be designed to bind specifically to a protein of the invention. Examples of the design of such polypeptides, which possess a prescribed ligand specificity are given in Beste et al., (Proc. Nat'l Acad. Sci. 96:1898-1903, 1999).
- a binding agent of the invention is immobilized on a substrate such as: artificial membranes, organic supports, biopolymer supports and inorganic supports such as for use in a screening assay.
- antibodies and binding agents specifically binding to polypeptides of the present invention may be produced and used even in absence of knowledge of the precise biological function and/or protein activity of the polypeptide.
- Such antibodies and binding agent may be useful, for example, as diagnostic, classification, and/or research tools.
- the present invention relates to a composition
- a composition comprising one or more polypeptides or polynucleotides of the present invention.
- the compositions are enriched in such a polypeptide.
- the term "enriched" indicates that the biological activity (e.g., biomass degradation or processing) of the composition has been increased, e.g., with an enrichment factor of at least 1.1.
- the composition may comprise a polypeptide of the present invention as the major component, e.g., a mono-component composition.
- the composition may comprise multiple enzymatic activities (e.g., those described herein).
- the polypeptide compositions may be prepared in accordance with methods known in the art and may be in the form of a liquid or a dry composition.
- the polypeptide composition may be in the form of a granulate or a microgranulate.
- the polypeptide to be included in the composition may be stabilized in accordance with methods known in the art. Examples are given below of preferred uses of the polypeptide compositions of the present invention.
- the dosage of the polypeptide composition of the invention and other conditions under which the composition is used may be determined on the basis of methods known in the art.
- the present invention relates to the use of the polypeptides (e.g., enzymes) of the present invention a number of industrial and other processes.
- polypeptides e.g., enzymes
- these advantages can include aspects such as lower production costs, higher specificity towards the substrate, greater synergies with existing enzymes, less antigenic effect, less undesirable side activities, higher yields when produced in a suitable microorganism, more suitable pH and temperature ranges, better properties of the final product, and food grade or kosher aspects.
- the present invention seeks to provide one or more of these advantages, or others.
- the polypeptides of the present invention may be used in new or improved methods for enzymatically degrading or converting plant cell wall polysaccharides from biomass into various useful products.
- plant cell walls contain associated pectins and lignins, the removal of which by enzymes of the current invention can improve accessibility to cellulases and hemicellulases, or which can themselves be converted to useful products. Therefore the polypeptides of the present invention may be used to degrade biomass or pretreated biomass to sugars. These sugars may be used as such or may be, for example, fermented into ethanol.
- polypeptides of the present invention may be used in improved methods for the processing of pretreated biomass.
- Pretreatment technologies may involve chemical, physical, or biological treatments. Examples of pre-treatment technologies include but are not limited to: steam explosion; ammonia; acid hydrolysis; alkaline hydrolysis; solvent extraction; crushing; milling; etc.
- Bioethanol is usually produced by the fermentation of glucose to ethanol by yeasts such as Saccharomyces cerevisiae: in addition to ethanol, other chemicals may be synthesized starting from glucose.
- Ethanol, today is produced mostly from sugars or starches, obtained from sugar cane, fruits and grains.
- cellulosic ethanol is obtained from cellulose, the main component of wood, straw and much of the plants.
- Sources of biomass for cellulosic ethanol production comprise agricultural residues (e.g., leftover crop materials from stalks, leaves, and husks of corn plants), forestry wastes (e.g., chips and sawdust from lumber mills, dead trees, and tree branches), energy crops (e.g., dedicated fast-growing trees and grasses such as switch grass), municipal solid waste (e.g., household garbage and paper products), food processing and other industrial wastes (e.g., black liquor, paper manufacturing by-products, etc.).
- agricultural residues e.g., leftover crop materials from stalks, leaves, and husks of corn plants
- forestry wastes e.g., chips and sawdust from lumber mills, dead trees, and tree branches
- energy crops e.g., dedicated fast-growing trees and grasses such as switch grass
- municipal solid waste e.g., household garbage and paper products
- food processing and other industrial wastes e.g., black liquor, paper manufacturing by-products, etc.
- Plant biomass is a mixture of plant polysaccharides, including cellulose, hemicelluloses, and pectin, together with the structural polymer, lignin.
- Glucose is released from cellulose by the action of mixtures of enzymes, including: endoglucanases, exoglucanases (cellobiohydrolases 1 and 2) and beta-glucosidases.
- Efficient large-scale conversion of cellulosic materials by such mixtures may require the full complement of enzymes, and can be enhanced by the addition of enzymes that attack the other plant cell wall components (e.g., hemicelluloses, pectins, and lignins), as well as chemical linkages between these components.
- polypeptides of the present invention that are highly expressed, or have high specific activity, stability, or resistance to inhibitors may improve the efficiency of the process, and lower enzyme costs. It would be an advantage to the art to improve the degradation and conversion of plant cell wall polysaccharides by composing cellulase mixtures using cellulase enzymes with such properties. Furthermore, polypeptides of the present invention that are able to function at extremes of pH and temperature are desirable, both since improved enzyme robustness decreases costs, and because enzymes that function at high temperature will allow high processing temperatures under high substrate consistency conditions that decrease viscosity and thus improve yields.
- Glycoside hydrolases from the family GH61 are known to stimulate the activity of cellulose cocktails on lignocellulosic substrates and are thus considered to exhibit cellulose-enhancing activity (Harris et al., Biochemistry 49, 3305 (2010)). Enhancement of cellulase cocktail efficiency by GH61 proteins of the present invention may contribute to lowering the costs of cellulase enzymes used for the production of glucose from plant cell biomass, as described above.
- GH61 (glycoside hydrolase family 61 or sometimes referred to as EGIV) proteins are oxygen- dependent polysaccharide monooxygenases (PMO's) according to the latest literature.
- GH61 was originally classified as an endogluconase, based on the measurement of very weak endo-1 ,4- -d-glucanase activity in one family member.
- the term "GH61" as used herein, is to be understood as a family of enzymes, which share common conserved sequence portions and foldings to be classified in family 61 of the well-established CAZY GH classification system (http://www.cazy.org/GH61.html).
- the glycoside hydrolase family 61 is a member of the family of glycoside hydrolases EC 3.2.1. GH61 is used herein as being part of the cellulases.
- Enzymatic hydrolysis of plant hemicellulose yields 5-carbon sugars that either may be fermented to ethanol by some species of yeast, or converted to other types of chemical products. Enzymatic deconstruction of hemicellulose is also known to improve the accessibility of plant cell wall cellulose to cellulase enzymes for the production of glucose from lignocellulosic materials. Hemicellulase enzymes of the present invention that enhance glucose production from lignocellulose would find utility in the bioethanol industry and in other process that rely on glucose or pentose streams from lignocellulose.
- Lignin is composed of methoxylated phenyl-propane units linked by ether linkages and carbon-carbon bonds.
- the chemical composition of lignin may, depending on species, include guaiacyl, 4-hydroxyphenyl, and syringyl groups.
- Enzymatic modification of lignin by the polypeptides of the present invention can be used for the production of structural materials from plant biomass, or alternatively improve the accessibility of plant cellulose and hemicelluloses to cellulase enzymes for the release of glucose from biomass as described above.
- Enzymes that degrade the lignin component of lignocellulose include lignin peroxidases, manganese-dependent peroxidases, versatile peroxidases, and laccases (Vicuna et al., 2000, Molecular Biotechnology 14: 173-176; Broda et al., 1996, Molecular Microbiology 19: 923-932).
- polypeptides of the present invention may also, in certain instances, be active in the decolorization of industrial dyes, and thus useful for the treatment and detoxification of chemical wastes.
- pectin-degrading polypeptides of the present invention can also enhance the action of cellulases on plant biomass by improving the accessibilty of cellulase to the cellulose component of lignocellulose.
- polypeptides of the present invention may also be useful in other applications for hydrolyzing non-starch polysaccharide (NSP).
- NSP non-starch polysaccharide
- esterases of the present invention can be useful in the bioenergy industry such as for the production of biodiesel and hydrolysis of hemicellulose.
- the present invention relates to methods for degrading or converting a cellulose-containing material, comprising: treating the cellulose-containing material with an effective amount of a cellulolytic enzyme composition in the presence of an effective amount of a polypeptide having cellulolytic enhancing activity of the present invention, wherein the presence of the polypeptide having cellulolytic enhancing activity increases the degradation of cellulose-containing material compared to the absence of the polypeptide having cellulolytic enhancing activity.
- the present invention relates to methods for producing a fermentation product, comprising: (a) saccharifying a cellulose-containing material with an effective amount of a cellulolytic enzyme composition in the presence of an effective amount of a polypeptide having cellulolytic enhancing activity of the present invention, wherein the presence of the polypeptide having cellulolytic enhancing activity increases the degradation of cellulose-containing material compared to the absence of the polypeptide having cellulolytic enhancing activity; (b) fermenting the saccharified cellulose-containing material of step (a) with one or more fermenting microorganisms to produce the fermentation product; and (c) recovering the fermentation product from the fermentation.
- the present invention relates to methods for preparing a food product comprising incorporating into the food product an effective amount of a polypeptide of the present invention. This can improve one or more properties of the food product relative to a food product in which the polypeptide is not incorporated.
- the phrase "incorporated into the food product" is defined herein as adding a polypeptide of the present invention to the food product, to any ingredient from which the food product is to be made, and/or to any mixture of food ingredients from which the food product is to be made.
- a polypeptide of the present invention may be added in any step of the food product preparation and may be added in one, two or more steps.
- the polypeptide of the present invention is added to the ingredients of a food product which can then be treated by methods including cooking, boiling, drying, frying, steaming or baking as is known in the art.
- the term "effective amount” is defined herein as an amount of the polypeptide (e.g., enzyme) of the present invention that is sufficient for providing a measurable effect on at least one property of interest of the food product.
- the term "improved property” is defined herein as any property of a food product which is improved by the action of a polypeptide (e.g., enzyme) of the present invention relative to a food product in which the polypeptide is not incorporated. The improved property may be determined by comparison of a food product prepared with and without addition of a polypeptide of the present invention. Organoleptic qualities may be evaluated using procedures well established in the food industry, and may include, for example, the use of a panel of trained taste-testers.
- the polypeptides of the present invention may be prepared in any form suitable for the use in question, e.g., in the form of a dry powder, agglomerated powder, or granulate, in particular a non-dusting granulate, liquid, in particular a stabilized liquid, or protected enzyme such as described in WO01/11974 and WO02/26044.
- Granulates and agglomerated powders may be prepared by conventional methods, e.g., by spraying the enzyme according to the invention onto a carrier in a fluid-bed granulator.
- the carrier may consist of particulate cores having a suitable particle size.
- the carrier may be soluble or insoluble, e.g., a salt (such as NaCI or sodium sulphate), sugar (such as sucrose or lactose), sugar alcohol (such as sorbitol), starch, rice, corn grits, or soy.
- a salt such as NaCI or sodium sulphate
- sugar such as sucrose or lactose
- sugar alcohol such as sorbitol
- starch rice, corn grits, or soy.
- the polypeptide of the present invention (and/or additional polypeptides/enzymes) may be contained in slow-release formulations. Methods for preparing slow-release formulations are well known in the art. Adding nutritionally acceptable stabilizers such as sugar, sugar alcohol, or another polyol, and/or lactic acid or another organic acid according to established methods may for instance, stabilize liquid enzyme preparations.
- polypeptides of the present invention may also be incorporated in yeast- comprising compositions such as disclosed in EP-A-0619947, EP-A-0659344 and WO02/49441.
- one or more additional polypeptides/enzymes may be incorporated into a food product of the present invention.
- the additional enzyme may be of any origin, including mammalian and plant, and preferably of microbial (bacterial, yeast or fungal) origin and may be obtained by techniques conventionally used in the art. Enzymes may conveniently be produced in microorganisms. Microbial enzymes are available from a variety of sources; Bacillus species are a common source of bacterial enzymes, whereas fungal enzymes are commonly produced in Aspergillus species.
- additional polypeptides/enzymes include starch degrading enzymes, xylanases, oxidizing enzymes, fatty material splitting enzymes, or protein-degrading, modifying or crosslinking enzymes.
- Starch degrading enzymes include endo-acting enzymes such as alpha-amylase, maltogenic amylase, pullulanase or other debranching enzymes, and exo-acting enzymes that cleave off glucose (amyloglucosidase), maltose (beta-amylase), maltotriose, maltotetraose and higher oligosaccharides.
- Suitable xylanases are for instance xylanases, pentosanases, hemicellulase, arabinofuranosidase, glucanase, cellulase, cellobiohydrolase, beta- glucosidase, and others.
- Oxidizing enzymes are for instance glucose oxidase, hexose oxidase, pyranose oxidase, sulfhydryl oxidase, lipoxygenase, laccase, polyphenol oxidases and others.
- Fatty material splitting enzymes are for instance triacylglycerol lipases, phospholipases (such as A1 , A2, B, C and D) and galactolipases.
- Protein degrading, modifying or crosslinking enzymes are for instance endo-acting proteases (serine proteases, metalloproteases, aspartyl proteases, thiol proteases), exo-acting peptidases that cleave off one amino acid, or dipeptide, tripeptide etceteras from the N-terminal (aminopeptidases) or C-terminal (carboxypeptidases) ends of the polypeptide chain, asparagines or glutamine deamidating enzymes such as deamidase and peptidoglutaminase or crosslinking enzymes such as transglutaminase.
- additional polypeptides/enzymes can include: amylases, such as alpha- amylase (which can be useful for providing sugars that are fermentable by yeast) or beta-amylase; cyclodextrin glucanotransferase; peptidase (e.g., an exopeptidase, which can be useful in flavour enhancement); transglutaminase; lipase, which can be useful for the modification of lipids present in the food or food constituents), phospholipase, cellulase, hemicellulase, protein disulfide isomerase, peroxidase, laccase, or an oxidase (e.g., glucose oxidase, hexose oxidase, aldose oxidase, pyranose oxidase, lipoxygenase or L-amino acid oxidase).
- amylases such as alpha- amylase (which can be useful for
- esterases of the present invention have a number of applications in the food industry including, but not limited to, degumming vegetable oils; improving the production of bread (e.g., in situ production of emulsifiers); producing crackers, noodles, and pasta; enhancing flavor development of cheese, butter, and margarine; ripening cheese; removing wax; trans-esterification of flavors and cocoa butter substitutes; synthesizing structured lipids for infant formula and nutraceuticals; improving the polyunsaturated fatty acid content in fish oil; and aiding in digestion and releasing minerals in food processing.
- degumming vegetable oils e.g., in situ production of emulsifiers
- producing crackers, noodles, and pasta enhancing flavor development of cheese, butter, and margarine
- ripening cheese removing wax
- trans-esterification of flavors and cocoa butter substitutes synthesizing structured lipids for infant formula and nutraceuticals
- improving the polyunsaturated fatty acid content in fish oil and aiding in digestion and
- polypeptides of the present invention can be useful in the detergent industry, e.g., for removal of carbohydrate-based stains from soiled laundry.
- Enzymes are used in detergents in order to improve its efficacy to remove most types of dirt.
- esterases such as lipases of the present invention are particularly useful for removing fats and lipids.
- polypeptides of the present invention can be useful in the feed enzyme industry, e.g., for increasing nutritional quality, digestibility and/or absorption of animal feed.
- Feed enzymes have an important role to play in current farming systems, as they can increase the digestibility of nutrients, leading to greater efficiency in the production of animal products such as meat and eggs. At the same time, they can play a role in minimizing the environmental impact of increased animal production.
- Non-starch polysaccharides can increase the viscosity of the digesta which can, in turn, decrease nutrient availability and animal performance.
- Endoxylanases and phytases are the best-known feed-enzyme products.
- Phytase enzymes hydrolyse phytic acid and release inorganic phosphate, thereby avoiding the need to add inorganic phosphates to the diet and reducing phosphorus excretion.
- Addition of xylanases to feed has also been shown to have positive effects on animal growth. Adding specific nutrients to feed improves animal digestion and thereby reduces feed costs. A lot of feed additives are being currently used and new concepts are continuously developed.
- Use of specific enzymes like non- starch carbohydrate degrading enzymes could breakdown fiber, releasing energy as well as increasing the protein digestibility due to better accessibility of the protein when fiber gets broken down. In this way the feed cost could come down, as well as the protein levels in the feed also could be reduced.
- Non-starch polysaccharides are also present in virtually all feed ingredients of plant origin. NSPs are poorly utilized and can, when solubilized, exert adverse effects on digestion. Exogenous enzymes can contribute to a better utilization of these NSPs and as a consequence reduce any anti-nutritional effects. Accordingly, in a particular embodiment, hemicellulases and other polysaccharide-active polypeptides/enzymes of the present invention can be used for this purpose in cereal-based diets for poultry and, to a lesser extent, for pigs and other species.
- esterases of the present invention are useful in the feed industry such as for reducing the amount of phosphate in feed.
- xylanases of the present invention can be useful in the pulp and paper industry, e.g., for prebleaching of kraft pulp.
- Xylanases have been found to be most effective for that purpose.
- Xylanases attract increasing scientific and commercial attention due to applications in the pulp and paper industry for removal of hemicellulose from dissolving pulps or for enhancement of the bleachability of pulp and, thus, reduction of the use of environmentally harmful bleaching chemicals.
- a similar application of xylanases for pulp prebleaching is an already well-established technology and has greatly stimulated research on hemicellulases in the past decade.
- lignin-active peroxidases of the present invention may also be active in modification of lignin and hence have bleaching properties, such enzymes are generally less attractive for bleaching due to the need to use and recycle expensive redox mediators.
- polypeptides such as xylanases of the present invention can be used to pre- bleach pulp to reduce the amount of bleaching chemicals to obtain a given brightness. It is suggested that xylanase depolymerises xylan blocks and increases accessibility or helps liberation of residual lignin by releasing xylan- chromophore fragments. In addition to brownstock prior to bleaching, polypeptides such as xylanases of the present invention can save on bleaching chemicals. The enzymes hydrolyze surface xylans and are able to break linkages between hemicellulose and lignin.
- esterases of the present invention are useful for the removal of triglycerides, steryl esters, resin acids, free fatty acids, and sterols (e.g., lipophilic wood extractives).
- polypeptides such as xylanases of the present invention can be used in antibacterial formulations, as well as in pharmaceutical products such as throat lozenges, toothpastes, and mouthwash.
- Chitin is a beta-(1 ,4)-linked polymer of N-acetyl D-glucosamine (GlcNAc), found as a structural polysaccharide in fungal cell walls as well as in the exoskeleton of arthropods and the outer shell of crustaceans. Approximately 75% the total weight of shellfish, is considered waste, and a large proportion of the material making up the waste is chitin.
- GlcNAc N-acetyl D-glucosamine
- polypeptides such as chitin-degrading enzymes of the present invention are useful in the modification and degradation of chitin, allowing the production of chitin-derived material, such as chitooligosaccharides and N-acetyl D-glucosamine, from chitin waste.
- polypeptides such as chitinase enzymes of the present invention can be useful as antifungal agents.
- polypeptides of the present invention can be used in the textile industry (e.g., for the treatment of textile substrates). More particularly, cellulases (e.g., endo-, exocellulases and cellobiohydrolases) have gained importance in the treatment of cellulose-containing fibers. During the washing of indigo-dyed denim textiles, enzymatic treatment by a polypeptide of the present invention is can be used in place of (or in addition to) a bleaching treatment to achieve a "used" look of jeans or other suitable fabrics. Polypeptides of the present invention can also improve the softness/feel of such fabrics.
- cellulases e.g., endo-, exocellulases and cellobiohydrolases
- enzymatic treatment by a polypeptide of the present invention is can be used in place of (or in addition to) a bleaching treatment to achieve a "used" look of jeans or other suitable fabrics.
- Polypeptides of the present invention can also improve the soft
- enzymes of the present invention can enhance cleaning ability or act as a softening agent.
- polypeptides such as cellulases of the present invention can be used in combination with polymeric agents in processes for providing a localized variation in the color density of fibers.
- polypeptides of the present invention can be used in the waste treatment industry (e.g., for changing the characteristics of the waste to become more amenable to further treatment and/or for bio-conversion to value-added products).
- Polypeptides such as lipases, cellulases, amylases, and proteases of the present invention can be used in addition to microorganisms to break down polymeric substances like proteins, polysaccharides and lipids, thereby facilitating this process.
- polypeptides of the present invention can be used in industries such as biocatalysis; sewage treatment; cleaning up oil pollution; the synthesis of fragrances; and enhancing the recovery of oil (e.g., during drilling).
- the polynucleotides, polypeptides and antibodies of the present invention can be useful for diagnostic and classification tools.
- designing hybridization probes or primers that are specific for a particular genus, species or strain e.g., the genus, species, or strain from which the sequences disclosed herein were derived
- a skilled person would be able to select an epitope of a polypeptide of the present invention which is specific for a particular genus, species or strain (e.g., the genus, species, or strain from which the sequences disclosed herein were derived) and generate an antibody or binding agent that binds specifically thereto.
- Such tools are useful, for example, in diagnostic methods for detecting the presence or absence of a particular organism (e.g., the organism from which the sequences disclosed herein were derived) in a sample; as research tools (e.g., for designing and producing microarrays for studying fungal gene expression); for rapidly classifying an organism of interest based the detection of a sequence or polypeptide specific for that organism.
- a particular organism e.g., the organism from which the sequences disclosed herein were derived
- research tools e.g., for designing and producing microarrays for studying fungal gene expression
- for rapidly classifying an organism of interest based the detection of a sequence or polypeptide specific for that organism.
- the skilled person would recognize that knowledge of the precise (biological) function or protein activity of a polypeptide of the present invention is not absolutely necessary for the aforementioned tools to be useful for diagnostic, research, or classification purposes.
- Sequences that are particularly useful in this regard are the genomic, coding and amino acid sequences corresponding to the polypeptides of the present invention annotated as "unknown" in Tables 1A-1C (as well as their corresponding exons and introns defined in Tables 2A-2C, where available). These sequences show little sequence identity with those in the art and thus can be useful as markers for identifying the organisms from which the sequences of the present invention were derived. The skilled person would know how to search various sequence databases to design specific hybridization oligonucleotides (e.g., probes and primers), as well as produce antibodies specifically binds to the aforementioned sequences.
- specific hybridization oligonucleotides e.g., probes and primers
- the present invention relates to a method for identifying and/or classifying an organism (e.g., a fungal species) based on a biological sample, the method comprising detecting the presence or absence of any one of the polynucleotides or polypeptides of the present invention (e.g., those recited in the preceding paragraph) and determining that said organism is present or classifying said organism based on the presence of the polynucleotide or polypeptide.
- the detecting step can be carried out using one or more oligonucleotides or antibodies of the present invention.
- the detecting step can be carried out by performing an amplification and/or hybridization reaction.
- polypeptide of the present invention may not be known perse (e.g., in the case of proteins of the presence invention labelled as "unknown” in Tables 1A-1C), the polypeptide may be nevertheless useful for carrying out an industrial process (e.g., cellulase-enhancing, cellulose-degrading, hemicellulose-degrading, cellulolysis-enhancing, lignocellulolysis-enhancing, and other biological functions listed in Tables 1A-1C).
- an industrial process e.g., cellulase-enhancing, cellulose-degrading, hemicellulose-degrading, cellulolysis-enhancing, lignocellulolysis-enhancing, and other biological functions listed in Tables 1A-1C.
- proteins labelled herein as "unknown” comprise proteins whose precise enzymatic activities may not be deduceable from sequence comparisons, but that are nevertheless indentified as interesting targets for industrial applications for other reasons (e.g., their expression is induced by growth under certain coonditions such as in the presence of cellulostic and/or lignocellulostic biomass).
- Theau2p4 Theau2p4 cellulose- endoglucanase endoglucanase GH5 cellulase GH5 43 44 45 15 215 415 000766 _000766 degrading
- Aminopeptidase 2 Aminopeptidase 2 protease 46 47 48 16 216 416 000896 hydrolysis
- Theau2p4 Theau2p4 galactan- arabinogalactan endo- endo-1 ,4-beta- endo-1,4-beta- GH53 55 56 57 19 219 419 001291 _001291 degrading 1 ,4-beta-galactosidase
- Vacuolar protease A Vacuolar protease A protease 73 74 75 25 225 425 001741 hydrolysis
- Probable pectinesterase A pectinesterase CE8 277 278 279 93 293 493 007913 _007913 nesterase inhibitor degrading
- Alpha-galactosidase Alpha-galactosidase glactoside- alpha-galactosidase GH27 280 281 282 94 294 494 007928
- Probable glycosidase crf1 glycosidase GH16 298 299 300 100 300 500 008354 _00062 glycosidase crf1 modifying
- Corth2p4 0 Corth2p4 0 Probable beta- Beta-galactosidase galactan- beta-galactosidase GH35 121 122 123 641 930 1219 02765 02765 galactosidase B GH35 degrading
- Corth2p4 0 Corth2p4 0 cellulase-enhancing polysaccharide cellulose- polysaccharide CBM
- Feruloyl esterase B Feruloyl esterase B feruloyl esterase CE1 259 260 261 687 976 1265 04823 04823 -modifying
- unknown unknown lignocellulose- 262 263 264 688 977 1266 04956 04956 is-enhancing
- unknown unknown lignocellulose- 274 275 276 692 981 1270 05268 05268 is-enhancing
- Aldose 1-epimerase Aldose 1-epimerase aldose epimerase 277 278 279 693 982 1271 05378 05378 modifying
- Aldose 1-epimerase Aldose 1-epimerase aldose epimerase 295 296 297 699 988 1277 06086 06086 modifying
- Corth2p4 0 Corth2p4 0 cellulase-enhancing polysaccharide cellulose- polysaccharide
- Corth2p4 0 Corth2p4 0 arabinogalactanase galactan- arabinogalactanase arabinogalactanase GH53 310 311 312 704 993 1282 06416 06416 GH53 degrading
- Corth2p4 0 Corth2p4 0 carbohydrate- carbohydrate
- Corth2p4_0 Corth2p4_0 lignocellulolys
- Corth2p4 0 Corth2p4 0 lignocellulolys A iid mno ac unknown unknown lignocellulose- 391 392 393 731 1020 1309 07371 07371 is-enhancing
- Feruloyl esterase B feruloyl esterase CE1 feruloyl esterase CE1 394 395 396 732 1021 1310 07378 01019 -modifying
- beta-glucuronidase nucleotidase 415 416 417 739 1028 1317 07458 07458 nucleotidase degradation
- Corth2p4 Corth2p4 Putative beta- hemicellulos arabino- arabinofurano- GH43 433 434 435 745 1034 13 007532 007532 xylosidase e-degrading furanosidase A iid mno ac23 hydrolase GH43
- Vacuolar protease A unknown unknown unknown 508 509 510 770 1059 1348 07778 07778
- Corth2p4 0 Corth2p4 0 Leucine protein
- Corth2p4 0
- Corth2p4 0
- Corth2p4 0 Corth2p4 0 protein
- Corth2p4 0 Corth2p4 0 protein
- Corth2p4 0 Corth2p4 0 protein
- Pectin lyase B pectin lyase PL1 pectin lyase PL1 592 593 594 798 1087 1376 08348 08348 degrading
- Target ID Updated annotation Function Protein activity C di on g
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Genetics & Genomics (AREA)
- Wood Science & Technology (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- Polymers & Plastics (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Food Science & Technology (AREA)
- Animal Husbandry (AREA)
- Medicinal Chemistry (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Textile Engineering (AREA)
- Birds (AREA)
- Physiology (AREA)
- Mycology (AREA)
- Biophysics (AREA)
- Plant Pathology (AREA)
- Botany (AREA)
- Physics & Mathematics (AREA)
- Gastroenterology & Hepatology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Nutrition Science (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Description
TITLE OF THE INVENTION
NOVEL CELL WALL DECONSTRUCTION ENZYMES OF THERMOASCUS AURANTIACUS, MYCELIOPHTHORA FERGUSII (CORYNASCUS THERMOPHILUS), AND PSEUDOCERCOSPORELLA HERPOTRICHOIDES, AND USES THEREOF
FIELD OF THE INVENTION
[0001] The present invention relates to novel polypeptides and enzymes having activities relating to biomass processing and/or degradation (e.g., cell wall deconstruction), as well as polynucleotides, vectors, cells, compositions and tools relating to same, or functional variants thereof. More particularly, the present invention relates to secreted enzymes that may be isolated from the fungi, Thermoascus aurantiacus strain CBS 181.67, Myceiiophthora fergusii (Corynascus thermophilus) strain CBS 405.69, and Pseudocercosporella herpotrichoides strain 494.80. Uses thereof in various industrial processes such as in biofuels, food preparation, animal feed, pulp and paper, textiles, detergents, waste treatment and others are also disclosed.
SEQUENCE LISTING
[0002] This application contains a Sequence Listing in computer readable form entitled
"Seq_Listing_THEAU_CORTH_PSEHE.txt", created October 9, 2013 having a size of about 8.48 MB. The computer readable form is incorporated herein by reference.
BACKGROUND OF THE INVENTION
[0003] Biomass-processing enzymes have a number of industrial applications such as in: the biofuel industry (e.g., improving ethanol yield and/or increasing the efficiency and economy of ethanol production); the food industry (e.g., production of cereal-based food products; the feed-enzyme industry (e.g., increasing the digestibility/absorption of nutrients); the pulp and paper industry (e.g., enhancing bleachability of pulp); the textile industry (e.g., treatment of cellulose-based fabrics); the waste treatment industry (e.g., de-colorization of synthetic dyes); the detergent industry (e.g., providing eco-friendly cleaning products); and the rubber industry (e.g., catalyzing the conversion of latex into foam rubber).
[0004] In particular, driven by the limited availability of fossil fuels, there is a growing interest in the biofuel industry for improving the conversion of biomass into second-generation biofuels. This process is heavily dependent on inexpensive and effective enzymes for the conversion of lignocellulose to ethanol. Cellulase enzyme cocktails involve the concerted action of endoglucanases, cellobiohydrolases (also known as exoglucanases), and beta- glucosidases. The current cost of cellulose-degrading enzymes is too high for bioethanol to compete economically with fossil fuels. Cost reduction may result from the discovery of cellulase enzymes with, for example, higher specific
activity, lower production costs, and/or greater compatibility with processing conditions including temperature, pH and the presence of inhibitors in the biomass, or produced as the result of biomass pre-treatment.
[0005] Conversion of plant biomass to glucose may also be enhanced by supplementing cellulose cocktails with enzymes that degrade the other components of biomass, including hemicelluloses, pectins and lignins, and their linkages, thereby improving the accessibility of cellulose to the cellulase enzymes. Such enzymes include, without being limiting, to: xylanases, mannanases, arabinanases, esterases, glucuronidases, xyloglucanases and arabinofuranosidases for hemicelluloses; lignin peroxidases, manganese-dependent peroxidases, versatile peroxidases, and laccases for lignin; and pectate lyase, pectin lyase, polygalacturonase, pectin acetyl esterase, alpha-arabinofuranosidase, beta-galactosidase, galactanase, arabinanase, rhamnogalacturonase, rhamnogalacturonan lyase, and rhamnogalacturonan acetyl esterase, xylogalacturonosidase, xylogalacturonase, and rhamnogalacturonan lyase. Additionally, glycoside hydrolase family 61 (GH61) proteins have been shown to stimulate the activity of cellulase preparations.
[0006] These enzymes may also be useful for other purposes in processing biomass. For example, the lignin modifiying enzymes may be used to alter the structure of lignin to produce novel materials, and hemicellulases may be employed to produce 5-carbon sugars from hemicelluloses, which may then be further converted to chemical products.
[0007] There is also a growing need for improved enzymes for food processing and feed applications. Cereal- based food products such as pasta, noodles and bread can be prepared from dough which is usually made from the basic ingredients (cereal) flour, water and optionally salt. As a result of a consumer-driven need to replace the chemical additives by more natural products, several enzymes have been developed with dough and/or cereal-based food product-improving properties, which are used in all possible combinations depending on the specific application conditions. Suitable enzymes include, for example, xylanase, starch degrading enzymes, oxidizing enzymes, fatty material splitting enzymes, protein degrading, and modifying or crosslinking enzymes. Many of these enzymes are also used for treating animal feed or animal feed additives, to make them more digestible or to improve their nutritional quality. Amylases are used for the conversion of plant starches to glucose. Pectin-active enzymes are used in fruit processing, for example to increase the yield of juices, and in fruit juice clarification, as well as in other food processing steps.
[0008] There is also a growing need for improved enzymes in other industries. In the pulp and paper industry, enzymes are used to make the bleaching process more effective and to reduce the use of oxidative chemicals. In the textile industry, enzymatic treatment is often used in place of (or in addition to) a bleaching treatment to achieve a "used" look of jeans, and can also improve the softness/feel of fabrics. When used in detergent compositions, enzymes can enhance cleaning ability or act as a softening agent. In the waste treatment industry, enzymes play an important role in changing the characteristics of the waste, for example, to become more amenable to further treatment and/or for bio-conversion to value-added products.
[0009] There is also a growing need for indutrial enzymes and proteins that are "thermostable" in that they retain a level of their function or protein activity at temperatures about 50°C. These thermostable enzymes are highly desirable, for example, to be able to perform reactions at elevated temperatures to avoid or reduce contamination by microorganisms (e.g., bacteria).
[0010] There thus remains a need in the above-mentioned industries and others for biomass-processing enzymes, polynucleotides encoding same, and recombinant vectors and strains for expressing same.
[0011] The present description refers to a number of documents, the content of which is herein incorporated by reference in their entirety.
SUMMARY OF THE INVENTION
[0012] In general, the present invention relates to soluble, secreted proteins relating to biomass processing and/or degradation (e.g., cell wall deconstruction) that may be isolated from the fungi, Thermoascus aurantiacus strain CBS 181.67, Myceiiophthora fergusii (Corynascus thermophilus) strain CBS 405.69, and Pseudocercosporella herpotrichoides strain 494.80, as well as polynucleotides, vectors, compositions, cells, antibodies, kits, products and uses associated with same. Briefly, these fungal strains were cultured in vitro and genomic DNA along with total RNA were isolated therefrom. These nucleic acids were then used to determine/assemble fungal genomic sequences and generate cDNA libraries. Bioinformatic tools were used to predict genes in the assembled genomic sequences, and those genes encoding proteins relating to biomass-degradation (e.g., cell wall deconstruction) were identified based on bioinformatics (e.g., the presence of conserved domains). Sequences predicted to encode proteins which are targeted to the mitochondria or bound to the cell wall were removed. cDNA clones comprising full-length sequences predicted to encode soluble, secreted proteins relating to biomass-degradation were fully sequenced and cloned into appropriate expression vectors for protein production and characterization. The full-length genomic, exonic, intronic, coding and polypeptide sequences are disclosed herein, along with corresponding putative (biological) functions and/or protein activities, where available.
[0013] The soluble, secreted, biomass degradation proteins of the present invention comprise a proteome which is referred to herein as the SSBD proteome of Thermoascus aurantiacus strain CBS 181.67, Myceiiophthora fergusii (Corynascus thermophilus) strain CBS 405.69, and Pseudocercosporella herpotrichoides strain 494.80.
[0014] Accordingly, in some aspects the present invention relates to an isolated polypeptide which is:
(a) a polypeptide comprising the amino acid sequence of any one of SEQ ID NOs: 401-600, 1179- 1467, or 2516-3039;
(b) a polypeptide comprising an amino acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% amino acid sequence identity to the polypeptide defined in (a);
(c) a polypeptide comprising an amino acid sequence encoded by the nucleic acid sequence of any one of SEQ ID NOs: 201-400, 890-1178, or 1992-2514;
(d) a polypeptide comprising an amino acid sequence encoded by any one the exonic nucleic acid sequences corresponding to the positions as defined in Tables 2A-2C;
(e) a polypeptide comprising an amino acid sequence encoded by a polynucleotide molecule that hybridizes under medium-high stringency conditions, high stringency conditions, or very high stringency conditions with the full-length complement of a polynucleotide molecule comprising the nucleic acid sequence defined in (c) or (d);
(f) a polypeptide comprising an amino acid sequence encoded by a polynucleotide molecule having at least 60%, at least 65% at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% nucleic acid sequence identity to a polynucleotide molecule comprising the nucleic acid sequence defined in (c) or (d);
(g) a functional variant of the polypeptide defined in (a) comprising a substitution, deletion, and/or insertion at one or more residues; or
(h) a functional fragment of the polypeptide of any one of (a) to (g).
[0015] In some embodiments, the above mentioned polypeptide has a corresponding function and/or protein activity according to Tables 1A-1C.
[0016] In some embodiments, the above mentioned polypeptide comprises or consists of the amino acid sequence of any one of SEQ ID NOs: 401-600, 1179-1467, or 2516-3039.
[0017] In some embodiments, the above mentioned polypeptide is a recombinant polypeptide.
[0018] In some embodiments, above mentioned polypeptide is obtainable from a fungus. In some embodiments, the fungus is from the genus Thermoascus, Myceliophthora (Corynascus), or Pseudocercosporella. In some embodiments, the fungus is Thermoascus aurantiacus, Myceliophthora fergusii (Corynascus thermophilus), or
Pseudocercosporella herpotrichoides.
[0019] In some aspects, the present invention relates to an antibody that specifically binds to any one of the above mentioned polypeptides.
[0020] In some aspects, the present invention relates to an isolated polynucleotide molecule encoding any one of the above mentioned polypeptides.
[0021] In some aspects, the present invention relates to an isolated polynucleotide molecule which is:
(a) a polynucleotide molecule comprising a nucleic acid sequence encoding the polypeptide of any one of SEQ ID NOs: 401-600, 1179-1467, or 2516-3039;
(b) a polynucleotide molecule comprising the nucleic acid sequence of any one of SEQ ID NOs: 1- 200, 601-889, or 1468-1991 ;
(c) a polynucleotide molecule comprising the nucleic acid sequence of any one of SEQ ID NOs: 201- 400, 890-1178, or 1992-2514;
(d) a polynucleotide molecule comprising any one of the exonic nucleic acid sequences corresponding to the positions as defined in Tables 2A-2C;
(e) a polynucleotide molecule comprising a nucleic acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% nucleic acid sequence identity to any one of the polynucleotide molecules defined in (a) to (d); or
(f) a polynucleotide molecule that hybridizes under medium-high stringency conditions, high stringency conditions, or very high stringency conditions with the full-length complement of any one of the polynucleotide molecules defined in (a) to (e).
[0022] In some embodiments, the above mentioned polynucleotide molecule is obtainable from a fungus. In some embodiments, the fungus is from the genus Thermoascus, Myceiiophthora (Corynascus), or Pseudocercosporeiia. In some embodiments, the fungus is Thermoascus aurantiacus, Myceiiophthora fergusii (Corynascus thermophilus), or Pseudocercosporeiia herpotrichoides.
[0023] In some aspects, the present invention relates to a vector comprising any one of the above mentioned polynucleotide molecules. In some embodiments, the vector comprises a regulatory sequence operatively linked to the polynucleotide molecule for expression of same in a suitable host cell. In some embodiments, the suitable host cell is a bacterial cell; a fungal cell; or a filamentous fungal cell.
[0024] In some embodiments, the present invention relates to a recombinant host cell comprising any one of the above mentioned polynucleotide molecules or vectors. In some embodiments, the present invention relates to a polypeptide obtainable by expressing the above mentioned polynucleotide or vector in a suitable host cell. In some embodiments, the suitable host cell is a bacterial cell; a fungal cell; or a filamentous fungal cell.
[0025] In some aspects, the present invention relates to a composition comprising any one of the above mentioned polypeptides or the recombinant host cells. In some embodiments, the composition further comprising a suitable carrier. In some embodiments, the composition further comprises a substrate of the polypeptide. In some embodiments, the substrate is biomass.
[0026] In some aspects, the present invention relates to a method for producing any one of the above mentioned polypeptides, the method comprising: (a) culturing a strain comprising the above mentioned polynucleotide molecule or vector under conditions conducive for the production of the polypeptide; and (b) recovering the polypeptide. In some embodiments, the strain is a bacterial strain; a fungal strain; or a filamentous fungal strain.
[0027] In some aspects, the present invention relates to a method for producing any one of the above mentioned polypeptides, the method comprising: (a) culturing the above mentioned recombinant host cell under conditions conducive for the production of the polypeptide; and (b) recovering the polypeptide.
[0028] In some aspects, the present invention relates to a method for preparing a food product, the method comprising incorporating any one of the above mentioned polypeptides during preparation of the food product. In some embodiments, the food product is a bakery product.
[0029] In some aspects, the present invention relates to the use of the above mentioned polypeptide for the preparation or processing of a food product. In some embodiments, the food product is a bakery product.
[0030] In some aspects, the present invention relates to the use of any one of the above mentioned polypeptides for the preparation or processing of a food product. In some embodiments, the food product is a bakery product.
[0031] In some aspects, the present invention relates to the above mentioned polypeptide for use in the preparation or processing of a food product. In some embodiments, the food product is a bakery product.
[0032] In some aspects the present invention relates to the use of any one of the above mentioned polypeptides for the preparation of animal feed. In some aspects the present invention relates to the use of any one of the above mentioned polypeptides for increasing digestion or absorption of animal feed. In some aspects, the present invention relates to any one of the above mentioned polypeptides for use in the preparation of animal feed, or for increasing digestion or absorption of animal feed. In some embodiment, the animal feed is a cereal-based feed.
[0033] In some aspects the present invention relates to the use of any one of the above mentioned polypeptides for the production or processing of kraft pulp or paper. In some aspects the present invention relates to any one of the above mentioned polypeptides for the production or processing of kraft pulp or paper. In some embodiments, the processing comprises prebleaching and/or de-inking.
[0034] In some aspects the present invention relates to the use of any one of the above mentioned polypeptides for processing lignin. In some aspects the present invention relates to any one of the above mentioned polypeptides for processing lignin.
[0035] In some aspects the present invention relates to the use of any one of the above mentioned polypeptides for producing ethanol. In some aspects the present invention relates to any one of the above mentioned polypeptides for producing ethanol.
[0036] In some embodiments, the above mentioned uses are in conjunction with cellulose or a cellulase.
[0037] In some aspects the present invention relates to the use of any one of the above mentioned polypeptides for treating textiles or dyed textiles. In some aspects the present invention relates to any one of the above mentioned polypeptides for treating textiles or dyed textiles.
[0038] In some aspects the present invention relates to the use of any one of the above mentioned polypeptides for degrading biomass or pretreated biomass. In some aspects the present invention relates to any one of the above mentioned polypeptides for degrading biomass or pretreated biomass.
[0039] In some embodiments, the present invention relates to proteins and/or enzymes that are thermostable. In some embodiments, a polypeptide of the present invention retains a level of its function and/or protein activity at about 50°C, about 55°C, about 60°C, about 65°C, about 70°C, about 75°C, about 80°C, or about 95°C. In some embodiments, a polypeptide of the present invention retains a level of its function and/or protein activity between about 50°C and about 95°C, between about 50°C and about 90°C, between about 50°C and about 85°C, between about 50°C and about 80°C, between about 50°C and about 75°C, between about 50°C and about 70°C, or between about 50°C and about 65°C. In some embodiments, a polypeptide of the present invention has optimal or maximal function and/or protein activity greater than 50°C, greater than 55°C, greater than 60°C, greater than 65°C, or greater than 70°C. In some embodiments, a polypeptide of the present invention has optimal or maximal function and/or protein activity between about 50°C and about 95°C, between about 50°C and about 90°C, between about 50°C and about 85°C, between about 50°C and about 80°C, between about 50°C and about 75°C, between about 50°C and about 70°C, or between about 50°C and about 65°C.
[0040] Unless defined otherwise, the scientific and technological terms and nomenclature used herein have the same meaning as commonly understood by a person of ordinary skill to which this invention pertains. Commonly understood definitions of molecular biology terms can be found for example in Dictionary of Microbiology and Molecular Biology, 2nd ed. (Singleton et al., 1994, John Wiley & Sons, New York, NY) or The Harper Collins Dictionary of Biology (Hale & Marham, 1991 , Harper Perennial, New York, NY), Rieger et al., Glossary of genetics: Classical and molecular, 5th edition, Springer- Verlag, New- York, 1991 ; Alberts et al., Molecular Biology of the Cell, 4th edition, Garland science, New-York, 2002; and, Lewin, Genes VII, Oxford University Press, New-York, 2000. Generally, the procedures of molecular biology methods and the like are common methods used in the art. Such standard techniques can be found in reference manuals such as for example Sambrook et al., (2000, Molecular Cloning - A Laboratory Manual, Third Edition, Cold Spring Harbor Laboratories); and Ausubel et al., (1994, Current Protocols in Molecular Biology, John Wiley & Sons, New- York).
[0041] As used herein, the expressions 'Myceliophthora (Corynascus)" and 'Myceliophthora fergusii (Corynascus thermophilum)" are meant to reflect the recent proposed changes in the taxonomy of all existing Corynascus species, which should be renamed to Myceliophthora according to phylogenic studies by van den Brink et al., ("Phylogeny of the industrial relevant, thermophilic genera Myceliophthora and Corynascus", Fungal Diversity (2012), 52:197-207). However, regardless of taxonomic classification, a person of skill in the art would be able to identify the organism used to determine the sequences disclosed herein for example based on the strain's accession number (CBS 405.69).
[0042] Further objects and advantages of the present invention will be clear from the description that follows.
Definitions
[0043] Headings, and other identifiers, e.g., (a), (b), (i), (ii), etc., are presented merely for ease of reading the specification and claims. The use of headings or other identifiers in the specification or claims does not necessarily require the steps or elements be performed in alphabetical or numerical order or the order in which they are presented.
[0044] In the present description, a number of terms are extensively utilized. In order to provide a clear and consistent understanding of the specification and claims, including the scope to be given such terms, the following definitions are provided.
[0045] Nucleotide sequences are presented herein by single strand, in the 5' to 3' direction, from left to right, using the one-letter nucleotide symbols as commonly used in the art and in accordance with the recommendations of the lUPAC-IUB Biochemical Nomenclature Commission.
[0046] The use of the word "a" or "an" when used in conjunction with the term "comprising" in the claims and/or the specification may mean "one" but it is also consistent with the meaning of "one or more", "at least one", and "one or more than one".
[0047] As used in the specification and claims, the words "comprising" (and any form of comprising, such as "comprise" and "comprises"), "having" (and any form of having, such as "have" and "has"), "including" (and any form of including, such as "includes" and "include") or "containing" (and any form of containing, such as "contains" and "contain") are inclusive or open-ended and do not exclude additional, un-recited elements or method steps.
[0048] The term "about" is used to indicate that a value includes the standard deviation of error for the device or method being employed to determine the value. In general, the terminology "about" is meant to designate a possible variation of up to 10%. Therefore, a variation of 1 , 2, 3, 4, 5, 6, 7, 8, 9 and 10% of a value is included in the term "about".
[0049] The term "DNA" or "RNA" molecule or sequence (as well as sometimes the term "oligonucleotide") refers to a molecule comprised generally of the deoxyribonucleotides adenine (A), guanine (G), thymine (T) and/or cytosine (C). In "RNA", T is replaced by uracil (U).
[0050] The present description refers to a number of routinely used recombinant DNA (rDNA) technology terms. Nevertheless, definitions of selected examples of such rDNA terms are provided for clarity and consistency.
[0051] As used herein, "polynucleotide" or "nucleic acid molecule" refers to a polymer of nucleotides and includes DNA (e.g., genomic DNA, cDNA), RNA molecules (e.g., mRNA), and chimeras thereof. The nucleic acid molecule can be obtained by cloning techniques or synthesized. DNA can be double-stranded or single-stranded (coding strand or non-coding strand [antisense]). Conventional deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) are included in the terms "nucleic acid molecule" and "polynucleotide" as are analogs thereof (e.g., generated using nucleotide analogs, e.g., inosine or phosphorothioate nucleotides). Such nucleotide analogs can be used, for
example, to prepare polynucleotides that have altered base-pairing abilities or increased resistance to nucleases. A nucleic acid backbone may comprise a variety of linkages known in the art, including one or more of sugar- phosphodiester linkages, peptide-nucleic acid bonds (referred to as "peptide nucleic acids" (PNA); Hydig-Hielsen et al., PCT Int'l Pub. No. WO 95/32305), phosphorothioate linkages, methylphosphonate linkages or combinations thereof. Sugar moieties of the nucleic acid may be ribose or deoxyribose, or similar compounds having known substitutions, e.g., 2' methoxy substitutions (containing a 2'-0-methylribofuranosyl moiety; see PCT No. WO 98/02582) and/or 2' halide substitutions. Nitrogenous bases may be conventional bases (A, G, C, T, U), known analogs thereof (e.g., inosine or others; see "The Biochemistry of the Nucleic Acids 5-36", Adams et al., ed., 11th ed., 1992), or known derivatives of purine or pyrimidine bases (see, Cook, PCT Int'l Pub. No. WO 93/13121) or "abasic" residues in which the backbone includes no nitrogenous base for one or more residues (Arnold et al., U.S. Pat. No. 5,585,481). A nucleic acid may comprise only conventional sugars, bases and linkages, as found in RNA and DNA, or may include both conventional components and substitutions (e.g., conventional bases linked via a methoxy backbone, or a nucleic acid including conventional bases and one or more base analogs).
[0052] An "isolated nucleic acid molecule", as is generally understood and used herein, refers to a polymer of nucleotides, and includes, but should not limited to DNA and RNA. The "isolated" nucleic acid molecule is purified from its natural in vivo state, obtained by cloning or chemically synthesized.
[0053] As used herein, the terms "gene" and "recombinant gene" refer to nucleic acid molecules which may be isolated from chromosomal DNA, and very often include an open reading frame encoding a protein, e.g., polypeptides of the present invention. A gene may include coding sequences, non-coding sequences, introns and regulatory sequences, as well known.
[0054] "Amplification" refers to any in vitro procedure for obtaining multiple copies ("amplicons") of a target nucleic acid sequence or its complement or fragments thereof. In vitro amplification refers to production of an amplified nucleic acid that may contain less than the complete target region sequence or its complement. In vitro amplification methods include, e.g., transcription-mediated amplification, replicase-mediated amplification, polymerase chain reaction (PCR) amplification, ligase chain reaction (LCR) amplification and strand-displacement amplification (SDA including multiple strand-displacement amplification method (MSDA)). Replicase-mediated amplification uses self-replicating RNA molecules, and a replicase such as Οβ-replicase (e.g., Kramer et al., U.S. Pat. No. 4,786,600). PCR amplification is well known and uses DNA polymerase, primers and thermal cycling to synthesize multiple copies of the two complementary strands of DNA or cDNA (e.g., Mullis et al., U.S. Pat. Nos. 4,683,195, 4,683,202, and 4,800,159). LCR amplification uses at least four separate oligonucleotides to amplify a target and its complementary strand by using multiple cycles of hybridization, ligation, and denaturation (e.g., EP Pat. App. Pub. No. 0320308). SDA is a method in which a primer contains a recognition site for a restriction endonuclease that permits the endonuclease to nick one strand of a hemimodified DNA duplex that includes the target sequence, followed by amplification in a series of primer extension and strand displacement steps (e.g., Walker et al., U.S. Pat.
No. 5,422,252). Two other known strand-displacement amplification methods do not require endonuclease nicking (Dattagupta et al., U.S. Patent No. 6,087,133 and U.S. Patent No. 6,124,120 (MSDA)). Those skilled in the art will understand that the oligonucleotide primer sequences of the present invention may be readily used in any in vitro amplification method based on primer extension by a polymerase (e.g., see Kwoh et al., 1990, Am. Biotechnol. Lab. 8:14 25 and Kwoh et al., 1989, Proc. Natl. Acad. Sci. USA 86, 1173 1177; Lizardi et al., 1988, BioTechnology 6:1197 1202; Malek et al., 1994, Methods Mol. Biol., 28:253 260; and Sambrook et al., 2000, Molecular Cloning - A Laboratory Manual, Third Edition, CSH Laboratories). As commonly known in the art, the oligos are designed to bind to a complementary sequence under selected conditions. The terminology "amplification pair" or "primer pair" refers herein to a pair of oligonucleotides (oligos) of the present invention, which are selected to be used together in amplifying a selected nucleic acid sequence by one of a number of types of amplification processes.
[0055] As used herein, the terms "hybridizing" and "hybridizes" are intended to describe conditions for hybridization and washing under which nucleotide sequences at least about 60%, at least about 70%, at least about 80%, more preferably at least about 85%, even more preferably at least about 90%, more preferably at least 95%, more preferably at least 98% or more preferably at least 99% homologous to each other typically remain hybridized to each other. A preferred, non-limiting example of such hybridization conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 1X SSC, 0.1 % SDS at 50°C, preferably at 55°C, preferably at 60°C and even more preferably at 65°C. Highly stringent conditions include, for example, hybridizing at 68°C in 5x SSC/5x Denhardt's solution / 1.0% SDS and washing in 0.2x SSC/0.1% SDS at room temperature. Alternatively, washing may be performed at 42°C. The skilled artisan will know which conditions to apply for stringent and highly stringent hybridization conditions. Additional guidance regarding such conditions is readily available in the art, for example, in Sambrook et al., supra; and Ausubel et al., supra (eds.), 1995, Current Protocols in Molecular Biology, (John Wiley & Sons, N.Y.). Of course, a polynucleotide which hybridizes only to a poly (A) sequence (such as the 3' terminal poly(A) tract of mRNAs), or to a complementary stretch of T (or U) residues, would not be included in a polynucleotide of the invention used to specifically hybridize to a portion of a nucleic acid of the invention, since such a polynucleotide would hybridize to any nucleic acid molecule containing a poly (A) stretch or the complement thereof (e.g., practically any double-stranded cDNA clone).
[0056] The terms "identity" and "percent identity" are used interchangeably herein. For the purpose of this invention, it is defined here that in order to determine the percent identity of two amino acid sequences or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in the sequence of a first amino acid or nucleic acid sequence for optimal alignment with a second amino or nucleic acid sequence). The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences (i.e.,
% identity = number of identical positions/total number of positions (i.e., overlapping positions) x 100). Preferably, the two sequences are the same length. Thus, In accordance with the present invention, the term "identical" or "percent identity" in the context of two or more nucleic acid or amino acid sequences, refers to two or more sequences or subsequences that are the same, or that have a specified percentage of amino acid residues or nucleotides that are the same (e.g., 60% or 65% identity, preferably, 70-95% identity, more preferably at least 95% identity), when compared and aligned for maximum correspondence over a window of comparison, or over a designated region as measured using a sequence comparison algorithm as known in the art, or by manual alignment and visual inspection. Sequences having, for example, 60% to 95% or greater sequence identity are considered to be substantially identical. Such a definition also applies to the complement of a test sequence. Preferably, the described identity exists over a region that is at least about 15 to 25 amino acids or nucleotides in length, more preferably, over a region that is about 50 to 100 amino acids or nucleotides in length. Those having skill in the art will know how to determine percent identity between/among sequences using, for example, algorithms such as those based on CLUSTALW computer program (Thompson Nucl. Acids Res. 2 (1994), 4673-4680) or FASTDB (Brutlag Comp. App. Biosci. 6 (1990), 237-245), as known in the art. Although the FASTDB algorithm typically does not consider internal non-matching deletions or additions in sequences, i.e., gaps, in its calculation, this can be corrected manually to avoid an overestimation of the % identity. CLUSTALW, however, does take sequence gaps into account in its identity calculations. Also available to those having skill in this art are the BLAST and BLAST 2.0 algorithms (Altschul Nucl. Acids Res. 25 (1977), 3389-3402). The BLASTN program for nucleic acid sequences uses as defaults a word length (W) of 11 , an expectation (E) of 10, M=5, N=4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, and an expectation (E) of 10. The BLOSUM62 scoring matrix (Henikoff Proc. Natl. Acad. Sci., USA, 89, (1989), 10915) uses alignments (B) of 50, expectation (E) of 10, M=5, N=4, and a comparison of both strands. Moreover, the present invention also relates to nucleic acid molecules the sequence of which is degenerate in comparison with the sequence of an above-described hybridizing molecule. When used in accordance with the present invention the term "being degenerate as a result of the genetic code" means that due to the redundancy of the genetic code different nucleotide sequences code for the same amino acid. The present invention also relates to nucleic acid molecules which comprise one or more mutations or deletions, and to nucleic acid molecules which hybridize to one of the herein described nucleic acid molecules, which show (a) mutation(s) or (a) deletion(s). The skilled person will appreciate that all these different algorithms or programs will yield slightly different results but that the overall percentage identity of two sequences is not significantly altered when using different algorithms.
[0057] In a related manner, the terms "homology" or "percent homology", refer to a similarity between two polypeptide sequences, but take into account changes between amino acids (whether conservative or not). As well known in the art, amino acids can be classified by charge, hydrophobicity, size, etc. It is also well known in the art that amino acid changes can be conservative (e.g., they do not significantly affect, or not at all, the function of the
protein). A multitude of conservative changes are known in the art, Serine for threonine, isoleucine for leucine, arginine for lysine etc., Thus the term homology introduces evolutionistic notions (e.g., pressure from evolution to a retain function of essential or important regions of a sequence, while enabling a certain drift of less important regions).
[0058] The skilled person will be aware of the fact that several different computer programs are available to determine the homology between two sequences. For instance, a comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch (J. Mol. Biol. (48): 444-453 (1970)) algorithm which has been incorporated into the GAP program in the Accelrys GCG software package (available at http://www.accelrys.com/products/gcg/), using either a BLOSUM62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1 , 2, 3, 4, 5, or 6. The skilled person will appreciate that all these different parameters will yield slightly different results but that the overall percentage identity of two sequences is not significantly altered when using different algorithms.
[0059] In yet another embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the Accelrys GCG software package (available at http://www.accelrys.com/products/gcg/), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1 , 2, 3, 4, 5, or 6. In another embodiment, the percent identity two amino acid or nucleotide sequence is determined using the algorithm of E. Meyers and W. Miller (CABIOS, 4:11 -17 (1989) which has been incorporated into the ALIGN program (version 2.0) (available at the ALIGN Query using sequence data of the Genestream server IGH Montpellier France http://vega.igh.cnrs.fr/bin/align-guess.cgi) using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.
[0060] The nucleic acid and protein sequences of the present invention can further be used as a "query sequence" to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul et al., (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score = 100, wordlength = 12 to obtain nucleotide sequences homologous to nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score = 50, wordlength = 3 to obtain amino acid sequences homologous to protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25(17): 3389- 3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See the homepage of the National Center for Biotechnology Information at http://www.ncbi.nlm.nih.gov/.
[0061] By "sufficiently complementary" is meant a contiguous nucleic acid base sequence that is capable of hybridizing to another sequence by hydrogen bonding between a series of complementary bases. Complementary
base sequences may be complementary at each position in sequence by using standard base pairing (e.g., G:C, A:T or A:U pairing) or may contain one or more residues (including abasic residues) that are not complementary by using standard base pairing, but which allow the entire sequence to specifically hybridize with another base sequence in appropriate hybridization conditions. Contiguous bases of an oligomer are preferably at least about 80% (81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 96, 97, 98, 99, 100%), more preferably at least about 90% complementary to the sequence to which the oligomer specifically hybridizes. Appropriate hybridization conditions are well known to those skilled in the art, can be predicted readily based on sequence composition and conditions, or can be determined empirically by using routine testing (see Sambrook ef al., Molecular Cloning, A Laboratory Manual, 2nd ed. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989) at §§ 1.90-1.91 , 7.37-7.57, 9.47-9.51 and 11.47-11.57, particularly at §§ 9.50-9.51 , 11.12-11.13, 11.45-11.47 and 11.55-11.57).
[0062] The present invention refers to a number of units or percentages that are often listed in sequences. For example, when referring to "at least 80%, at least 85%, at least 90%...", or "at least about 80%, at least about 85%, at least about 90%...", every single unit is not listed, for the sake of brevity. For example, some units (e.g., 81 , 82, 83, 84, 85,... 91 , 92%....) may not have been specifically recited but are considered encompassed by the present invention. The non-listing of such specific units should thus be considered as within the scope of the present invention.
[0063] Nucleic acid sequences may be detected by using hybridization with a complementary sequence (e.g., oligonucleotide probes) (see U.S. Patent Nos. 5,503,980 (Cantor), 5,202,231 (Drmanac et al.), 5,149,625 (Church et al.), 5,112,736 (Caldwell et al.), 5,068,176 (Vijg et al.), and 5,002,867 (Macevicz)). Hybridization detection methods may use an array of probes (e.g., on a DNA chip) to provide sequence information about the target nucleic acid which selectively hybridizes to an exactly complementary probe sequence in a set of four related probe sequences that differ one nucleotide (see U.S. Patent Nos. 5,837,832 and 5,861 ,242 (Chee et al.)).
[0064] A detection step may use any of a variety of known methods to detect the presence of nucleic acid by hybridization to an oligonucleotide probe. The types of detection methods in which probes can be used include Southern blots (DNA detection), dot or slot blots (DNA, RNA), and Northern blots (RNA detection). Labeled proteins could also be used to detect a particular nucleic acid sequence to which it binds (e.g., protein detection by far western technology: Guichet et al., 1997, Nature 385(6616): 548-552; and Schwartz et al., 2001 , EMBO 20(3): 510- 519). Other detection methods include kits containing reagents of the present invention on a dipstick setup and the like. Of course, it might be preferable to use a detection method which is amenable to automation. A non-limiting example thereof includes a chip or other support comprising one or more (e.g., an array) of different probes.
[0065] A "label" refers to a molecular moiety or compound that can be detected or can lead to a detectable signal. A label is joined, directly or indirectly, to a nucleic acid probe or the nucleic acid to be detected (e.g., an amplified sequence) or to a polypeptide to be detected. Direct labeling can occur through bonds or interactions that link the label to the polynucleotide or polypeptide (e.g., covalent bonds or non-covalent interactions), whereas indirect
labeling can occur through the use of a "linker" or bridging moiety, such as additional nucleotides, amino acids or other chemical groups, which are either directly or indirectly labeled. Bridging moieties may amplify a detectable signal. Labels can include any detectable moiety (e.g., a radionuclide, ligand such as biotin or avidin, enzyme or enzyme substrate, reactive group, chromophore such as a dye or colored particle, luminescent compound including a bioluminescent, phosphorescent or chemiluminescent compound, and fluorescent compound).
[0066] As used herein, "expression" is meant the process by which a gene or otherwise nucleic acid sequence eventually produces a polypeptide. It involves transcription of the gene into mRNA, and the translation of such mRNA into polypeptide(s).
[0067] The terms "peptide" and "oligopeptide" are considered synonymous (as is commonly recognized) and each term can be used interchangeably as the context required to indicate a chain of at least two amino acids coupled by peptidyl linkages. The word "polypeptide" is used herein for chains containing more than seven amino acid residues. All oligopeptide and polypeptide formulas or sequences herein are written from left to right and in the direction from amino terminus to carboxyl terminus. The one-letter code of amino acids used herein is commonly known in the art and can be found in Sambrook, et al., supra. Sequence Listings programs can convert easily this one-letter code of amino acids sequence into a three-letter code.
[0068] The phrase "mature polypeptide" is defined herein as a polypeptide having biological activity a polypeptide of the present invention that is in its final form, following translation and any post-translational modifications, such as N-terminal processing, C-terminal truncation, removal of signal sequences, glycosylation, phosphorylation, etc. In one embodiment, polypeptides of the present invention comprise mature of polypeptides of any one of the polypeptides disclosed herein. Mature polypeptides of the present invention can be predicted using programs such as SignalP. The phrase "mature polypeptide coding sequence" is defined herein as a nucleotide sequence that encodes a mature polypeptide as defined above. As well known, some nucleotide sequences are non- coding.
[0069] As used herein, the term "purified" or "isolated" refers to a molecule (e.g., polynucleotide or polypeptide) having been separated from a component of the composition in which it was originally present. Thus, for example, an "isolated polynucleotide" or "isolated polypeptide" has been purified to a level not found in nature. A "substantially pure" molecule is a molecule that is lacking in most other components (e.g., 30, 40, 50, 60, 70, 75, 80, 85, 90, 95, 96, 97, 98, 99, 100% free of contaminants). By opposition, the term "crude" means molecules that have not been separated from the components of the original composition in which it was present. For the sake of brevity, the units (e.g., 66, 67...81 , 82, 83, 84, 85,...91 , 92%....) have not been specifically recited but are considered nevertheless within the scope of the present invention.
[0070] An "isolated polynucleotide" or "isolated nucleic acid molecule" is a nucleic acid molecule (DNA or RNA) that is not immediately contiguous with both of the coding sequences with which it is immediately contiguous (one on the 5' end and one on the 3' end) in the naturally occurring genome of the organism from which it is derived.
Thus, in one embodiment, an isolated nucleic acid includes some or all of the 5' non-coding (e.g., promoter) sequences that are immediately contiguous to the coding sequence. The term therefore includes, for example, a recombinant DNA that is incorporated into a vector, into an autonomously replicating plasmid or virus, or into the genomic DNA of a prokaryote or eukaryote, or which exists as a separate molecule (e.g., a cDNA or a genomic DNA fragment produced by PCR or restriction endonuclease treatment) independent of other sequences. It also includes a recombinant DNA that is part of a hybrid gene encoding an additional polypeptide that is substantially free of cellular material, viral material, or culture medium (when produced by recombinant DNA techniques), or chemical precursors or other chemicals (when chemically synthesized). Moreover, an "isolated nucleic acid fragment" is a nucleic acid fragment that is not naturally occurring as a fragment and would not be found in the natural state.
[0071] As used herein, an "isolated polypeptide" or "isolated protein" is intended to include a polypeptide or protein removed from its native environment. For example, recombinantly produced polypeptides and proteins expressed in host cells are considered isolated for the purpose of the invention, as are native or recombinant polypeptides which have been substantially purified by any suitable technique such as, for example, the single-step purification method disclosed in Smith and Johnson, Gene 67:31-40 (1988).
[0072] The term "variant" refers herein to a polypeptide, which is substantially similar in structure (e.g., amino acid sequence) to a polypeptide disclosed herein or encoded by a nucleic acid sequence disclosed herein without being identical thereto. Thus, two molecules can be considered as variants even though their primary, secondary, tertiary or quaternary structures are not identical. A variant can comprise an insertion, substitution, or deletion of one or more amino acids as compared to its corresponding native protein. A variant can comprise additional modifications (e.g., post-translational modifications such as acetylation, phosphorylation, glycosylation, sulfatation, sumoylation, prenylation, ubiquitination, etc). As used herein, the term "functional variant" is intended to include a variant which is sufficiently similar in both structure and function to a polypeptide disclosed herein or encoded by a nucleic acid sequence disclosed herein, to maintain at least one of its native biological activities.
[0073] As used herein, the term "biomass" refers to any cellulosic or lignocellulosic material and includes materials comprising cellulose, and optionally further comprising hemicellulose, lignin, starch, oligosaccharides and/or monosaccharides. Biomass may also comprise additional components, such as protein and/or lipid. Biomass may be derived from a single source, or biomass can comprise a mixture derived from more than one source; for example, biomass could comprise a mixture of corn cobs and corn stover, or a mixture of grass and leaves. Biomass includes, but is not limited to, bioenergy crops, agricultural residues, municipal solid waste, industrial solid waste, sludge from paper manufacture, yard waste, wood and forestry waste or a combination thereof. Examples of biomass include, but are not limited to, corn grain, corn cobs, crop residues such as corn husks, corn stover, grasses, wheat, wheat straw, barley, barley straw, hay, rice straw, switchgrass, waste paper, sugar cane bagasse, sorghum, soy, components obtained from milling of grains, trees, branches, roots, leaves, wood chips, sawdust, shrubs and bushes, vegetables, fruits, flowers, and animal manure or a combination thereof. Biomass that is useful for the invention may
include biomass that has a relatively high carbohydrate value, is relatively dense, and/or is relatively easy to collect, transport, store and/or handle. In one embodiment of the present invention, biomass that is useful includes corn cobs, corn stover, sawdust, and sugar cane bagasse.
[0074] As used herein, the terms "cellulosic" or "cellulose-containing material" refers to a composition comprising cellulose. As used herein, the term "lignocellulosic" refers to a composition comprising both lignin and cellulose. Lignocellulosic material may also comprise hemicellulose. The predominant polysaccharide in the primary cell wall of biomass is cellulose, the second most abundant is hemi-cellulose, and the third is pectin. The secondary cell wall, produced after the cell has stopped growing, also contains polysaccharides and is strengthened by polymeric lignin covalently cross-linked to hemicellulose. Cellulose is a homopolymer of anhydrocellobiose and thus a linear beta-(1-4)-D-glucan, while hemicelluloses include a variety of compounds, such as xylans, xyloglucans, arabinoxylans, and mannans in complex branched structures with a spectrum of substituents. Although generally polymorphous, cellulose is found in plant tissue primarily as an insoluble crystalline matrix of parallel glucan chains. Hemicelluloses usually hydrogen bond to cellulose, as well as to other hemicelluloses, which help stabilize the cell wall matrix.
[0075] Cellulose is generally found, for example, in the stems, leaves, hulls, husks, and cobs of plants or leaves, branches, and wood of trees. The cellulose-containing material can be, but is not limited to, herbaceous material, agricultural residues, forestry residues, municipal solid wastes, waste paper, and pulp and paper mill residues. The cellulose-containing material can be any type of biomass including, but not limited to, wood resources, municipal solid waste, wastepaper, crops, and crop residues (e.g., see Wiselogel et al., 1995, in Handbook on Bioethanol (Charles E. Wyman, editor), pp.105-118, Taylor & Francis, Washington D.C.; Wyman. 1994. Bioresource Technology 50: 3-16; Lynd. 1990. Applied Biochemistry and Biotechnology 24/25: 695-719; Mosier et al., 1999, Recent Progress in Bioconversion of Lignocellulosics, in Advances in Biochemical Engineering/Biotechnology, T. Scheper, managing editor, Volume 65. pp.23-40. Springer- Verlag, New York). It is understood herein that the cellulose may be in the form of lignocellulose, a plant cell wall material containing lignin, cellulose, and hemicellulose in a mixed matrix.
[0076] The phrase "cellulolytic enhancing activity" or "cellulolysis-enhancing" is defined herein as a biological activity which enhances the hydrolysis of a cellulose-containing material by proteins having cellulolytic activity. The term "cellulolytic activity" is defined herein as a biological activity which hydrolyzes a cellulose- containing material.
[0077] The phrase "lignocellulolytic enhancing activity" or "lignocellulolysis-enhancing" is defined herein as a biological activity which enhances the hydrolysis of a lignocellulose-containing material by proteins having lignocellulolytic activity. The term "lignocellulolytic activity" is defined herein as a biological activity which hydrolyzes a lignocellulose-containing material.
[0078] The term "thermostable", as used herein, refers to an enzyme that retains its function or protein activity at a temperature greater than 50°C; thus, a thermostable cellulose-degrading or cellulase-enhacing enzyme/protein retains the ability to degrade or enhace the degradation of cellulose at this elevated temperature. A protein or enzyme may have more than one enzymatic activity. For example, some polypeptide of the present invention exhibit bifunctional activities such as xylosidase/ arabinosidase activity. Such bifunctional enzymes may exhibit thermostability with regard to one activity, but not another, and still be considered as "thermostable".
BRIEF DESCRIPTION OF DRAWINGS
[0079] In the appended drawings:
[0080] Figure 1 is a schematic map of the pGBFIN-49 expression plasmid.
[0081] Figures 2-4 show protein activity-temperature profiles of various secreted proteins from Thermoascus aurantiacus.
[0082] Figures 5-7 show protein activity-temperature profiles of various secreted proteins from Myceliophthora fergusii (Corynascus thermophilus).
[0083] Figure 8 show protein activity-temperature profiles of various secreted proteins Pseudocercosporella herpotrichoides.
[0084] In the appended Sequence Listing, SEQ ID NOs: 1-600 relate to sequences from Thermoascus aurantiacus; SEQ ID NOs: 601-1467 relate to sequences from Myceliophthora fergusii (Corynascus thermophilus); and SEQ ID NOs: 1468-3039 relate to sequences from Pseudocercosporella herpotrichoides.
DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS POLYPEPTIDES OF THE INVENTION
[0085] In one aspect, the present invention relates to isolated polypeptides secreted by Thermoascus aurantiacus, Myceliophthora fergusii (Corynascus thermophilus), or Pseudocercosporella herpotrichoides, (e.g., Thermoascus aurantiacus strain CBS 181.67, Myceliophthora fergusii (Corynascus thermophilus) strain CBS 405.69, or Pseudocercosporella herpotrichoides strain 494.80) having an activity relating to the processing or degradation of biomass (e.g., cell wall deconstruction).
[0086] In another aspect, the present invention relates to isolated polypeptides comprising the amino acid sequences shown in any one of SEQ ID NOs: 401-600, 1179-1467, or 2516-3039.
[0087] In another aspect, the present invention relates to isolated polypeptides sharing a minimum threshold of amino acid sequence identity with any one of the above-mentioned polypeptides. In specific embodiments, the present invention relates to isolated polypeptides having at least 60%, 65%, 70%, 71%, 72, 73%, 74%, 75%, 76%, 77%, 78%, 79, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% amino acid sequence identity to any one of the above-mentioned polypeptides. Other specific
percentage units that have not been specifically recited here for brevity are nevertheless considered within the scope of the present invention.
[0088] In another aspect, the present invention relates to a polypeptide encoded by a polynucleotide of the present invention, which includes genomic (e.g., SEQ ID NOs: 1-200, 601-889, or 1468-1991), and coding (e.g., SEQ ID NOs: 201-400, 890-1178, or 1992-2514) nucleic acid sequences disclosed herein, polynucleotides hybridizing under medium-high, high, or very high stringency conditions with a full-length complement thereof, as well as polynucleotides sharing a certain degree of nucleic acid sequence identity therewith.
[0089] In another aspect, the present invention relates to a polypeptide comprising an amino acid sequence encoded by at least one exonic nucleic acid sequence of any one of the genomic sequences corresponding to SEQ ID NOs: 1-200, 601-889, or 1468-1991 (e.g., the intron or exon segments defined by the exon boundaries listed in Tables 2A-2C) or a functional part thereof.
[0090] In another aspect, the present invention relates to functional variants of any one of the above- mentioned polypeptides. In another embodiment, the term "functional" or "biologically active" relates to the native enzymatic (e.g., catalytic) activity of a polypeptide of the present invention. In some embodiments, the present invention relates to a polypeptide comprising a biological activity of any one of the enzymes described below, or a polynucleotide encoding same.
[0091] "Carbohydrase" refers to any protein that catalyzes the hydrolysis of carbohydrates. "Glycoside hydrolase", "glycosyl hydrolase" or "glycosidase" refers to a protein that catalyzes the hydrolysis of the glycosidic bonds between carbohydrates or between a carbohydrate and a non-carbohydrate residue. Endoglucanases, cellobiohydrolases, beta-glucosidases, a-glucosidases, xylanases, beta-xylosidases, alpha-xylosidases, galactanases, a-galactosidases, beta-galactosidases, a-amylases, glucoamylases, endo-arabinases, arabinofuranosidases, mannanases, beta-mannosidases, pectinases, acetyl xylan esterases, acetyl mannan esterases, femlic acid esterases, coumaric acid esterases, pectin methyl esterases, and chitosanases are examples of glycosidases.
[0092] "Cellulase" refers to a protein that catalyzes the hydrolysis of 1 ,4-D-glycosidic linkages in cellulose (such as bacterial cellulose, cotton, filter paper, phosphoric acid swollen cellulose, Avicel®); cellulose derivatives (such as carboxymethylcellulose and hydroxyethylcellulose); plant lignocellulosic materials, beta-D-glucans or xyloglucans. Cellulose is a linear beta-(1-4) glucan consisting of anhydrocellobiose units. Endoglucanases, cellobiohydrolases, and beta- glucosidases are examples of cellulases.
[0093] "Endoglucanase" refers to a protein that catalyzes the hydrolysis of cellulose to oligosaccharide chains at random locations by means of an endoglucanase activity.
[0094] "Cellobiohydrolase" refers to a protein that catalyzes the hydrolysis of cellulose to cellobiose via an exoglucanase activity, sequentially releasing molecules of cellobiose from the reducing or non-reducing ends of
cellulose or cello- oligosaccharides. "Beta-glucosidase" refers to an enzyme that catalyzes the conversion of cellobiose and oligosaccharides to glucose.
[0095] "Hemicellulase" refers to a protein that catalyzes the hydrolysis of hemicellulose, such as that found in lignocellulosic materials. Hemicelluloses are complex polymers, and their composition often varies widely from organism to organism, and from one tissue type to another. Hemicelluloses include a variety of compounds, such as xylans, arabinoxylans, xyloglucans, mamians, glucomannans, and galacto(gluco)mannans. Hemicellulose can also contain glucan, which is a general term for beta-linked glucose residues. In general, a main component of hemicellulose is beta-1 ,4-linked xylose, a five carbon sugar. However, this xylose is often branched as beta-1 ,3 linkages or beta-1 ,2 linkages, and can be substituted with linkages to arabinose, galactose, mannose, glucuronic acid, or by esterification to acetic acid. Hemicellulolytic enzymes, i.e., hemicellulases, include both endo-acting and exo-acting enzymes, such as xylanases, beta-xylosidases. alpha-xylosidases, galactanases, a-galactosidases, beta- galactosidases, endo-arabinases, arabinofuranosidases, mannanases, and beta-mannosidases. Hemicellulases also include the accessory enzymes, such as acetylesterases, ferulic acid esterases, and coumaric acid esterases. Among these, xylanases and acetyl xylan esterases cleave the xylan and acetyl side chains of xylan and the remaining xylo-oligomers are unsubstituted and can thus be hydrolysed with beta-xylosidase only. In addition, several less known side activities have been found in enzyme preparations which hydrolyze hemicellulose. Accordingly, xylanases, acetylesterases and beta- xylosidases are examples of hemicellulases.
[0096] "Xylanase" specifically refers to an enzyme that hydrolyzes the beta-1 , 4 bond in the xylan backbone, producing short xylooligosaccharides.
[0097] "Beta-mannanase" or "endo-1,4-beta-mannosidase" refers to a protein that hydrolyzes mannan- based hemicelluloses (mannan, glucomannan, galacto(gluco)mannan) and produces short beta-1 ,4- mannooligosaccharides.
[0098] "Mannan endo-1,6-alpha-mannosidase" refers to a protein that hydrolyzes 1 ,6-alpha-mannosidic linkages in unbranched 1 ,6-mannans.
[0099] "Beta-mannosidase" (beta-1 ,4-mannoside mannohydrolase; EC 3.2.1.25) refers to a protein that catalyzes the removal of beta-D-mannose residues from the non-reducing ends of oligosaccharides.
[00100] "Galactanase", "endo-beta-1,6-galactanse" or "arabinogalactan endo-1,4-beta-galactosidase" refers to a protein that catalyzes the hydrolysis of endo-1 ,4-beta-D-galactosidic linkages in arabinogalactans.
[00101] "Glucoamylase" refers to a protein that catalyzes the hydrolysis of terminal 1 ,4-linked-D-glucose residues successively from non-reducing ends of the glycosyl chains in starch with the release of beta-D-glucose.
[00102] "Beta-hexosaminidase" or "beta-N-acetylglucosaminidase" refers to a protein that catalyzes the hydrolysis of terminal N-acetyl-D-hexosamine residues in N-acetyl-beta-D-hexosamines.
[00103] "Alpha-L-arabinofuranosidase", "alpha-N-arabmofuranosidase", "alpha-arabinofuranosidase",
"arabinosidase" or "arabinofuranosidase" refers to a protein that hydrolyzes arabinofuranosyl-containing
hemicelluloses or pectins. Some of these enzymes remove arabinofuranoside residues from 0-2 or 0-3 single substituted xylose residues, as well as from 0-2 and/or 0-3 double substituted xylose residues. Some of these enzymes remove arabinose residues from arabinan oligomers.
[00104] "Endo-arabinase" refers to a protein that catalyzes the hydrolysis of 1 ,5-alpha-arabinofuranosidic linkages in 1 ,5-arabinans.
[00105] "Exo-arabinase" refers to a protein that catalyzes the hydrolysis of 1 ,5-alpha-linkages in 1 ,5-arabinans or 1 ,5-alpha-L arabino-oligosaccharides, releasing mainly arabinobiose, although a small amount of arabinotriose can also be liberated.
[00106] "Beta-xylosidase" refers to a protein that hydrolyzes short 1 ,4-beta-D-xylooligomers into xylose.
[00107] "Cellobiose dehydrogenase" refers to a protein that oxidizes cellobiose to cellobionolactone.
[00108] "Chitosanase" refers to a protein that catalyzes the endohydrolysis of beta-1 ,4-linkages between D- glucosamine residues in acetylated chitosan (i.e., deacetylated chitin).
[00109] "Exo-polygalacturonase" refers to a protein that catalyzes the hydrolysis of terminal alpha 1 ,4-linked galacturonic acid residues from non-reducing ends thus converting polygalacturonides to galacturonic acid.
[00110] "Acetyl xylan esterase" refers to a protein that catalyzes the removal of the acetyl groups from xylose residues. "Acetyl mannan esterase" refers to a protein that catalyzes the removal of the acetyl groups from mannose residues, "ferulic esterase" or "ferulic acid esterase" refers to a protein that hydrolyzes the ester bond between the arabinose substituent group and ferulic acid. "Coumaric acid esterase" refers to a protein that hydrolyzes the ester bond between the arabinose substituent group and coumaric acid. Acetyl xylan esterases, ferulic acid esterases and pectin methyl esterases are examples of carbohydrate esterases.
[00111] "Pectate lyase" and "pectin lyases" refer to proteins that catalyze the cleavage of 1 ,4-alpha-D- galacturonan by beta-elimination acting on polymeric and/or oligosaccharide substrates (pectates and pectins, respectively).
[00112] "Endo-1,3-beta-glucanase" or "laminarinase" refers to a protein that catalyzes the cleavage of 1 ,3- linkages in beta-D-glucans such as laminarin or lichenin. Laminarin is a linear polysaccharide made up of beta-1 , 3- glucan with beta-1 , 6-linkages.
[00113] "Lichenase" refers to a protein that catalyzes the hydrolysis of lichenan, a linear, 1 ,3-1 ,4-beta-D glucan.
[00114] Rhamnogalacturonan is composed of alternating alpha-1 ,4-rhamnose and alpha-1 ,2-linked galacturonic acid, with side chains linked 1 ,4 to rhamnose. The side chains include Type I galactan, which is beta- 1 ,4-linked galactose with alpha-1 , 3-linked arabinose substituents; Type II galactan, which is beta-1 , 3-1 ,6-linked galactoses (very branched) with arabinose substituents; and arabinan, which is alpha-1 ,5-linked arabinose with alpha-1 , 3-linked arabinose branches. The galacturonic acid substituents may be acetylated and/or methylated.
[00115] "Exo-rhamnogalacturonanase" refers to a protein that catalyzes the degradation of the rhamnogalacturonan backbone of pectin from the non-reducing end.
[00116] "Rhamnogalacturonan acetylesterase" refers to a protein that catalyzes the removal of the acetyl groups ester-linked to the highly branched rhamnogalacturonan (hairy) regions of pectin.
[00117] "Rhamnogalacturonan lyase" refers to a protein that catalyzes the degradation of the rhamnogalacturonan backbone of pectin via a beta-elimination mechanism (e.g., see Pages et al., J. Bacteriol., 185:4727-4733 (2003)).
[00118] "Alpha-rhamnosidase" refers to a protein that catalyzes the hydrolysis of terminal non-reducing alpha- L-rhamnose residues in alpha-L-rhamnosides.
[00119] Certain proteins of the present invention may be classified as "Family 61 glycosidases" based on homology of the polypeptides to CAZy Family GH61. Family 61 glycosidases may exhibit cellulolytic enhancing activity or endoglucanase activity. Additional information on the properties of Family 61 glycosidases may be found in U.S. Patent Application Publication Nos. 2005/0191736, 2006/0005279, 2007/0077630, and in PCT Publication No.. WO 2004/031378.
[00120] "Esterases" represent a category of various enzymes including lipases, phospholipases, cutinases, and phytases that catalyze the hydrolysis and synthesis of ester bonds in compounds.
[00121] The International Union of Biochemistry and Molecular Biology have developed a nomenclature for enzymes where each enzyme is described by a sequence of four numbers preceded by "EC". The first number broadly classifies the enzyme based on its mechanism. According to the naming conventions, enzymes are generally classified into six main family classes and many sub-family classes: EC 1 Oxidoreductases: catalyze oxidation/reduction reactions; EC 2 Transferases: transfer a functional group (e.g. a methyl or phosphate group); EC 3 Hydrolases: catalyze the hydrolysis of various bonds; EC 4 Lyases: cleave various bonds by means other than hydrolysis and oxidation; EC 5 Isomerases: catalyze isomerization changes within a single molecule; and EC 6 Ligases: join two molecules with covalent bonds. A number of bioinformatic tools are available to the skilled person to predict which main family class and sub-family class an enzyme molecule belongs to according to its sequence information. In some instances, certain enzymes (or family of enzymes) can be re-classified, for example, to take into account newly discovered enzyme functions or properties. Accordingly, the polypeptides/enzymes of the present invention are not meant to be limited to specific enzyme classes as they currently exist. The skilled person would know how to appropriately reclassify (and assign the appropriate functions) to the enzymes of the present invention based on the amino acid sequence information provided herein. Such reclassifications are thus within the scope of the present invention.
[00122] In some embodiments, the present invention relates to a polypeptide comprising a biological activity of any one of the enzymes (or sub-classes thereof), or a polynucleotide encoding same.
Cellulose-hydrolyzing enzymes, including: endoglucanases (EC 3.2.1.4), which hydrolyze the beta-1 ,4- linkages between glucose units; exoglucanases (also known as cellobiohydrolases 1 and 2) (EC 3.2.1.91), which hydrolyze cellobiose, a glucose disaccharide, from the reducing and non-reducing ends of cellulose; and beta-glucosidases (EC 3.2.1.21), which hydrolyze the beta-1 ,4 glycoside bond of cellobiose to glucose;
Proteins that enhance or accelerate the action of cellulose-degrading enzymes, including: glycoside hydrolase family 61 (GH61) proteins (e.g., polysaccharide monooxygenases), which enhance the action of cellulose enzymes on lignocellulose substrates;
Enzymes that degrade or modify xylan and/or xylan-lignin complexes, including: xylanases, such as endo- 1,4-beta-xylanase (EC 3.2.1.8), which catalyze the endohydrolysis of 1 -4-beta-D-xylosidic linkages in xylans (or xyloglucans); xylosidases, such as xylan 1,4-beta-xylosidases (EC 3.2.1.37), which catalyze hydrolysis of 1 ,4-beta-D-xylans to remove successive D-xylose residues from the non-reducing terminals, and also cleaves xylobiose; arabinosidases, such as alpha-arabinofuranosidases (EC 3.2.1.55), which hydrolyze terminal non-reducing alpha-L-arabinofuranoside residues in alpha-L-arabinosides (including arabinoxylans and arabinogalactans); alpha-glucuronidases (EC 3.2.1.139), which hydrolyze an alpha-D- glucuronoside to the corresponding alcohol and D-glucuronate; feruloyl esterases (EC 3.1.1.73), which catalyzes hydrolysis of the 4-hydroxy-3-methoxycinnamoyl (feruloyl) group from an esterified sugar (which is usually arabinose in natural substrates); and acetylxylan esterases (EC 3.1.1.72), which catalyze deacetylation of xylans and xylo-oligosaccharides;
Enzymes that degrade or modify mannan, including: mannanases, such as mannan endo-1,4-beta- mannosidase (EC 3.2.1.78), which catalyze random hydrolysis of 1 ,4-beta-D-mannosidic linkages in mannans, galactomannans and glucomannans;
mannosidases (EC 3.2.1.25), which hydrolyze terminal, non-reducing beta-D-mannose residues in beta-D- mannosides; alpha-galactosidases (EC 3.2.1.22), which hydrolyzes terminal, non-reducing alpha-D- galactose residues in alpha-D-galactosides (including galactose oligosaccharides, galactomannans and galactohydrolase); and mannan acetyl esterases;
Enzymes that degrade or modify xyloglucans, including: xyloglucanases such as xyloglucan-specific endo- beta-1 ,4-glucanase (EC 3.2.1.151), which involves endohydrolysis of 1 ,4-beta-D-glucosidic linkages in xyloglucan; and xyloglucan-specific exo-beta-1 ,4-glucanase (EC 3.2.1.155), which catalyzes exohydrolysis of 1 ,4-beta-D-glucosidic linkages in xyloglucan; endoglucanases / cellulases;
Enzymes that degrade or modify glucans, including: Enzymes that degrade beta-1 ,4-glucan, such as endoglucanases; cellobiohydrolases; and beta-glucosidases;
Enzymes that degrade beta-1 ,3-1 ,4-glucan, such as endo-beta-1,3(4)-glucanases (EC 3.2.1.6), which catalyzes endohydrolysis of 1 ,3- or 1 ,4-linkages in beta-D-glucans when the glucose residue whose
reducing group is involved in the linkage to be hydrolyzed is itself substituted at C-3; endoglucanases (beta-glucanase, cellulase), and beta-glucosidases;
• Enzymes that degrade or modify galactans, including: galactanases (EC 3.2.1.23), which hydrolyze terminal non-reducing beta-D-galactose residues in beta-D-galactosides;
• Enzymes that degrade or modify arabinans, including: arabinanases (EC 3.2.1.99), which catalyze endohydrolysis of 1 ,5-alpha-arabinofuranosidic linkages in 1 ,5-arabinans;
• Enzymes that degrade or modify starch, including: amylases, such as alpha-amylases (EC 3.2.1.1), which catalyze endohydrolysis of 1 ,4-alpha-D-glucosidic linkages in polysaccharides containing three or more 1 ,4- alpha-linked D-glucose units; and glucosidases, such as alpha-glucosidases (EC 3.2.1.20), which hydrolyze terminal, non-reducing 1 ,4-linked alpha-D-glucose residues with release of alpha-D-glucose;
• Enzymes that degrade or modify pectin, including: pectate lyases (EC 4.2.2.2), which carry out eliminative cleavage of pectate to give oligosaccharides with 4-deoxy-alpha-D-gluc-4-enuronosyl groups at their non- reducing ends; pectin lyases (EC 4.2.2.10), which catalyze eliminative cleavage of (1-4)-alpha-D- galacturonan methyl ester to give oligosaccharides with 4-deoxy-6-0-methyl-alpha-D-galact-4-enuronosyl groups at their non-reducing ends; polygalacturonases (EC 3.2.1.15), which carry out random hydrolysis of 1 ,4-alpha-D-galactosiduronic linkages in pectate and other galacturonans; pectin esterases, such as pectin acetyl esterase (EC 3.1.1.11), which hydrolyzes acetate from pectin acetyl esters; alpha- arabinofuranosidases; beta-galactosidases; galactanases; arabinanases; rhamnogalacturonases (EC 3.2.1.-), which hydrolyze alpha-D-galacturonopyranosyl-(1 ,2)-alpha-L-rhamnopyranosyl linkages in the backbone of the hairy regions of pectins; rhamnogalacturonan lyases (EC 4.2.2.-), which degrade type I rhamnogalacturonan from plant cell walls and releases disaccharide products; rhamnogalacturonan acetyl esterases (EC 3.1.1.-), which hydrolyze acetate from rhamnogalacturonan; and xylogalacturonosidases and xylogalacturonases (EC 3.2.1.-), which hydrolyze xylogalacturonan (xga), a galacturonan backbone heavily substituted with xylose, and which is one important component of the hairy regions of pectin;
• Enzymes that degrade or modify lignin, including: lignin peroxidases (EC 1.11.1.14), which oxidize lignin and lignin model compounds using hydrogen peroxide; manganese-dependent peroxidases (EC 1.11.1.13), which oxidizes lignin and lignin model compounds using Mn2+ and hydrogen peroxide; versatile peroxidases (EC 1.11.1.16), which oxidize lignin and lignin model compounds using an electron donor and hydrogen peroxide and combines the substrate-specificity characteristics of the two other ligninolytic peroxidases: manganese peroxidase (EC 1.11.1.13) and lignin peroxidase (EC 1.11.1.14); and laccases (EC 1.10.3.2), a group of multi-copper proteins of low specificity acting on both o- and p-quinols, and often acting also on lignin; and
• Enzymes acting on chitin, including: chitinases (EC 3.2.1.14), which catalyze random hydrolysis of N- acetyl-beta-D-glucosaminide 1 ,4-beta-linkages in chitin and chitodextrins; and hexosaminidases, such as
beta-N-acetylhexosaminidase (EC 3.2.1.52), which hydrolyzes terminal non-reducing N-acetyl-D- hexosamine residues in N-acetyl-beta-D-hexosaminides.
[00123] In another embodiment, the present invention includes the polypeptides and their corresponding activities as defined in Tables 1A-1C, as well as functional variants thereof.
[00124] As alluded to above, the term "functional variant" as used herein is intended to include a polypeptide which is sufficiently similar in structure and function to any one of the above-mentioned polypeptides (without being identical thereto) to maintain at least one of its native biological activities. In another embodiment, a functional variant can comprise an insertion, substitution, or deletion of one or more amino acids as compared to its corresponding native protein. In another embodiment, a functional variant can comprise additional modifications (e.g., post- translational modifications such as acetylation, phosphorylation, glycosylation, sulfatation, sumoylation, prenylation, ubiquitination, etc).
[00125] In another embodiment, functional variants of the present invention can contain one or more conservative substitutions of a polypeptide sequence disclosed herein. Such modifications can be carried out routinely using site-specific mutagenesis. The term "conservative substitution" is intended to indicate a substitution in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acids having similar side chains are known in the art and include amino acids with basic side chains (e.g., lysine, arginine and hystidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagines, glutamine, serine, threonine, tyrosine, cysteine), non-polar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine tryptophan, histidine).
[00126] In another embodiment, functional variants of the present invention can contain one or more insertions, deletions or truncations of non-essential amino acids. As used herein, a "non-essential amino acid" is a residue that can be altered in a polypeptide of the present invention without substantially altering its (biological) function or protein activity. For example, amino acid residues that are conserved among the proteins of the present invention having similar biological activities (and their orthologs) are predicted to be particularly unamenable to alteration.
[00127] In another embodiment, functional variants can include functional fragments (i.e., biologically active fragments) of any one of the polypeptide sequences disclosed herein. Such fragments include fewer amino acids than the full length protein from which they are derived, but exhibit at least one biological activity of the corresponding full-length protein. Typically, biologically active fragments comprise a domain or motif with at least one activity of the full-length protein. A biologically active fragment of a protein of the invention can be a polypeptide which is, for example, 10, 25, 50, 100 or more amino acids in length. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the biological activities of the native form of a polypeptide of the present invention.
[00128] In another embodiment, the present invention includes other functional variants of the polypeptides disclosed herein, which can be identified by techniques known in the art. For example, functional variants can be identified by screening combinatorial libraries of mutants (e.g., truncation mutants), of polypeptides of the present invention for biological activity. In another embodiment, a variegated library of variants can be generated by combinatorial mutagenesis at the nucleic acid level. A variegated library of variants can be produced by, for example, enzymatically ligating a mixture of synthetic oligonucleotides into gene sequences such that a degenerate set of potential protein sequences is expressible as individual polypeptides, or alternatively, as a set of larger fusion proteins (e.g., for phage display). There are a variety of methods that can be used to produce libraries of potential variants of the polypeptides of the present invention from a degenerate oligonucleotide sequence. Methods for synthesizing degenerate oligonucleotides are known in the art (e.g., see Narang (1983) Tetrahedron 39:3; Itakura et al., (1984) Annu. Rev. Biochem. 53:323; Itakura et al., (1984) Science 198:1056; Ike et al., (1983) Nucleic Acid Res. 11 :477).
[00129] In addition, libraries of fragments of the coding sequence of a polypeptide of the present invention can be used to generate a variegated population of polypeptides for screening a subsequent selection of variants. For example, a library of coding sequence fragments can be generated by treating a double stranded PCR fragment of the coding sequence of interest with a nuclease under conditions wherein nicking occurs only about once per molecule, denaturing the double stranded DNA, renaturing the DNA to form double stranded DNA which can include sense/antisense pairs from different nicked products, removing single stranded portions from reformed duplexes by treatment with S1 nuclease, and ligating the resulting fragment library into an expression vector. By this method, an expression library can be derived which encodes N-terminal and internal fragments of various sizes of the protein of interest.
[00130] Several techniques are known in the art for screening gene products of combinatorial libraries made by point mutations of truncation, and for screening cDNA libraries for gene products having a selected property. The most widely used techniques, which are amenable to high through-put analysis, for screening large gene libraries typically include cloning the gene library into replicable expression vectors, transforming appropriate cells with the resulting library of vectors, and expressing the combinatorial genes under conditions in which detection of a desired activity facilitates isolation of the vector encoding the gene whose product was detected. Recursive ensemble mutagenesis (REM), a technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify variants of polypeptides of the present invention (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al., (1993) Protein Engineering 6(3): 327-331).
[00131] In another embodiment, functional variants of the present invention can encompass orthologs of the genes and polypeptides disclosed herein. Orthologs of the polypeptides disclosed herein include proteins that can be isolated from other strains or species and possess a similar or identical biological activity. Such orthologs can be identified as comprising an amino acid sequence that is substantially homologous (shares a certain degree of amino
acid sequence identity) with the polypeptides disclosed herein. As used herein, the expression "substantially homologous" refers to a first amino acid or nucleotide sequence which contains a sufficient or minimum number of identical or equivalent (e.g., with similar side chain) amino acids or nucleotides to a second amino acid or nucleotide sequence such that the first and the second amino acid or nucleotide sequences have a common domain. For example, amino acid or nucleotide sequences which contain a common domain having at least 70%, 71%, 72%, 73% 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81 %, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91 % 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% sequence identity are defined herein as sufficiently identical.
[00132] In another embodiment, the present invention includes improved proteins derived from the polypeptides of the present invention. Improved proteins are proteins wherein at least one biological activity is improved. Such proteins may be obtained by randomly introducing mutations along all or part of the coding sequences of the polypeptides of the present invention such as by saturation mutagenesis, and the resulting mutants can be expressed recombinantly and screened for biological activity. For instance, the art provides for standard assays for measuring the enzymatic activity of the resulting protein and thus improved proteins may be selected.
Recovery and purification
[00133] In another aspect, polypeptides of the present invention may be present alone (e.g., in an isolated or purified form), within a composition (e.g., an enzymatic composition for carrying out an industrial process), or in an appropriate host. In one embodiment, polypeptides of the present invention can be recovered and purified from cell cultures (e.g., recombinant cell cultures) by methods known in the art. In another embodiment, high performance liquid chromatography ("HPLC") can be employed for the purification.
[00134] In another aspect, polypeptides of the present invention include naturally purified products, products of chemical synthetic procedures, and products produced by recombinant techniques from a prokaryotic or eukaryotic host, including, for example, bacterial, yeast, higher plant, insect and mammalian cells. Depending on the host employed in a recombinant production procedure, the polypeptides of the present invention may be glycosylated or may be non-glycosylated. In addition, polypeptides of the invention may also include an initial modified methionine residue, in some cases as a result of host-mediated processes.
Fusion proteins
[00135] In another aspect, the present invention includes fusion proteins comprising a polypeptide of the present invention or a functional variant thereof, which is operatively linked to one or more unrelated polypeptide (e.g., heterologous amino acid sequences). "Unrelated polypeptides" or "heterologous polypeptides" or "heterologous sequences" refer to polypeptides or sequences which are usually not present close to or fused to one of the polypeptides of the present invention. Such "unrelated polypeptides" or "heterologous polypeptides" having amino acid sequences corresponding to proteins which are not substantially homologous to the polypeptide
sequences disclosed herein. Such "unrelated polypeptides" can be derived from the same or a different organism. In one embodiment, a fusion protein of the present invention comprises at least two biologically active portions or domains of polypeptide sequences disclosed herein. In the context of fusion proteins, the term "operatively linked" is intended to indicate that all of the different polypeptides are fused in-frame to each other. In another embodiment, an unrelated polypeptide can be fused to the N terminus or C terminus of a polypeptide of the present invention.
[00136] In another embodiment, a polypeptide of the present invention can be fused to a protein which enables or facilitates recombinant protein purification and/or detection. For example, a polypeptide of the present invention can be fused to a protein such as glutathione S-transferase (GST), and the resulting fusion protein can then be purified/detected through the high affinity of GST for glutathione.
[00137] Fusion proteins of the present invention can be produced by standard recombinant DNA techniques. For example, DNA fragments encoding different polypeptide sequences can be ligated together in frame in accordance with conventional techniques, for example by employing blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for appropriate termini, filling -in of cohesive ends as appropriate, alkaline phosphatase treatment to avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can be synthesized by conventional techniques including automated DNA synthesizers. Alternatively, PCR amplification of gene fragments can be carried out using anchor primers, which give rise to complementary overhangs between two consecutive gene fragments which can subsequently be annealed and re-amplified to generate a chimeric gene sequence (e.g., see Current Protocols in Molecular Biology, eds. Ausubel et al., John Wiley & Sons: 1992). Moreover, many expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the present invention can be cloned into such an expression vector so that the fusion moiety is linked in-frame to the polypeptide of interest.
Signal sequences
[00138] In another embodiment, a polypeptide of the present invention can be fused to a heterologous signal sequence (e.g., at its N terminus) to facilitate its isolation, expression and/or secretion from certain host cells (e.g., mammalian and yeast host cells). Signal sequences are typically characterized by a core of hydrophobic amino acids, which are generally cleaved from the mature protein during secretion in one or more cleavage events. Such signal peptides may contain processing sites that allow cleavage of the signal sequence from the mature proteins as they pass through the secretory pathway.
[00139] For example, the gp67 secretory sequence of the baculovirus envelope protein can be used as a heterologous signal sequence {Current Protocols in Molecular Biology, Ausubel et al., eds., John Wiley & Sons, 1992). Other examples of eukaryotic heterologous signal sequences include the secretory sequences of melittin and human placental alkaline phosphatase (Stratagene; La Jolla, California). In yet another example, useful prokaryotic
heterologous signal sequences include the phoA secretory signal (Sambrook et al., supra) and the protein A secretory signal (Pharmacia Biotech; Piscataway, New Jersey).
[00140] The signal sequence can direct secretion of the protein, such as from a eukaryotic host into which the expression vector is transformed, and the signal sequence is subsequently or concurrently cleaved. The protein can then be readily purified from the extracellular medium by known methods. In another embodiment, a signal sequence can be linked to a fusion protein of the present invention to facilitate detection, purification, and/or recovery thereof. For example, the sequence encoding a fusion protein of the present invention may be fused to a marker sequence, such as a sequence encoding a peptide, which facilitates purification of the fused polypeptide. In another embodiment, the marker sequence can be a hexa-histidine peptide, such as the tag provided in a pQE vector (Qiagen, Inc.), among others, many of which are commercially available. As described in Gentz et al., Proc. Natl. Acad. Sci. USA 86:821 -824 (1989), for instance, hexa-histidine provides for convenient purification of the fusion protein. In another embodiment, the HA tag is another peptide useful for purification, which corresponds to an epitope derived of influenza hemaglutinin protein, which has been described by Wilson et al., Cell 37:767 (1984), for instance.
POLYNUCLEOTIDES
[00141] The nucleic acid sequences of the genes disclosed herein were determined by sequencing cDNA clones, mRNA transcripts, or genomic DNA obtained from Thermoascus aurantiacus strain CBS 181.67, Myceliophthora fergusii (Corynascus thermophilus) strain CBS 405.69, and Pseudocercosporella herpotrichoides strain 494.80.
[00142] In another aspect, the present invention relates to polynucleotides encoding a polypeptide of the present invention, including functional variants thereof. In one embodiment, polynucleotides of the present invention comprise the coding nucleic acid sequence of any one of SEQ ID NOs: 201-400, 890-1178, or 1992-2514, or as set forth in Tables 1A-1C.
[00143] In another aspect, the present invention relates to genomic DNA sequences corresponding to the above mentioned coding sequences. In one embodiment, polynucleotides of the present invention comprise the genomic nucleic acid sequence of any one of SEQ ID NOs: 1-200, 601-889, or 1468-1991 ; or as set forth in Tables 1A-1C.
[00144] In another aspect, the present invention relates to a polynucleotide comprising at least one intronic or exonic nucleic acid sequence of any one of the genomic sequences corresponding to SEQ ID NOs: 1-200, 601-889, or 1468-1991 (e.g., the intron or exon segments defined by the exon boundaries listed in Tables 2A-2C). Although only the positions of the exons are defined in Tables 2A-2C, a person of skill in the art would readily be able to determine the positions of the corresponding introns in view of this information. In some embodiments, polynucleotides comprising at least one these intronic segments are within the scope of the present invention.
[00145] In yet another aspect, the present invention relates to a polynucleotide comprising at least one exonic nucleic acid sequence comprised within SEQ ID NOs: 1-200, 601-889, or 1468-1991, or as set forth in Tables 2A- 2C.
[00146] In another aspect, the present invention relates to isolated polynucleotides sharing a minimum threshold of nucleic acid sequence identity with any one of the above-mentioned polynucleotides. In specific embodiments, the present invention relates to isolated polynucleotides having at least 60%, 65%, 70%, 71 %, 72, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% nucleic acid sequence identity to any one of the above-mentioned polynucleotides. Other specific percentage units that have not been specifically recited here for brevity are nevertheless considered within the scope of the present invention. Polynucleotides having the aforementioned thresholds of nucleic acid sequence identity can be created by introducing one or more nucleotide substitutions, additions or deletions into the coding nucleotide sequences of the present invention such that one or more amino acid substitutions, deletions or insertions are introduced into the encoded polypeptide. Such mutations may be introduced by standard techniques, such as site-directed mutagenesis and PCR-mediated mutagenesis.
[00147] In another aspect, the present invention relates to a polynucleotide that hybridizes (or is hybridizable) under medium-high stringency conditions, high stringency conditions, or very high stringency conditions with the full- length complement of any one of the polynucleotides defined above.
[00148] As used herein, "very low stringency conditions" means for probes of at least 100 nucleotides in length, prehybridization and hybridization at 42°C in 5X SSPE, 0.3% SDS, 200 micrograms/mL sheared and denatured salmon sperm DNA, and 25% formamide, following standard Southern blotting procedures for 12 to 24 hours. The carrier material is finally washed three times each for 15 minutes using 2X SSC, 0.2% SDS at 45°C.
[00149] As used herein, "low stringency conditions" means for probes of at least 100 nucleotides in length, prehybridization and hybridization at 42°C in 5X SSPE, 0.3% SDS, 200 micrograms/mL sheared and denatured salmon sperm DNA, and 25% formamide, following standard Southern blotting procedures for 12 to 24 hours. The carrier material is finally washed three times each for 15 minutes using 2X SSC, 0.2% SDS at 50°C.
[00150] As used herein, "medium stringency conditions" means for probes of at least 100 nucleotides in length, prehybridization and hybridization at 42°C in 5X SSPE, 0.3% SOS, 200 micrograms/mL sheared and denatured salmon sperm DNA, and 35% formamide, following standard Southern blotting procedures for 12 to 24 hours. The carrier material is finally washed three times each for 15 minutes using 2X SSC, 0.2% SOS at 55°C.
[00151] As used herein, "medium-high stringency conditions" means for probes of at least 100 nucleotides in length, prehybridization and hybridization at 42°C in 5X SSPE, 0.3% SDS, 200 micrograms/mL sheared and denatured salmon sperm DNA, and 35% formamide, following standard Southern blotting procedures for 12 to 24 hours. The carrier material is finally washed three times each for 15 minutes using 2X SSC, 0.2% SDS at 60°C.
[00152] As used herein, "high stringency conditions" means for probes of at least 100 nucleotides in length, prehybridization and hybridization at 42°C in 5X SSPE, 0.3% SDS, 200 micrograms/mL sheared and denatured salmon sperm DNA, and 50% formamide, following standard Southern blotting procedures for 12 to 24 hours. The carrier material is finally washed three times each for 15 minutes using 2X SSC, 0.2% SDS at 65°C.
[00153] As used herein, "very high stringency conditions" means for probes of at least 100 nucleotides in length, prehybridization and hybridization at 42°C in 5X SSPE, 0.3% SDS, 200 micrograms/mL sheared and denatured salmon sperm DNA, and 50% formamide, following standard Southern blotting procedures for 12 to 24 hours. The carrier material is finally washed three times each for 15 minutes using 2X SSC, 0.2% SDS at 70°C.
[00154] In one embodiment, a polynucleotide of the present invention (or a fragment thereof) can be isolated using the sequence information provided herein in conjunction with standard molecular biology techniques (e.g., as described in Sambrook et al., supra. For example, suitable hybridization oligonucleotides (e.g., probes or primers) can be designed using all or a portion of the nucleic acid sequences disclosed herein and prepared by standard synthetic techniques (e.g., using an automated DNA synthesizer). The oligonucleotides can be employed in hybridization and/or amplification reactions, for example, to amplify a template of cDNA, mRNA or genomic DNA, according to standard PCR techniques. A polynucleotide so amplified can be cloned into an appropriate vector and characterized by DNA sequence analysis.
[00155] In another aspect, the present invention relates to polynucleotides encoding functional variants of any one of the polypeptides of the present invention, including a biologically active fragment or domain thereof.
[00156] In another aspect, the present invention can include nucleic acid molecules (e.g., oligonucleotides) sufficient for use as primers and/or hybridization probes to amplify, sequence and/or identify nucleic acid molecules encoding a polypeptide of the present invention or fragments thereof. In some embodiments, the present invention relates to polynucleotides (e.g., oligonucleotides) that comprise, span, or hybridize specifically to exon-exon or exon- intron junctions of the genomic sequences identified herein, such as those defined in Tables 2A-2C. Designing such polynucleotides/oligonucleotides would be within the grasp of a person of skill in the art in view of the target sequence information disclosed herein and are thus encompassed by the present invention.
[00157] In another aspect, the present invention relates to polynucleotides comprising silent mutations or mutations that do not significantly alter the (biological) function or protein activity of the encoded polypeptide. Guidance concerning how to make phenotypically silent amino acid substitutions is provided for example in Bowie et al., Science 247:1306-1310 (1990) and in the references cited therein. Furthermore, it will be apparent for the skilled person that DNA sequence polymorphisms of the genes disclosed herein may exist within a given population, which may differ from the sequences disclosed herein. Such genetic polymorphisms may exist in cells from different populations or within a population due to natural allelic variation. Accordingly, in one embodiment, the present invention can include natural allelic variants and homologs of polynucleotides disclosed herein.
[00158] In another aspect, polynucleotides of the present invention can comprise only a portion or a fragment of the nucleic acid sequences disclosed herein. Although such polynucleotides may not encode a functional polypeptide of the present invention, they are useful for example as probes or primers in hybridization or amplification reactions. Exemplary uses of such polynucleotides include: (1) isolating a gene (as allelic variant thereof) from cDNA library; (2) in situ hybridization (e.g., FISH) to metaphase chromosomal spreads to provide precise chromosomal location of the gene as described in Verma et al., Human Chromosomes: a Manual of Basic Techniques, Pergamon Press, New York (1988); (3) Northern blot analysis for detecting expression of mRNA corresponding to a polypeptide disclosed herein, or a homolog, ortholog or variant thereof, in specific tissues and/or cells; and (4) probes and primers that can be used as a diagnostic tool to analyze the presence of a nucleic acid hybridizable to a polynucleotide disclosed herein in a given biological (e.g., tissue) sample. It would be within the grasp of a skilled person to design specific oligonucleotides in view of the nucleic acid sequences disclosed herein. Oligonucleotides typically comprise a region of nucleotide sequence that hybridizes (preferably under highly stringent conditions) to at least 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 37, 39, 40, 50, 60, 70, 80, 90 or 100 contiguous nucleotides of a polynucleotide of the present invention. In one embodiment, such oligonucleotides can be used for identifying and/or cloning other family members, as well as orthologs from other species. In another embodiment, the oligonucleotide can be attached to a detectable label (e.g., a radioisotope, a fluorescent compound, an enzyme, or an enzyme cofactor). Such oligonucleotides can also be used as part of a diagnostic method or kit for identifying cells which express a polypeptide of the present invention.
[00159] As would be understood by the skilled person, full-length complements of any one of the polynucleotides of the present invention are also encompassed. In one embodiment, the full-length complements are antisense molecules with respect to the coding strands of polynucleotides of the present invention, which hybridize (preferably under highly stringent conditions) to at least 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 37, 39, 40, 50, 60, 70, 80, 90 or 100 contiguous nucleotides to a polynucleotide of the present invention.
Sequencing errors
[00160] The sequence information as provided herein should not be so narrowly construed as to require inclusion of erroneously identified bases. The specific sequences disclosed herein can be readily used to isolate the corresponding complete genes from the organism sequenced herein, which in turn can easily be subjected to further sequence analyses thereby identifying sequencing errors.
[00161] Unless otherwise indicated, all nucleotide sequences disclosed herein were determined by sequencing using an automated DNA sequencer, and all amino acid sequences of polypeptides disclosed herein were predicted by translation based on the genetic code. Therefore, as is known in the art for any DNA sequence determined by this automated approach, any nucleotide sequence determined herein may contain some errors. Nucleotide sequences
determined by automation are typically at least about 90% identical, more typically at least about 95% to at least about 99.9% identical to the actual nucleotide sequence of the sequenced DNA molecule. The actual sequence can be more precisely determined by other approaches including manual DNA sequencing methods well known in the art. As is also known in the art, a single insertion or deletion in a determined nucleotide sequence compared to the actual sequence will cause a frame shift in translation of the nucleotide sequence such that the predicted amino acid sequence encoded by a determined nucleotide sequence will be completely different from the amino acid sequence actually encoded by the sequenced DNA molecule, beginning at the point of such an insertion or deletion.
[00162] The person skilled in the art is capable of identifying such erroneously identified bases and knows how to correct such errors.
VECTORS
[00163] Another aspect of the invention pertains to vectors (e.g., expression vectors), containing a polynucleotide encoding a polypeptide of the present invention.
[00164] As used herein, the term "vector" includes a nucleic acid molecule capable of transporting another nucleic acid molecule to which it has been linked. One type of vector is a "plasmid", which refers to a circular double stranded DNA loop into which additional DNA segments can be ligated. Another type of vector is a viral vector, wherein additional DNA segments can be ligated into the viral genome. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non-episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as "expression vectors". In general, expression vectors useful in recombinant DNA techniques are often in the form of plasmids. The terms "plasmid" and "vector" can be used interchangeably herein as the plasmid is the most commonly used form of vector. However, the invention is intended to include such other forms of expression vectors, such as viral vectors (e.g., replication defective retroviruses, adenoviruses and adeno- associated viruses), which serve equivalent functions.
[00165] In one embodiment, recombinant expression vectors of the invention can comprise a polynucleotide of the present invention in a form suitable for expression of the polynucleotide in a host cell, which means that the recombinant expression vector includes one or more regulatory sequences, selected on the basis of the host cells to be used for expression, which is operatively linked to the nucleic acid sequence to be expressed. Within a recombinant expression vector, "operatively linked" is intended to mean that the nucleotide sequence of interest is linked to the regulatory sequence(s) in a manner which allows for expression of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell). The term "regulatory sequence" is intended to include promoters, enhancers and other expression control elements (e.g.,
polyadenylation signal). Such regulatory sequences are described, for example, in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, CA (1990). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence in many types of host cells and those which direct expression of the nucleotide sequence only in a certain host cell (e.g., tissue-specific regulatory sequences). It will be appreciated by those skilled in the art that the design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, etc. The expression vectors of the present invention can be introduced into host cells to thereby produce proteins or peptides, encoded by polynucleotides as described herein (e.g., polypeptides of the present invention).
[00166] In another embodiment, recombinant expression vectors of the present invention can be designed for expression of polypeptides of the present invention in prokaryotic or eukaryotic cells. For example, these polypeptides can be expressed in bacterial cells such as E. coli, insect cells (using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel supra). In another embodiment, recombinant expression vectors of the present invention can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.
[00167] In another embodiment, expression vectors of the present invention can include chromosomal-, episomal- and virus-derived vectors, e.g., vectors derived from bacterial plasmids, bacteriophage, yeast episome, yeast chromosomal elements, viruses such as baculoviruses, papova viruses, vaccinia viruses, adenoviruses, fowl pox viruses, pseudorabies viruses and retroviruses, and vectors derived from combinations thereof, such as those derived from plasmid and bacteriophage genetic elements, such as cosmids and phagemids.
[00168] For expression, a DNA insert should be operatively linked to an appropriate promoter, such as the phage lambda PL promoter, the E. coli lac, trp and tac promoters, the SV40 early and late promoters and promoters of retroviral LTRs, to name a few. Other suitable promoters will be known to the skilled person. In a specific embodiment, promoters are preferred that are capable of directing a high expression level of biologically active polypeptides of the present invention (e.g., lignocellulose active proteins) from fungi. Such promoters are known in the art. The expression constructs may contain sites for transcription initiation, termination, and, in the transcribed region, a ribosome binding site for translation. The coding portion of the mature transcripts expressed by the constructs will include a translation initiating AUG at the beginning and a termination codon appropriately positioned at the end of the polypeptide to be translated.
[00169] Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional transformation or transfection techniques. As used herein, the terms "transformation" and "transfection" are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, transduction, infection, lipofection, cationic lipid-mediated transfection or electroporation. Suitable methods for transforming or transfecting host cells can be found in Sambrook, et al., (Molecular Cloning: A Laboratory Manual, 2nd ed. Cold Spring Harbor
Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989), Davis et al., Basic Methods in Molecular Biology (1986) and other laboratory manuals.
[00170] For stable transfection of mammalian cells, it is known that, depending upon the expression vector and transfection technique used, only a small fraction of cells may integrate the foreign DNA into their genome. In order to identify and select these integrants, a gene that encodes a selectable marker (e.g., resistance to antibiotics) is generally introduced into the host cells along with the gene of interest. Preferred selectable markers include those which confer resistance to drugs, such as G418, hygromycin and methatrexate. A polynucleotide encoding a selectable marker can be introduced into a host cell on the same vector as that encoding a polypeptide of the present invention, or on a separate vector. Cells stably transfected with a polynucleotide of the present invention can be identified by drug selection (e.g., cells that have incorporated the selectable marker gene will survive, while the other cells die).
[00171] Expression of proteins in prokaryotes is often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, e.g., to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: (1) to increase expression of recombinant protein; (2) to increase the solubility of the recombinant protein; and (3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, in fusion expression vectors, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein.
[00172] Vectors preferred for use in bacteria are for example disclosed in WO-A1 -2004/074468. Other suitable vectors will be readily apparent to the skilled artisan. Known bacterial promoters suitable for use in the present invention include the promoters disclosed in WO-A1 -2004/074468.
[00173] As indicated, the expression vectors will preferably contain selectable markers. Such markers include dihydrofolate reductase or neomycin resistance for eukaryotic cell culture and antibiotic resistance (e.g., tetracyline or ampicillin) for culturing in E. coli and other bacteria. Representative examples of appropriate host include bacterial cells, such as E. coli, Streptomyces, Salmonella typhimurium and certain Bacillus species; fungal cells such as Aspergillus species, for example A. niger, A. oryzae and A. nidulans, yeast cells such as Kluyveromyces, for example K. lactis and/or Pichia, for example P. pastoris; insect cells such as Drosophila S2 and Spodoptera Sf9; animal cells such as CHO, COS and Bowes melanoma; and plant cells. Appropriate culture mediums and conditions for the above-described host cells are known in the art.
[00174] Transcription of the DNA encoding the polypeptides of the present invention by higher eukaryotes may be increased by inserting an enhancer sequence into the vector. Enhancers are cis-acting elements of DNA, usually about from 10 to 300 bp that act to increase transcriptional activity of a promoter in a given host cell-type. Examples of enhancers include the SV40 enhancer, which is located on the late side of the replication origin at bp 100 to 270,
the cytomegalovirus early promoter enhancer, the polyoma enhancer on the late side of the replication origin, and adenovirus enhancers.
[00175] For secretion of the translated protein into the lumen of the endoplasmic reticulum, into the periplasmic space or into the extracellular environment, appropriate secretion signal may be incorporated into the expressed polypeptide. The signals may be endogenous to the polypeptide or they may be heterologous signals. In an embodiment, a polypeptide of the present invention may be expressed in a modified form, such as a fusion protein, and may include not only secretion signals but also additional heterologous functional regions. Thus, for instance, a region of additional amino acids, particularly charged amino acids, may be added to the N terminus of the polypeptide to improve stability and persistence in the host cell, during purification or during subsequent handling and storage. Also, peptide moieties may be added to the polypeptide to facilitate purification and/or detection.
HOST CELLS
[00176] In another aspect, the present invention features cells, e.g., transformed host cells or recombinant host cells that contain a polynucleotide or vector of the present invention. A "transformed cell" or "recombinant cell" is a cell into which (or into an ancestor of which) has been introduced a polynucleotide or vector of the invention by means of recombinant DNA techniques. Both prokaryotic and eukaryotic cells are included, e.g., bacteria, fungi, yeast, and the like, especially preferred are cells from filamentous fungi, in particular the strain from which the polynucleotide and polypeptide sequences disclosed herein were derived.
[00177] In one embodiment, a cell of the present invention is typically not a wild-type strain or a naturally- occurring cell. Host cells of the present invention can include, but are not limited to: fungi (e.g., Aspergillus niger, Trichoderma reesii, Myceliophthora thermophila and Talaromyces emersonii); yeasts (e.g., Saccharomyces cerevisiae, Yarrowia lipolytica and Pichia pastoris); bacteria (e.g., Escherichia coli and Bacillus sp.); and plants (e.g., Nicotiana benthamiana, Nicotiana tabacum and Medicago sativa).
[00178] In another embodiment, a polynucleotide (or a polynucleotide which is comprised within a vector) may be homologous or heterologous with respect to the cell into which it is introduced. In this context, a polynucleotide is homologous to a cell if the polynucleotide naturally occurs in that cell. A polynucleotide is heterologous to a cell if the polynucleotide does not naturally occur in that cell. Accordingly, in an embodiment, the present invention relates to a cell which comprises a heterologous or a homologous sequence corresponding to any one of the polynucleotides or polypeptides disclosed herein.
[00179] In another embodiment, a host cell can be chosen that modulates the expression of the inserted sequences, or modifies and processes the gene product in a specific, desired fashion. Such modifications (e.g., glycosylation) and processing (e.g., cleavage) of protein products may facilitate optimal functioning of the protein. Various host cells have characteristic and specific mechanisms for post-translational processing and modification of proteins and gene products. Appropriate cell lines or host systems familiar to those of skill in the art can be chosen to
ensure the desired and correct modification and processing of the foreign protein expressed. To this end, eukaryotic host cells that possess the cellular machinery for proper processing of the primary transcript, glycosylation, and phosphorylation of the gene product can be used. Such host cells are well known in the art.
[00180] In another embodiment, host cells can also include, but are not limited to, mammalian cell lines such as CHO, VERO, BHK, HeLa, COS, MDCK, 293, 3T3, WI38, and choroid plexus cell lines. If desired, a stably transfected cell line can produce the polypeptides of the present invention. A number of vectors suitable for stable transfection of mammalian cells are available to the public, methods for constructing such cell lines are also publicly known, e.g., in Ausubel et al., (supra).
[00181] In another embodiment, the present invention relates to methods of inhibiting the expression of a polypeptide of the present invention in a host cell, comprising administering to the cell or expressing in the cell a double-stranded RNA (dsRNA) molecule (or a molecule comprising region of double-strandedness), wherein the dsRNA comprises a subsequence of a polynucleotide of the present invention. In a preferred aspect, the dsRNA is about 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25 or more duplex nucleotides in length. The dsRNA is preferably a small interfering RNA (siRNA) or a micro RNA (miRNA). In a preferred aspect, the dsRNA is small interfering RNA (siRNAs) for inhibiting transcription. In another preferred aspect, the dsRNA is micro RNA (miRNAs) for inhibiting translation. The present invention also relates to such double-stranded RNA (dsRNA) molecules, comprising a portion of the mature polypeptide coding sequence of any one of the coding sequences of the polypeptides disclosed herein of inhibiting expression of that polypeptide in a cell. While the present invention is not limited by any particular mechanism of action, the dsRNA can enter a cell and cause the degradation of a single-stranded RNA (ssRNA) of similar or identical sequences, including endogenous mRNAs. When a cell is exposed to dsRNA, mRNA from the homologous gene is selectively degraded by a process called RNA interference (RNAi). The dsRNAs of the present invention can be used in gene-silencing methods. In one aspect, the invention relates to methods to selectively degrade RNA using the dsRNAi's of the present invention. The process may be practiced in vitro, ex vivo or in vivo. In one aspect, the dsRNA molecules can be used to generate a loss-of-function mutation in a cell, an organ or an oganism. Methods for making and using dsRNA molecules to selectively degrade RNA are well known in the art, see, for example, U.S. Patent No. 6,506,559; U.S. Patent No. 6,511 ,824; U.S. Patent No. 6,515,109; and U.S. Patent No. 6,489, 127. In some instances, new phylogenic analyses of fungal species have resulted in taxonomic reclassifications. For example, following their phylogenic studies reported in van den Brink et al., ("Phylogeny of the industrial relevant, thermophilic genera Myceiiophthora and Corynascus", Fungal Diversity (2012), 52:197-207), the authors proposed renaming all existing Corynascus species to Myceiiophthora. Such changes in taxonomic classification are within the scope of the present invention and, regardless of future reclassifications, a person of skill in the art would be able to identify the organism used to determine the sequences disclosed herein for example based on the strain's accession number (CBS 389.93; ATCC 62921 ; or CBS 625.91).
[00182] It should be understood herein that the level of expression of polypeptides of the present invention could be modified by adapting the codon usage ratio of a sequence of the present invention to that of the host or hosts in which it is meant to be expressed. This adaptation and the concept of codon usage ratio are all well known in the art.
Antibodies
[00183] In another aspect, the present invention relates to an isolated binding agent capable of selectively binding to a polypeptide of the present invention. Suitable binding agents may be selected from an antibody, an antigen binding fragment, or a binding partner. In one embodiment, the binding agent selectively binds to an amino acid sequence selected from Tables 1A-1C, including to any fragment of any of the above sequences comprising at least one antibody binding epitope.
[00184] According to the present invention, the phrase "selectively binds to" refers to the ability of an antibody, antigen binding fragment or binding partner of the present invention to preferentially bind to specified proteins. More specifically, the phrase "selectively binds" refers to the specific binding of one protein to another (e.g., an antibody, fragment thereof, or binding partner to an antigen), wherein the level of binding, as measured by any standard assay (e.g., an immunoassay), is statistically significantly higher than the background control for the assay. For example, when performing an immunoassay, controls typically include a reaction well/tube that contain antibody or antigen binding fragment alone (i.e., in the absence of antigen), wherein an amount of reactivity (e.g., non-specific binding to the well) by the antibody or antigen binding fragment thereof in the absence of the antigen is considered to be background. Binding can be measured using a variety of methods standard in the art including enzyme immunoassays (e.g., ELISA, immunoblot assays, etc.).
[00185] Antibodies are characterized in that they comprise immunoglobulin domains and as such, they are members of the immunoglobulin superfamily of proteins. An antibody of the invention includes polyclonal and monoclonal antibodies, divalent and monovalent antibodies, bi- or multi-specific antibodies, serum containing such antibodies, antibodies that have been purified to varying degrees, and any functional equivalents of whole antibodies. Isolated antibodies of the present invention can include serum containing such antibodies, or antibodies that have been purified to varying degrees. Whole antibodies of the present invention can be polyclonal or monoclonal. Alternatively, functional equivalents of whole antibodies, such as antigen binding fragments in which one or more antibody domains are truncated or absent (e.g., Fv, Fab, Fab', or F(ab)2 fragments), as well as genetically- engineered antibodies or antigen binding fragments thereof, including single chain antibodies or antibodies that can bind to more than one epitope (e.g., bi-specific antibodies), or antibodies that can bind to one or more different antigens (e.g., bi- or multi-specific antibodies), may also be employed in the invention. Methods for the generation and production of antibodies are well known in the art.
[00186] Monoclonal antibodies may be produced according to the methodology of Kohler and Milstein (Nature 256:495-497, 1975). Non-antibody polypeptides, sometimes referred to as binding partners, may be designed to bind specifically to a protein of the invention. Examples of the design of such polypeptides, which possess a prescribed ligand specificity are given in Beste et al., (Proc. Nat'l Acad. Sci. 96:1898-1903, 1999). In one embodiment, a binding agent of the invention is immobilized on a substrate such as: artificial membranes, organic supports, biopolymer supports and inorganic supports such as for use in a screening assay.
[00187] In some embodiment, antibodies and binding agents specifically binding to polypeptides of the present invention may be produced and used even in absence of knowledge of the precise biological function and/or protein activity of the polypeptide. Such antibodies and binding agent may be useful, for example, as diagnostic, classification, and/or research tools.
COMPOSITIONS AND USES
[00188] In another aspect, the present invention relates to a composition comprising one or more polypeptides or polynucleotides of the present invention. In one embodiment, the compositions are enriched in such a polypeptide. The term "enriched" indicates that the biological activity (e.g., biomass degradation or processing) of the composition has been increased, e.g., with an enrichment factor of at least 1.1. The composition may comprise a polypeptide of the present invention as the major component, e.g., a mono-component composition. Alternatively, the composition may comprise multiple enzymatic activities (e.g., those described herein).
[00189] The polypeptide compositions may be prepared in accordance with methods known in the art and may be in the form of a liquid or a dry composition. For instance, the polypeptide composition may be in the form of a granulate or a microgranulate. The polypeptide to be included in the composition may be stabilized in accordance with methods known in the art. Examples are given below of preferred uses of the polypeptide compositions of the present invention. The dosage of the polypeptide composition of the invention and other conditions under which the composition is used may be determined on the basis of methods known in the art.
[00190] In another aspect, the present invention relates to the use of the polypeptides (e.g., enzymes) of the present invention a number of industrial and other processes. Despite the long term experience obtained with these processes, there remains a need for improved polypeptides and enzymes featuring one or more significant advantages over those presently used. Depending on the specific application, these advantages can include aspects such as lower production costs, higher specificity towards the substrate, greater synergies with existing enzymes, less antigenic effect, less undesirable side activities, higher yields when produced in a suitable microorganism, more suitable pH and temperature ranges, better properties of the final product, and food grade or kosher aspects. In various embodiments, the present invention seeks to provide one or more of these advantages, or others.
Biomass processing or degradation
[00191] In another aspect, the polypeptides of the present invention may be used in new or improved methods for enzymatically degrading or converting plant cell wall polysaccharides from biomass into various useful products. In addition to cellulose and hemicellulose, plant cell walls contain associated pectins and lignins, the removal of which by enzymes of the current invention can improve accessibility to cellulases and hemicellulases, or which can themselves be converted to useful products. Therefore the polypeptides of the present invention may be used to degrade biomass or pretreated biomass to sugars. These sugars may be used as such or may be, for example, fermented into ethanol.
[00192] Usually, biomass must be subjected to pre-treatment in order to make the cellulose more accessible. Accordingly, in one embodiment, polypeptides of the present invention may be used in improved methods for the processing of pretreated biomass. Pretreatment technologies may involve chemical, physical, or biological treatments. Examples of pre-treatment technologies include but are not limited to: steam explosion; ammonia; acid hydrolysis; alkaline hydrolysis; solvent extraction; crushing; milling; etc.
[00193] One example of a product produced from biomass is bioethanol. Bioethanol is usually produced by the fermentation of glucose to ethanol by yeasts such as Saccharomyces cerevisiae: in addition to ethanol, other chemicals may be synthesized starting from glucose. Ethanol, today, is produced mostly from sugars or starches, obtained from sugar cane, fruits and grains. In contrast, cellulosic ethanol is obtained from cellulose, the main component of wood, straw and much of the plants. Sources of biomass for cellulosic ethanol production comprise agricultural residues (e.g., leftover crop materials from stalks, leaves, and husks of corn plants), forestry wastes (e.g., chips and sawdust from lumber mills, dead trees, and tree branches), energy crops (e.g., dedicated fast-growing trees and grasses such as switch grass), municipal solid waste (e.g., household garbage and paper products), food processing and other industrial wastes (e.g., black liquor, paper manufacturing by-products, etc.).
[00194] Plant biomass is a mixture of plant polysaccharides, including cellulose, hemicelluloses, and pectin, together with the structural polymer, lignin. Glucose is released from cellulose by the action of mixtures of enzymes, including: endoglucanases, exoglucanases (cellobiohydrolases 1 and 2) and beta-glucosidases. Efficient large-scale conversion of cellulosic materials by such mixtures may require the full complement of enzymes, and can be enhanced by the addition of enzymes that attack the other plant cell wall components (e.g., hemicelluloses, pectins, and lignins), as well as chemical linkages between these components. Hence, polypeptides of the present invention that are highly expressed, or have high specific activity, stability, or resistance to inhibitors may improve the efficiency of the process, and lower enzyme costs. It would be an advantage to the art to improve the degradation and conversion of plant cell wall polysaccharides by composing cellulase mixtures using cellulase enzymes with such properties. Furthermore, polypeptides of the present invention that are able to function at extremes of pH and temperature are desirable, both since improved enzyme robustness decreases costs, and because enzymes that function at high temperature will allow high processing temperatures under high substrate consistency conditions that decrease viscosity and thus improve yields.
[00195] Glycoside hydrolases from the family GH61 are known to stimulate the activity of cellulose cocktails on lignocellulosic substrates and are thus considered to exhibit cellulose-enhancing activity (Harris et al., Biochemistry 49, 3305 (2010)). Enhancement of cellulase cocktail efficiency by GH61 proteins of the present invention may contribute to lowering the costs of cellulase enzymes used for the production of glucose from plant cell biomass, as described above. GH61 (glycoside hydrolase family 61 or sometimes referred to as EGIV) proteins are oxygen- dependent polysaccharide monooxygenases (PMO's) according to the latest literature. Often in the literature, these proteins are mentioned as enhancing the action of cellulases on lignocellulose substrates. GH61 was originally classified as an endogluconase, based on the measurement of very weak endo-1 ,4- -d-glucanase activity in one family member. The term "GH61" as used herein, is to be understood as a family of enzymes, which share common conserved sequence portions and foldings to be classified in family 61 of the well-established CAZY GH classification system (http://www.cazy.org/GH61.html). The glycoside hydrolase family 61 is a member of the family of glycoside hydrolases EC 3.2.1. GH61 is used herein as being part of the cellulases.
[00196] Enzymatic hydrolysis of plant hemicellulose yields 5-carbon sugars that either may be fermented to ethanol by some species of yeast, or converted to other types of chemical products. Enzymatic deconstruction of hemicellulose is also known to improve the accessibility of plant cell wall cellulose to cellulase enzymes for the production of glucose from lignocellulosic materials. Hemicellulase enzymes of the present invention that enhance glucose production from lignocellulose would find utility in the bioethanol industry and in other process that rely on glucose or pentose streams from lignocellulose.
[00197] Lignin is composed of methoxylated phenyl-propane units linked by ether linkages and carbon-carbon bonds. The chemical composition of lignin may, depending on species, include guaiacyl, 4-hydroxyphenyl, and syringyl groups. Enzymatic modification of lignin by the polypeptides of the present invention can be used for the production of structural materials from plant biomass, or alternatively improve the accessibility of plant cellulose and hemicelluloses to cellulase enzymes for the release of glucose from biomass as described above. Enzymes that degrade the lignin component of lignocellulose include lignin peroxidases, manganese-dependent peroxidases, versatile peroxidases, and laccases (Vicuna et al., 2000, Molecular Biotechnology 14: 173-176; Broda et al., 1996, Molecular Microbiology 19: 923-932). In some embodiments, polypeptides of the present invention may also, in certain instances, be active in the decolorization of industrial dyes, and thus useful for the treatment and detoxification of chemical wastes.
[00198] In another embodiment, pectin-degrading polypeptides of the present invention can also enhance the action of cellulases on plant biomass by improving the accessibilty of cellulase to the cellulose component of lignocellulose.
[00199] In another embodiment, polypeptides of the present invention may also be useful in other applications for hydrolyzing non-starch polysaccharide (NSP).
[00200] In another embodiment, esterases of the present invention can be useful in the bioenergy industry such as for the production of biodiesel and hydrolysis of hemicellulose.
[00201] In another embodiment, the present invention relates to methods for degrading or converting a cellulose-containing material, comprising: treating the cellulose-containing material with an effective amount of a cellulolytic enzyme composition in the presence of an effective amount of a polypeptide having cellulolytic enhancing activity of the present invention, wherein the presence of the polypeptide having cellulolytic enhancing activity increases the degradation of cellulose-containing material compared to the absence of the polypeptide having cellulolytic enhancing activity.
[00202] In another embodiment, the present invention relates to methods for producing a fermentation product, comprising: (a) saccharifying a cellulose-containing material with an effective amount of a cellulolytic enzyme composition in the presence of an effective amount of a polypeptide having cellulolytic enhancing activity of the present invention, wherein the presence of the polypeptide having cellulolytic enhancing activity increases the degradation of cellulose-containing material compared to the absence of the polypeptide having cellulolytic enhancing activity; (b) fermenting the saccharified cellulose-containing material of step (a) with one or more fermenting microorganisms to produce the fermentation product; and (c) recovering the fermentation product from the fermentation.
Food product industry
[00203] In one embodiment, the present invention relates to methods for preparing a food product comprising incorporating into the food product an effective amount of a polypeptide of the present invention. This can improve one or more properties of the food product relative to a food product in which the polypeptide is not incorporated. The phrase "incorporated into the food product" is defined herein as adding a polypeptide of the present invention to the food product, to any ingredient from which the food product is to be made, and/or to any mixture of food ingredients from which the food product is to be made. In other words, a polypeptide of the present invention may be added in any step of the food product preparation and may be added in one, two or more steps. The polypeptide of the present invention is added to the ingredients of a food product which can then be treated by methods including cooking, boiling, drying, frying, steaming or baking as is known in the art.
[00204] At least in the context of food products, the term "effective amount" is defined herein as an amount of the polypeptide (e.g., enzyme) of the present invention that is sufficient for providing a measurable effect on at least one property of interest of the food product. The term "improved property" is defined herein as any property of a food product which is improved by the action of a polypeptide (e.g., enzyme) of the present invention relative to a food product in which the polypeptide is not incorporated. The improved property may be determined by comparison of a food product prepared with and without addition of a polypeptide of the present invention. Organoleptic qualities
may be evaluated using procedures well established in the food industry, and may include, for example, the use of a panel of trained taste-testers.
[00205] The polypeptides of the present invention may be prepared in any form suitable for the use in question, e.g., in the form of a dry powder, agglomerated powder, or granulate, in particular a non-dusting granulate, liquid, in particular a stabilized liquid, or protected enzyme such as described in WO01/11974 and WO02/26044. Granulates and agglomerated powders may be prepared by conventional methods, e.g., by spraying the enzyme according to the invention onto a carrier in a fluid-bed granulator. The carrier may consist of particulate cores having a suitable particle size. The carrier may be soluble or insoluble, e.g., a salt (such as NaCI or sodium sulphate), sugar (such as sucrose or lactose), sugar alcohol (such as sorbitol), starch, rice, corn grits, or soy. In an embodiment, the polypeptide of the present invention (and/or additional polypeptides/enzymes) may be contained in slow-release formulations. Methods for preparing slow-release formulations are well known in the art. Adding nutritionally acceptable stabilizers such as sugar, sugar alcohol, or another polyol, and/or lactic acid or another organic acid according to established methods may for instance, stabilize liquid enzyme preparations.
[00206] In another embodiment, polypeptides of the present invention may also be incorporated in yeast- comprising compositions such as disclosed in EP-A-0619947, EP-A-0659344 and WO02/49441.
[00207] In another embodiment, one or more additional polypeptides/enzymes may be incorporated into a food product of the present invention. The additional enzyme may be of any origin, including mammalian and plant, and preferably of microbial (bacterial, yeast or fungal) origin and may be obtained by techniques conventionally used in the art. Enzymes may conveniently be produced in microorganisms. Microbial enzymes are available from a variety of sources; Bacillus species are a common source of bacterial enzymes, whereas fungal enzymes are commonly produced in Aspergillus species.
[00208] In specific embodiments, additional polypeptides/enzymes include starch degrading enzymes, xylanases, oxidizing enzymes, fatty material splitting enzymes, or protein-degrading, modifying or crosslinking enzymes. Starch degrading enzymes include endo-acting enzymes such as alpha-amylase, maltogenic amylase, pullulanase or other debranching enzymes, and exo-acting enzymes that cleave off glucose (amyloglucosidase), maltose (beta-amylase), maltotriose, maltotetraose and higher oligosaccharides. Suitable xylanases are for instance xylanases, pentosanases, hemicellulase, arabinofuranosidase, glucanase, cellulase, cellobiohydrolase, beta- glucosidase, and others. Oxidizing enzymes are for instance glucose oxidase, hexose oxidase, pyranose oxidase, sulfhydryl oxidase, lipoxygenase, laccase, polyphenol oxidases and others. Fatty material splitting enzymes are for instance triacylglycerol lipases, phospholipases (such as A1 , A2, B, C and D) and galactolipases. Protein degrading, modifying or crosslinking enzymes are for instance endo-acting proteases (serine proteases, metalloproteases, aspartyl proteases, thiol proteases), exo-acting peptidases that cleave off one amino acid, or dipeptide, tripeptide etceteras from the N-terminal (aminopeptidases) or C-terminal (carboxypeptidases) ends of the polypeptide chain,
asparagines or glutamine deamidating enzymes such as deamidase and peptidoglutaminase or crosslinking enzymes such as transglutaminase.
[00209] In others embodiments, additional polypeptides/enzymes can include: amylases, such as alpha- amylase (which can be useful for providing sugars that are fermentable by yeast) or beta-amylase; cyclodextrin glucanotransferase; peptidase (e.g., an exopeptidase, which can be useful in flavour enhancement); transglutaminase; lipase, which can be useful for the modification of lipids present in the food or food constituents), phospholipase, cellulase, hemicellulase, protein disulfide isomerase, peroxidase, laccase, or an oxidase (e.g., glucose oxidase, hexose oxidase, aldose oxidase, pyranose oxidase, lipoxygenase or L-amino acid oxidase).
[00210] In other embodiment, esterases of the present invention have a number of applications in the food industry including, but not limited to, degumming vegetable oils; improving the production of bread (e.g., in situ production of emulsifiers); producing crackers, noodles, and pasta; enhancing flavor development of cheese, butter, and margarine; ripening cheese; removing wax; trans-esterification of flavors and cocoa butter substitutes; synthesizing structured lipids for infant formula and nutraceuticals; improving the polyunsaturated fatty acid content in fish oil; and aiding in digestion and releasing minerals in food processing.
[00211] When one or more additional enzyme activities are to be added in accordance with the methods of the present invention, these activities may be added separately or together with the polypeptide according to the invention.
Detergent industry
[00212] In another aspect, polypeptides of the present invention can be useful in the detergent industry, e.g., for removal of carbohydrate-based stains from soiled laundry. Enzymes are used in detergents in order to improve its efficacy to remove most types of dirt. In some embodiments, esterases such as lipases of the present invention are particularly useful for removing fats and lipids.
Feed industry
[00213] In another aspect, polypeptides of the present invention can be useful in the feed enzyme industry, e.g., for increasing nutritional quality, digestibility and/or absorption of animal feed.
[00214] Feed enzymes have an important role to play in current farming systems, as they can increase the digestibility of nutrients, leading to greater efficiency in the production of animal products such as meat and eggs. At the same time, they can play a role in minimizing the environmental impact of increased animal production.
[00215] Non-starch polysaccharides (NSP) can increase the viscosity of the digesta which can, in turn, decrease nutrient availability and animal performance.
[00216] Endoxylanases and phytases are the best-known feed-enzyme products. Phytase enzymes hydrolyse phytic acid and release inorganic phosphate, thereby avoiding the need to add inorganic phosphates to the diet and
reducing phosphorus excretion. Addition of xylanases to feed has also been shown to have positive effects on animal growth. Adding specific nutrients to feed improves animal digestion and thereby reduces feed costs. A lot of feed additives are being currently used and new concepts are continuously developed. Use of specific enzymes like non- starch carbohydrate degrading enzymes could breakdown fiber, releasing energy as well as increasing the protein digestibility due to better accessibility of the protein when fiber gets broken down. In this way the feed cost could come down, as well as the protein levels in the feed also could be reduced.
[00217] Non-starch polysaccharides (NSPs) are also present in virtually all feed ingredients of plant origin. NSPs are poorly utilized and can, when solubilized, exert adverse effects on digestion. Exogenous enzymes can contribute to a better utilization of these NSPs and as a consequence reduce any anti-nutritional effects. Accordingly, in a particular embodiment, hemicellulases and other polysaccharide-active polypeptides/enzymes of the present invention can be used for this purpose in cereal-based diets for poultry and, to a lesser extent, for pigs and other species.
[00218] In some embodiments, esterases of the present invention are useful in the feed industry such as for reducing the amount of phosphate in feed.
Pulp and paper
[00219] In another embodiment, xylanases of the present invention can be useful in the pulp and paper industry, e.g., for prebleaching of kraft pulp. Xylanases have been found to be most effective for that purpose. Xylanases attract increasing scientific and commercial attention due to applications in the pulp and paper industry for removal of hemicellulose from dissolving pulps or for enhancement of the bleachability of pulp and, thus, reduction of the use of environmentally harmful bleaching chemicals. A similar application of xylanases for pulp prebleaching is an already well-established technology and has greatly stimulated research on hemicellulases in the past decade. Although lignin-active peroxidases of the present invention may also be active in modification of lignin and hence have bleaching properties, such enzymes are generally less attractive for bleaching due to the need to use and recycle expensive redox mediators.
[00220] In a related embodiment, polypeptides such as xylanases of the present invention can be used to pre- bleach pulp to reduce the amount of bleaching chemicals to obtain a given brightness. It is suggested that xylanase depolymerises xylan blocks and increases accessibility or helps liberation of residual lignin by releasing xylan- chromophore fragments. In addition to brownstock prior to bleaching, polypeptides such as xylanases of the present invention can save on bleaching chemicals. The enzymes hydrolyze surface xylans and are able to break linkages between hemicellulose and lignin. Other polypeptides (e.g., hemicellulase active enzymes) of the present invention which can break these linkages can function effectively in bleaching or pre-bleaching of pulp, and thus such uses are also within the scope of the present invention.
[00221] In some embodiments, esterases of the present invention are useful for the removal of triglycerides, steryl esters, resin acids, free fatty acids, and sterols (e.g., lipophilic wood extractives).
Other uses
[00222] In another embodiment, polypeptides such as xylanases of the present invention can be used in antibacterial formulations, as well as in pharmaceutical products such as throat lozenges, toothpastes, and mouthwash.
[00223] Chitin is a beta-(1 ,4)-linked polymer of N-acetyl D-glucosamine (GlcNAc), found as a structural polysaccharide in fungal cell walls as well as in the exoskeleton of arthropods and the outer shell of crustaceans. Approximately 75% the total weight of shellfish, is considered waste, and a large proportion of the material making up the waste is chitin. Accordingly, in one embodiment, polypeptides such as chitin-degrading enzymes of the present invention are useful in the modification and degradation of chitin, allowing the production of chitin-derived material, such as chitooligosaccharides and N-acetyl D-glucosamine, from chitin waste. In another embodiment, polypeptides such as chitinase enzymes of the present invention can be useful as antifungal agents.
[00224] In another embodiment, polypeptides of the present invention can be used in the textile industry (e.g., for the treatment of textile substrates). More particularly, cellulases (e.g., endo-, exocellulases and cellobiohydrolases) have gained importance in the treatment of cellulose-containing fibers. During the washing of indigo-dyed denim textiles, enzymatic treatment by a polypeptide of the present invention is can be used in place of (or in addition to) a bleaching treatment to achieve a "used" look of jeans or other suitable fabrics. Polypeptides of the present invention can also improve the softness/feel of such fabrics. When used in textile detergent compositions, enzymes of the present invention can enhance cleaning ability or act as a softening agent. In another embodiment, polypeptides such as cellulases of the present invention can be used in combination with polymeric agents in processes for providing a localized variation in the color density of fibers.
[00225] In another embodiment, polypeptides of the present invention can be used in the waste treatment industry (e.g., for changing the characteristics of the waste to become more amenable to further treatment and/or for bio-conversion to value-added products). Polypeptides such as lipases, cellulases, amylases, and proteases of the present invention can be used in addition to microorganisms to break down polymeric substances like proteins, polysaccharides and lipids, thereby facilitating this process.
[00226] In another embodiment, polypeptides of the present invention can be used in industries such as biocatalysis; sewage treatment; cleaning up oil pollution; the synthesis of fragrances; and enhancing the recovery of oil (e.g., during drilling).
[00227] Other uses of the polynucleotides and polypeptides of the present invention would be apparent to a person of skill in the art in view of the sequences and biological activities disclosed herein. These other uses, even though not explicitly mentioned here, are nevertheless within the scope of the present invention.
Diagnostic, classification and research tools
[00228] In another embodiment, the polynucleotides, polypeptides and antibodies of the present invention can be useful for diagnostic and classification tools. In this regard, it would be within the capacities of a person of skill in the art to search existing sequence databases and perform a phylogenic analysis based on the nucleic acid and amino acid sequences disclosed herein. Furthermore, designing hybridization probes or primers that are specific for a particular genus, species or strain (e.g., the genus, species, or strain from which the sequences disclosed herein were derived) would be within the grasp of a skilled person, in view of the sequence information disclosed herein. Similarly, a skilled person would be able to select an epitope of a polypeptide of the present invention which is specific for a particular genus, species or strain (e.g., the genus, species, or strain from which the sequences disclosed herein were derived) and generate an antibody or binding agent that binds specifically thereto.
[00229] Such tools are useful, for example, in diagnostic methods for detecting the presence or absence of a particular organism (e.g., the organism from which the sequences disclosed herein were derived) in a sample; as research tools (e.g., for designing and producing microarrays for studying fungal gene expression); for rapidly classifying an organism of interest based the detection of a sequence or polypeptide specific for that organism. The skilled person would recognize that knowledge of the precise (biological) function or protein activity of a polypeptide of the present invention is not absolutely necessary for the aforementioned tools to be useful for diagnostic, research, or classification purposes. Sequences that are particularly useful in this regard are the genomic, coding and amino acid sequences corresponding to the polypeptides of the present invention annotated as "unknown" in Tables 1A-1C (as well as their corresponding exons and introns defined in Tables 2A-2C, where available). These sequences show little sequence identity with those in the art and thus can be useful as markers for identifying the organisms from which the sequences of the present invention were derived. The skilled person would know how to search various sequence databases to design specific hybridization oligonucleotides (e.g., probes and primers), as well as produce antibodies specifically binds to the aforementioned sequences.
[00230] In some embodiments, the present invention relates to a method for identifying and/or classifying an organism (e.g., a fungal species) based on a biological sample, the method comprising detecting the presence or absence of any one of the polynucleotides or polypeptides of the present invention (e.g., those recited in the preceding paragraph) and determining that said organism is present or classifying said organism based on the presence of the polynucleotide or polypeptide. In some embodiments, the detecting step can be carried out using one or more oligonucleotides or antibodies of the present invention. In some embodiments, the detecting step can be carried out by performing an amplification and/or hybridization reaction.
[00231] In Tables 1A-1C below, the skilled person will recognize that although the precise protein activity of a polypeptide of the present invention may not be known perse (e.g., in the case of proteins of the presence invention labelled as "unknown" in Tables 1A-1C), the polypeptide may be nevertheless useful for carrying out an industrial
process (e.g., cellulase-enhancing, cellulose-degrading, hemicellulose-degrading, cellulolysis-enhancing, lignocellulolysis-enhancing, and other biological functions listed in Tables 1A-1C). In this regard, proteins labelled herein as "unknown" comprise proteins whose precise enzymatic activities may not be deduceable from sequence comparisons, but that are nevertheless indentified as interesting targets for industrial applications for other reasons (e.g., their expression is induced by growth under certain coonditions such as in the presence of cellulostic and/or lignocellulostic biomass).
Table 1A. Biomass degrading genes and polypeptides of Thermoascus aurantiacus
Provisional
PCT application
Gene ID in Annotation In application
SEQ ID NO: prov. Provisional £ SEQ ID NO:
appns. application Nos.
Target ID Updated annotation Function Protein activity
61/714,496 61/714,496 "o
Έ Έ
and and o
61/714,999 61/714,999 o Έ Έ
< <
Theau2p4 carbohydrate esterase carbohydrate- unknown carbohydrate esterase CE16 40 41 42 14 214 414 000765 CE16 modifying
Theau2p4 Theau2p4 cellulose- endoglucanase endoglucanase GH5 cellulase GH5 43 44 45 15 215 415 000766 _000766 degrading
Theau2p4 protein
Aminopeptidase 2 Aminopeptidase 2 protease 46 47 48 16 216 416 000896 hydrolysis
Theau2p4 THEAU 3 Extracellular exo-alpha-(1- hemicellulose- unknown arabinofuranosidase GH43 49 50 51 17 217 417 000921 _00101 >5)-L-arabinofuranosidase degrading
Theau2p4 Subtilisin-like serine protein
Alkaline protease 2 protease 52 53 54 18 218 418 001159 protease pepC hydrolysis
Arabinogalactan Probable arabinogalactan
Theau2p4 Theau2p4 galactan- arabinogalactan endo- endo-1 ,4-beta- endo-1,4-beta- GH53 55 56 57 19 219 419 001291 _001291 degrading 1 ,4-beta-galactosidase
galactosidase galactosidase A
Uncharacterized
lteres
Theau2p4_ serine Ptieromone-processing protein
protease 58 59 60 20 220 420 001344 carboxypeptidase carboxypeptidase kexl hydrolysis
F41C3.5
Theau2p4 Aspartic-type Aspartic-type protein
protease 61 62 63 21 221 421 001376 endopeptidase ctsD endopeptidase ctsD hydrolysis
Theau2p4 Theau2p4 Expansin family cellulase-
Allergen Asp f 7 expansin 64 65 66 22 222 422 001424 _001424 protein enhancing
Theau2p4
Phospholipase C 3 Phospholipase C 3 lipid-degrading lipase 67 68 69 23 223 423 001559
uncharacterized
Theau2p4 THEAU 2 lignocellulolysis
unknown unknown lignocellulose-induced 70 71 72 24 224 424 001685 _04356 -enhancing
protein
Theau2p4 protein
Vacuolar protease A Vacuolar protease A protease 73 74 75 25 225 425 001741 hydrolysis
Theau2p4 THEAU 2 chitin- chitinase chitinase GH18 chitinase GH18 76 77 78 26 226 426 001760 03823 degrading
( pp p I Theau24 THEAU 1 Extracellular endo alha hemicellulose ypppp p Trietidletidase sedl rotea 1-
yppp p I Theau24 Probable trietidrotein-
gg pp pp Liase B Liase B liidderadin liase 1-
Provisional
PCT application
Gene ID in Annotation In application
SEQ ID NO: prov. Provisional
£ SEQ ID NO:
appns. application Nos.
Target ID Updated annotation Function Protein activity
61/714,496 61/714,496 "o
Έ Έ
and and o o o
61/714,999 61/714,999 o Έ Έ
< <
Peptidase M20 Probable
Theau2p4 protein
domain-containing carboxypeptidase protease 268 269 270 90 290 490 007404 hydrolysis
protein C757.05c AO090003000058
Theau2p4 THEAU 1 exo- exo-polygalacturonase pectin- exo-polygalacturonase GH28 271 272 273 91 291 491 007769 _00014 polygalacturonase GH28 degrading
Theau2p4 THEAU 3 chitin- unknown Beta-hexosaminidase beta-hexosaminidase GH20 274 275 276 92 292 492 007847 _00048 degrading
Putative
Theau2p4 Theau2p4 pectinesterase/pecti pectin-
Probable pectinesterase A pectinesterase CE8 277 278 279 93 293 493 007913 _007913 nesterase inhibitor degrading
38
alpha-
Theau2p4
Alpha-galactosidase Alpha-galactosidase glactoside- alpha-galactosidase GH27 280 281 282 94 294 494 007928
degrading
lteres
Theau2p4 Glucan 1,3-beta- Exo-1 ,3-beta-glucanase glucan- exo-1 ,3-beta-
GH55 283 284 285 95 295 495 008005 glucosidase GH55 degrading glucanase
Theau2p4 Theau2p4 lignin-
Laccase-1 Laccase-1 laccase AA1 286 287 288 96 296 496 008027 _008027 degrading
Uncharacterized
Theau2p4 hemicellulose- 30.6 kDa protein in unknown CE4 ace tylxylaneste rase CE4 289 290 291 97 297 497 008280 modifying
fumA 3'region
Theau2p4_ THEAU_2 galactan- beta-galactosidase beta-galactosidase GH35 beta-galactosidase GH35 292 293 294 98 298 498 008290 00061 degrading
Theau2p4 Hemolytic Non-hemolytic
lipid-degrading lipase 295 296 297 99 299 499 008334 phospholipase C phospholipase C
Theau2p4 THEAU 1 Probable carbohydrate-
Probable glycosidase crf1 glycosidase GH16 298 299 300 100 300 500 008354 _00062 glycosidase crf1 modifying
Theau2p4 THEAU 1 Endo-1 ,4-beta- tomatin
tomatinase GH10 tomatinase GH10 301 302 303 101 301 501 008423 00021 xylanase C degrading
1 pp yy 009884etidase sed2 hdrolsis ypppp Trietidletidase sed2 1- yp I Theau24 carbohdrat y p I Theau24 Uncharacterized carbohdrat
pp I Theau24 Theau24 Killer toxin subunits chitin-
Provisional
PCT application
Gene ID in Annotation in
.>. application
SEQ ID NO: prov. Provisional SEQ ID NO:
appns. application Nos.
Target ID Updated annotation Function Protein activity
61/714,496 61/714,496
and and
61/714,999 61/714,999
uncharacterized
Theau2p4 lignocellulolysis
unknown lignocellulose-induced 130 330 530 _000077 -enhancing
protein
Theau2p4 15-hydroxyprostaglandin
unknown dehydrogenase 131 331 531 _001886 dehydrogenase [NAD(+)]
uncharacterized
Theau2p4 lignocellulolysis
unknown lignocellulose-induced 132 332 532 _002223 -enhancing
protein
Theau2p4
unknown unknown unknown 133 333 533 _004121
Theau2p4 Bifunctional solanapyrone FAD-dependent
unknown A CAZfAyai7 134 334 534 _005752 synthase oxidoreductase
alpha-N-
Theau2p4 Uncharacterized oligosaccharid
acetylgalactosaminida GH109 135 335 535 _006038 oxidoreductase C513.06c e-modifying lt CBMf iieres on
se
uncharacterized
Theau2p4 lignocellulolysis
unknown lignocellulose-induced Gienomc 136 336 536 _006946 -enhancing
protein
Theau2p4 Cdiong
possible adhesin cell adhesion adhesin 137 337 537 _007415
uncharacterized Aiidmno ac
Theau2p4 lignocellulolysis
unknown lignocellulose-induced 138 338 538 _008246 -enhancing
protein Gienomc uncharacterized
Theau2p4 lignocellulolysis
unknown lignocellulose-induced 139 3 _009600 -enhancing Cdi3ong9 539 protein
Theau2p4 Dehydrogenase/reductase
unknown dehydrogenase 140 340 5 Aiidmno ac40 _010333 SDR family member 7B
THEAU 1 galactan- beta-galactosidase GH35 beta-galactosidase GH35 141 341 541
00006 degrading
Table 1 B. Biomass degrading genes and polypeptides of Myceliophthora fergusii (Corynascus thermophilus)
Provisional
PCT application SEQ application
ID NO:
Gene ID In Annotation In SEQ ID NO:
prov. Provisional
Target ID Updated annotation Function Protein activity
appn. application No.
Έ Έ
61/714,485 61/714,485
Έ Έ
< <
Corth2p4 0
Lipase 1 unknown unknown unknown 1 2 3 601 890 1179 00016
Corth2p4 0 CORTH 1 cellulase-enhancing polysaccharide cellulose- polysaccharide
AA9 4 5 6 602 891 1180 00017 02562 protein monooxygenase degrading monooxygenase
Corth2p4 0 Stress response protein lipid-
Lipase 4 lipase 7 8 9 603 892 1181 00019 ishl degrading
Corth2p4 0
Lipase 1 unknown unknown unknown CA filZyamy 10 11 12 604 893 1182 00067
Corth2p4 0
Lipase 4 unknown unknown unknown 13 14 15 605 894 1183 00073 CBMf itt oneres
arabinoxylan
Corth2p4 0 Corth2p4 0 alpha- hemicellulose CBM
arabinofuranosidase arabinofuranosidase GH62 16 17 18 606 895 1184 00169 00169 arabinofuranosidase -degrading 1
GH62
Corth2p4 0 Corth2p4 0 glucan-
Beta-glucanase Beta-glucanase cellulase GH16 19 20 21 607 896 1185 00317 00317 degrading
Corth2p4 0 Corth2p4 0 pectin methylesterase pectin-
Pectinesterase pectinesterase CE8 22 23 24 608 897 1186 00319 00319 CE8 degrading
Corth2p4 0 CORTH 1 cellulose- cellobiohydrolase cellobiohydrolase GH7 cellobiohydrolase GH7 25 26 27 609 898 1187 00449 01922 degrading
Corth2p4 0 Corth2p4 0 chitin- hexosaminidase hexosaminidase GH20 hexosaminidase GH20 28 29 30 610 899 1188 00539 00539 degrading
Corth2p4 0 Corth2p4 0 Probable Probable glycosidase glucan- glycosidase GH16 31 32 33 611 900 1189 00543 00543 glycosidase CRH1 CRH1 degrading
Corth2p4 0 Corth2p4 0 Acetylxylan Acetylxylan esterase 2 hemicellulose
acetylxylan esterase CE5 34 35 36 612 901 1190 00894 00894 esterase 2 CE5 -degrading
Corth2p4 0 Corth2p4 0 cellulose- beta-glucosidase beta-glucosidase GH3 beta-glucosidase GH3 37 38 39 613 902 1191 00923 00923 degrading
Corth2p4 Corth2p4 exo-1,3-beta- galactan- exo-1,3-beta- CBM
unknown GH43 40 41 42 614 903 1192 000941 000941 galactanase GH43 degrading galactanase 35
Provisional
PCT application SEQ application
ID NO:
Gene ID in Annotation in .>.
SEQ ID NO:
prov. Provisional
Target ID Updated annotation Function Protein activity
appn. application No.
61/714,485 61/714,485
Corth2p4 0 Corth2p4 0 Probable beta- Beta-galactosidase galactan- beta-galactosidase GH35 121 122 123 641 930 1219 02765 02765 galactosidase B GH35 degrading
Corth2p4 0 CORTH 1 cellulase-enhancing polysaccharide cellulose- polysaccharide
AA9 124 125 126 642 931 1220 02774 02029 protein monooxygenase degrading monooxygenase
Corth2p4 0 Homoaconitase,
unknown unknown homoaconitase 127 128 129 643 932 1221 02835 mitochondrial
Corth2p4 0
unknown unknown GT31 unknown unknown GT31 130 131 132 644 933 1222 02845
Corth2p4 0 Corth2p4 0 cellulase-enhancing Polysaccharide cellulose- polysaccharide CAZfyai CBM
AA9 133 134 135 645 934 1223 02847 02847 protein monooxygenase GH61 degrading monooxygenase 1
Corth2p4 0 Adrenodoxin homolog,
unknown unknown ferridoxin 136 137 138 646 935 1224 02850 mitochondrial lt CBMf iieres on
Corth2p4 0 protein 50s ribosomal
unknown 50S ribosomal protein L1 139 140 141 647 936 1225 02856 synthesis protein 11
Gienomc
Corth2p4 0
unknown unknown unknown unknown 142 143 144 648 937 1226 02858
Corth2p4 0 Corth2p4 0 cellulose- Cdiong
endoglucanase endoglucanase GH5 cellulase GH5 145 146 147 649 938 1227 02886 02886 degrading
Corth2p4 0 CORTH 1 cellulase-enhancing polysaccharide cellulose- polysaccharide Aiidmno ac
AA9 148 149 150 650 939 1228 02887 02430 protein monooxygenase degrading monooxygenase
Corth2p4 0 Corth2p4 0 cellulase-enhancing polysaccharide cellulose- polysaccharide CBM
AA9 151 152 153 6
protein monooxygenase degrading monooxygenase 1 Gi5enomc1 940 1229 02961 02961
Corth2p4 0 Corth2p4 0 Manganese lignin- manganese
Manganese peroxidase 3 AA2 154 155 156 652 941 1230 03055 03055 peroxidase 3 degrading peroxidase
Cdiong
Putative
Corth2p4 0 CORTH 1 rhamnogalacturonan pectin- rhamnogalacturona
rhamnogalacturona PL4 157 158 159 653 942 1231 03209 00099 lyase PL4 degrading n lyase
se Aiidmno ac
Corth2p4 0 Corth2p4 0 cellulose- cellobiohydrolase cellobiohydrolase GH6 cellobiohydrolase GH6 160 161 162 654 943 1232 03296 03296 degrading
Corth2p4 0 Corth2p4 0 Cellobiose Cellobiose lignocellulose- cellobiose
AA8 163 164 165 655 944 1233 03311 03311 dehydrogenase dehydrogenase degrading dehydrogenase
Provisional
PCT application SEQ application
ID NO:
Gene ID In Annotation In
£ SEQ ID NO:
prov. Provisional
Target ID Updated annotation Function Protein activity
appn. application No. "o
Έ Έ
61/714,485 61/714,485 o o o
o Έ Έ
< <
Corth2p4 0 6-hydroxy-D-nicotine
unknown unknown oxidase AA7 241 242 243 681 970 1259 04480 oxidase
Corth2p4 0 Corth2p4 0 galactan-
Beta-galactosidase Beta-galactosidase beta-galactosidase GH2 244 245 246 682 971 1260 04560 04560 degrading
Corth2p4 0 CORTH 1 cellulase-enhancing polysaccharide cellulose- polysaccharide CBM
AA9 247 248 249 683 972 1261 04562 02324 protein monooxygenase degrading monooxygenase 1
Corth2p4 Corth2p4 Acetylxylan acetylxylan esterase hemlcellulos acetylxylan CBM
CE5 250 251 252 684 973 1262 004688 004688 esterase CE5 e-degradlng esterase 1
Corth2p4 0 Corth2p4 0 tiemicellulose
beta-mannanase Beta-mannanase GH5 beta-mannanase GH5 253 254 255 685 974 1263 04710 04710 -degrading
uncharacterized
Corth2p4 0 Corth2p4 0 glucan- GH13 CBM
unknown lignocellulose-induced beta-glucanase lteres 256 257 258 686 975 1264 04769 04769 degrading 1 1
protein GH131
Corth2p4 0 Corth2p4 0 tiemicellulose
Feruloyl esterase B Feruloyl esterase B feruloyl esterase CE1 259 260 261 687 976 1265 04823 04823 -modifying
uncharacterized
Corth2p4 0 Corth2p4 0 lignocellulolys
unknown unknown lignocellulose- 262 263 264 688 977 1266 04956 04956 is-enhancing
induced protein
Corth2p4 0 UPF0357 protein
unknown unknown unknown 265 266 267 689 978 1267 05030 YCL012C
Corth2p4 0
unknown unknown unknown unknown 268 269 270 690 979 1268 05160
Corth2p4 0 Corth2p4 0 Killer toxin subunits chitin- CBM
chitinase chitinase GH18 271 272 273 691 980 1269 05265 05265 alpha/beta degrading 18
uncharacterized
Corth2p4 0 Corth2p4 0 lignocellulolys
unknown unknown lignocellulose- 274 275 276 692 981 1270 05268 05268 is-enhancing
induced protein
Corth2p4 0 Corth2p4 0 sugar-
Aldose 1-epimerase Aldose 1-epimerase aldose epimerase 277 278 279 693 982 1271 05378 05378 modifying
Provisional
PCT application SEQ application
ID NO:
Gene ID In Annotation In SEQ ID NO:
prov. Provisional
Target ID Updated annotation Function Protein activity
appn. application No.
Έ Έ
61/714,485 61/714,485
Έ Έ
< < arabinoxylan
Corth2p4 0 Corth2p4 0 Putative beta- hemicellulose
arabinofuranohydrolase arabinofuranosidase GH43 280 281 282 694 983 1272 05438 05438 xylosidase -degrading
GH43
Corth2p4 0 Corth2p4 0 Venom carbohydrate-
Carboxylesterase carboxylesterase CE10 283 284 285 695 984 1273 05522 05522 carboxylesterase-6 modifying
NAD dependent Putative uncharacterized
Corth2p4 0 Corth2p4 0
epimerase/dehydrat oxidoreductase unknown oxidoreductase 286 287 288 696 985 1274 05615 05615
ase YDR541 C
Corth2p4 0 CORTH 1 hemicellulose CAZ filyamy
xylanase xylanase GH11 xylanase GH11 289 290 291 697 986 1275 05803 02668 -degrading
Corth2p4 0 CORTH 1 cellulase-enhancing polysaccharide cellulose- polysaccharide
AA9 292 293 294 698 987 1276 06047 00634 protein monooxygenase degrading monooxygenase CBMf itt oneres
Corth2p4 0 Corth2p4 0 sugar-
Aldose 1-epimerase Aldose 1-epimerase aldose epimerase 295 296 297 699 988 1277 06086 06086 modifying
Corth2p4_0 Corth2p4_0 Acetylxylan hemicellulose
acetylxylan esterase CE1 acetylxylan esterase CE1 298 299 300 700 989 1278 06093 06093 esterase A -degrading
Corth2p4 0 Corth2p4 0 cellulose- CBM
cellobiohydrolase cellobiohydrolase GH7 cellobiohydrolase GH7 301 302 303 701 990 1279 06231 06231 degrading 1
Corth2p4 0 Corth2p4 0 cellulase-enhancing polysaccharide cellulose- polysaccharide
AA9 304 305 306 702 991 1280 06280 06280 protein monooxygenase degrading monooxygenase
Corth2p4 0 Corth2p4 0 Sim1 adhesin-like GH13
adhesin GH132 cell adhesion adhesin 307 308 309 703 992 1281 06392 06392 protein 2
Corth2p4 0 Corth2p4 0 arabinogalactanase galactan- arabinogalactanase arabinogalactanase GH53 310 311 312 704 993 1282 06416 06416 GH53 degrading
Corth2p4 0 CORTH 1 cellulose- CBM
cellobiotiydrolase cellobiohydrolase GH6 cellobiohydrolase GH6 313 314 315 705 994 1283 06508 02586 degrading 1
Corth2p4 0 Corth2p4 0 Expa nsin-like cellulase- unknown expansin 316 317 318 706 995 1284 06585 06585 protein enhancing
Corth2p4 0 Corth2p4 0 carbohydrate- carbohydrate
unknown unknown CE16 CE16 319 320 321 707 996 1285 06624 06624 modifying esterase
Provisional
PCT application SEQ application
ID NO:
Gene ID In Annotation In SEQ ID NO:
prov. Provisional
Target ID Updated annotation Function Protein activity
appn. application No.
61/714,485 61/714,485
Bifu notional
Corth2p4 0 Corth2p4 0
xylanase/deacetylas unknown unknown unknown 361 362 363 721 1010 1299 07216 07216
e
Cytochrome c oxidase
Corth2p4 0 Corth2p4 0 Chitooligosaccharid cytochrome c
assembly protein cox-16, 364 365 366 722 1011 1300 07256 07256 e deacetylase oxidase
mitochondrial
Corth2p4 0 Corth2p4 0
Chitin deacetylase 1 unknown unknown unknown 367 368 369 723 1012 1301 07314 07314
Probable endo- CAZ filyamy
Corth2p4 0 Corth2p4 0 1 ,3(4)-beta- unknown unknown 370 371 372 724 1013 1302 07317 07317 glucanase
AO090023000083 CBMf itt oneres
Corth2p4 0 Corth2p4 0 Probable Sporulation-specific sporulation-specific
unknown 373 374 375 725 1014 1303 07324 07324 glycosidase CRH1 protein 2 protein 2
Gienomc
Corth2p4 0 Corth2p4 0 exo- DnaJ-related protein
dnaj-related protein 376 377 378 726 1015 1304 07336 07336 glucosaminidase SCJ1
Corth2p4 0 Corth2p4 0 Probable alpha- Cdiong
unknown unknown unknown 379 380 381 727 1016 1305 07337 07337 galactosidase B
Low molecular weight Aiidmno ac
Corth2p4 0 Corth2p4 0 dephosphoryl
alptia-galactosidase phosphotyrosine protein phosphatase 382 383 384 728 1017 1306 07352 07352 ation
phosphatase
Corth2p4_0 Corth2p4_0 Uncharacterized protein Gienomc
galactanase unknown unknown 385 386 387 729 1018 1307 07363 07363 YBR096W
uncharacterized
Corth2p4_0 Corth2p4_0 lignocellulolys
unknown unknown lignocellulose- 388 389 390 730 10 Cdiong19 1308 07365 07365 is-enhancing
induced protein
uncharacterized
Corth2p4 0 Corth2p4 0 lignocellulolys Aiidmno ac unknown unknown lignocellulose- 391 392 393 731 1020 1309 07371 07371 is-enhancing
induced protein
Corth2p4 0 CORTH 1 hemicellulose
Feruloyl esterase B feruloyl esterase CE1 feruloyl esterase CE1 394 395 396 732 1021 1310 07378 01019 -modifying
Provisional
PCT application SEQ application
ID NO:
Gene ID in Annotation in .>.
SEQ ID NO:
prov. Provisional
Target ID Updated annotation Function Protein activity
appn. application No.
61/714,485 61/714,485
Corth2p4 0 Corth2p4 0 Glucan 1 ,3-beta- unknown unknown unknown 397 398 399 733 1022 1311 07382 07382 glucosidase
Corth2p4 0 Corth2p4 0
lamina rinase unknown unknown unknown 400 401 402 734 1023 1312 07391 07391
Corth2p4 0 Corth2p4 0 hemicellulose CBM
xylanase xylanase GH11 xylanase GH11 403 404 405 735 1024 1313 07404 07404 -degrading 1
Corth2p4 0 Corth2p4 0 cellulose- cellobiohydrolase endoglucanase GH6 cellulase GH6 406 407 408 736 1025 1314 07434 07434 degrading
Corth2p4 0 Corth2p4 0 CAZfyai
lamina rinase unknown unknown unknown 409 410 411 737 1026 1315 07436 07436
Corth2p4 0 Corth2p4 0
lamina rinase unknown unknown unknown 412 413 414 738 1027 1316 07450 07450 lt CBMf iieres on
Corth2p4 0 Corth2p4 0 Snake venom 5'- nucleic acid
beta-glucuronidase nucleotidase 415 416 417 739 1028 1317 07458 07458 nucleotidase degradation
Gienomc
Corth2p4 0 Corth2p4 0 Alpha-L-fucosidase
unknown unknown unknown 418 419 420 740 1029 1318 07464 07464 2
Corth2p4 0 Corth2p4 0 dephosphoryl Cdiong
Acid phosphatase Acid phosphatase acid phosphatase 421 422 423 741 1030 1319 07465 07465 ation
Corth2p4 0 Corth2p4 0 Aiidmno ac
Aldose 1-epimerase unknown unknown unknown 424 425 426 742 1031 1320 07466 07466
Corth2p4 0 Corth2p4 0
Aldose 1-epimerase unknown unknown unknown 427 428 429 7 Gi4enomc3 1032 1321 07477 07477
uncharacterized
Corth2p4 0 Corth2p4 0 lignocellulolys
unknown unknown lignocellulose- 430 431 432 744 1033 1322 07526 07526 is-enhancing
induced protein Cdiong arabinoxylan
Corth2p4 Corth2p4 Putative beta- hemicellulos arabino- arabinofurano- GH43 433 434 435 745 1034 13 007532 007532 xylosidase e-degrading furanosidase Aiidmno ac23 hydrolase GH43
Corth2p4 0 Corth2p4 0 Probable mitochondrial lipid-
Lipase 2 lipase 436 437 438 746 1035 1324 07540 07540 chaperone BCS1-B degrading
Provisional
PCT application SEQ
.>. application
ID NO:
Gene ID in Annotation in SEQ ID NO:
prov. Provisional
Target ID Updated annotation Function Protein activity
appn. application No.
61/714,485 61/714,485
Corth2p4 0 Corth2p4 0
Vacuolar protease A unknown unknown unknown 508 509 510 770 1059 1348 07778 07778
Corth2p4 0 Corth2p4 0 Embryonic protein
Bubble protein protease 511 512 513 771 1060 1349 07779 07779 pepsinogen hydrolysis
Corth2p4 0 Corth2p4 0 Subtilisin-I ike protein
Pisatin demettiylase protease 514 515 516 772 1061 1350 07784 07784 protease 6 hydrolysis
Corth2p4 0 Corth2p4 0 Serine-type tRNA-specific adenosine protein
protease 517 518 519 773 1062 1351 07794 07794 carboxypeptidase F deaminase 2 hydrolysis
Metallocarboxypepti CAZfyai
Corth2p4 0 Corth2p4 0 tRNA wybutosine- protein
dase A-like protein protease 520 521 522 774 1063 1352 07810 07810 synthesizing protein 2 hydrolysis
MCYG_01475
Corth2p4 0 Corth2p4 0 Tripeptidyl- Unctiaracterized calcium- lt CBMf iieres on
unknown unknown 523 524 525 775 1064 1353 07811 07811 peptidase sed2 binding protein C613.03
Corth2p4 0 Corth2p4 0 Phosphate-re pressible protein
Aspergillopepsin-2 protease 5 Gi2enomc6 527 528 776 1065 1354 07816 07816 acid phosphatase hydrolysis
Corth2p4 0 Corth2p4 0 Leucine protein
Pirin protease 529 5
8 07818 aminopeptidase 1 hydrolysis Cdi3ong0 531 777 1066 1355 0781
Corth2p4 0 Corth2p4 0 Putative serine Box C/D snoRNA protein protein
protease 532 533 534 778 1067 1356 07820 07820 protease K12H4.7 1 hydrolysis Aiidmno ac
Corth2p4 0 Corth2p4 0 Cuticle-degrading
unknown unknown unknown 535 536 537 779 1068 1357 07872 07872 protease
Corth2p4 0 Corth2p4 0 Minor extracellular Gienomc
unknown unknown unknown 538 539 540 780 1069 1358 07887 07887 protease vpr
Corth2p4 0 Corth2p4 0 Metallocarboxypepti DNA-3-methyladenine protein
protease 541 542 543 781 10 0 dase A glycosylase hydrolysis Cdiong70 1359 07910 0791
Corth2p4 0 Corth2p4 0 protein
Podospora pepsin Prohibit! n-1 protease 544 545 546 782 1071 1360 07920 07920 hydrolysis Aiidmno ac
Corth2p4 0 Corth2p4 0 Subtilisin-I ike
unknown unknown unknown 547 548 549 783 1072 1361 07927 07927 proteinase Spm1
Corth2p4 0 Corth2p4 0 Putative serine
unknown unknown unknown 550 551 552 784 1073 1362 07995 07995 protease F56F10.1
Provisional
PCT application SEQ
.>. application
ID NO:
Gene ID in Annotation in SEQ ID NO:
prov. Provisional
Target ID Updated annotation Function Protein activity
appn. application No.
61/714,485 61/714,485
Corth2p4 0 Corth2p4 0 probable leucine
unknown AA9 unknown unknown AA9 553 554 555 785 1074 1363 08040 08040 aminopeptidase 2
Corth2p4 0 Corth2p4 0 Probable leucine protein
Alkaline phosphatase H protease 556 557 558 786 1075 1364 08044 08044 aminopeptidase 2 hydrolysis
Corth2p4 0 Corth2p4 0 protein
Candidapepsin-3 unknown protease 559 560 561 787 1076 1365 08057 08057 hydrolysis
Gamma-
Corth2p4 0 Corth2p4 0 Protein transport protein protein
glutamyltranspeptid protease 562 563 564 788 1077 1366 08065 08065 sec39 hydrolysis
ase 2 CAZfyai
Corth2p4 0 Corth2p4 0
Endothiapepsin unknown unknown unknown 565 566 567 789 1078 1367 08077 08077
Probable aspartic- lt CBMf iieres on
Corth2p4 0 Corth2p4 0
type endopeptidase unknown unknown unknown 568 569 570 790 1079 1368 08080 08080
OPSB
Gienomc
Extracellular Probable
Corth2p4 0 Corth2p4 0 protein
metalloprotease mannosyltransferase protease GT15 571 572 573 791 1080 1369 08085 08085 hydrolysis
Pa_2_14170 KTR4 Cdiong
Corth2p4 0 Corth2p4 0 Cuticle-degrading
unknown unknown unknown 574 575 576 792 1081 1370 08088 08088 protease Aiidmno ac
Corth2p4 0 Corth2p4 0 Putative dipeptidase
unknown unknown unknown 577 578 579 793 1082 1371 08104 08104 CPC735_014430
Corth2p4 0 Corth2p4 0 Putative fungistatic protein Gienomc
Aminopeptidase Y protease AA5 580 581 582 794 1083 1372 08111 08111 metabolite hydrolysis
Corth2p4 0 Corth2p4 0 Sim1 adhesin-like GH13
adhesin GH132 unknown adhesin 583 584 585 795 10 180 08180 protein 2 Cdiong84 1373 08
Corth2p4 0 Corth2p4 0
Tyrosinase unknown unknown unknown 586 587 588 796 1085 1374 08209 08209 Aiidmno ac
Corth2p4 0 Corth2p4 0
Tyrosinase unknown unknown unknown 589 590 591 797 1086 1375 08225 08225
Corth2p4 0 Corth2p4 0 pectin-
Pectin lyase B pectin lyase PL1 pectin lyase PL1 592 593 594 798 1087 1376 08348 08348 degrading
Provisional
PCT application SEQ application
ID NO:
Gene ID in Annotation In £ SEQ ID NO:
prov. Provisional
Target ID Updated annotation Function Protein activity
appn. application No. "o
Έ Έ
61/714,485 61/714,485 o
o
o Έ Έ < <
CORTH 1 Polysaccharide cellulose- polysaccharide
AA9 811 1100 1389
03053 monooxygenase GH61 degrading monooxygenase
CORTH 1 hemicellulose
Xylanase GHH xylanase GH11 812 1101 1390 03145 -degrading
Corth2p4 0 carbohydrate- carbohydrate
unknown CE3 CE3 813 1102 1391 00138 modifying esterase
Corth2p4 0 Leucine aminopeptidase protein
protease 814 1103 1392 00172 1 hydrolysis
Corth2p4 0 Putative fungistatic
unknown glyoxal oxidase AA5 815 1104 1393 00477 metabolite
Corth2p4 0 Probable triacylglycerol lipid- lipase 816 1105 1394 00527 lipase C1450.16c degrading lteres
Corth2p4 0 Uncharacterized
unknown oxidoreductase 817 1106 1395 00837 oxidoreductase YusZ
Corth2p4 0 Uncharacterized
unknown oxidoreductase 818 1107 1396 00838 oxidoreductase DUE
Corth2p4 0 beta-glucuronidase hemicellulose
beta-glucuronidase GH79 819 1108 1397 00858 GH79 -degrading
Corth2p4 0
Choline dehydrogenase unknown dehydrogenase AA3 820 1109 1398 00887
Corth2p4 0
unknown AA7 unknown unknown AA7 821 1110 1399 00911
Corth2p4 0
Choline dehydrogenase unknown dehydrogenase AA3 822 1111 1400 01004
Corth2p4 0 Glucan 1,3-beta- glucan- glucan 1 ,3-beta-
GH55 823 1112 1401 01018 glucosidase degrading glucosidase
Corth2p4 0 protein
Dipeptidyl peptidase 4 protease CE10 824 1113 1402 01022 hydrolysis
Corth2p4 0 Carboxypeptidase Y protein
protease 825 1114 1403 01303 homolog A hydrolysis
Provisional
PCT application SEQ
.>. application
ID NO:
Gene ID in Annotation in SEQ ID NO:
prov. Provisional
Target ID Updated annotation Function Protein activity
appn. application No.
61/714,485 61/714,485
Corth2p4 0 chitin- CBM
Chitin deacetylase 1 chitin deacetylase CE4 826 1115 1404 01516 degrading 18
Corth2p4 0
unknown GH16 unknown unknown GH16 827 1116 1405 01647
Corth2p4 0 Putative fungistatic
unknown unknown AA5 AA5 828 1117 1406 01670 metabolite
Corth2p4 0 Subtilisin-like protease protein
protease 829 1118 1407 01758 CPC735_066880 hydrolysis
Probable isoaspartyl CAZfyai
Corth2p4 0 protein
peptidase/L- protease 830 1119 1408 01806 hydrolysis
asparaginase 3
Corth2p4 0 protein lt CBMf iieres on
Vacuolar protease A protease 831 1120 1409 02063 hydrolysis
Corth2p4 0 protein
Embryonic pepsinogen protease Gienomc 832 1121 1410 02115 hydrolysis
Corth2p4 0 protein
Subtilisin-like protease 6 protease
02197 hydrolysis Cdiong 833 1122 1411
Putative sterigmatocystin
Corth2p4 0 lignin- biosynthesis peroxidase peroxidase Aiidmno ac 834 1123 1412 02351 degrading
stcC
Corth2p4 0 exo-glucosaminidase chitin- exo-
GH2 8
02355 GH2 degrading glucosaminidase Gi3enomc5 1124 1413
Corth2p4 0
Retinol dehydrogenase 8 unknown dehydrogenase 836 1125 1414 02374
Corth2p4 0 Uncharacterized Cdiong
unknown oxidoreductase 837 1126 1415 02375 oxidoreductase DUE
Corth2p4 0 L-sorbose 1- unknown dehydrogenase AA3 838 1127 14 Aiidmno ac16 02376 dehydrogenase
Corth2p4 0 glucan-
Laminarinase GH55 laminarinase GH55 839 1128 1417 02387 degrading
Provisional
PCT application SEQ application
ID NO:
Gene ID in Annotation in .>.
SEQ ID NO:
prov. Provisional
Target ID Updated annotation Function Protein activity
appn. application No.
61/714,485 61/714,485
Corth2p4 0 Se rine-type protein
protease 840 1129 1418 02514 carboxypeptidase F hydrolysis
Corth2p4 0 6-hydroxy-D-nicotine
unknown oxidase AA7 841 1130 1419 02676 oxidase
Corth2p4 0 tiemicellulose
galactanase GH5 galactanase GH30 842 1131 1420 02800 -degrading
Metallocarboxy peptidase
Corth2p4 0 protein
A-like protein protease 843 1132 1421 02811 hydrolysis
MCYG_01475 CAZfyai
Corth2p4 0
unknown AA7 unknown oxidase AA7 844 1133 1422 02819
Corth2p4 0 Tripeptidyl-peptidase protein lt CBMf iieres on
protease 845 1134 1423 02880 sed2 hydrolysis
Corth2p4 0 Copper-containing nitrite copper-containing
unknown AA1 Gienomc 846 1135 1424 02924 reductase nitrite reductase
Corth2p4 0 exo-1 ,3-beta-glucanase cellulose- exo-1,3-beta-
GH55
GH55 degrading glucanase Cdiong 847 1136 1425 03097
Corth2p4 0
L-ascorbate oxidase unknown oxidase AA1 848 1137 1426 03178 Aiidmno ac
alpha-
Corth2p4 0 alptia-galactosidase
galactoside- alpha-galactosidase GH27 849 1138 1427 03187 GH27
degrading Gienomc
Corth2p4 0 protein
Aspergillopepsin-2 protease 850 1139 1428 03191 hydrolysis
Corth2p4 0 Leucine aminopeptidase protein Cdiong
protease 851 1140 1429 03207 1 hydrolysis
Corth2p4 0 lipid-
Lipase 2 lipase 852 1141 14 Aiidmno ac30 03244 degrading
Corth2p4 0 Putative serine protease protein
protease 853 1142 1431 03359 K12H4.7 hydrolysis
Provisional
PCT application SEQ application
ID NO:
Gene ID in Annotation in
£ SEQ ID NO:
prov. Provisional
Target ID Updated annotation Function Protein activity
appn. application No.
61/714,485 61/714,485 o
Corth2p4 0 Cuticle-degrading protein
protease 854 1143 1432 03416 protease hydrolysis
Corth2p4 0 ctiitin- ctiitin deacetylase CE4 chitin deacetylase CE4 855 1144 1433 03463 degrading
Corth2p4 0 Minor extracellular protein
protease 856 1145 1434 03491 protease vpr hydrolysis
Corth2p4 0 Metallocarboxy peptidase protein
protease 857 1146 1435 03507 A hydrolysis
Corth2p4 0 lipid-
Lysoptiosptiolipase lipase 858 1147 1436 03877 degrading
Corth2p4 0 protein
Podospora pepsin protease 859 1148 1437 04001 hydrolysis lt CBMf iieres on
Corth2p4 0 Putative oxidoreductase
unknown oxidoreductase 860 1149 1438 04341 C1 F5.03c
Gienomc
Corth2p4 0 Subtilisin-like proteinase protein
protease 861 1150 1439 04483 Spm1 hydrolysis
Corth2p4 0 Putative serine protease protein
protease 862 1151 1440 04551 F56F10.1 hydrolysis
Corth2p4 0 Probable serine-O- serine-o- Aiidmno ac
CE1 863 1152 1441 04617 acetyltransferase cys2 acetyltransferase
Corth2p4 0 probable leucine protein
protease 8
04685 aminopeptidase 2 hydrolysis Gi6enomc4 1153 1442 alpha-
Corth2p4 0 alptia-galactosidase
galactoside- alpha-galactosidase GH27 865 1154 1443 04746 GH27
degrading
Corth2p4 0 Probable leucine protein
protease 866 1155 1444 04754 aminopeptidase 2 hydrolysis Aiidmno ac
Corth2p4 0 chitin- ctiitin deacetylase CE4 chitin deacetylase CE4 867 1156 1445 04949 degrading
Corth2p4 0 protein
Candidapepsin-3 protease 868 1157 1446 04962 hydrolysis
Provisional
PCT application SEQ application
ID NO:
Gene ID in Annotation in .>.
SEQ ID NO:
prov. Provisional
Target ID Updated annotation Function Protein activity
appn. application No.
61/714,485 61/714,485
Corth2p4 0 Putative dipeptidase protein
protease 883 1172 1461 08031 CPC735_014430 hydrolysis
Corth2p4 0 lipid-
Lipase lipase 884 1173 1462 08108 degrading
Corth2p4 0 Tetrahydrocannabinol^
unknown oxidase AA7 885 1174 1463 08349 acid synthase
Corth2p4 0 lipid-
Lipase 4 lipase CE10 886 1175 1464 08439 degrading CAZfyai
Corth2p4 0 protein CAZfyai
Aminopeptidase Y protease 887 1176 1465 08463 hydrolysis
Putative lt CBMf i)eres on
Corth2p4 0 protein
metallocarboxy peptidase protease lt CBMf iieres on 888 1177 1466 08481 hydrolysis
ecm14
Gienomc
Corth2p4 0 carbohydrate-
Alpha-L-fucosidase 2 alpha-l-fucosidase GH95 Gienomc 889 1178 1467 08774 modifying
Cdiong
Cdiong
Table 1C. Biomass degrading genes and polypeptides of Pseudocercosporella herpotrichoides Aiidmno ac Aiidmno ac
Provisional
PC Gienomc GienomcT application SEQ
Gene ID in application
ID NO:
Annotation in SEQ ID NO:
prov.
Provisional
appn. Target ID Updated annotation Function Protein activity Cdiong
application Nos. Cdiong 61/714,493
61/714,493
Aiidmno ac
Aiidmno ac
Psehe2p4 Carboxypeptidase protein
Carboxypeptidase A4 protease 1 2 3 1468 1992 2516 000017 A4 hydrolysis
Provisional
PCT application SEQ application
Gene ID in ID NO:
Annotation In
£ SEQ ID NO:
prov.
Provisional
appn. Target ID Updated annotation Function Protein activity
application Nos. "o
61/714,493 Έ Έ
61/714,493 o o o
o Έ Έ
< <
Psehe2p4 PSEHE 1 avenacin- avenacinase avenacinase GH3 avenacinase GH3 4 5 6 1469 1993 2517 000071 _00155 degrading
Psehe2p4 Tripeptidyl- Tripeptidyl-peptidase protein
protease 7 8 9 1470 1994 2518 000072 peptidase sedl sedl hydrolysis
Psehe2p4 Psehe2p4 cellulase- possible expansin possible expansin expansin 10 11 12 1471 1995 2519 000075 _000075 enhancing
Extracellular Extracellular
Psehe2p4 protein
metalloproteinase metalloproteinase protease 13 14 15 1472 1996 2520 000102 hydrolysis
MEP MEP
Psehe2p4 Psehe2p4 Cellobiose Cellobiose lignocellulose- cellobiose
AA8 17 18 1473 1997 252 0189 dehydrogenase dehydrogenase degrading dehydrogenase lteres 16 1 000189 _00
Probable zinc Probable zinc
Psehe2p4 protein
metalloprotease metalloprotease protease 19 20 21 1474 1998 2522 000259 hydrolysis
MGG_02107 MGG_02107
Psehe2p4_
adhesin possible adhesin cell adhesion adhesin 22 23 24 1475 1999 2523 000334
Glucan endo-1,3-
Psehe2p4 PSEHE 1 Glucan endo-1 ,3-beta- glucan- glucan endo-1 ,3- beta-glucosidase GH16 25 26 27 1476 2000 2524 000496 _00051 glucosidase A1 degrading beta-glucosidase
A1
Psehe2p4 Probable serine Probable serine protein
protease 28 29 30 1477 2001 2525 000553 protease EDA2 protease EDA2 hydrolysis
Psehe2p4 Psehe2p4 pigment-
Tyrosinase Tyrosinase tyrosinase 31 32 33 1478 2002 2526 000555 _000555 producing
O-
O-
Psehe2p4_ methylsterigmatoc
methylsterigmatocystin unknown oxidoreductase 34 35 36 1479 2003 2527 000672 ystin
oxidoreductase
oxidoreductase
Psehe2p4 Psehe2p4 pectin pectin methylesterase
pectin-degrading pectinesterase CE8 37 38 39 1480 2004 2528 000753 _000753 methylesterase CE8
Psehe2p4 Psehe2p4 Probable pectate lyase
pectate lyase pectin-degrading pectate lyase PL1 40 41 42 1481 2005 2529 000830 _000830 B
yyppppp p I Psehe24 Carboxetidase Carboxetidase S1rotein
Provisional
PCT application SEQ
.>. application
Gene ID In ID NO:
Annotation In SEQ ID NO: prov.
Provisional
appn. Target ID Updated annotation Function Protein activity
application Nos.
61/714,493
61/714,493
Psehe2p4
unknown PL20 unknown PL20 pectin-degrading glucuronan lyase PL20 130 131 132 1511 2035 2559 001930
Putative Putative
Psehe2p4 hemicellulose- carbohydrate
endoglucanase X endoglucanase X CE3 133 134 135 1512 2036 2560 002052 degrading esterase
(Fragment) (Fragment)
Psehe2p4 PSEHE 1 Beta-galactosidase galactan- beta-galactosidase beta-galactosidase GH35 136 137 138 1513 2037 2561 002084 _00215 GH35 degrading
Psehe2p4 hemicellulose- CAZfyai
feruloyl esterase feruloyl esterase CE1 feruloyl esterase 139 140 141 1514 2038 2562 002120 modifying
Psehe2p4 Leucine Leucine protein
protease
aminopeptidase 2 aminopeptidase 2 hydrolysis lt CBMf iieres on 142 143 144 1515 2039 2563 002206
Psehe2p4 PSEHE 1 Probable hemicellulose- unknown GH43 unknown GH43 145 146 147 1516 2040 2564 002217 _00213 arabinase degrading
Gienomc
Polysaccharide
Psehe2p4 PSEHE 1 polysaccharide cellulose- Polysaccharide CBM
monooxygenase AA9 148 149 150 1517 2041 2565 002235 _00277 monooxygenase degrading monooxygenase 1
GH61 Cdiong
Probable endo-
Probable endo-1 ,3(4)-
Psehe2p4 1 ,3(4)-beta- glucan- endo-1 ,3(4)-beta- beta-glucanase GH16 151 152 153 1518 2042 2566 002237 glucanase degrading glucanase Aiidmno ac
AFUA_2G 14360
NFIA_089530
Extracellular Extracellular
Psehe2p4 protein
metalloprotease metalloprotease protease 154 155 156 15 Gienomc19 2043 2567 002240 hydrolysis
GLRG 06286 GLRG 06286
Psehe2p4 Psehe2p4 acetylxylan hemicellulose- feruloyl esterase CE1 feruloyl esterase CE1 157 158 159 1520 20 Cdiong44 2568 002244 _002244 esterase modifying
Psehe2p4 Psehe2p4 acetylxylan acetylxylan esterase hemicellulose- acetylxylan
CE1 160 161 162 1521 2045 2569 002248 _002248 esterase CE1 degrading esterase Aiidmno ac
Disintegrin and Disintegrin and
Psehe2p4 metalloproteinase metalloproteinase protein
protease 163 164 165 1522 2046 2570 002301 domain-containing domain-containing hydrolysis
protein B protein B
Provisional
PCT application SEQ
.>. application
Gene ID In ID NO:
Annotation In SEQ ID NO: prov.
Provisional
appn. Target ID Updated annotation Function Protein activity
application Nos.
61/714,493
61/714,493
Psehe2p4 PSEHE 1 hemicellulose- beta-xylosidase Beta-xylosidase GH3 beta-xylosidase GH3 166 167 168 1523 2047 2571 002328 _00158 degrading
Psehe2p4 Cuticle-degrading Cuticle-degrading protein
protease 169 170 171 1524 2048 2572 002369 protease protease hydrolysis
carbohydrate-
Psehe2p4 Psehe2p4 possible carbohydrate- carbohydrate- GH11 CBM
endoglucanase binding 172 173 174 1525 2049 2573 002396 _002396 binding cytochrome oxidizing 4 1
cytochrome
Psehe2p4 CAZfyai
Galactose oxidase Galactose oxidase sugar-modifying galactose oxidase AA5 175 176 177 1526 2050 2574 002405
Psehe2p4 Psehe2p4 Aldose 1-
Aldose 1-epimerase sugar-modifying aldose epimerase lt CBMf iieres on 178 179 180 1527 2051 2575 002493 _002493 epimerase
Psehe2p4 Psehe2p4 avenacin- avenacinase avenacinase GH3 avenacinase GH3 181 182 183 1528 2052 2576 002836 _002836 degrading
Gienomc
Psehe2p4
adtiesin possible adhesin cell adhesion adhesin 184 185 186 1529 2053 2577 002845
Psehe2p4 PSEHE 1 cellulose- endoglucanase endoglucanase GH7 cellulase GH7 187 1 Cdi8ong8 189 1530 2054 2578 002957 _00302 degrading
uncharacterized
Psehe2p4 Psehe2p4 lignocellulolysis- unknown unknown lignocellulose- 190 191 1 Aiidmno ac92 1531 2055 2579 002998 _002998 enhancing
induced protein
FAD-linked FAD-linked
Psehe2p4_
oxidoreductase oxidoreductase unknown oxidoreductase AA7 193 194 195 15 Gienomc32 2056 2580 003002
DDB_G0289697 DDB_G0289697
Psehe2p4 protein
Poly po rope psin Polyporo pepsin protease 196 197 198 1533 20 Cdiong57 2581 003003 hydrolysis
Psehe2p4 Carboxypeptidase protein
Carboxypeptidase Y protease 199 200 201 1534 2058 2582 003054 Y hydrolysis Aiidmno ac
Psehe2p4 Psehe2p4 cellulase- possible expansin possible expansin expansin 202 203 204 1535 2059 2583 003148 _003148 enhancing
Psehe2p4 PSEHE 1 Probable chitinase CBM
Chitinase GH18 chitin-degrading chitinase GH18 205 206 207 1536 2060 2584 003282 00103 2 18
Provisional
PCT application SEQ
.>. application
Gene ID In ID NO:
Annotation In SEQ ID NO: prov.
Provisional
appn. Target ID Updated annotation Function Protein activity
application Nos.
61/714,493
61/714,493
Psehe2p4 Psehe2p4
unknown unknown unknown unknown 208 209 210 1537 2061 2585 003292 _003292
Psehe2p4 Psehe2p4 Xylosidase/arabino Xylosidase/arabinosid hemicellulose- xylosidase GH43 211 212 213 1538 2062 2586 003293 _003293 sidase ase degrading
Probable glucan
Psehe2p4 Probable glucan 1,3- glucan- glucan 1 ,3-beta-
1 ,3-beta- GH5 214 215 216 1539 2063 2587 003294 beta-glucosidase A degrading glucosidase
glucosidase A
Psehe2p4 Glucan 1 ,3-beta- Glucan 1 ,3-beta- glucan- glucan 1 ,3-beta- CAZfyai
GH55 217 218 219 1540 2064 2588 003310 glucosidase glucosidase degrading glucosidase
Psehe2p4 Probable leucine Probable leucine protein
protease
tidase 2 aminopeptidase 2 hydrolysis lt CBMf iieres on 220 221 222 1541 2065 2589 003317 aminopep
Gamma- Gamma-
Psehe2p4 protein
glutamyltranspepti glutamyltranspeptidas protease 223 224 225 1542 2066 2590 003333 hydrolysis Gienomc
dase 2 e 2
Psehe2p4_ Psehe2p4 cellulose- endoglucanase endoglucanase GH5 cellulase GH5 226 227 228 1543 2067 2591 003364 _003364 degrading Cdiong
Psehe2p4 Lysophospholipas
Lysoptiosptiolipase 1 lipid-degrading lipase 229 230 231 1544 2068 2592 003365 e 1
Psehe2p4 Aiidmno ac
Lipase B Lipase B lipid-degrading lipase 232 233 234 1545 2069 2593 003395
Psehe2p4 PSEHE 1 hemicellulose- xylanase xylanase GH11 xylanase GH11 235 236 237 15
402 _00013 degrading Gienomc46 2070 2594 003
Psehe2p4 Psehe2p4
Laccase-1 Laccase-1 lignin-degrading laccase AA1 238 239 240 1547 2071 2595 003411 _003411 Cdiong
Psehe2p4 Psehe2p4 cellulase- possible expansin possible expansin expansin 241 242 243 1548 2072 2596 003431 _003431 enhancing
Psehe2p4 Vacuolar protease protein Aiidmno ac
Vacuolar protease A protease 244 245 246 1549 2073 2597 003565 A hydrolysis
Uncharacterized
Psehe2p4 Psehe2p4
RING finger unknown GH93 unknown unknown GH93 247 248 249 1550 2074 2598 003598 _003598
protein C328.02
Provisional
PCT application SEQ application
Gene ID In ID NO:
Annotation In .>.
SEQ ID NO:
prov.
Provisional
appn. Target ID Updated annotation Function Protein activity
application Nos.
61/714,493
61/714,493 uncharacterized
Psehe2p4 PSEHE 1 lignocellulolysis- unknown unknown lignocellulose- 367 368 369 1590 2114 2638 005567 J0210 enhancing
induced protein
Psehe2p4 Tripeptidyl- Tripeptidyl-peptidase protein
protease 370 371 372 1591 2115 2639 005678 peptidase sedl sedl hydrolysis
Iron transport
Psehe2p4 Psehe2p4 multicopper
multicopper multicopper oxidase unknown AA1 373 374 375 1592 2116 2640 005727 _005727 oxidase
oxidase FET3 CAZfyai
Psehe2p4 PSEHE 1 beta-galactosidase galactan- beta-galactosidase beta-galactosidase GH35 376 377 378 1593 2117 2641 005853 _00216 GH35 degrading
uncharacterized lt CBMf iieres on
Psehe2p4 Psehe2p4 lignocellulolysis- unknown unknown lignocellulose- 379 380 381 1594 2118 2642 005919 _005919 enhancing
induced protein
Gienomc
Psehe2p4 Psehe2p4 galactan- galactanase galactanase GH5 galactanase GH5 382 383 384 1595 2119 2643 005939 _005939 degrading
Psehe2p4 Psehe2p4 polysaccharide polysaccharide cellulose- Polysaccharide
AA9 385 3 Cdi8ong6 387 1596 2120 2644 005961 _005961 monooxygenase monooxygenase degrading monooxygenase
unknown
Psehe2p4_
(tyrosinase unknown unknown unknown 388 389 3 Aiidmno ac90 1597 2121 2645 005982
domain)
Psehe2p4 Psehe2p4 CBM
Ctiitotriosidase-1 Chitotriosidase-1 chitin-degrading chitotriosidase GH18 391 392 393 15
006015 _006015 18 Gienomc98 2122 2646
Psehe2p4 Psehe2p4 Cellobiose Cellobiose lignocellulose- cellobiose CBM
AA8 394 395 396 1599 2123 2647 006036 _006036 dehydrogenase dehydrogenase degrading dehydrogenase 1 Cdiong
Putative
Psehe2p4 PSEHE 1 cellulose- CBM
endoglucanase endoglucanase GH45 cellulase GH45 397 398 399 1600 2124 2648 006039 _00246 degrading 1
type K Aiidmno ac uncharacterized
Psehe2p4 Psehe2p4 lignocellulolysis- unknown unknown lignocellulose- 400 401 402 1601 2125 2649 006057 _006057 enhancing
induced protein
yy y p p p I Psehe24 PSEHE 1olsaccharideolsaccharide cellulose
Provisional
PCT application SEQ application
Gene ID in ID NO:
Annotation in .>.
SEQ ID NO:
prov.
Provisional
appn. Target ID Updated annotation Function Protein activity
application Nos.
61/714,493
61/714,493
Psehe2p4 galactan- CBM
galactanase galactanase GH5 galactanase GH30 484 485 486 1629 2153 2677 006752 degrading 1
uncharacterized
Psehe2p4 PSEHE 1 lignocellulolysis- unknown unknown lignocellulose- 487 488 489 1630 2154 2678 006753 J 1345 enhancing
induced protein
Psehe2p4 Psehe2p4 Versatile Versatile peroxidase versatile
lignin-degrading AA2 490 491 492 1631 2155 2679 006770 _006770 peroxidase VPL1 VPL1 peroxidase
Psehe2p4 alpha- alptia-galactosidase galactan- alpha- CAZfyai
GH27 493 494 495 1632 2156 2680 006785 galactosidase GH27 degrading galactosidase
Psehe2p4 PSEHE 1 endo-1,4-beta- hemicellulose- CBM
xylanase xylanase GH10 3 2157 2681 006789 _00010 xylanase degrading lt CBMf iieres on 496 497 498 163
1
Psehe2p4 Psehe2p4 Acetylxylan Acetylxylan esterase 2 hemicellulose- acetylxylan
CE5 499 500 501 1634 2158 2682 006868 _006868 esterase 2 CE5 degrading esterase
Gienomc
Psehe2p4 Leucine Leucine protein
protease 502 503 504 1635 2159 2683 006883 aminopeptidase 2 aminopeptidase 2 hydrolysis
Psehe2p4
Lipase Lipase lipid-degrading lipase CE2 505 5 Cdi0ong6 507 1636 2160 2684 006910
Psehe2p4 Psehe2p4 CBM
chitinase Chitinase GH18 chitin-degrading chitinase GH18 508 509 5 006912 _006912 18 Aiidmno ac10 1637 2161 2685
Psehe2p4 Minor extracellular Minor extracellular protein
protease 511 512 513 1638 2162 2686 006917 protease vpr protease vpr hydrolysis
Gienomc
Extracellular
Psehe2p4 Extracellular protein
metalloproteinase protease 514 515 516 1639 2163 2687 006949 metalloproteinase 1 hydrolysis
1 Cdiong
Psehe2p4 Psehe2p4
cutinase cutinase CE5 cutin-degrading cutinase CE5 517 518 519 1640 2164 2688 007006 _007006
Psehe2p4 PSEHE 1 polysaccharide polysaccharide cellulose- Polysaccharide Aiidmno ac
AA9 520 521 522 1641 2165 2689 007013 _00278 monooxygenase monooxygenase degrading monooxygenase
Psehe2p4 PSEHE 1 hemicellulose-
Beta-galactosidase Beta-galactosidase beta-galactosidase GH2 523 524 525 1642 2166 2690 007034 00111 degrading
Provisional
PCT application SEQ application
Gene ID In ID NO:
Annotation In SEQ ID NO: prov.
Provisional
appn. Target ID Updated annotation Function Protein activity
application Nos.
61/714,493
61/714,493 unknown
Psehe2p4
( ydrophobin hydrophobin; unknown hydrophobin 526 527 528 1643 2167 2691 007048
domain)
Extracellular Extracellular
Psehe2p4 protein
metalloprotease metalloprotease protease 529 530 531 1644 2168 2692 007061 hydrolysis
SMAC_06893 SMAC_06893
Psehe2p4 Dipeptidyl protein
Dipeptidyl peptidase 4 protease CE10 532 533 534 1645 2169 2693 007064 peptidase 4 hydrolysis CAZ filyamy
Psehe2p4 endo- unknown unknown unknown 535 536 537 1646 2170 2694 007099 polygalacturonase
Psehe2p4 GH11 CBMf itt oneres
unknown GH114 unknown GH114 unknown unknown 538 539 540 1647 2171 2695 007114 4
Psehe2p4 Psehe2p4
Cutinase Cutinase cutin-degrading cutinase CE5 5 Gi4enomc1 542 543 1648 2172 2696 007236 _007236
Psehe2p4_ Psehe2p4 hemicellulose- xylanase xylanase GH30 xylanase GH30 544 545 546 1649 2173 2697 007460 _007460 degrading Cdiong
Psehe2p4 unknown (CBM18 CBM
unknown CBM18 chitin-binding unknown 547 548 549 1650 2174 2698 007479 domain) 18
Psehe2p4 PSEHE 1 beta-mannosidase hemicellulose- Aiidmno ac
beta-mannosidase beta-mannosidase GH2 550 551 552 1651 2175 2699 007493 _00121 GH2 degrading
Psehe2p4 Glucan 1 ,3-beta- Glucan 1 ,3-beta- glucan- glucan 1 ,3-beta-
GH55 553 554 555 16 sidase glucosidase degrading glucosidase Gienomc52 2176 2700 007711 gluco
Psehe2p4 exo-arabinanase hemicellulose- exo-arabinanase exo-arabinanase GH93 556 557 558 1653 2177 2701 007746 GH93 degrading Cdiong
Psehe2p4 PSEHE 1 hemicellulose- xylanase xylanase GH10 xylanase GH10 559 560 561 1654 2178 2702 007749 _00012 degrading
Probable aspartic- Aiidmno ac
Probable aspartic-type
Psehe2p4 type protein
endopeptidase protease 562 563 564 1655 2179 2703 007756 endopeptidase hydrolysis
AFUA_3G01220
AFUA 3G01220
Provisional
PCT application SEQ D in .>. application
Gene I ID NO:
Annotation in SEQ ID NO: prov.
Provisional
appn. Target ID Updated annotation Function Protein activity
application Nos.
61/714,493
61/714,493
Psehe2p4 PSEHE 1 hemicellulose- beta-mannanase beta-mannanase GH5 beta-mannanase GH5 565 566 567 1656 2180 2704 007774 _00143 degrading
uncharacterized
Psehe2p4 Psehe2p4 lignocellulolysis- unknown unknown lignocellulose- 568 569 570 1657 2181 2705 007781 _007781 enhancing
induced protein
Psehe2p4 Psehe2p4 hemicellulose- CBM
feruloyl esterase feruloyl esterase CE1 feruloyl esterase CE1 571 572 573 1658 2182 2706 007789 _007789 modifying 1
Psehe2p4 Psehe2p4 Beta- beta- CAZfyai
Beta-hexosaminidase chitin-degrading GH20 574 575 576 1659 2183 2707 007799 _007799 hexosaminidase hexosaminidase
Psehe2p4 PSEHE 1 hemicellulose- CBM
xylanase xylanase GH10 xylanase GH10 1660 2184 2708 007835 _00007 degrading lt CBMf iieres on 577 578 579
1
Psehe2p4 Cuticle-degrading Cuticle-degrading protein
protease 580 581 582 1661 2185 2709 007838 protease protease hydrolysis
Gienomc
Psehe2p4 unknown (CBM1
unknown unknown unknown 583 584 585 1662 2186 2710 007840 domain)
Psehe2p4 Psehe2p4 tomatin
Tomatinase beta-glucosidase GH3 tomatinase GH3 586 5 Cdi8ong7 588 1663 2187 2711 007853 _007853 degrading
Probable aspartic-
Probable aspartic-type
Psehe2p4 type protein Aiidmno ac
endopeptidase protease 589 590 591 1664 2188 2712 007869 endopeptidase hydrolysis
AFUA_3G01220
AFUA_3G01220
carbohydrate- Gienomc
Psehe2p4 L-sorbosone
binding unknown dehydrogenase 592 593 594 1665 2189 2713 007924 dehydrogenase
cytochrome Cdiong
Psehe2p4_ protein
Candidapepsin-1 Candidapepsin-1 protease 595 596 597 1666 2190 2714 007996 hydrolysis
Psehe2p4 Psehe2p4 Endo-1 ,4-beta- Endo-1 ,4-beta- hemicellulose- endo-1 ,4-beta- CBM Aiidmno ac
GH10 598 599 600 1667 2191 2715 008182 _008182 xylanase F3 xylanase F3 degrading xylanase 1
Psehe2p4 Psehe2p4
Pectinesterase Pectinesterase pectin-degrading pectinesterase CE8 601 602 603 1668 2192 2716 008195 008195
Provisional
PCT application SEQ application
Gene ID In ID NO:
Annotation In
£ SEQ ID NO:
prov.
Provisional
appn. Target ID Updated annotation Function Protein activity
application Nos. "o
61/714,493 Έ Έ
61/714,493 o o o
o Έ Έ
< <
Polysaccharide
Psehe2p4 PSEHE 1 polysaccharide cellulose- Polysaccharide
monooxygenase AA9 718 719 720 1707 2231 2755 009648 _00274 monooxygenase degrading monooxygenase
GH61
Psehe2p4 protein
Acid protease Acid protease protease 721 722 723 1708 2232 2756 009676 hydrolysis
Psehe2p4 Psehe2p4 hemicellulose- feruloyl esterase feruloyl esterase CE1 feruloyl esterase CE1 724 725 726 1709 2233 2757 009703 _009703 modifying
Psehe2p4 PSEHE 1 mixed-link mixed-link glucanase glucan- mixed-link
GH16 727 728 729 1710 2234 2758 009785 _00044 glucanase GH16 degrading glucanase
Psehe2p4
adhesin possible adhesin cell adhesion adhesin lteres 730 731 732 1711 2235 2759 009815
Psehe2p4 unknown (CBM18 CBM
unknown CBM18 chitin-binding unknown 733 734 735 1712 2236 2760 009827 domain) 18
Psehe2p4 Psehe2p4 pigment-
Tyrosinase Tyrosinase tyrosinase 736 737 738 1713 2237 2761 009866 _009866 producing
Polysaccharide
Psehe2p4 PSEHE 1 polysaccharide cellulose- Polysaccharide CBM
monooxygenase AA9 739 740 741 1714 2238 2762 009871 _00273 monooxygenase degrading monooxygenase 1
GH61
Psehe2p4
adhesin possible adhesin cell adhesion adhesin 742 743 744 1715 2239 2763 009915
arabinoxylan arabinoxylan
Psehe2p4_ PSEHE_1 hemicellulose- arabinofuranosidas CBM
arabinofuranosidas arabinofuranosidase GH62 745 746 747 1716 2240 2764 009954 _00287 degrading e 1
e GH62
Psehe2p4 PSEHE 1 hemicellulose- CBM
xylanase xylanase GH30 xylanase GH30 748 749 750 1717 2241 2765 009971 _00179 degrading 1
Psehe2p4 PSEHE 1 polysaccharide polysaccharide cellulose- Polysaccharide
AA9 751 752 753 1718 2242 2766 010022 _00272 monooxygenase monooxygenase degrading monooxygenase
Psehe2p4 hemicellulose- unknown GH5 unknown GH5 beta-mannanase GH5 754 755 756 1719 2243 2767 010189 degrading
Psehe2p4 hemicellulose- arabinofuranosidas GH12 CBM
cellobiohydrolase unknown GH127 757 758 759 1720 2244 2768 010198 degrading e 7 1
Provisional
PCT application SEQ application
Gene ID in ID NO:
Annotation In £ SEQ ID NO: prov.
Provisional
appn. Target ID Updated annotation Function Protein activity
application Nos. "o
61/714,493 Έ Έ
61/714,493 o
o Έ Έ
< <
Peptidase M20 Peptidase M20
Psehe2p4_ domain-containing domain-containing protein
protease 838 839 840 1747 2271 2795 011346 protein protein hydrolysis
SMAC_03666.2 SMAC_03666.2
Psehe2p4 Psehe2p4 hemicellulose- feruloyl esterase feruloyl esterase CE1 feruloyl esterase CE1 841 842 843 1748 2272 2796 011739 _011739 modifying
Psehe2p4 PSEHE 1
hexosaminidase hexosaminidase GH20 chitin-degrading hexosaminidase GH20 844 845 846 1749 2273 2797 011748 _00122
Psehe2p4 PSEHE 1 cellulose- beta-glucosidase beta-glucosidase GH3 beta-glucosidase GH3 847 848 849 1750 2274 2798 011768 _00147 degrading
uncharacterized lteres
Psehe2p4 PSEHE 1 lignocellulolysis- unknown unknown lignocellulose- 850 851 852 1751 2275 2799 011815 _01595 enhancing
induced protein
Psehe2p4 carbohydrate- carbohydrate
unknown CE3 unknown CE3 CE3 853 854 855 1752 2276 2800 011857 modifying esterase
Psehe2p4 protein
Aminopeptidase Y Aminopeptidase Y protease 856 857 858 1753 2277 2801 011891 hydrolysis
Probable Probable
Psehe2p4_ PSEHE_1 endo- endopolygalacturo endopolygalacturonas pectin-degrading GH28 859 860 861 1754 2278 2802 011924 _00138 polygalacturonase
nase D e D
unknown
Psehe2p4
(tyrosinase unknown unknown unknown 862 863 864 1755 2279 2803 011941
domain)
Psehe2p4
unknown CE3 unknown CE3 unknown unknown CE3 865 866 867 1756 2280 2804 012058
Psehe2p4 Psehe2p4
cutinase cutinase CE5 cutin-degrading cutinase CE5 868 869 870 1757 2281 2805 012081 _012081
Psehe2p4 PSEHE 1 cellulose- cellobiohydrolase cellobiohydrolase GH7 cellobiohydrolase GH7 871 872 873 1758 2282 2806 012098 00303 degrading
Psehe2p4 mixed-link mixed-link glucanase glucan- mixed-link
GH16 874 875 876 1759 2283 2807 012173 glucanase GH16 degrading glucanase
Provisional
PCT application SEQ application
Gene ID In ID NO:
Annotation In SEQ ID NO: prov.
Provisional
appn. Target ID Updated annotation Function Protein activity
application Nos.
61/714,493
61/714,493 unknown
Psehe2p4
(tyrosinase unknown unknown unknown 877 878 879 1760 2284 2808 012198
domain)
bifunctional alpha- bifunctional alpha-
Psehe2p4 Psehe2p4 hemicellulose- arabinofuranosidas CBM
arabinofuranosidas arabinofuranosidase/b GH43 880 881 882 1761 2285 2809 012204 _012204 degrading e 1
e/beta-xylosidase eta-xylosidase GH43
Psehe2p4 Psehe2p4
Cutinase Cutinase cutin-degrading cutinase CE5 883 884 885 1762 2286 2810 012229 _012229 CAZ filyamy
Psehe2p4 PSEHE 1 hemicellulose- xylanase Xylanase GH11 xylanase GH11 886 887 888 1763 2287 2811 012296 _00025 degrading
Psehe2p4 Putative serine Putative serine protein CBMf itt oneres
protease 889 890 891 1764 2288 2812 012303 protease K12H4.7 protease K12H4.7 hydrolysis
Psehe2p4 Psehe2p4
Laccase-2 Laccase-2 lignin-degrading laccase AA1 8 Gi9enomc2 893 894 1765 2289 2813 012360 _012360
uncharacterized
Psehe2p4 Psehe2p4 lignocellulolysis- unknown unknown lignocellulose- 895 8 C9ong6 897 1766 2290 2814 012393 _012393 enhancing di
induced protein
Neutral protease 2 Neutral protease 2
Psehe2p4_ protein
homolog homolog protease 898 899 9 Aiidmno ac00 1767 2291 2815 012444 hydrolysis
MGYG_02351 MGYG_02351
Psehe2p4 Psehe2p4
Laccase Laccase lignin-degrading laccase AA1 901 902 903 17
2449 _012449 Gienomc68 2292 2816 01
Gamma- Gamma-
Psehe2p4 protein
glutamyltranspepti glutamyltranspeptidas protease 904 905 906 1769 22 012508 hydrolysis Cdiong93 2817 dase 1 e 1
Psehe2p4 Psehe2p4 Cellobiose Cellobiose lignocellulose- cellobiose
AA8 907 908 909 1770 2294 2818 012526 _012526 dehydrogenase dehydrogenase degrading dehydrogenase Aiidmno ac
Psehe2p4 Psehe2p4
pectate lyase pectate lyase PL3 pectin-degrading pectate lyase PL3 910 911 912 1771 2295 2819 012650 _012650
Psehe2p4 protein
Alkaline protease 1 Alkaline protease 1 protease 913 914 915 1772 2296 2820 012670 hydrolysis
p p I Psehe24 Probable leucine Probable leucinerotein
Provisional
PCT application SEQ application
Gene ID in ID NO:
Annotation in .>.
SEQ ID NO:
prov.
Provisional
appn. Target ID Updated annotation Function Protein activity
application Nos.
61/714,493
61/714,493
Alpha-N- Extracellular exo-
Psehe2p4 PSEHE 1 hemicellulose- arabinofuranosidas
arabinofuranosidas alpha-(1->5)-L- GH43 994 995 996 1799 2323 2847 013521 _00230 degrading e
e 2 arabinofuranosidase
Psehe2p4 PSEHE 1 tomatin
Tomatinase beta-glucosidase GH3 tomatinase GH3 997 998 999 1800 2324 2848 013631 _00151 degrading
alpha- alpha-
Psehe2p4 Psehe2p4 hemicellulose- arabinofuranosidas
arabinofuranosidas arabinofuranosidase GH51 1000 1001 1002 1801 2325 2849 013641 _013641 degrading e
e GH51 CAZfyai
Psehe2p4 Psehe2p4
pectate lyase pectate lyase PL3 pectin-degrading pectate lyase PL3 1003 1004 1005 1802 2326 2850 013699 _013699
Psehe2p4 lt CBMf iieres on
Lipase 5 Lipase 5 lipid-degrading lipase 1006 1007 1008 1803 2327 2851 013729
Psehe2p4 Psehe2p4 acetylxylan Acetylxylan esterase 2 hemicellulose- acetylxylan
CE5 10 Gienomc09 1010 1011 1804 2328 2852 013733 _013733 esterase CE5 degrading esterase
Psehe2p4_ Leucine Leucine protein
protease 1012 1013 1014 1805 2329 2853 013736 aminopeptidase 2 aminopeptidase 2 hydrolysis Cdiong
unknown
Psehe2p4
(tyrosinase unknown unknown unknown 1015 1016 1017 1806 2330 2854 013741
domain) Aiidmno ac
Psehe2p4 Tripeptidyl- Tripeptidyl-peptidase protein
protease 1018 1019 1020 1807 2331 2855 013788 peptidase sed3 sed3 hydrolysis
Putative Putative Gienomc
Psehe2p4
lysophospholipase lysophospholipase lipid-degrading lipase 1021 1022 1023 1808 2332 2856 013807
C1450.09c C1450.09c Cdiong
Psehe2p4_
adhesin possible adhesin cell adhesion adhesin 1024 1025 1026 1809 2333 2857 013817
Psehe2p4 Catalase- Aiidmno ac
Catalase-peroxidase 2 lignin-degrading peroxidase AA2 1027 1028 1029 1810 2334 2858 013868 peroxidase 2
unknown
Psehe2p4
(tyrosinase unknown unknown unknown 1030 1031 1032 1811 2335 2859 013891
domain)
yyppppppp p I Psehe24 Trietidl Trietidletidaserotein--
Provisional
PCT application SEQ application
Gene ID In ID NO:
Annotation In
£ SEQ ID NO:
prov.
Provisional
appn. Target ID Updated annotation Function Protein activity
application Nos. "o
61/714,493 Έ Έ
61/714,493 o o o
o Έ Έ
< <
Extracellular Extracellular
Psehe2p4 protein
metalloprotease metalloprotease protease 1117 1118 1119 1840 2364 2888 014979 hydrolysis
GLRG_06286 GLRG_06286
Psehe2p4 unknown (CBM18 CBM
unknown CBM18 chitin-binding unknown 1120 1121 1122 1841 2365 2889 014982 domain) 18
Psehe2p4 PSEHE 1 alpha-glucosidase starch- alpha-glucosidase alpha-glucosidase GH31 1123 1124 1125 1842 2366 2890 015038 _00186 GH31 degrading
Psehe2p4 PSEHE 1 hemicellulose- xylanase xylanase GH11 xylanase GH11 1126 1127 1128 1843 2367 2891 015055 _00026 degrading
Psehe2p4 PSEHE 1 exo-beta-1 ,3- exo-beta-1 , 3- galactan- exo-beta-1, 3- CBM
GH43
galactanase galactanase GH43 degrading galactanase lt 3eres 1129 1130 1131 1844 2368 2892 015097 _00241 5
Psehe2p4
beta-glucuronidase unknown GH79 unknown unknown GH79 1132 1133 1134 1845 2369 2893 015098
Psehe2p4 Tripeptidyl- Tripeptidyl-peptidase protein
protease 1135 1136 1137 1846 2370 2894 015106 peptidase sed2 sed2 hydrolysis
Psehe2p4 Carboxypeptidase Carboxypeptidase protein
protease 1138 1139 1140 1847 2371 2895 015166 cpdS cpdS hydrolysis
Probable endo-
Psehe2p4 PSEHE 1 Probable endo-1 ,4- hemicellulose- endo-1 ,4-beta- 1 ,4-beta-xylanase GH11 1141 1142 1143 1848 2372 2896 015201 _00027 beta-xylanase A degrading xylanase
A
Psehe2p4 exo-arabinanase hemicellulose- exo-arabinanase exo-arabinanase GH93 1144 1145 1146 1849 2373 2897 015235 GH93 degrading
Psehe2p4 PSEHE 1
chitinase chitinase GH18 chitin-degrading chitinase GH18 1147 1148 1149 1850 2374 2898 015287 _00104
Psehe2p4 Psehe2p4 carbohydrate- carbohydrate
unknown CE16 unknown CE16 CE16 1150 1151 1152 1851 2375 2899 015306 _015306 modifying esterase
Psehe2p4 PSEHE 1 cellulose- endoglucanase Endoglucanase GH7 cellulase GH7 1153 1154 1155 1852 2376 2900 015332 _00305 degrading
Psehe2p4 GH13
adhesin possible adhesin cell adhesion adhesin 1156 1157 1158 1853 2377 2901 015386 2
Provisional
PCT application SEQ application
Gene ID In ID NO:
Annotation In
prov. £ SEQ ID NO:
Provisional
appn. Target ID Updated annotation Function Protein activity
application Nos. "o
61/714,493 Έ Έ
61/714,493 o o o
o Έ Έ
< <
Psehe2p4 PSEHE 1 cellulose- avenacinase beta-glucosidase GH3 beta-glucosidase GH3 1234 1235 1236 1879 2403 2927 016290 _00161 degrading
Psehe2p4 Psehe2p4 Cellobiose Cellobiose lignocellulose- cellobiose
AA8 1237 1238 1239 1880 2404 2928 016342 _016342 dehydrogenase dehydrogenase degrading dehydrogenase
Psehe2p4 Probable serine Probable serine protein
protease 1240 1241 1242 1881 2405 2929 016344 protease EDA2 protease EDA2 hydrolysis
Psehe2p4 Psehe2p4
cutinase cutinase CE5 cutin-degrading cutinase CE5 1243 1244 1245 1882 2406 2930 016355 _016355
Psehe2p4 PSEHE 1 cellulose- beta-glucosidase beta-glucosidase GH3 beta-glucosidase GH3 1246 1247 1248 1883 2407 2931 016450 _00177 degrading
Psehe2p4 PSEHE 1 alpha- alpha-glucuronidase hemicellulose- alpha- lteres
GH67 1249 1250 1251 1884 2408 2932 016467 _00294 glucuronidase GH67 degrading glucuronidase
Psehe2p4 carbohydrate- carbohydrate
unknown CE2 unknown CE2 CE2 1252 1253 1254 1885 2409 2933 016549 modifying esterase
Psehe2p4 Psehe2p4 Cellobiose cellobiose lignocellulose- cellobiose
AA8 1255 1256 1257 1886 2410 2934 016575 _016575 dehydrogenase dehydrogenase degrading dehydrogenase
arabinoxylan arabinoxylan
Psehe2p4 PSEHE 1 hemicellulose- arabinofuranosidas CBM
arabinofuranohydr arabinofuranosidase GH62 1258 1259 1260 1887 2411 2935 016577 _00288 degrading e 1
olase GH62
PSEHE 1 cellulose- endoglucanase GH12 endoglucanase GH12 1888 2412 2936
_00031 degrading
Probable endo-1 ,3(4)-
PSEHE 1 glucan- endo-1 ,3(4)-beta- beta-glucanase GH16 1889 2413 2937
_00048 degrading glucanase
AFUA_2G 14360
PSEHE 1 Glucan endo-1 ,3-beta- glucan- glucan endo-1 ,3-
GH16 1890 2414 2938
_00052 glucosidase A1 degrading beta-glucosidase
PSEHE 1 glucan- unknown GH16 unknown GH16 1891 2415 2939
_00068 degrading
PSEHE 1 CBM
Chitinase GH18 chitin-degrading chitinase GH18 1892 2416 2940
00078 18
Provisional
PCT application SEQ application
Gene ID In ID NO:
Annotation In .>.
SEQ ID NO:
prov.
Provisional
appn. Target ID Updated annotation Function Protein activity
application Nos.
61/714,493
61/714,493
Psehe2p4
unknown AA5 unknown oxidase AA5 1935 2459 2983 _006754
Psehe2p4
unknown AA5 unknown oxidase AA5 1936 2460 2984 _007154
Psehe2p4
unknown AA5 unknown oxidase AA5 1937 2461 2985 _007693
Psehe2p4
unknown CE1 unknown hydrolase CE1 1938 2462 2986 _007787 CAZfyai
Alcohol
Psehe2p4
dehydrogenase unknown dehydrogenase AA3 1939 2463 2987 _007788
[acceptor] lt CBMf iieres on
Psehe2p4
unknown AA5 unknown oxidase AA5 1940 2464 2988 _007986
Gienomc
Psehe2p4 gluconolactone
Gluconolactonase gluconolactonase 1941 2465 2989 _008326 hydrolyzing
Psehe2p4 6-hydroxy-D-nicotine
unknown oxidase AA7 Cdiong 1942 2466 2990 _008561 oxidase
Psehe2p4 possible pyranose pyranose
sugar-modifying AA3
dehydrogenase dehydrogenase Aiidmno ac 1943 2467 2991 _008695
Bifunctional
Psehe2p4
solanapyrone unknown oxidase AA7 1944 2468 2992 _008967
synthase Gienomc
Psehe2p4
unknown AA5 unknown oxidase AA5 1945 2469 2993 _009175 Cdiong
Psehe2p4
Lipase 2 lipid-degrading lipase CE10 1946 2470 2994 _009354
Bifunctional Aiidmno ac
Psehe2p4
solanapyrone unknown oxidoreductase AA7 1947 2471 2995 _009382
synthase
Psehe2p4
unknown AA5 unknown oxidase AA5 1948 2472 2996 009407
Provisional
PCT application SEQ
.>. application
Gene ID In ID NO:
Annotation In SEQ ID NO: prov.
Provisional
appn. Target ID Updated annotation Function Protein activity
application Nos.
61/714,493
61/714,493
Sterol-4-alpha-
Psehe2p4 carboxylate 3- unknown dehydrogenase 1949 2473 2997 _009561 dehydrogenase,
decarboxylating
Psehe2p4
unknown AA8 unknown unknown AA8 1950 2474 2998 _009889
Bifunctional
Psehe2p4
solanapyrone unknown oxidoreductase A CAZfAyai7 1951 2475 2999 _009911
synthase
Bifunctional
Psehe2p4
solanapyrone unknown oxidoreductase AA7 lt CBMf iieres on 1952 2476 3000 _009986
synthase
Psehe2p4 CBM
unknown CBM18 chitin-binding unknown 1953 2477 3001 _010107 18 Gienomc
Psehe2p4
Galactose oxidase sugar-modifying galactose oxidase AA5 1954 2478 3002 _010176 Cdiong
Psehe2p4
unknown AA5 unknown oxidase AA5 1955 2479 3003 _010868
Zinc-type alcohol Aiidmno ac
Psehe2p4
dehydrogenase-like unknown dehydrogenase 1956 2480 3004 _011021
protein C337.11
Psehe2p4 Choline Gienomc
choline oxidizing dehydrogenase AA3 1957 2481 3005 011125 dehydrogenase
Psehe2p4 multicopper
multicopper oxidase unknown AA1 1958 24 Cdiong82 3006 _011128 oxidase
Psehe2p4
unknown AA2 unknown unknown AA2 1959 2483 3007 _011155 Aiidmno ac
Psehe2p4
unknown AA8 unknown unknown AA8 1960 2484 3008 _011217
Psehe2p4
Galactose oxidase sugar-modifying galactose oxidase AA5 1961 2485 3009 _011757
Provisional
PCT application SEQ application
Gene ID In ID NO:
Annotation In .>.
SEQ ID NO:
prov.
Provisional
appn. Target ID Updated annotation Function Protein activity
application Nos.
61/714,493
61/714,493
Psehe2p4 3-oxosteroid 1- unknown dehydrogenase 1962 2486 3010 _011892 dehydrogenase
Psehe2p4 possible pyranose pyranose
sugar-modifying AA3 1963 2487 3011 _012059 dehydrogenase dehydrogenase
Psehe2p4
Lipase 4 lipid-degrading lipase CE10 1964 2488 3012 _012110
Bifunctional
Psehe2p4
solanapyrone unknown oxidoreductase A CAZfAyai7 1965 2489 3013 _012149
synthase
Psehe2p4
unknown AA5 unknown oxidase AA5
_012330 lt CBMf iieres on 1966 2490 3014
Psehe2p4
unknown AA5 unknown unknown AA5 1967 2491 3015 _012434
Gienomc
Psehe2p4
unknown AA5 unknown oxidase AA5 1968 2492 3016 _012692
Psehe2p4
Lipase 1 lipid-degrading lipase CE10 Cdiong 1969 2493 3017 _012773
Psehe2p4 possible pyranose pyranose
sugar-modifying AA3
_012776 dehydrogenase dehydrogenase Aiidmno ac 1970 2494 3018
Psehe2p4 glucan- unknown GH16 unknown GH16 1971 2495 3019 _012789 degrading
Gienomc
Psehe2p4
unknown AA5 unknown unknown AA5 1972 2496 3020 _012850
Bifunctional
Psehe2p4 Cdiong
solanapyrone unknown oxidoreductase AA7 1973 2497 3021 _013228
synthase
Psehe2p4 possible pyranose pyranose Aiidmno ac sugar-modifying AA3 1974 2498 3022 _013234 dehydrogenase dehydrogenase
Psehe2p4
unknown AA7 unknown unknown AA7 1975 2499 3023 013293
Provisional
PCT application SEQ application
Gene ID In ID NO:
Annotation In SEQ ID NO: prov.
Provisional
appn. Target ID Updated annotation Function Protein activity
application Nos.
61/714,493
61/714,493
Psehe2p4
unknown AA8 unknown unknown AA8 1976 2500 3024 _013346
Bifunctional
Psehe2p4
solanapyrone unknown oxidoreductase AA7 1977 2501 3025 _013730
synthase
Psehe2p4
Lipase 2 lipid-degrading lipase CE10 1978 2502 3026 _013983
Psehe2p4 Choline CAZ filyamy
choline oxidizing dehydrogenase AA3 1979 2503 3027 _014250 dehydrogenase
Psehe2p4
Lipase 1 lipid-degrading lipase CE10
_015161 CBMf itt oneres 1980 2504 3028
Psehe2p4
unknown AA7 unknown unknown AA7 1981 2505 3029 _015225
Gienomc
Probable endo-1 ,3(4)-
Psehe2p4 glucan- endo-1 ,3(4)-beta- beta-glucanase GH16 1982 2506 3030 _015389 degrading glucanase
An02g00850 Cdiong
Psehe2p4
unknown CE1 unknown hydrolase CE1 1983 2507 3031 _015423
Psehe2p4 (+)-neomenthol Aiidmno ac
unknown dehydrogenase 1984 2508 3032 _015455 dehydrogenase
Psehe2p4 Choline
unknown dehydrogenase AA3 19 Gienomc85 2509 3033 _015750 dehydrogenase
Psehe2p4
unknown AA5 unknown oxidase AA5 1986 2510 3034 _015831 Cdiong
Psehe2p4 cellulase- possible expansin expansin 1987 2511 3035 _015995 enhancing
Psehe2p4 Alcohol Aiidmno ac alcohol-oxidizing dehydrogenase 1988 2512 3036 _016051 dehydrogenase 2
Bifunctional
Psehe2p4
solanapyrone unknown oxidoreductase AA7 1989 2513 3037 _016372
synthase
Provisional
PCT application SEQ e ID In .>. application
Gen ID NO:
Annotation In SEQ ID NO: prov.
Provisional
appn. Target ID Updated annotation Function Protein activity
application Nos.
61/714,493
61/714,493
Psehe2p4
Lipase 1 lipid-degrading lipase CE10 1990 2514 3038 _016408
Bifunctional
Psehe2p4
solanapyrone unknown oxidoreductase AA7 1991 2515 3039 _016534
synthase
CAZfyai tt CBMf iieres on
Gienomc
Cdiong
Aiidmno ac
Gienomc
Cdiong
Aiidmno ac
Table 2A. List of genes of Thermoascus aurantiacus with reference to exon boundaries
Theau2p4_003017 36 1376 1..320, 380..657, 719..967, 1039..1376
1..144, 245.759, 810..893, 947..1056, 1106..1248, 1297..1797,
Theau2p4_003130 37 2329
1878..2010, 2073..2329
Theau2p4_003291 38 1168 1..92, 155..237, 304..553, Θ48..897, 974..1168
Theau2p4_003665 39 1545 1..97, 154.708, 764..1545
Theau2p4_003765 40 1192 1..806, 895..1192
Theau2p4_003819 41 1665 1..1665
Theau2p4_003847 42 2406 1..2406
Theau2p4_003855 43 360 1..360
Theau2p4_003935 44 1650 1..337, 410..1650
1..212, 2Θ44..2828, 2879.-2885, 2936..3016, 3067..3290,
Theau2p4_004016 45 5651
3346..5651
Theau2p4_004039 46 2776 1..783, 832..1083, 1199-2776
Theau2p4_004099 47 2178 1..209, 273-536, 593-825, 879-999, 1056-2178
Theau2p4_004148 48 1997 1..180, 250-652, 710.771 , 1056-1997
Theau2p4_004180 49 1031 1..309, 363-1031
1..247, 316-437, 507.761 , 842-936, 996-1196, 1249-1370,
Theau2p4_004192 50 2296
1433-1766, 1837-2296
Theau2p4_004325 51 1815 1..1815
Theau2p4_004359 52 3796 1..867, 932..1913, 1965-3283, 3342-3539, 3617-3796
Theau2p4_004375 53 1708 1..472, 531 -1708
Theau2p4_004420 54 2710 1..337, 400-843, 901 -2092, 2162-2254, 2320-2710
Theau2p4_004914 55 788 1..309, 378.788
Theau2p4_004983 56 871 1..104, 220-871
Theau2p4_004984 57 1444 1..612, 686-1444
1..235, 389-481 , 580-687, 789-865, 925-1032, 1118-1296,
Theau2p4_005064 58 3255
1369-1437, 1499-1714, 1766-2063, 2128-2366, 2466-3255
Theau2p4_005082 59 1789 1..166, 221..366, 423-843, 896-1621 , 1686-1789
Theau2p4_005099 60 2067 1 -144, 211 -2067
Theau2p4_005217 61 1430 1..105, 164..602, 665-1133, 1202-1430
1..195, 256-311 , 369-479, 537-596, 649-819, 872-997,
Theau2p4_005273 62 2116
1054-2116
Theau2p4_005292 63 1337 1..109, 176-241 , 310-323, 382..509, 611 -1337
Theau2p4_005423 64 2848 1..817, 929-2218, 2634-2848
1..99, 179-311 , 410-550, 868-957, 1056-1358, 1443-1627,
Theau2p4_005464 65 2169
1726-2169
Theau2p4_005467 66 2328 1..2328
Theau2p4_005484 67 1079 1..88, 151 -316, 383.716, 774..1079
Theau2p4_005497 68 2037 1 -158, 213-478, 531.729, 789-2037
1..141 , 194.708, 770-853, 912..1021 , 1087-1229, 1285-1791 ,
Theau2p4_005525 69 2349
1853-1985, 2093-2349
Theau2p4_005544 70 1270 1..815, 881 -1059, 1123-1151 , 1211 -1270
Theau2p4_005617 71 1404 1..1404
1..144, 197..571, 619..678, 736..805, 874..1409, 1477..1547,
Theau2p4_009685 112 1626
1614..1626
Theau2p4_009693 113 2539 1..1118, 1205..1876, 1930..2539
Theau2p4_009847 114 1335 1..140, 196.766, 866..1149, 1236..1335
Theau2p4_009884 115 1964 1..109, 176..1164, 1218..1964
1..184, 439.487, 545..6Θ9, 727.745, 796..885, 949..1042,
Theau2p4_009904 116 1799
1098..1154, 1202..1438, 1497..1799
Theau2p4_009921 117 1215 1..1215
Theau2p4_009960 118 1903 1..167, 214..1903
Theau2p4_009976 119 1749 1..1749
Theau2p4_009991 120 2805 1..94, 164..473, 555..685, 965..2805
Theau2p4_009995 121 2771 1..131, 184..1137, 1187..1617, 1675..1910, 1986..2771
Theau2p4_010014 122 846 1..846
1.700, 771..814, 875..1074, 1155..1194, 1259..1383, 1459..1654,
Theau2p4_010047 123 2137
1730..2039, 2103..2137
Theau2p4_010112 124 1081 1..503, 589..839, 951..1081
1..228, 323..356, 421..580, 635..1214, 1266..1426, 1480..1613,
Theau2p4_010152 125 2112
1664..2112
Theau2p4_010206 126 1119 1..1119
Theau2p4_010274 127 1725 1..42, 96..1232, 1300..1725
1..247, 300..329, 383..506, 5Θ3..597, 650..856, 914..1214, 1276..1474, 1525..1568, 1628..1836, 1908..2111 , 2171..2403,
Theau2p4_010295 128 4428
2473..2717, 2769.2998, 3045..3426, 3500..3593, 3656..4045,
4105..4428
Theau2p4_010308 129 1584 1..141 , 192..566, 616..675, 728..806, 891..1426, 1477..1584
Theau2p4_000077 130 852 1..852
Theau2p4_001886 131 979 1..31,89..547, 609..979
Theau2p4_002223 132 1527 1..1431 , 1489..1527
1..95, 244-482, 551-1074, 2593-2737, 2802-2916, 2982-3127,
Theau2p4_004121 133 7162
3190-3295, 3349-4113, 6775-6858, 6925-7162
1..493, 554-667, 726-808, 871..1125, 1204-1232, 1309-1504,
Theau2p4_005752 134 1912
1589-1912
Theau2p4_006038 135 1179 1..1179
Theau2p4_006946 136 300 1..102, 169-300
Theau2p4_007415 137 832 1..93, 191-832
Theau2p4_008246 138 676 1..15, 116-676
Theau2p4_009600 139 554 1-391,481-554
Theau2p4_010333 140 986 1-363,423-986
1..135, 195-664, 725-991, 1043-1058, 1253-1387, 1442-2105,
THEAU_1_00006 141 3488
2158-3488
THEAU_1_00013 142 755 1..114, 177.755
1..60, 184..326, 383-399, 595-1011, 1065-1839, 1898-2759,
THEAU_1_00024 143 3050
2820-3050
THEAU_1_00035 144 1390 1..935, 1001-1179, 1243-1271, 1331-1390
THEAU_1_00067 145 2526 1..500, 705-2526
THEAU_1_00071 146 1370 1..176, 248..407, 464..1 178, 1243..1370
THEAU_1_00079 147 1486 1..288, 353..719, 781..908, 1046..1486
THEAU_1_00082 148 1584 1..141 , 222-566, 728..806, 891..1426, 1477..1584
THEAU_1_00086 149 1 190 1..330, 663..1 190
THEAU_1_00105 150 3007 1..184, 333..372, 423..64Θ, 702..3007
THEAU_1_00136 151 861 1..241 , 306..346, 410..458, 522..643, 712..861
THEAU_1_00137 152 2328 1..976, 1085..2328
THEAU_1_00146 153 712 1..573, 635..712
1..199, 252..281 , 335..458, 515..549, 602..1 166, 1228..1426,
THEAU_1_00157 154 4380 1477..1520, 1580..1788, 1860..2063, 2123..2355, 2425..26Θ9,
2721..2950, 3036..3378, 3452..3545, 3608..3997, 4057..4380
THEAU_1_00158 155 2539 1..1 1 18, 1205..1852, 1930..2539
THEAU_1_00162 156 1358 1..56, 1 12..506, 565..1358
THEAU_1_00169 157 1 192 1..740, 895..1 192
THEAU_2_00575 159 1545 1..97, 154..708, 764..1161 , 1234..1545
THEAU_2_03692 160 724 1..234, 287.724
THEAU_2_03943 161 1439 1..121 , 193..630, 691..1439
THEAU_2_04686 162 1486 1..288, 353..719, 781..980, 1046..1486
1..241 , 306..346, 410..458, 522..643, 712..857, 920..1008,
THEAU_2_04829 163 1802
1080..1098, 1207..1233, 1313..1403, 1502..1567, 1710..1802
THEAU_2_05270 164 1486 1..199, 280..1 133, 1229..1486
1..199, 252..281 , 335..458, 515..549, 602..808, 866..1 166, 1228..1426, 1477..1520, 1580..1788, 1860..2063, 2123..2355,
THEAU_3_00027 165 4380
2425..26Θ9, 2721..2950, 3036..3378, 3452..3545, 3608..3997,
4057..4380
THEAU_3_00028 166 1584 1..141 , 192..566, 728..806, 891..1426, 1477..1584
1..30, 79..136, 215..269, 335..852, 910..1035, 1 129..1381 ,
THEAU_3_00035 167 1698
1486..1589, 1649..1698
THEAU_3_00056 171 737 1..307, 367.737
1..202, 356..448, 547..Θ54, 756..832, 892..999, 1085..1263,
THEAU_3_00067 172 3248 1336..1404, 1466..1681 , 1733..2030, 2095..2312, 2433..3151 ,
3223..3248
THEAU_3_001 19 173 447 1 -330, 418-447
THEAU_3_00124 174 3736 1..101 , 154..1039, 1734-2583, 2649-3736
THEAU_3_00137 175 772 1.772
Theau2p4_000277 176 2058 1..174, 233-685, 744..1221 , 1286-2058
1..264, 327-495, 567..970, 1025- 1637, 1731 -1786, 2957-31 15,
Theau2p4_000298 177 5789
4161 -4309, 4877-5286, 5365-5789
Theau2p4_000508 178 1606 1..229, 322-533, 583.758, 831 -1212, 1277- 1606
Theau2p4_002044 179 1414 1 -871 , 928-1304, 1373-1414
Theau2p4_002591 180 1 173 1..1 173
Theau2p4_003352 181 1262 1..19, 88-342, 580- 1262
Theau2p4_003545 182 1926 1..319, 370-973, 1031 -1650, 1715-1926
Theau2p4_003768 183 2087 1 -364, 442- 1078, 1 140- 1756, 1870-2087
Theau2p4_005004 184 1915 1.747, 815- 1915
Theau2p4_005186 185 581 1..346, 399..475, 528..581
Theau2p4_005905 186 1902 1..1902
Theau2p4_006609 187 1492 1..318, 375.746, 828..924, 978..1066, 1133..1492
Theau2p4_006879 188 1302 1..1302
Theau2p4_006995 189 1982 1..317, 421..621 , 678..1743, 1812..1982
Theau2p4_007118 190 1862 1..270, 326..512, 568.768, 792..1169, 1267..1862
Theau2p4_007772 191 1154 1..34, 130..543, 626..856, 934..1154
1..87, 189..287, 381..495, 598.714, 797..879, 945..1226,
Theau2p4_008179 192 3242
1294..1541 , 2024..2183, 237Θ..2589, 2834..3021 , 3084..3242
Theau2p4_008330 193 1965 1..446, 520..1352, 1430..1848, 1951..1965
Theau2p4_008937 194 1777 1.78, 128..545, 593..Θ54, 706..1136, 1186..1400, 1464..1777
Theau2p4_009039 195 1464 1..1027, 1121..1464
Theau2p4_009110 196 1213 1..237, 296..1020, 1081..1213
Theau2p4_009447 197 513 1..232, 328..389, 487..513
Theau2p4_009769 198 1758 1..228, 302..652, 712..1758
Theau2p4_009843 199 1453 1..491 , 598..1453
Theau2p4_010122 200 569 1..220, 369.445, 522..569
Theau2p4_000002 1 965 1..184, 241..965
Theau2p4_000003 2 1081 1.79, 134.299, 368.704, 776..1081
Theau2p4_000203 3 1978 1..207, 269..1978
Theau2p4_000253 4 1226 1..234, 287.706, 768..1226
Theau2p4_000278 5 1386 1..1386
1..343, 403..507, 562..678, 728..851 , 907..922, 974..1090,
Theau2p4_000294 6 2174
1142..1192, 1248..1401 , 1470..1528, 1596..2174
Theau2p4_000305 7 1981 1..99, 150..366, 423..1326, 1401..1699, 1778..1885, 1971..1981
Theau2p4_000410 8 1782 1..175, 238..337, 394.918, 996..1782
Theau2p4_000537 9 1382 1..80, 136..530, 589..1382
Theau2p4_000728 10 2526 1..2526
Theau2p4_000729 11 1658 1..144, 213..1658
Theau2p4_000749 12 1583 1..357, 430..612, 715..984, 1044.1434, 1519..1583
1.438, 499..512, 566..1179, 1231..1382, 1435..1500, 1687..2306,
Theau2p4_000762 13 3208
2395..3208
Theau2p4_000765 14 1182 1..69, 243..2Θ5, 463..585, 671..926, 1021..1182
Theau2p4_000766 15 1309 1..109, 164..222, 278.430, 496..560, 634.731 , 783..1309
Theau2p4_000896 16 3048 1..347, 492..28Θ2, 2920..3048
Theau2p4_000921 17 1370 1..176, 248.407, 464.1178, 1258..1370
Theau2p4_001159 18 1600 1..372, 432..1370, 1427..1600
Theau2p4_001291 19 2569 1..330, 405..1150, 1570..1799, 1860..2095, 2183..2569
Theau2p4_001344 20 1427 1..215, 323..1427
Theau2p4_001376 21 1526 1..216, 273..1526
Theau2p4_001424 22 792 1.792
1..102, 168..581 , 642..818, 873..911 , 966..1174, 1237..1386,
Theau2p4_001559 23 1753
1441..1753
Theau2p4_001685 24 1176 1..233, 283..368, 425..831 , 892..1176
Theau2p4_001741 25 1405 1..102, 187..337, 401..1055, 1117..1405
1..387, 47Θ..576, Θ39..892, 953..982, 1036..1085, 1151..1211 ,
Theau2p4_001760 26 1947
1283..1947
Theau2p4_001903 27 1244 1..833, 898..1082, 1144..1244
Theau2p4_001952 28 2616 1..2616
1..60, 184..326, 383..427, 483..535, 595..1011 , 1065..1839,
Theau2p4_002242 29 3050
1898..2759, 2820..3050
Theau2p4_002307 30 1492 1..294, 359.725, 787..986, 1052..1492
Theau2p4_002505 31 1107 1..64, 189.710, 812..1107
Theau2p4_002538 32 1446 1..114, 177.732, 788..1446
Theau2p4_002630 33 1399 1..112, 193..1046, 1142..1399
Theau2p4_002751 34 1380 1..1380
Theau2p4_002827 35 1402 1..386, 445..1208, 1272..1402
Theau2p4_003017 36 1376 1..320, 380..657, 719..967, 1039..1376
1..144, 245.759, 810..893, 947..1056, 1106..1248, 1297..1797,
Theau2p4_003130 37 2329
1878..2010, 2073..2329
Theau2p4_003291 38 1168 1..92, 155..237, 304..553, Θ48..897, 974..1168
Theau2p4_003665 39 1545 1..97, 154.708, 764..1545
Theau2p4_003765 40 1192 1..806, 895..1192
Theau2p4_003819 41 1665 1..1665
Theau2p4_003847 42 2406 1..2406
Theau2p4_003855 43 360 1..360
Theau2p4_003935 44 1650 1..337, 410..1650
1..212, 2Θ44..2828, 2879..2885, 2936..3016, 3067..3290,
Theau2p4_004016 45 5651
3346..5651
Theau2p4_004039 46 2776 1.783, 832..1083, 1199..2776
Theau2p4_004099 47 2178 1..209, 273..53Θ, 593..825, 879..999, 1056..2178
Theau2p4_004148 48 1997 1..180, 250..652, 710.771 , 1056..1997
Theau2p4_004180 49 1031 1..309, 363..1031
1..247, 316..437, 507.761 , 842..936, 996..1196, 1249..1370,
Theau2p4_004192 50 2296
1433..1766, 1837..2296
Theau2p4_004325 51 1815 1..1815
Theau2p4_004359 52 3796 1..867, 932..1913, 1965..3283, 3342..3539, 3617..3796
Theau2p4_004375 53 1708 1..472, 531..1708
Theau2p4_004420 54 2710 1..337, 400..843, 901..2092, 2162..2254, 2320..2710
Theau2p4_004914 55 788 1..309, 378.788
Theau2p4_004983 56 871 1..104, 220..871
Theau2p4_004984 57 1444 1..612, 686..1444
1..235, 389..481 , 580..687, 789..865, 925..1032, 1118..1296,
Theau2p4_005064 58 3255
1369..1437, 1499..1714, 1766..2063, 2128..2366, 246Θ..3255
Theau2p4_005082 59 1789 1..166, 221..366, 423..843, 896..1621 , 1686..1789
Theau2p4_005099 60 2067 1..144, 211..2067
Theau2p4_005217 61 1430 1..105, 164..602, 665..1133, 1202..1430
1..195, 256..311 , 3Θ9..479, 537..596, 649..819, 872..997,
Theau2p4_005273 62 2116
1054..2116
Theau2p4_005292 63 1337 1..109, 176..241 , 310..323, 382..509, 611..1337
Theau2p4_005423 64 2848 1..817, 929..2218, 2Θ34..2848
1..99, 179..311 , 410..550, 868..957, 1056..1358, 1443..1627,
Theau2p4_005464 65 2169
1726..2169
Theau2p4_005467 66 2328 1..2328
Theau2p4_005484 67 1079 1..88, 151..316, 383.716, 774..1079
Theau2p4_005497 68 2037 1..158, 213..478, 531..729, 789..2037
1..141 , 194..708, 770..853, 912..1021 , 1087..1229, 1285..1791 ,
Theau2p4_005525 69 2349
1853..1985, 2093..2349
Theau2p4_005544 70 1270 1..815, 881..1059, 1123..1151 , 1211..1270
Theau2p4_005617 71 1404 1..1404
Theau2p4_005791 72 1735 1..1199, 1259..1568, 1643..1735
Theau2p4_005940 73 1022 1..159, 217..235, 294..319, 388..529, 598..1022
Theau2p4_006016 74 1840 1..1496, 1549..1840
Theau2p4_006023 75 2085 1..937, 988..1633, 1701..1918, 2006..2085
Theau2p4_006100 76 1529 1..92, 162..454, 526..1132, 1205..1529
1..42, 91..148, 227..281 , 347..8Θ4, 922..1047, 1141..1420,
Theau2p4_006102 77 1609
1498..1609
Theau2p4_006123 78 1777 1..332, 399..1678, 1743..1777
1..194, 255..547, 614..1053, 1104..1687, 1752..1954, 2006..2149,
Theau2p4_006125 79 2546
2227..23Θ2, 2453..254Θ
Theau2p4_006573 80 1005 1..350, 447..814, 944..1005
Theau2p4_006767 81 766 1..456, 543..594, 654.766
Theau2p4_006768 82 1471 1..573, Θ32..826, 906..1323, 1404..1471
Theau2p4_006771 83 1331 1..123, 221..456, 526.-949, 1032..1331
Theau2p4_006847 84 1798 1..360, 413..1798
Theau2p4_007017 85 1278 1..1278
Theau2p4_007034 86 2104 1..63, 125..313, 368..2104
Theau2p4_007080 87 1398 1..1398
Theau2p4_007243 88 1374 1..125, 187..234, 296..931 , 996..1374
Table 2B. List of genes of Myceliophthora fergusii (Corynascus thermophilus) with reference to exon boundaries
Corth2p4_000319 608 1014 1..1014
Corth2p4_000449 609 1785 1..159, 291..749, 844..1051 , 1146..1500, 1593..1785
Corth2p4_000539 610 2133 1..156, 286..1452, 1702..2133
Corth2p4_000543 611 1416 1..109, 177..1099, 1183..1416
Corth2p4_000894 612 905 1..121 , 193..317, 385..642, 726..905
Corth2p4_000923 613 2319 1..1795, 1913..2319
Corth2p4_000941 614 1774 1..90, 202..554, 654.726, 797..1045, 1184..1774
Corth2p4_000946 615 1439 1..152, 233..381 , 460..1439
Corth2p4_000957 616 1131 1..275, 714..1131
Corth2p4_001006 617 1048 1..489, 596..1048
Corth2p4_001013 618 1515 1..80, 201..601 , 786..1246, 1390..1515
Corth2p4_001028 619 1393 1..147, 209..1393
Corth2p4_001043 620 2858 1.72, 176..330, 411.799, 8Θ2..2858
Corth2p4_001049 621 1125 1..243, 324..1001 , 1072..1125
1..43, 109..293, 352..1785, 1860..2002, 2060..2151 , 2235..24Θ7,
Corth2p4_001075 622 2942
258Θ..2942
Corth2p4_001772 623 1509 1..198, 274..1509
Corth2p4_001812 624 2631 1..2631
Corth2p4_001856 625 1378 1..197, 265-427, 508-580, 660-1378
Corth2p4_001899 626 3464 1 -312, 442-2176, 2269-3464
Corth2p4_001901 627 966 1..966
Corth2p4_001916 628 1611 1..393, 445..530, 616..871 , 967..1611
Corth2p4_001919 629 944 1..55, 142..483, 555-692, 796-944
Corth2p4_002030 630 858 1..440, 527-623, 688-858
Corth2p4_002330 631 888 1..55, 155-511 , 602-888
Corth2p4_002332 632 1268 1 -272, 458-1049, 1194-1268
Corth2p4_002333 633 1020 1..693, 829-1020
Corth2p4_002390 634 1539 1..179, 247-365, 443-483, 552-638, 717..907, 1005-1539
Corth2p4_002412 635 745 1..248, 328.745
Corth2p4_002448 636 1101 1..1101
Corth2p4_002537 637 1641 1..548, 650-824, 990-1263, 1364-1641
Corth2p4_002730 638 1785 1..1785
Corth2p4_002748 639 936 1..936
Corth2p4_002753 640 1032 1..98, 294-861 , 931 -1032
Corth2p4_002765 641 3369 1..847, 980-1939, 2009-3369
Corth2p4_002774 642 902 1..568, 660.718, 816-902
Corth2p4_002835 643 2557 1..15, 78-375, 513-2557
Corth2p4_002845 644 1461 1..576, 655-1461
Corth2p4_002847 645 1011 1..1011
Corth2p4_002850 646 668 1..257, 326-668
Corth2p4_002856 647 969 1..245, 306-969
Corth2p4_005268 692 1223 1..796, 982..1223
Corth2p4_005378 693 1348 1..847, 945..1348
Corth2p4_005438 694 1689 1..1689
Corth2p4_005522 695 2337 1..2337
Corth2p4_005615 696 1435 1..22, 92..112, 351..581 , 675..1435
Corth2p4_005803 697 795 1..281 , 390.795
Corth2p4_006047 698 821 1..371 , 452..821
Corth2p4_006086 699 1579 1..129, 23Θ..335, 433.759, 864..1380, 1486..1579
Corth2p4_006093 700 990 1..990
Corth2p4_006231 701 1643 1..409, 487..1643
Corth2p4_006280 702 1383 1..196, 257..1383
Corth2p4_006392 703 1397 1..1218, 1350..1397
Corth2p4_006416 704 1136 1..599, 686..1136
Corth2p4_006508 705 1662 1..92, 177..646, 760..1344, 1400..1662
Corth2p4_006585 706 840 1..840
Corth2p4_006624 707 1092 1..964, 1022..1092
Corth2p4_006704 708 1464 1..413, 530.761 , 826..1059, 1132..1464
Corth2p4_006771 709 1398 1..52, 234..587, 1106..1398
Corth2p4_006772 710 742 1..394, 459.742
1..52, 184..1032, 1147..1268, 1378..1471 , 1555..1604,
Corth2p4_006773 711 2461
1670..1748, 1807..2039, 2131..2173, 2247..2461
Corth2p4_006798 712 1955 1..567, 644.780, 849..1061 , 1412..1601 , 1697..1827, 1898..1955
Corth2p4_006802 713 435 1..435
Corth2p4_006831 714 3130 1..335, 2446..3130
Corth2p4_006884 715 1037 1..126, 210..303, 393.736, 873..1037
Corth2p4_006909 716 1281 1..373, 497..1281
Corth2p4_006985 717 895 1..443, 516.736, 816..895
Corth2p4_006986 718 2561 1..200, 359..2561
Corth2p4_007037 719 2397 1..397, 461..2397
Corth2p4_007041 720 1990 1..92, 187..244, 320..1990
Corth2p4_007216 721 608 1..30, 332.482, 580..608
Corth2p4_007256 722 449 1..267, 369.449
Corth2p4_007314 723 1697 1..452, 520..605, 670..1697
Corth2p4_007317 724 268 1..152, 250..268
Corth2p4_007324 725 1349 1..52, 127..979, 1043..1349
Corth2p4_007336 726 1534 1.78, 145..233, 394..1534
Corth2p4_007337 727 465 1..53, 318..465
Corth2p4_007352 728 631 1..145, 202..293, 359..631
Corth2p4_007363 729 1164 1..204, 286..1164
Corth2p4_007365 730 543 1..543
Corth2p4_007371 731 1450 1..230, 343..491 , 569.711 , 7Θ7..935, 1008..1450
Corth2p4_007378 732 1081 1..162, 230..343, 473..1081
Corth2p4_007382 733 1302 1..1302
Corth2p4_007391 734 423 1..423
Corth2p4_007404 735 1000 1..263, 394..1000
Corth2p4_007434 736 1235 1..926, 1019..1235
Corth2p4_007436 737 600 1..62, 180..600
Corth2p4_007450 738 438 1..438
Corth2p4_007458 739 1754 1..1097, 1166..1754
Corth2p4_007464 740 1922 1..662, 1460..1922
Corth2p4_007465 741 1025 1..178, 271..409, 491..1025
Corth2p4_007466 742 1125 1..739, 842..1125
Corth2p4_007477 743 892 1..23, Θ79..892
Corth2p4_007526 744 599 1..312, 393..599
Corth2p4_007532 745 1680 1..1680
Corth2p4_007540 746 1731 1..1731
Corth2p4_007543 747 877 1..215, 331..877
Corth2p4_007546 748 1800 1..1800
Corth2p4_007576 749 584 1..275, 3Θ5..584
Corth2p4_007591 750 1675 1..190, 258..1261 , 1357..1548, 1616..1675
Corth2p4_007613 751 1002 1..1002
Corth2p4_007616 752 1240 1..307, 444..1240
Corth2p4_007633 753 828 1..239, 294..603, 675.734, 796-828
Corth2p4_007649 754 2013 1..34, 134..440, 501 -2013
Corth2p4_007651 755 878 1..18, 166-259, 607..878
Corth2p4_007660 756 1367 1 -552, 618-1367
Corth2p4_007662 757 1612 1..1029, 1092-1262, 1373-1612
Corth2p4_007682 758 2150 1..295, 366-496, 567-833, 882..1408, 1523-2150
Corth2p4_007690 759 496 1..334, 381 -496
Corth2p4_007710 760 673 1 -304, 402-673
Corth2p4_007717 761 393 1..393
Corth2p4_007723 762 1073 1..566, 629-942, 1003-1073
Corth2p4_007737 763 525 1..525
Corth2p4_007742 764 744 1..744
Corth2p4_007750 765 444 1..81 , 148-444
1..15, 185-260, 331 -447, 508-595, 661.798, 864..1331 ,
Corth2p4_007751 766 1662
1398-1662
Corth2p4_007755 767 357 1..357
Corth2p4_007756 768 882 1..145, 236-882
Corth2p4_007769 769 1839 1..1839
Corth2p4_007778 770 1745 1..272, 1733-1745
Corth2p4_007779 771 372 1..186, 283-372
Corth2p4_007784 772 1718 1..205, 297-382, 472..1467, 1530-1718
Corth2p4_007794 773 1750 1..663, 730-1513, 1581 -1750
Corth2p4_007810 774 1560 1..1560
Corth2p4_000858 819 2114 1..148, 207..837, 895..1186, 1268..1454, 1672..2114
1..134, 189..266, 5Θ2..682, 765..891, 953..1417, 1502..2277,
Corth2p4_000887 820 2404
2375..2404
Corth2p4_000911 821 2189 1..109, 206..587, 689..2189
Corth2p4_001004 822 1960 1..1226, 1288..1960
Corth2p4_001018 823 2727 1..2727
Corth2p4_001022 824 2899 1..69, 333..1067, 1112..2899
Corth2p4_001303 825 1760 1..502, 598..1760
Corth2p4_001516 826 1654 1..272, 334..406, 480..647, 714..809, 872..1654
Corth2p4_001647 827 1227 1..1227
Corth2p4_001670 828 2589 1..571 , 661..1822, 2499.-2589
Corth2p4_001758 829 2853 1..97, 218.793, 1633..1862, 1930..2853
Corth2p4_001806 830 1185 1..1185
Corth2p4_002063 831 1316 1..99, 158..314, 382..1316
Corth2p4_002115 832 1463 1..260,410..1463
Corth2p4_002197 833 1525 1..594, 749-902, 1011-1525
Corth2p4_002351 834 1110 1..88, 178-415, 535-806, 923-1110
Corth2p4_002355 835 2715 1..2715
Corth2p4_002374 836 1001 1..216, 257-500,613-1001
Corth2p4_002375 837 1139 1..224, 340-549, 662..819, 910-1139
1..357, 504..545, 630-1506, 1627-2079, 2162-2203, 2299-2322,
Corth2p4_002376 838 2465
2419-2465
Corth2p4_002387 839 2564 1..306, 380-466, 529-837, 900-2564
Corth2p4_002514 840 1962 1..199, 270-445, 533-807, 885-1962
Corth2p4_002676 841 1824 1..254, 372..1824
Corth2p4_002800 842 1455 1..1455
Corth2p4_002811 843 2131 1..328, 427-659, 729-837, 914..1195, 1328-1515, 2069-2131
Corth2p4_002819 844 2126 1..344, 424-829, 937..1827, 1953-2126
Corth2p4_002880 845 2750 1..962, 1022-1040, 1740-1930, 1995-2066, 2144-2750
Corth2p4_002924 846 1313 1-150,219-1313
1.79, 136-207, 297..370, 437-926, 1020-2007, 2108-2175,
Corth2p4_003097 847 2795
2278-2795
Corth2p4_003178 848 2089 1..258, 316-1810, 1893-2089
Corth2p4_003187 849 1638 1-112,212-306, 371..434, 575-1638
Corth2p4_003191 850 1357 1..411, 560-585, 692.750, 899-959, 1072-1357
Corth2p4_003207 851 1616 1..383, 500-672, 828-1250, 1356-1420, 1533-1616
Corth2p4_003244 852 1185 1..220, 317..1064, 1125-1185
Corth2p4_003359 853 1903 1..824, 901-1097, 1195-1342, 1450-1903
Corth2p4_003416 854 1572 1-441,559-1233, 1366-1572
Corth2p4_003463 855 933 1..320, 395.762, 857-933
Corth2p4_003491 856 2815 1..120, 174-524,617-2815
Corth2p4_003507 857 1609 1..22, 74-89, 157..278, 363.707, 816-1097, 1284-1609
Corth2p4_003877 858 2013 1..2013
Table 2C. List of genes of Pseudocercosporeiia herpotrichoides with reference to exon boundaries
Psehe2p4_000102 1472 2224 1..360, 407.798, 847..2224
Psehe2p4_000189 1473 2478 1..174, 223..341 , 394.725, 789..1228, 1276..2478
Psehe2p4_000259 1474 1684 1..476, 561..617, 676.744, 808..1684
Psehe2p4_000334 1475 3934 1..1202, 1255..2201 , 2244..2890, 2963.-3934
Psehe2p4_000496 1476 958 1..204, 254..958
Psehe2p4_000553 1477 1805 1..255, 314..431 , 482..1020, 1069..1174, 1225..1805
Psehe2p4_000555 1478 1490 1..188, 243..433, 544..612, Θ84..837, 887..904, 992..1490
Psehe2p4_000672 1479 2000 1..121 , 192..446, 504..1208, 1284..1365, 1424..2000
Psehe2p4_000753 1480 1142 1..612, 714..1142
Psehe2p4_000830 1481 1036 1..111 , 188..470, 540..1036
Psehe2p4_000998 1482 2608 1..103, 159..245, 389..584, 658..2608
Psehe2p4_001071 1483 831 1..49, 98..377, 453..831
Psehe2p4_001117 1484 1423 1..93, 153..440, 488..555, 604..1423
Psehe2p4_001149 1485 1105 1..234, 331..518, 608.721 , 773..843, 915..1105
1..318, 369..506, 559.722, 773..856, 913..999, 1076..1115, 1168..1260, 1319..1410, 1517..1551 , 1605..1646, 1796..1983,
Psehe2p4_001182 1486 6708 2039..2313, 2436..2720, 2880..3049, 3144..3405, 34Θ3..3483,
3539..3584, 3641 -3661 , 3721 -3756, 3815-3871 , 4010-4155, 4227-4445, 4522-4653, 4712-4773, 4885-5262, 6524-6708
1..110, 168-316, 381 -533, 582-746, 805-966, 1020-1181 ,
Psehe2p4_001184 1487 1982
1235-1387, 1452-1610, 1667-1823, 1877-1906, 1955-1982
1..470, 518-844, 897-978, 1085-1728, 2060-2091 , 2652-3021 ,
Psehe2p4_001188 1488 4635
3081 -4635
Psehe2p4_001205 1489 1188 1..1188
Psehe2p4_001256 1490 1230 1 -355, 428-1230
Psehe2p4_001258 1491 1777 1 -432, 479-1777
Psehe2p4_001268 1492 1213 1 -201 , 260-961 , 1031 -1213
Psehe2p4_001286 1493 1496 1..122, 175-288, 337-458, 506-1057, 1118-1184, 1376-1496
Psehe2p4_001308 1494 1371 1..275, 326-498, 548.775, 827..1371
Psehe2p4_001323 1495 1461 1 -231 , 284-1370, 1439-1461
Psehe2p4_001405 1496 1076 1 -151 , 201 -403, 455-634, 693-1076
Psehe2p4_001406 1497 1038 1..107, 159-317, 367-877, 931 -1038
1..106, 154..298, 344-542, 592.715, 763-879, 926-979,
Psehe2p4_001418 1498 3248 1027-1337, 1456-1499, 1549-2232, 2279-2481 , 2533-2971 ,
3019-3059, 3106-3189, 3238-3248
1.75, 148-652, 708.767, 871..978, 1093-1125, 1175-1549, 1677-1902, 1992-2104, 2155-2259, 2355-2476, 2558-2764,
Psehe2p4_001498 1499 4161
2815-2938, 2989-3421 , 3505-3596, 3668-3694, 3794-3872,
3939-4161
Psehe2p4_001518 1500 425 1..163, 241..326, 384-425
Psehe2p4_001526 1501 1557 1..91 , 148-1103, 1177-1557
1..57, 121..204, 328-483, 533-600, 646-677, 726-1521 ,
Psehe2p4_001533 1502 3696
1670-1680, 1734-2777, 2846-3696
Psehe2p4_001564 1503 2975 1..84, 228-431 , 481 -840, 894..1194, 1288-2975
Psehe2p4_001666 1504 1696 1..493, 543-1696
1..302, 420..651 , 703..753, 808..910, 957..1018, 1068..1193,
1241..1406, 1462..1527, 1581..1666, 1714..1897, 2054..2282, 2381..2436, 2495..26Θ4, 2778..3611 , 3663..3789, 3980..4153,
Psehe2p4_003310 1540 7782
4219..4430, 4523.4648, 4717.4833, 4936..5000, 5052..5132, 5203..5659, 5713..6000, 6050..6377, 6432..6770, 682Θ..6969,
7061.7782
Psehe2p4_003317 1541 1608 1..99, 153.470, 523..1608
1.4, Θ3..229, 279..330, 383.457, 504..640, 688.748, 795..893,
Psehe2p4_003333 1542 2187
940..988, 1039..1094, 1141..1659, 1712..2187
Psehe2p4_003364 1543 1348 1..60, 110..151 , 200..314, 3Θ9..628, 674.1075, 1121..1348
Psehe2p4_003365 1544 1641 1..1641
Psehe2p4_003395 1545 1500 1..292, 348..826, 880..1500
Psehe2p4_003402 1546 891 1..296, 411..891
1..228, 275..330, 386.408, 458..605, 66Θ..836, 886..1196,
Psehe2p4_003411 1547 2383
1252..1322, 1372..1890, 1962..2129, 2188..2237, 2293..2383
Psehe2p4_003431 1548 768 1..138, 381..500, 589..686, 756.768
Psehe2p4_003565 1549 1307 1..256, 306..1138, 1191..1307
Psehe2p4_003598 1550 3750 1..138, 190..565, 683..1272, 1846..2209, 2294.3750
Psehe2p4_003661 1551 585 1..271 , 341.441 , 511..585
Psehe2p4_003736 1552 1288 1..231 , 345..558, 630..1288
1..249, 302.425, 477.757, 803..975, 1028..1115, 1170..1275,
Psehe2p4_003795 1553 2034
1334..1470, 1591..2034
Psehe2p4_003855 1554 1643 1..394, 447..613, 669..1046, 1103..1245, 1305..1476, 1569..1643
Psehe2p4_003856 1555 3260 1..284, 352..533, 600..955, 1690..2486, 2549..3260
Psehe2p4_003869 1556 1170 1..240, 353..527, 596..1170
Psehe2p4_003921 1557 1163 1..120, 198..650, 729..1163
Psehe2p4_003984 1558 1391 1..1305, 1353..1391
1..252, 308.425, 479..625, 675..884, 937..1063, 1118..1140,
Psehe2p4_004027 1559 2290
1193..1225, 1283..1307, 1359..1760, 1812..1959, 2030..2290
Psehe2p4_004041 1560 1330 1..162, 332.402, 453.749, 825..1083, 1208..1330
Psehe2p4_004080 1561 882 1..261 , 313..395, 45Θ..882
Psehe2p4_004116 1562 1204 1..212, 286..1204
1..210, 263.426, 478..650, 702..1127, 1198..1262, 1318..1362,
Psehe2p4_004241 1563 1453
1415..1453
Psehe2p4_004321 1564 1213 1..241 , 290.797, 850..1213
Psehe2p4_004337 1565 2857 1..1001 , 1096..1183, 1255..2746, 2799..2857
Psehe2p4_004389 1566 945 1..659, 711..945
Psehe2p4_004411 1567 1472 1..527, 63Θ..699, 759..1472
Psehe2p4_004451 1568 843 1..843
Psehe2p4_004458 1569 2129 1.775, 832..2129
Psehe2p4_004583 1570 870 1..88, 142..357, 417.716, 770..870
Psehe2p4_004584 1571 1027 1..111 , 161..1027
Psehe2p4_004691 1572 1384 1..267, 355.471 , 561.706, 840..1034, 1090..1384
Psehe2p4_004760 1573 915 1..64, 149..238, 287..535, 594..820, 871..915
Psehe2p4_004761 1574 1648 1.73, 123..800, 858..895, 952..1458, 1511..1648
Psehe2p4_004762 1575 1 175 1..49, 96..189, 345..599, 648..Θ85, 73Θ..822, 872..990, 1038..1 175
1..73, 122..235, 289..3Θ6, 417..923, 976..1282, 1338..1402,
Psehe2p4_004763 1576 1529
1453..1529
Psehe2p4_004781 1577 1417 1..91 , 159..431 , 491..826, 901..1099, 1207..1417
1..307, 356..1095, 1 142..1367, 1446..1946, 1996..2064,
Psehe2p4_004842 1578 3576
2407..2537, 2597..2691 , 2796..357Θ
Psehe2p4_004850 1579 2063 1..1447, 1570..2063
1..161 , 215..381 , 447..602, 711..1336, 1391..1593, 1641..1742,
Psehe2p4_004865 1580 2186
1792..1832, 1897..2186
1..190, 243..418, 510..570, 622..831 , 884..890, 941..992,
Psehe2p4_004935 1581 2175
1046..1615, 1670..1728, 1785..1942, 1994..2175
Psehe2p4_004955 1582 1206 1..147, 197..601 , 661..1206
1..141 , 188..300, 350..1016, 1067..1282, 1335..1386, 1434..1802,
Psehe2p4_005081 1583 2206
1866..2206
Psehe2p4_005234 1584 1320 1..84, 131..943, 997..1320
Psehe2p4_005266 1585 1212 1..673, 728..1212
Psehe2p4_005269 1586 1 1 1 1 1..60, 1 13..224, 280..438, 932..957, 1013..1 1 11
Psehe2p4_005342 1587 1018 1..308, 357..666, 722.777, 830..938, 986..1018
Psehe2p4_005404 1588 933 1..933
Psehe2p4_005530 1589 1391 1..123, 179..250, 298..1318, 1369..1391
Psehe2p4_005567 1590 848 1..150, 261..419, 474..Θ72, 73Θ..848
1..212, 2Θ9..532, 582..947, 994..1086, 1 196..1427, 1482..1699,
Psehe2p4_005678 1591 2392
1759..2185, 2240..2392
1..99, 177..537, 588..717, 76Θ..828, 888..1074, 1 123..1 163, 1215..1534, 1585..1605, 1654..1729, 1784..191 1 , 1962..1990,
Psehe2p4_005727 1592 5930
2041..2442, 2491..2931 , 3051..3519, 3570..4121 , 4232.4257,
4368..4551 , 4668..5820, 5879..5930
1..1 14, 174..747, 801..1033, 1089..1 191 , 1248..1971 , 2026..31 19,
Psehe2p4_005853 1593 3345
3170..3345
Psehe2p4_005919 1594 1341 1..156, 242..493, 573.785, 855..1209, 1298..1341
Psehe2p4_005939 1595 1335 1..408, 469..642, 694..1335
Psehe2p4_005961 1596 981 1..319, 376..925, 975..981
Psehe2p4_005982 1597 1367 1..206, 261..423, 515..1 157, 1246..1367
1..199, 292..499, 578.737, 819..875, 958..1039, 1 134..1407, 1468..1533, 1605..1749, 1867..1957, 2081..2178, 2273..2588,
Psehe2p4_006015 1598 4984
2654..2705, 2812..3137, 3208..3331 , 3376..3506, 3673..3840, 3904..4027, 4075..4155, 4328..4416, 4599.4984
1.45, 102..169, 214..362, 408..622, 671..825, 872..1416,
Psehe2p4_006036 1599 3257
1468..2741 , 2795..3151 , 3249..3257
Psehe2p4_006039 1600 1099 1..91 , 185..540, 617..1099
1..376, 425..530, 587..831 , 883..918, 1172..1344, 1397..1681 ,
Psehe2p4_006057 1601 2690
1736..1815, 1867..2193, 2243..2690
Psehe2p4_006060 1602 797 1..325, 382.797
Psehe2p4_006078 1603 1322 1..1 16, 175..328, 386.471 , 534..849, 906..1322
Psehe2p4_006093 1604 3475 1..384, 792..1 161 , 1208..2074, 2124..3242, 3303..3475
1..168, 216..294, 346..357, 408.425, 475..526, 574.71 1 ,
Psehe2p4_006096 1605 2318
762..1620, 1676..1714, 1776..1793, 1844..2032, 2142..2318
Psehe2p4_006125 1606 1054 1..451 , 504..1054
Psehe2p4_006144 1607 2094 1..217, 269..712, 896..2094
Psehe2p4_006206 1608 982 1..88, 144..203, 347..427, 488..640, 711..982
Psehe2p4_006243 1609 1731 1..65, 117..547, 639.777, 908..1229, 1286..1413, 1506..1731
Psehe2p4_006280 1610 2053 1..146, 252..419, 569..1026, 1077..2053
Psehe2p4_006329 1611 1483 1..615, 677.798, 850..1483
Psehe2p4_006376 1612 1302 1..146, 202..288, 401..636, 692..1302
Psehe2p4_006377 1613 922 1..217, 320..511 , 565..803, 878..922
Psehe2p4_006439 1614 522 1..522
Psehe2p4_006441 1615 1473 1..115, 169..606, 653..1473
Psehe2p4_006539 1616 1672 1..807, 884..1672
Psehe2p4_006555 1617 1606 1..348, 398..565, 617.787, 839..1084, 1145..1282, 1346..1606
1..86, 210..934, 985..1154, 1202..1419, 1545..1628, 1735..1856,
Psehe2p4_006561 1618 2578
1907..2362, 2448..2578
Psehe2p4_006601 1619 1510 1.747, 854..1273, 1343..1510
Psehe2p4_006607 1620 1431 1..54, 386..515, 562.769, 852..1431
1..155, 202..376, 427..532, 728..1155, 1204..1473, 1629..1996,
Psehe2p4_006617 1621 3324
2059..2087, 2218..2491 , 2595..3324
Psehe2p4_006658 1622 866 1..281 , 485..866
Psehe2p4_006666 1623 843 1..843
Psehe2p4_006671 1624 1043 1..108, 156..901 , 950..1043
Psehe2p4_006725 1625 2041 1..276, 403..1127, 1174..2041
Psehe2p4_006727 1626 1753 1..306, 434..5Θ3, 649.727, 851..1390, 1522..1753
1..227, 281..420, 515..534, 604..983, 1037..1135, 1208..1348, 1406..1427, 1515..1523, 1634..1759, 1822..1866, 1978..2394,
Psehe2p4_006735 1627 5247 2483..2548, 2609..2823, 2921..3001 , 3061 -3136, 3195-3222,
3352-3491 , 3564-3641 , 3765-3861 , 3928-4042, 4098-4311 , 4470-4539, 4589-4700, 4814-4963, 5040-5247
Psehe2p4_006736 1628 2057 1..225, 275-894, 966-1817, 1976-2057
Psehe2p4_006752 1629 2530 1..211 , 295-1027, 1166-1293, 1390-1926, 2358-2530
Psehe2p4_006753 1630 2421 1 -418, 471 -1834, 1882-2421
Psehe2p4_006770 1631 1409 1..516, 635-1099, 1161 -1409
Psehe2p4_006785 1632 1566 1..186, 290-347, 461 -1566
Psehe2p4_006789 1633 1430 1..328, 384-544, 594..1430
Psehe2p4_006868 1634 1004 1..92, 191 -300, 420-544, 602..695, 748-1004
Psehe2p4_006883 1635 1622 1..99, 157..477, 524-899, 946-1622
Psehe2p4_006910 1636 2126 1..465, 534..1310, 1371 -2126
1..125, 182..348, 403-522, 574..901 , 946-1072, 1123-1210,
Psehe2p4_006912 1637 1943
1260-1409, 1469-1730, 1790-1943
1..123, 182..540, 678-1068, 1183-1240, 1380-2376, 2436-2537,
Psehe2p4_006917 1638 5433 2727-3257, 3344-3472, 3592-3639, 4294-4516, 4619-4713,
4899-5433
1..363, 412..806, 853-980, 1069-1237, 1353-1528, 1646-2141 ,
Psehe2p4_006949 1639 2379
2196-2379
Psehe2p4_007006 1640 868 1 -386, 434-609, 690-868
Psehe2p4_007013 1641 810 1..403, 471.709, 766..810
1..298, 353..569, 621..805, 860..1248, 1304..1486, 1543..1686,
Psehe2p4_007034 1642 2225
1740..1870, 1928..2225
Psehe2p4_007048 1643 1042 1..160, 240..322, 445..539, 721..1042
Psehe2p4_007061 1644 1197 1..307, 361..527, 583..630, 701.796, 878..1056, 1125..1197
Psehe2p4_007064 1645 2506 1..15, 78..385, 441..562, 612..847, 893..2506
Psehe2p4_007099 1646 2160 1..262, 334..4Θ5, 516.796, 845..2049, 2127..2160
Psehe2p4_007114 1647 1053 1..1053
Psehe2p4_007236 1648 462 1..462
Psehe2p4_007460 1649 1631 1..312, 377..503, 559.724, 780..1185, 1263..1631
Psehe2p4_007479 1650 430 1..192, 251..430
Psehe2p4_007493 1651 2995 1..196, 243..309, 367..1309, 1360..1459, 1515..2995
1..126, 180..240, 326..558, 617..1051 , 1182..1468, 1576..2381 ,
Psehe2p4_007711 1652 7847 2520..3029, 3156..3828, 3949..4052, 4193..4671 , 4818..5730,
5841..6214, 6318..6693, 6826.7847
Psehe2p4_007746 1653 1211 1..221 , 278..1211
Psehe2p4_007749 1654 1404 1..281 , 360..594, 691..836, 891..1404
Psehe2p4_007756 1655 1638 1..349, 497..815, 868..1232, 1301..1638
Psehe2p4_007774 1656 1244 1..252, 298..1149, 1203..1244
Psehe2p4_007781 1657 1290 1..426, 514..962, 1017..1189, 1250..1290
Psehe2p4_007789 1658 1106 1..91 , 139..1106
Psehe2p4_007799 1659 2352 1..97, 145..617, Θ76..2352
Psehe2p4_007835 1660 1357 1.79, 140..203, 286..548, 603..1357
Psehe2p4_007838 1661 1271 1..282, 336..1271
Psehe2p4_007840 1662 430 1..60, 128..430
Psehe2p4_007853 1663 2559 1..145, 204..579, 627..1506, 1574..1708, 1771..2559
1..288, 443..518, 735..933, 1061..1150, 1222..1280, 1406..1657,
Psehe2p4_007869 1664 2446
1824..2271 , 2365..2446
Psehe2p4_007924 1665 1534 1..309, 367.756, 828..1189, 1303..1534
Psehe2p4_007996 1666 1313 1..290, 342..517, 574..1313
1..88, 145..208, 254..273, 353..886, 944..984, 1039..1087,
Psehe2p4_008182 1667 2124 1140..1261 , 1317..1344, 1406..1523, 1603..1831 , 1900..1965,
2032..2124
Psehe2p4_008195 1668 7371 1..151 , 208..4107, 4219..4249, 4298..5196, 5246.7371
Psehe2p4_008221 1669 1263 1..408, 460..647, 737..955, 1003..1175, 1223..1263
Psehe2p4_008231 1670 2003 1..262, 311..539, Θ54..934, 984..1304, 1356..1844, 1951..2003
Psehe2p4_008232 1671 826 1..107, 162.711 , 761..826
Psehe2p4_008247 1672 1231 1..411 , 459..Θ25, Θ82..886, 948..1146, 1200..1231
Psehe2p4_008261 1673 1729 1..168, 304.781 , 864..1729
Psehe2p4_008333 1674 1508 1..198, 255..1508
Psehe2p4_008382 1675 2075 1..195, 246..2075
Psehe2p4_008408 1676 1203 1..55, 117..526, 577..1203
Psehe2p4_008429 1677 912 1..89, 161..288, 338..537, 619..912
Psehe2p4_008513 1678 2151 1..265, 319..1676, 1726..2151
1..284, 337..512, 568..6Θ3, 822..842, 898..938, 995..1143,
Psehe2p4_008544 1679 2478
1190..1257, 1305..1955, 2006..2478
1..351 , 402..560, 611..753, 809..1157, 1226..1335, 1382..1918,
Psehe2p4_008592 1680 2456
2047..2222, 2275..245Θ
Psehe2p4_008632 1681 1309 1..603, 656..1309
Psehe2p4_008639 1682 1626 1..105, 161..499, 551..830, 886..963, 1014..1108, 1159..1626
1..325, 381..616, 687.798, 848..953, 1020..1072, 1129..1251 ,
Psehe2p4_008643 1683 1610
1300..1610
Psehe2p4_008683 1684 873 1..873
Psehe2p4_008684 1685 2441 1..464, 512.723, 772..2441
Psehe2p4_008685 1686 2124 1..163, 236..2124
Psehe2p4_008737 1687 1241 1..96, 141..1241
Psehe2p4_008806 1688 1901 1..433, 548..620, 671..862, 983..1901
1..240, 357..482, 646..1473, 1610..2069, 2128..2216, 2283..2444,
Psehe2p4_008849 1689 6616 2513..2734, 2845..3057, 3136..3274, 3371..3828, 3947..3989,
4117..4346, 4461..5258, 5483..6616
Psehe2p4_008863 1690 1147 1..129, 189..448, 496..1147
Psehe2p4_008927 1691 3411 1..2768, 2817..3411
Psehe2p4_009068 1692 447 1..169, 250..341 , 406..447
Psehe2p4_009097 1693 1237 1..49, 105..1237
1..55, 103..155, 278..329, 488..503, 551..667, 728..823,
Psehe2p4_009119 1694 2674 876..1005, 1126..1180, 1230..1283, 1367..1648, 1707..1918,
2005..2142, 2192..2674
Psehe2p4_009134 1695 1547 1..297, 358..1106, 1181..1547
Psehe2p4_009386 1696 1239 1..1239
Psehe2p4_009394 1697 1984 1..307, 732..1984
Psehe2p4_009415 1698 1637 1..64, 117..464, 517..1637
Psehe2p4_009482 1699 971 1..264, 311..829, 885..971
Psehe2p4_009537 1700 1446 1..356, 429..814, 866..1446
Psehe2p4_009550 1701 1599 1..66, 114..155, 202..304, 350..972, 1021..1599
Psehe2p4_009567 1702 2332 1..497, 549.754, 802..1315, 1362..1683, 1730..2332
Psehe2p4_009596 1703 1254 1..490, 552..579, 624..1254
Psehe2p4_009599 1704 1335 1..1335
Psehe2p4_009606 1705 1272 1..183, 230..976, 1039..1272
Psehe2p4_009642 1706 573 1..573
Psehe2p4_009648 1707 1007 1..101 , 158.720, 814..863, 918..1007
Psehe2p4_009676 1708 1342 1..287, 342..517, 573..1342
Psehe2p4_009703 1709 867 1..867
Psehe2p4_009785 1710 990 1..990
Psehe2p4_009815 1711 2617 1..421 , 469.758, 818..2617
Psehe2p4_009827 1712 2030 1..454, 513..630, Θ98..926, 1040..1170, 1262..2030
1..209, 2Θ4..397, 544..572, 630.785, 905..1014, 1071..1107,
Psehe2p4_009866 1713 2063
1269..1483, 1592..1726, 1830..1936, 1990..2063
Psehe2p4_009871 1714 1253 1..237, 286..Θ42, 696..1111 , 1160..1253
Psehe2p4_009915 1715 2715 1..2715
Psehe2p4_009954 1716 1222 1.76, 126..1222
Psehe2p4_009971 1717 1604 1..578, 629..1604
Psehe2p4_010022 1718 2146 1..190, 251..927, 995..2146
Psehe2p4_010189 1719 1401 1..196, 249..539, 663..850, 901..983, 1086..1181 , 1287..1401
1..155, 263..1200, 1327..2329, 3004..3033, 3084..3174,
Psehe2p4_010198 1720 3762
3346..37Θ2
Psehe2p4_010283 1721 844 1..413, 465..685, 7Θ5..844
Psehe2p4_010297 1722 858 1..86, 138..858
Psehe2p4_010308 1723 857 1..425, 478..698, 775..857
Psehe2p4_010309 1724 923 1..178, 241..359, 425.-832, 888..923
1..304, 360..418, 484..1259, 1360..1391 , 1439..1578, 1641..1658,
Psehe2p4_010316 1725 4102 1732..2058, 2124..2145, 2203..2362, 2433..2559, 2610..2659,
2710..2838, 2897..3074, 3127..3287, 3352..3830, 3912..4102
Psehe2p4_010340 1726 687 1..687
Psehe2p4_010356 1727 1193 1..348, 463..637, 739..1193
Psehe2p4_010386 1728 2239 1..853, 917..981 , 1042..1937, 2032..2239
Psehe2p4_010421 1729 1301 1..238, 292..849, 964..1230, 1282..1301
Psehe2p4_010425 1730 3502 1..108, 166..652, 699..3502
1..146, 191..362, 408..436, 484..580, 628..Θ89, 1003..1260,
Psehe2p4_010489 1731 2962
1306..2962
Psehe2p4_010506 1732 1058 1..261 , 408..483, 535..933, 991..1058
1..415, 483..517, 594..751 , 868..1349, 1434..1562, 1642..1779,
Psehe2p4_010618 1733 2359
2007..2161 , 224Θ..2359
Psehe2p4_010626 1734 867 1..360, 424..514, 566-867
Psehe2p4_010710 1735 2768 1..250, 307..402, 448-906, 954..1102, 1154-2489, 2554-2768
Psehe2p4_010796 1736 1661 1..225, 277..1161 , 1212-1661
Psehe2p4_010880 1737 1211 1..324, 380-1000, 1110-1211
1..318, 377..700, 752-859, 910-1017, 1068-1175, 1233-1340,
Psehe2p4_010887 1738 3294
1398-1505, 1592-1699, 1748-2269, 2321 -2861 , 2918-3294
Psehe2p4_010982 1739 1325 1..603, 702..1325
Psehe2p4_011013 1740 1116 1 -417, 481 -1116
Psehe2p4_011016 1741 1100 1 -448, 499-1100
Psehe2p4_011043 1742 1384 1..278, 334..1384
Psehe2p4_011050 1743 787 1..145, 206-414, 485.787
Psehe2p4_011051 1744 1450 1.75, 132-264, 318-1450
Psehe2p4_011052 1745 1527 1..565, 684.739, 787..1527
Psehe2p4_011060 1746 1803 1..216, 268-311 , 362-447, 494-955, 1019-1189, 1346-1803
Psehe2p4_011346 1747 1247 1..1012, 1063-1247
Psehe2p4_011739 1748 1112 1..88, 149-661 , 781 -1112
Psehe2p4_011748 1749 2031 1.786, 950-1549, 1615-2031
1 -121 , 236-415, 466-645, 695-887, 933-990, 1041.1119,
Psehe2p4_011768 1750 3002
1166-1365, 1421 -2704, 2778-3002
Psehe2p4_011815 1751 1127 1 -391 , 442-1127
Psehe2p4_011857 1752 801 1..444, 520-801
Psehe2p4_011891 1753 1633 1..250, 308-1452, 1508-1633
1..359, 407..788, 872..1199, 2137..2245, 2293..4819, 4882..5901 ,
Psehe2p4_011924 1754 6734
5951..6086, 6151..6734
Psehe2p4_011941 1755 1162 1..803, 859..1162
1..230, 287..326, 392..442, 49Θ..582, 632..Θ67, 725..876, 943..1038, 1092..1715, 1800..2080, 2202..2448, 2603..3143,
Psehe2p4_012058 1756 5943 3261..3354, 3477..3812, 3869..4028, 4159..4185, 4233.4366,
4522.4596, 4650.4688, 4744.4850, 4953..5145, 5217..5241 ,
5312..5471 , 5534..56Θ7, 5721..5943
Psehe2p4_012081 1757 814 1..263, 317..501 , 561..611 , 678..814
Psehe2p4_012098 1758 1395 1..597, 646..1395
Psehe2p4_012173 1759 1236 1..132, 179.420, 472..904, 961..1098, 1162..1236
1..188, 238..345, 422.476, 525..637, 686..850, 911..1187,
Psehe2p4_012198 1760 1528
1262..1528
Psehe2p4_012204 1761 1680 1..580, 695..1680
Psehe2p4_012229 1762 958 1..524, 584..741 , 792..958
Psehe2p4_012296 1763 922 1..152, 280.405, 461..644, 704..922
Psehe2p4_012303 1764 1786 1..93, 148..685, 735..1786
1..55, 106..158, 361.415, 587..602, 653.769, 821..916,
Psehe2p4_012360 1765 2539
965..1149, 1199..1287, 1349..1598, 1649..1872, 1928..2539
Psehe2p4_012393 1766 483 1.483
Psehe2p4_012444 1767 1207 1..202, 253..538, 592..1207
Psehe2p4_012449 1768 2094 1.400, 450..501 , 555..608, 657..1493, 1545..2094
Psehe2p4_012508 1769 2040 1..162, 211..218, 310..1347, 1504..1517, 1568..2040
Psehe2p4_012526 1770 747 1..287, 438.747
Psehe2p4_012650 1771 836 1..325, 375..626, Θ73..836
Psehe2p4_012670 1772 1390 1..205, 307.413, 460..693, 743..1390
Psehe2p4_012717 1773 1701 1..1701
Psehe2p4_012730 1774 899 1..339, 389..618, 671.757, 806..899
Psehe2p4_012751 1775 1363 1..93, 190..388, 454.796, 931..1067, 1179..1363
Psehe2p4_012904 1776 1623 1..241 , 304..625, 723..1623
Psehe2p4_012912 1777 1647 1..126, 185..514, 587..898, 955..1092, 1141..1647
Psehe2p4_012937 1778 1616 1..65, 111..124, 202..582, 631..649, 699..1018, 1135..1616
Psehe2p4_012986 1779 1083 1..266, 325..598, 673..1083
Psehe2p4_013023 1780 1030 1..385, 439..651 , 711..1030
Psehe2p4_013032 1781 1182 1..1182
Psehe2p4_013033 1782 1201 1..371 , 424..1201
1.467, 513..1050, 1104..1115, 1172..1264, 1333..1525,
Psehe2p4_013034 1783 1832
1660..1832
1..167, 239..322, 425.439, 492.756, 827..1242, 1362..1480,
Psehe2p4_013053 1784 1821
1571..1821
1.444, 551..984, 1053..1246, 1301..1367, 1440..1908,
Psehe2p4_013070 1785 3717
1955..2540, 2585..3717
Psehe2p4_013071 1786 1434 1..249, 318..515, 563.-844, 892..1434
Psehe2p4_013084 1787 924 1..153, 202..924
Psehe2p4_013091 1788 2488 1..604, 651..1348, 1392..1470, 1519..1634, 1679..2488
Psehe2p4_013099 1789 1797 1..1797
Psehe2p4_013175 1790 1947 1..674, 798..1947
Psehe2p4_013271 1791 1724 1..238, 306..797, 847..1724
Psehe2p4_013294 1792 1085 1..33, 82..499, 628..1085
Psehe2p4_013375 1793 873 1..350, 402.769, 824..873
Psehe2p4_013421 1794 1665 1..606, 661..1665
Psehe2p4_013424 1795 1761 1..134, 212..930, 1042..1517, 1591..1761
1..92, 144..201 , 253..457, 510..648, 707..786, 927..Θ98,
Psehe2p4_013488 1796 2454 1048..1083, 1131..1417, 1467..1673, 1722..1997, 2051..2105,
2168..2454
Psehe2p4_013508 1797 943 1..461 , 518..675, 741..943
Psehe2p4_013512 1798 1203 1..1203
Psehe2p4_013521 1799 1130 1..399, 483..1130
1..404, 4Θ8..594, 714..831 , 885..991 , 1166..1380, 1501..1572,
Psehe2p4_013631 1800 2991
1632..1930, 1985..2991
Psehe2p4_013641 1801 2372 1..337, 384..1034, 1104..2028, 2092..2286, 2336..2372
Psehe2p4_013699 1802 1109 1..406, 471..568, 642..803, 855..1109
Psehe2p4_013729 1803 1471 1..1003, 1116..1471
Psehe2p4_013733 1804 794 1..330, 388..5Θ7, 621..794
Psehe2p4_013736 1805 1606 1..99, 165..551 , 604..794, 844..1606
Psehe2p4_013741 1806 1508 1..212, 284..392, 446..59Θ, 692..1021 , 1075..1267, 1322..1508
Psehe2p4_013788 1807 1770 1..1770
Psehe2p4_013807 1808 1662 1..1662
Psehe2p4_013817 1809 1971 1..48, 97..1971
1..292, 364..518, 594..1098, 1150..1293, 1349..1588, 1644..1966,
Psehe2p4_013868 1810 2693
2040..2693
Psehe2p4_013891 1811 1345 1..236, 290..539, 587.768, 820..1345
1..55, 104..368, 414..921 , 1039..1322, 1465..1493, 1566..1833,
Psehe2p4_013915 1812 2563
1860..1999, 2082..2563
Psehe2p4_013958 1813 1013 1.76, 176..265, 359..Θ27, 759..1013
Psehe2p4_013975 1814 1806 1..384, 607..1806
Psehe2p4_013986 1815 2031 1..903, 952..1728, 1786..2031
Psehe2p4_014048 1816 1566 1.73, 119..421 , 475..571 , 689..1142, 1213..1566
Psehe2p4_014092 1817 1395 1..335, 384..6Θ4, 73Θ..894, 956..1395
1..120, 343..424, 491 -733, 904..1095, 1205..1304, 1388..1788,
Psehe2p4_014097 1818 2315
1963..2315
Psehe2p4_014109 1819 2741 1..468, Θ27..949, 1093..1794, 1862..2741
Psehe2p4_014161 1820 2121 1.701 , 74Θ..875, 921..1148, 1195..1638, 1693..2121
Psehe2p4_014177 1821 1062 1..82, 132..318, 387..1062
Psehe2p4_014187 1822 1662 1..215, 298.440, 484..1290, 1464..1511 , 1568..1662
Psehe2p4_014214 1823 768 1..177, 237-448, 495.768
Psehe2p4_014242 1824 1928 1..225, 276..498, 546..1807, 1857..1928
Psehe2p4_014245 1825 1353 1..242, 291..540, 590.771 , 828..1353
Psehe2p4_014311 1826 1134 1..85, 141..213, 267..1134
Psehe2p4_014317 1827 1805 1..296, 345..38Θ, 430..689, 738..849, 947..1541 , 1611..1805
Psehe2p4_014335 1828 852 1..238, 351..366, 420..852
Psehe2p4_014386 1829 1375 1..350, 414..510, 560..693, 758..826, 882..1285, 1344..1375
1..210, 261..408, 503..521 , 574.746, 798..1049, 1186..1359,
Psehe2p4_014392 1830 1621
1422..1531 , 1583..1621
1..253, 334..422, 473..827, 886..1040, 1102..1211 , 1283..1370,
Psehe2p4_014428 1831 1992
1426..1540, 1596..1625, 1736..1992
Psehe2p4_014498 1832 1412 1..49, 97.748, 797..1412
Psehe2p4_014533 1833 861 1..861
Psehe2p4_014613 1834 1285 1..385, 645.795, 937..1285
Psehe2p4_014716 1835 1306 1..250, 356..1051 , 1107..1306
Psehe2p4_014769 1836 986 1..521 , 579.787, 841..986
Psehe2p4_014848 1837 878 1..473, Θ35..878
Psehe2p4_014872 1838 1713 1..106, 155..336, 384..62Θ, 673.755, 834..952, 1001..1713
Psehe2p4_014874 1839 964 1..151 , 203..350, 46Θ..964
Psehe2p4_014979 1840 1062 1..319, 386..594, 665..930, 984..1062
Psehe2p4_014982 1841 1164 1..1164
Psehe2p4_015038 1842 2945 1..223, 275..1240, 1288..2945
Psehe2p4_015055 1843 727 1..266, 319.727
1.72, 151..303, 428..Θ39, 711..763, 823..857, 924..1080,
Psehe2p4_015097 1844 2155
1218..1353, 1476..1566, 1637..2048, 2118..2155
Psehe2p4_015098 1845 1650 1..134, 184..303, 349..488, 540..647, 707..1650
Psehe2p4_015106 1846 1885 1..941 , 1000..1885
1..172, 221..258, 385..565, 612..656, 720..1031 , 1079.. 103,
Psehe2p4_015166 1847 2043
1166..1691 , 1789..2043
Psehe2p4_015201 1848 1030 1..245, 295..1030
Psehe2p4_015235 1849 1362 1..458, 513..1362
Psehe2p4_015287 1850 1628 1..255, 314..414, 464..960, 1069..1628
Psehe2p4_015306 1851 1205 1..137, 195.765, 819..1205
Psehe2p4_015332 1852 1123 1..850, 939..1123
Psehe2p4_015386 1853 1611 1..369, 424..Θ56, 702..1611
Psehe2p4_015399 1854 1825 1..253, 310.700, 751..1627, 1679..1735, 1796..1825
Psehe2p4_015402 1855 2686 1..371 , 422..510, 603..1002, 1135..2686
Psehe2p4_015409 1856 1068 1.70, 177..240, 299..9Θ8, 1033..1068
Psehe2p4_015417 1857 1818 1..1818
Psehe2p4_015493 1858 1804 1..106, 161..607, 674..1243, 1299..1804
Psehe2p4_015540 1859 1300 1..212, 258..420, 473..1300
Psehe2p4_015589 1860 1111 1..101 , 155..303, 354..550, 611..1111
1.72, 207..294, 608..809, 859..872, 928..1000, 1049..1299, 1460..1511 , 1563..1656, 1756..1896, 2041..2141 , 2208..2622, 267Θ..2968, 3018..4261 , 4341..4609, 4665.4744, 4797.4876,
Psehe2p4_015597 1861 8143
4984..5961 , 6039..6086, 6134..6485, 6586.-6646, 6782.-6989, 7099-7130, 7198-7290, 7388-7592, 7645-7676, 7794-7872,
7926-8143
Psehe2p4_015625 1862 1028 1..75, 126.463, 540-821 , 893-1028
Psehe2p4_015734 1863 603 1..603
Psehe2p4_015769 1864 1156 1..253, 319..412, 480..862, 930..1156
1..97, 153..190, 240..392, 439..541 , 595..620, 672.765, 813..841 ,
Psehe2p4_015773 1865 1449
888..968, 1030..1266, 1375..1449
Psehe2p4_015785 1866 1770 1..1770
Psehe2p4_015833 1867 1808 1..196, 243..916, 974..1466, 1513..1808
Psehe2p4_015969 1868 1561 1..309, 401..682, 797..1561
1..175, 224..Θ97, 750..849, 900..2009, 2060..2152, 2202..2395,
Psehe2p4_016049 1869 2690
2446..2690
Psehe2p4_016058 1870 2648 1..386, 479..Θ72, 718..868, 916..1759, 1904..1980, 2072..2648
1..120, 173..229, 285..376, 450..683, 808..1837, 1887..2132,
Psehe2p4_016100 1871 2832
2187..2269, 2322..2829
Psehe2p4_016122 1872 1235 1..411 , 469..6Θ5, 716..937, 1013..1235
Psehe2p4_016142 1873 1959 1..316, 367..497, 551..817, 865..1959
1..121 , 171..366, 433..490, 545..649, 720..1075, 1151..1210,
Psehe2p4_016156 1874 2751
1281..1579, 1661..2088, 2207..2238, 2295..2390, 2478..2751
Psehe2p4_016158 1875 540 1..374, 453..540
Psehe2p4_016171 1876 1536 1..62, 138..342, 395..608, 699..1019, 1073..1536
Psehe2p4_016253 1877 2388 1..121 , 172..641 , 697..2388
Psehe2p4_016278 1878 936 1..166, 217..313, 370..590, 647..936
Psehe2p4_016290 1879 2808 1..130, 181..579, 628.798, 955..1341 , 1389..1432, 1483..2808
Psehe2p4_016342 1880 926 1..133, 241..660, 709..926
1..264, 317..434, 487.764, 811..983, 1036..1123, 1172..1277,
Psehe2p4_016344 1881 2846
1334..1913, 2471..2510, 2781..2846
Psehe2p4_016355 1882 789 1..192, 245..4Θ5, 528.789
1..236, 331..428, 479..552, 600..662, 718.769, 821..888,
Psehe2p4_016450 1883 2832 939..1095, 1158..1204, 1257..1661 , 1716..1745, 1794..1958,
2014..2832
Psehe2p4_016467 1884 2529 1..2529
Psehe2p4_016549 1885 1350 1..218, 269..962, 1024..1350
Psehe2p4_016575 1886 1774 1..118, 170..270, 320..1774
Psehe2p4_016577 1887 1243 1.73, 129..1243
PSEHE_1_00031 1888 844 1..401 , 465..685, 765-844
PSEHE_1_00048 1889 1164 1 -374, 432-483, 543-968, 1033-1164
PSEHE_1_00052 1890 937 1..183, 233-937
PSEHE_1_00068 1891 1230 1..1230
1..125, 182-348, 403-522, 574-829, 946-1072, 1123-1210,
PSEHE_1_00078 1892 1943
1260-1409, 1469-1730, 1790-1943
1..199, 292-499, 578.737, 819-875, 958-1039, 1155-1407, 1468-1533, 1605-1749, 1867-1957, 2081 -2178, 2273-2588,
PSEHE_1_00090 1893 4984
2657-2705, 2812-3137, 3208-3331 , 3451 -3506, 3673-3840,
3904-4027, 4328-4416, 4599-4984
1..246, 296-309, 365-437, 486.736, 897-948, 1000-1093, 1193-1333, 1478-1578, 1645-2059, 2113-2405, 2455-3698,
PSEHE_1_00098 1894 7580
3778-4046, 4102-4181 , 4234-4313, 4421 -5398, 5476-5523, 5571 -5922, 6023-6083, 6219-6426, 6536-6567, 6635-6727,
6825..7029, 7082..7113, 7231.7309, 7363.7580
PSEHE_1_00116 1895 2778 1..1001 , 1096..1183, 1255..2778
PSEHE_1_00136 1896 1594 1..336, 386..553, 605.775, 827..1072, 1133..1270, 1334..1594
1..124, 177..296, 348..518, 636.719, 772..861 , 920..1013,
PSEHE_1_00167 1897 2924
1151..1840, 1889..2924
PSEHE_1_00172 1898 2559 1..145, 204..579, Θ27..2559
1..105, 158..214, 270..361 , 435..6Θ8, 793..1822, 1872..2117,
PSEHE_1_00176 1899 2817
2172..2254, 2307..2817
PSEHE_1_00180 1900 1693 1..312, 377..503, 559.724, 780..1185, 1263..1580, 1661..1693
1..229, 278..1017, 1064..1289, 1368..1868, 1918..1986,
PSEHE_1_00185 1901 3498
2329.-2459, 2519..2613, 2718..3498
PSEHE_1_00207 1902 2029 1..238, 298..551 , 653..935, 1002..1175, 1299..2029
PSEHE_1_00214 1903 4270 1..1959, 3407..4270
1..162, 212..279, 325..356, 405..1200, 1306..1325, 1413..2456,
PSEHE_1_00218 1904 3375
2525..3375
PSEHE_1_00231 1905 1100 1 -369, 453-1100
PSEHE_1_00237 1906 1759 1..127, 246-1759
PSEHE_1_00239 1907 1384 1..267, 561.706, 840-1034, 1090-1384
PSEHE_1_00268 1908 1344 1..275, 326-498, 548.775, 827..1344
PSEHE_1_00279 1909 1075 1..319, 376-925, 982..1075
PSEHE_1_00282 1910 1102 1..92, 146-294, 345-541 , 602-1102
PSEHE_1_00283 1911 1099 1..89, 143-291 , 342-538, 599-1099
PSEHE_1_00286 1912 1197 1..61 , 107..236, 271..384, 433-554, 602..1197
Psehe2p4_000181 1913 1692 1..1692
1..1125, 1173-2002, 2050-2372, 2425-2526, 2613-3413,
Psehe2p4_000245 1914 4134
3482-4134
Psehe2p4_000633 1915 2180 1..316, 367..2180
Psehe2p4_001093 1916 1466 1..14, 82..164, 220-1466
1..244, 291 -331 , 383.705, 816-1018, 1080-1127, 1186-1509,
Psehe2p4_001771 1917 2483
1604-1637, 1685-2483
1..190, 254-537, 584.740, 788-1252, 1311 -1403, 1457-2151 ,
Psehe2p4_001880 1918 2237
2208-2237
Psehe2p4_003169 1919 932 1 -521 , 638-932
Psehe2p4_003301 1920 1896 1..270, 337-842, 897..1896
1..198, 259-526, 578-658, 705.735, 792-879, 928-1082,
Psehe2p4_003443 1921 3027
1130-1694, 1753-2110, 2780-3027
Psehe2p4_003999 1922 685 1 -305, 448-685
1..203, 256-360, 423-573, 621.752, 808-880, 930-1010,
Psehe2p4_004005 1923 1706
1063-1706
1..192, 245-342, 396-483, 558-599, 648-829, 885-965,
Psehe2p4_004118 1924 1924
1018-1924
Psehe2p4_004210 1925 2007 1.771 , 823-2007
Psehe2p4_004433 1926 1770 1..1770
Psehe2p4_004485 1927 1320 1..1320
Psehe2p4_005051 1928 1335 1..1335
Psehe2p4_005345 1929 2146 1..131 , 227-554, 611 -1145, 1207-1655, 1724-2146
Psehe2p4_005817 1930 1323 1..383, 459..921, 976..1323
1..175, 229..25Θ, 305..474, 608..699, 750..879, 927..1207,
Psehe2p4_005890 1931 2932
1252..1465, 1512..1604, 1651..2324, 2372..2932
Psehe2p4_006190 1932 2008 1..482, 548..1784, 1886..2008
Psehe2p4_006511 1933 1392 1..1392
1..278, 364..5Θ4, 644..1203, 1280..1662, 1723..2023, 2537..2609,
Psehe2p4_006676 1934 4313
3437..4313
1..301, 358..550, 621..751, 803..978, 1026..1863, 1913..2651,
Psehe2p4_006754 1935 4785
2700..4785
Psehe2p4_007154 1936 1851 1..353, 402..1851
1..472, 527..Θ26, 673..812, 860..958, 1015..1063, 1115..1254,
Psehe2p4_007693 1937 5280 1375..3064, 3123..3349, 3400..3413, 3470..3501, 3549..4081,
4130..4218, 4267..5280
Psehe2p4_007787 1938 1125 1..1125
Psehe2p4_007788 1939 1862 1..84, 141..1862
Psehe2p4_007986 1940 1392 1..305, 450..637, 704..1098, 1216..1392
Psehe2p4_008326 1941 1427 1..69, 212..1124, 1237..1427
Psehe2p4_008561 1942 1650 1..373, 427..630, 686..1650
1..196, 252..284, 339..598, 649.784, 836..1285, 1335..2050,
Psehe2p4_008695 1943 2126
2097..2126
Psehe2p4_008967 1944 1717 1..342, 398..847, 900..1438, 1495..1717
1..107, 155..239, 356..521, 571..1230, 1349..2283, 2394-2652,
Psehe2p4_009175 1945 6306 2718-2841, 2919-3891, 3942-4398, 4448-4553, 4666-5309,
5370-5405, 5459-6037,6113-6306
Psehe2p4_009354 1946 2111 1..765, 818-1138, 1188-1433, 1489-1675, 1744-2111
Psehe2p4_009382 1947 1743 1..252, 320-533, 583-605, 669-1037, 1090-1743
1..51, 100-170, 233-360, 450-776, 845-951, 1056-1317,
Psehe2p4_009407 1948 1512
1373-1512
Psehe2p4_009561 1949 1297 1..213, 268-593, 649-1057, 1115-1180, 1235-1297
Psehe2p4_009889 1950 1698 1..90, 152-371,449-1698
Psehe2p4_009911 1951 1816 1..219, 295-730, 786-1493, 1548-1816
Psehe2p4_009986 1952 1977 1..201, 309.702, 760-864, 924-967, 1013-1375, 1432-1977
Psehe2p4_010107 1953 1097 1..167, 228-356,422-1097
Psehe2p4_010176 1954 2001 1..2001
1..45, 101-329, 438-627, 678-817, 868-954, 1003-1569,
Psehe2p4_010868 1955 6339 1623-1970, 2194-2287, 2342-2696, 2841-3159, 3350-3721,
3772-4354, 4405-4538, 4585-5606, 5665-6339
Psehe2p4_011021 1956 1032 1..1032
Psehe2p4_011125 1957 2079 1..130, 198.794, 921..1337, 1395-1979, 2033-2079
Psehe2p4_011128 1958 2026 1..330, 384..450, 498-1164, 1218-1590, 1686-1879, 1933-2026
1..269, 317..434, 481-531, 583.764, 812..864, 923-956,
Psehe2p4_011155 1959 2098
1000-1454, 1509-1741, 1790-2098
Psehe2p4_011217 1960 1323 1-491,556-1118, 1178-1323
Psehe2p4_011757 1961 2294 1..1293, 1365-2294
Psehe2p4_011892 1962 2034 1..192, 1340-1489, 1708-2034
Psehe2p4_012059 1963 2084 1..183, 239-305,427.2084
Psehe2p4_012110 1964 1854 1..1410, 1468..1597, 1697..1854
Psehe2p4_012149 1965 2080 1..183, 396..471 , 591..929, 977..1475, 1557..1762, 1815..2080
Psehe2p4_012330 1966 5433 1..134, 185..1341 , 1386..4580, 4Θ52..5433
1..79, 129..462, 509..625, 671..1203, 1269..2036, 2322..2442,
Psehe2p4_012434 1967 2638
2545..2Θ38
Psehe2p4_012692 1968 1267 1..561 , 610..954, 1004..1267
1..159, 2Θ3..492, 557.797, 956..1185, 1230..1349, 1407..1516,
Psehe2p4_012773 1969 2493
1640..1762, 1884..2024, 2076..2166, 2232..2493
Psehe2p4_012776 1970 2085 1..193, 243..52Θ, 613..705, 758..1217, 1268..2085
Psehe2p4_012789 1971 1783 1..258, 320..719, 771..942, 1047..1219, 1273..1435, 1525..1783
1..39, 93..184, 232..445, 518..594, 705..1813, 1860..2028,
Psehe2p4_012850 1972 4716
2149..2282, 2330..2522, 2591..2774, 2872..3028, 3083..4716
Psehe2p4_013228 1973 1943 1..90, 155.757, 809..989, 1037..1415, 1465..1620, 1721..1943
Psehe2p4_013234 1974 2104 1..80, 138..258, 313..608, 658..2104
Psehe2p4_013293 1975 3708 1..359, 406..820, 869..1931 , 2834..2913, 3025..3708
Psehe2p4_013346 1976 924 1..230, 345..682, 743..924
Psehe2p4_013730 1977 1625 1..408, 466..574, 694..1625
Psehe2p4_013983 1978 1798 1..162, 212..1798
Psehe2p4_014250 1979 1980 1.77, 126..312, 363..1108, 1158..1980
Psehe2p4_015161 1980 2070 1..308, 3Θ3..442, 520..835, 966..1402, 1457..2070
Psehe2p4_015225 1981 1670 1..317, 461..1670
1..144, 23Θ..472, 566-591 , 925-1267, 1385-1482, 1567-2585,
Psehe2p4_015389 1982 2664
2651 -2664
Psehe2p4_015423 1983 996 1..996
Psehe2p4_015455 1984 813 1..28, 80..813
1..184, 233-348, 402..572, 623.749, 799-1251 , 1307-2076,
Psehe2p4_015750 1985 2180
2127-2180
Psehe2p4_015831 1986 1917 1..519, 609..1382, 1483-1917
Psehe2p4_015995 1987 408 1..408
1..151 , 1042-1089, 1146-1436, 1492-1653, 1709-1993,
Psehe2p4_016051 1988 2308
2067-2308
Psehe2p4_016372 1989 1923 1..132, 204.746, 796-1567, 1658-1923
Psehe2p4_016408 1990 2035 1..180, 285-490, 601 -920, 990-2035
Psehe2p4_016534 1991 1946 1..126, 178-598, 661.722, 776-1341 , 1408-1619, 1681 -1946
[00232] The present invention is illustrated in further details by the following non-limiting examples.
EXAMPLES
Example 1 : Fermentation of the organism
Materials & Methods
[00233] In general, for each species, starter mycelium was grown in rich medium (either mycological broth or yeast malt broth (the latter being indicated with YM)) and then washed with water. The starter was then used to
inoculate different liquid media or solid substrate and the resulting mycelium was used for RNA extraction and library construction.
[00234] Following are the medium recipes and the solid substrates with a referenced source (if available) as well as a table (Table 3) listing the media variations, since in some cases the basic recipes of the referenced source have been altered depending on the species grown. This is then followed by a summary of the specific species as grown in the examples.
A. Mycological broth
Per liter: 10 g soytone, 40 g D-glucose, 1 mL Trace Element solution, Double-distilled water;
Adjust pH to 5.0 with hydrochloric acid (HCI) and bring volume to 1 L with double-distilled water.
Trace Element Solution contains 2 mM Iron(ll) sulphate heptahydrate (FeSO hbO), 1 mM Copper (II) sulphate pentahydrate (CuSCyShbO), 5 mM Zinc sulphate heptahydrate (ZnSC hbO), 10 mM Manganese sulphate monohydrate (MnSCyHbO), 5 mM Cobalt(ll) chloride hexahydrate (CoC ^hbO), 0.5 mM Ammonium molybdate tetrahydrate ((ΝΗ4)6Μθ7θ24·4Η2θ), and 95 mM Hydrochloric acid (HCI) dissolved in double-distilled water.
B. Yeast-Malt broth (YM)
(Reference: ATCC medium No. 200)
Per liter: 3 g yeast extract, 3 g malt extract, 5 g peptone, 10 g D-glucose, Double-distilled water to 1 L.
C. Trametes Defined Medium (TDM)
(Reference: Reid and Piace, "Effect of Residual lignin type and amount on biological bleaching of kraft pulp by Trametes versicolor". Applied Environmental Microbiology 60: 1395-1400, 1994.)
Per liter: 10 g D-glucose, 0.75 g L-Asparagine monohydrate, 0.68 g Potassium phosphate monobasic (KH2PO4), 0.25 g Magnesium sulphate heptahydrate (MgSC ^O), 15 mg Calcium chloride dihydrate (CaC ^^O), 100 μg Thiamine hydrochloride, 1 ml Trace Element solution, 0.5 g Tween™ 80, Double distilled water;
Adjust pH to 5.5 with 3 M potassium hydroxide and bring volume to 1 L with double-distilled water.
Table 3. Variations of TDM media used for library construction
Variation Description
TDM-1 Medium was prepared as in basic recipe described above.
TDM-2 Quantity of asparagine monohydrate was reduced to 0.15 g.
TDM-3 Manganese sulphate monohydrate was omitted from the medium.
The quantity of manganese sulphate monohydrate was raised to 0.2 mM final concentration in the
TDM-4
medium.
TDM-5 The quantity of copper (II) sulphate pentahydrate was raised to 20 μΜ.
TDM-6 Glucose was replaced with 10 g per liter of cellulose (Solka-Floc, 200FCC)
TDM-7 Glucose was replaced with 10 g per liter of xylan from birchwood (Sigma Cat. # X-0502)
TDM-8 Glucose was replaced with 10 g per liter of wheat bran1.
TDM-9 Glucose was replaced with 10 g per liter of citrus pectin (Sigma Cat. # P-9135).
TDM-10 Tween™ 80 was omitted from the medium.
The double-distilled water was replaced with Whitewater2 collected from peroxide bleaching (which
TDM-11
occurs during the manufacture of fine paper).
TDM-12 The double-distilled water was replaced with Whitewater2 collected from newsprint manufacture.
TDM-13 Glucose was replaced with 5 g per liter of ground hardwood kraft pulp3.
TDM-14 The medium's pH was raised to 7.5.
TDM-15 The strain was incubated at 5°C above its optimum growth temperature.
TDM-16 The strain was incubated at 10°C below its optimum growth temperature.
One half of the double-distilled water was replaced with Whitewater from newsprint manufacture.
TDM-17
Glucose was omitted.
TDM-18 Potassium phosphate monobasic was replaced with 5 mM phytic acid from rice (Sigma Cat. # P3168).
TDM-19 Asparagine monohydratewas increased to 4 g per liter.
Asparagine monohydratewas increased to 4g per liter and glucose was replaced with 2% fructose.
TDM-20
Asparagine monohydratewas increased to 4 g per liter; 100 mL of double-distilled water was replaced
TDM-21
with 100 mL kerosene4. Glucose was omitted.
Asparagine monohydratewas increased to 4 g per liter; 100 mL of double-distilled water was replaced
TDM-22
with 100 mL hexadecane (Sigma cat. # H0255). Glucose was omitted.
Asparagine monohydratewas increased to 4 g per liter; one half of the double-distilled water was
TDM-23 replaced with 25% Whitewater from newsprint manufacture plus 25% white water from peroxide
bleaching. Glucose was omitted.
Asparagine monohydratewas increased to 4 g per liter and the quantity of manganese sulphate
TDM-24
monohydrate was raised to 0.2 mM final concentration in the medium.
Asparagine monohydratewas increased to 4 g per liter and manganese sulphate monohydrate was
TDM-25
omitted from the medium.
TDM-26 Asparagine monohydratewas increased to 4 g per liter; and potassium phosphate monobasic was replaced with 5mM phytic acid from rice (Sigma Cat. # P3168).
TDM-27 Glucose was replaced with 10g per liter of olive oil (Sigma cat. # 01514)
One half of the double-distilled water was replaced with Whitewater from peroxide bleaching. Glucose
TDM-28
was omitted.
TDM-29 Glucose was replaced with 10 g per liter of tallow.
TDM-30 Glucose was replaced with 10 g per liter of yellow grease.
TDM-31 Glucose was replaced with 10 g per liter of defined lipid (Sigma cat. # L0288).
TDM-32 Glucose was replaced with 50 g per liter of D-xylose.
TDM-33 Glucose was replaced with 20 g per liter of glycerol and 20ml per liter of ethanol.
TDM-34 Glucose was reduced to 1 g per liter and 10 g per liter of bran was added.
TDM-35 Glucose was reduced to 1g per liter and 10 g per liter of pectin (Sigma Cat. # P-9135) was added.
TDM-36 Glucose was replaced with 10 g per liter of biodiesel.
TDM-37 Glucose was replaced with 10 g per liter of soy feedstock.
TDM-38 Glucose was replaced with 10g per liter of locust bean gum (Sigma cat # G0753).
One half of double-distilled water was replaced with a 1 :1 ratio of Whitewater from newsprint
TDM-39
manufacture and white water from peroxide bleaching. Glucose was omitted.
TDM-40 The medium's pH was raised to 8.5.
One half of double-distilled water was replaced with Whitewater from peroxide bleaching; plus yeast
TDM-41
extract was added to 1 g per liter. Glucose was omitted.
TDM-42 Glucose was replaced with 5 g per liter of yellow grease and 5 g per liter of soy feedstock
TDM-43 Glucose was replaced with 20g per liter of fructose.
Glucose was replaced with 10 g per liter of cellulose (Solka-Floc, 200FCC) plus 1 g per liter of
TDM-44
sophorose.
TDM-45 The medium's pH was raised to 8.84.
1 Food grade wheat bran sourced from the supermarket was used.
2 All Whitewaters were sourced from Quebec paper mills by PAPRICAN on the Applicant's behalf.
3 Hardwood kraft pulp was sourced from Quebec paper mills by PAPRICAN on the Applicant's behalf.
4 Kerosene was sourced from a general hardware store.
D. Asparagine Salts Medium (AS):
(Reference: Ikeda et al., Laccase and Melanization in Clinically Important Cryptococcus Species Other Than Cryptococcus neoformans Journal of Clinical Microbiology 40: 1214-1218, 2002)
Per liter: 3.0 g D-glucose, 1.0 g L-Asparagine monohydrate, 3.0 g KH2PO4, 0.5 g Mg S04 -7H20, 1 mg Thiamine.
Table 4: Variations of AS media used for library construction
E. Solid substrates used:
SS-1 5 g Wheat Bran.
SS-2 5 g Wheat bran plus 5 mL defined lipid.
SS-3 5 g Oat bran (food grade, sourced from supermarket).
[00235] The Thermoascus aurantiacus, Myceliophthora fergusii (Corynascus thermophilus), and Pseudocercosporella herpotrichoides strains were each grown according to the methods described above under the following growth conditions: TDM-1 , -2, -3, -4, -5, -6, -7, -8, -9, -10, -13, -14, -15, -39; YM, whereby the following optimal growth temperature was used: 25°C.
[00236] The strains carrying the recombinant genes were grown according to the methods described above under the following growth conditions: minimal medium as described in Kafer et al., (1977, Adv. Genet. 19:33-131) except that the salt concentrations were raised ten-fold and the glucose concentration was 150 grams per liter, at 30°C.
Example 2: Genome sequencing and assembly
[00237] Genomic DNA was isolated from mycelium when the growth culture had reached the mid log phase. Genomic DNA was sequenced using the Roche 454 Titanium technology (http://www.454.com) to a genome coverage of over 20-fold according to the instructions of the manufacturer. The sequences were assembled using the Newbler and Celera assemblers (http://sourceforge.net/apps/mediawiki/wgs-assembler).
Example 3: Building the cDNA libraries
[00238] Total RNA was isolated from fungal cells or mycelia when the growth cultures had reached the late log phase. The mycelia were collected by filtration through Miracloth and washed with water by filtration. The mycelia were padded dry using paper towels, and frozen in liquid nitrogen and stored at -80°C. To extract total RNA, the frozen mycelia or cells were ground to a fine powder in liquid nitrogen using pestle and mortar. Approximately 1 -1.5 gram of frozen fungal powder was dissolved in 10 mL of TRIzol® reagent and RNA was extracted according to the manufacturer's protocol (Invitrogen Life Sciences, Cat. #15596-018). Following extraction, the RNA was dissolved at 1-1.5 mg/ml of DEPC-treated water.
[00239] The PolyATtract® mRNA Isolation Systems (Promega, Cat. #Z5300) was used to isolate poly(A)+RNA. In general, equal amounts of total RNA extracted from up to ten culture conditions were pooled. One milligram of total RNA was used for isolation of poly(A)+RNA according to the protocol provided by the manufacturer. The purified poly(A)+RNA was dissolved at 200-500 μg/mL of DEPC-treated water.
[00240] Five micrograms of poly(A)+RNA were used for the construction of cDNA library. Double-stranded cDNA was synthesized using the ZAP-cDNA® Synthesis Kit (Stratagene, Cat. #200400) according to the manufacturer's protocol with the following modifications. An anchored oligo(dT) linker-primer was used in the first-strand synthesis reaction to force the primer to anneal to the beginning of the poly(A) tail of the mRNA. The anchored oligo(dT) linker-primer has the sequence:
5' -GAGAGAGAGAGAGAGAGAGAACTAGTCTCGAGTTTTTTTTTTTTTTTTTTVN-3' (SEQ ID NO: 3040)
where V is A, C, or G and N is A, C, G, or T. A second modification was made by adding trehalose at a final concentration of 0.6 M and betaine at a final concentration of 2 M in the buffer of the first-strand synthesis reaction to promote full-length synthesis. Following synthesis and size fractionation, fractions of double-stranded cDNA with sizes longer than 600 bp were pooled. The pooled cDNA was cloned directionally into the plasmid vector BlueScript KS+® (Stratagene) or a modified BlueScript KS+ vector that contained Gateway® (Invitrogen) recombination sites. The cDNA library was transformed into E. coli strain XL10-Gold ultracompetent cells (Stratagene, Cat. #Z00315) for propagation.
[00241] Bacterial cells carrying cDNA clones were grown on LB agar containing the antibiotic ampicillin for selection of plasmid-borne bacteria and X-gal and IPTG to use the blue/white system to screen for the presence cDNA inserts. The white bacterial colonies, those carrying cDNA inserts, were transferred by a colony-picking robot to 384-well MTP for replication and storage. Clones that were to be analyzed by sequencing were transferred to 96- well deep blocks using liquid-handling robots. The bacteria were cultured at 37°C with shaking at 150 rpm. After 24 hours of growth, plasmid DNA from the cDNA clones was prepared by alkaline lysis and sequenced from the 5' end using ABI 3730x1 DNA analyzers (Applied Biosystems). The chromatograms obtained following single-pass sequencing of the cDNA clones were processed using Phred (available at http://www.phrap.org) to assign sequence quality values, Lucy as described in Chou and Holmes (2001 , Bioinformatics, 17(12) 1093-1104) to remove vector and low quality sequences, and Phrap (available at http://www.phrap.org/) to assemble overlapping sequences derived from the same gene into contigs.
Example 4: Annotations
[00242] An in-house automated annotation pipeline was used to predict genes in the assembled genome sequence. The analysis pipeline used in part the ab initio tool Genemark® (http://exon.biology.gatech.edu/) for prediction. It also used the predictor Augustus (http://augustus.gobics.de/) trained on de novo assembled sequences and orthologous sequences for gene finding. Sequence similarity searches against the mycoCLAP® (http://cubique.fungalgenomics.ca/mycoCLAP/) and NCBI non-redundant databases were performed with BLASTX as described in Altschul et al., (1997) (Nucleic Acids Res. 25(17): 3389-3402). Proteins encoding biomass-degrading enzymes possess conserved domains. We used the domains available at the European Bioinformatics Institute (www.ebi.ac.uk/Tools/lnterProScan/) to assist in the identification of target enzymes.
[00243] Proteins targeted to the extracellular space by the classical secretory pathway possess an N-terminal signal peptide, composed of a central hydrophobic core surrounded by N- and C- terminal hydrophilic regions. We used Phobius (available at http://phobius.cgb.ki.se) and SignalP® version 3 (available at http://www.cbs.dtu.dk/services/SignalP) to recognize the presence of signal peptides encoded by the cDNA clones. The tools TargetP® (available at http://www.cbs.dtu.dk/services/TargetP) and Big-PI Fungal Predictor (available at http://mendel.imp.ac.at/gpi/fungi_server.html) were used to remove sequences that encode proteins which are
targeted to the mitochondria or bound to the cell wall. Finally, sequences predicted to encode soluble secreted proteins by these automated tools were analyzed manually. Clones that comprise full-length cDNAs which are predicted to encode soluble secreted proteins were sequenced completely. For genes identified from the genome sequence, oligonucleotide primers specific to the target genes were designed and used to PCR amplified the target genes from double-stranded cDNA or genomic DNA. The PCR amplified products were cloned into an appropriate expression vector for protein production in host cells. The genomic, coding and polypeptide sequences were assigned SEQ ID NOs, annotations, general functions, protein activities, CAZy family classifications, as summarized in Tables 1A-1C. Where appropriate, carbohydrate-binding modules (CBMs) of particular interest for the degradation of biomass were also listed in Tables 1A-1C.
Example 5: Assays for characterization of polypeptides
[00244] Polypeptides of the present invention may be additionally cloned into an expression vector, expressed and characterized (e.g., in sugar release assays) for activity relating to their ability to breakdown and/or process biomass as described in WO/2012/92676, WO/2012/130950, and WO/2012/130964 using appropriate substrates (e.g., acid pre-treated corn stover, hot water treated washed wheat straw, or hot water treated washed corn fiber substrate). Soluble sugars that are released can be analyzed for example by proton NMR.
[00245] A number of assays may be used to characterize the polypeptides of the present invention. Selected non-limiting examples of such assays are described and/or referenced below. Of course, other assays not explicitly mentioned or referenced here may also be used, and the expression "can be" used below is intended to reflect this possibility. Furthermore, a person of skill in the art would be able to modify or adapt these and other assays, as necessary, to characterize a particular polypeptide.
• Acetylxylan esterase CE5. Polypeptides of the present invention having this activity can be characterized as described in Water et al., Appl Environ Microbiol. (2012), 78(10): 3759-62; or Yang et al., International Journal of Molecular Sciences (2010), 11 (12): 5143-5151.
• Adhesin protein Madl Polypeptides of the present invention having this activity can be characterized for example as described in Wang and St Leger, Eukaryot. Ce// (2007), 6(5): 808-816.
• Adhesin. Polypeptides of the present invention having this activity (reviewed in Dranginis et al., Microbiology and Molecular Biology Reviews (2007), 71 (2): 282-294) can be characterized using techniques well known in the art (e.g. adhesion assays).
• Aldose 1-epimerase (mutarotase, aldose mutarotase). Polypeptides of the present invention having this activity can be characterized as described in Timson and Reece, FEBS Letters (2003), 543(1 -3):21 -24; and Villalobo et al., Exp. Parasitol. (2005) 110(3): 298-302.
• Allergen Asp f 15. Polypeptides of the present invention having this activity can be characterized as
described in Bowyer et al., Medical Mycology (2007), 45(1): 17-26.
Alpha-arabinofuranosidase. Polypeptides of the present invention having this activity can be characterized for example as described by Poutanen et al (Appl. Microbiol. Biotechnol. 1988, 28, 425- 432) using 5 mM p-nitrophenyl alpha-L-arabinofuranoside as substrates. The reactions may be carried out in 50 mM citrate buffer at pH 6.0, 40°C with a total reaction time of 30 min. The reaction is stopped by adding 0.5 ml of 1 M sodium carbonate and the liberated p-nitrophenol is measured at 405 nm. Activity is expressed in U/ml. Furthermore, arabionofuranosidases may also be useful in animal feed compositions to increase digestibility. Corn arabinoxylan is heavily di-substituted with arabinose. In order to facilitate the xylan degradation it is advantageous to remove as many as possible of the arabinose substituents. The in vitro degradation of arabinoxylans in a corn based diet supplemented with a polypeptide of the present invention having alpha-arabinofuranosidase activity and a commercial xylanase is studied in an in vitro digestion system, as described in WO/2006/114094.
Alpha-fucosidase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent No. 5,637,490; in Zielke et al., J. Lab. Clin. Med. (1972), 79:164; or using commercially available kits (e.g., Alpha-L-Fucosidase (AFU) Assay Kit, Cat. No. BQ082A-EALD, BioSupplyUK).
Alpha-galactosidase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent application publication No. US 2010/0273235 A1. Briefly, a synthetic substrate, 4-Nitrophenyl-a-D-galactoside is used and the release of p-Nitro-phenol is followed at a wavelength of 405 nm in a reaction buffer containing 100 mM sodium phosphate, 50 mM sodium chloride, pH 6.8 at 26°C.
Alpha-glucuronidase GH67. Polypeptides of the present invention having this activity can be characterized for example as described in Lee et al., J Ind Microbiol Biotechnol. (2012), 39(8): 1245-51 , or Nagy et al., J. Bacteriol. (2002), 184: 4925-4929.
Aminopeptidase Y. Polypeptides of the present invention having this activity can be characterized for example as described in Yasuhara et al., J. Biol. Chem. (1994) 269(18) : 13644-50.
Arabinogalactanase. Polypeptides of the present invention having this activity can be characterized for example as described in Yamamoto and Emi, Methods in Enzymology (1988), 160: 719-725.
Arabinoxylan arabinofuranohydrolase (AXH) GH43. Polypeptides of the present invention having this activity can be characterized for example as described in Yoshida et al., Journal of Bacteriology (2010), 192(20): 5424-5436.
Arabinoxylan arabinofuranosidase GH62. Polypeptides of the present invention having this activity can be characterized for example as described in Sakamoto et al., Applied Microbiology and Biotechnology (2011), 90(1): 137-146.
Aspartic protease. Polypeptides of the present invention having this activity can be characterized for example as described in Tacco et al., Med. Mycol. (2009), 47(8): 845-854; or in Hu et al., Journal of Biomedicine and Biotechnology (2012), 2012:728975.
Aspartic-type endopeptidase. Polypeptides of the present invention having this activity can be characterized for example as described in Tjalsma et al., J. Biol. Chem. (1999), 274: 28191-28197.
Aspergillopepsin-2. Polypeptides of the present invention having this activity can be characterized for example as described in Huang et al., Journal of Biological Chemistry (2000), 275(34): 26607-14.
Avenacinase. Polypeptides of the present invention having this activity can be characterized for example as described in Kwak et al., Phytopathology (2010), 100(5): 404-14; or in Bowyer et al., Science (1995), 267(5196): 371-4.
Beta-galactosidase. Polypeptides of the present invention having this activity can be characterized for example using commercially available kits (e.g., β-Galactosidase Enzyme Assay System with Reporter Lysis Buffer, Cat. No. E2000, Promega).
Beta-glucanase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent application publication number US 2012/0023626 A1 ; or in US patent No. 8,309,338.
Beta-glucosidase. Polypeptides of the present invention having this activity can be characterized for example as described in PCT application publication No. WO/2007/019442; or by using a commercially available kit (e.g., Beta-Glucosidase Assay Kit, Cat. No. KA1611 , Abnova Corp).
Beta-glucuronidase GH79. Polypeptides of the present invention having this activity can be characterized for example as described in Eudes et al., Plant Cell Physiology (2008), 49(9): 1331 -41 ;or Michikawa et al., Journal of Biological Chemistry (2012), 287: 14069-14077.
Beta-mannanase. Polypeptides of the present invention having this activity can be characterized for example as described in European patent application No. EP 2261359 A1 ; or in PCT application publication No. WO2008009673A2.
Beta-mannosidase. Polypeptides of the present invention having this activity can be characterized for example as described in Park et al., N. Biotechnol. (2011), 28(6): 639-48; Duffaud et al., Appl Environ Microbiol. (1997), 63(1): 169-77; or in Fliedrova et al., Protein Expr Purif. (2012), 85(2): 159-64.
Beta-xylosidase. Polypeptides of the present invention having this activity can be characterized for example as described in Wagschal et al., Applied and Environmental Microbiology (2005), 71 (9): 5318— 5323; or Shao et al., Appl Environ Microbiol. (2011), 77(3): 719-726.
Bifunctional xylanase/deacetylase. Polypeptides of the present invention having this activity can be characterized for example as described in Cepeljnik et al., Folia Microbiol. (2006), 51 (4): 263-267; US patent application publication No. US 2012/0028306 A1 ; US patent No. 7,759,102; or PCT application publication No. WO 2006/078256 A2; or Grozinger and Schreiber, Chem Biol. (2002), 9(1): 3-16.
Carbohydrate-binding cytochrome. Polypeptides of the present invention having this activity can be characterized for example as described in Yoshida et al., Appl Environ Microbiol. (2005) 71 (8): 4548- 4555.
Carboxypeptidase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent application publication No. US 2007/0160711 A1 ; or in PCT application publication No. WO 1998/014599A1.
Cellobiohydrolase GH6. Polypeptides of the present invention having this activity can be characterized for example as described in Takahashi et al., Applied and Environmental Microbiology (2010), 76(19): 6583-6590.
Cellobiohydrolase GH7. Polypeptides of the present invention having this activity can be characterized for example as described in Segato et al., Biotechnology for Biofuels (2012), 5:21 ; or Baumann et al., Biotechnol. for Biofuels (2011), 4:45.
Cellobiose dehydrogenase. Polypeptides of the present invention having this activity can be characterized for example as described in Schou et al., Biochem. J. (1998), 330: 565-571 ; or Baminger et al., J. Microbiol Methods. (1999), 35(3): 253-9.
Chitin deacetylase. Polypeptides of the present invention having this activity can be characterized for example as described in European patent application No. EP 0610320 B1.
Chitinase. Polypeptides of the present invention having this activity can be characterized for example as
described in US patent No. 7,087,810.
• Chitooligosaccharide deacetylase. Polypeptides of the present invention having this activity can be characterized for example as described in John et al., Proc Natl Acad Sci USA (1993), 90(2): 625-9.
• Chitotriosidase-1. Polypeptides of the present invention having this activity can be characterized for example as described in US patent No. 6,057,142.
• Cholinesterase. Polypeptides of the present invention having this activity can be characterized for
example as described in Abass Askar et al., Canadian Journal Veterinary Research (2011), 75(4): 261— 270; or Catia et al., PLoS One (2012), 7(3): e33975.
• Cutinase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent application publication No. US 2012/0028318 A1 ; or in Chen et al., J. Biol Chem. (2008), 283(38): 25854-62.
• Cytochrome P450. Polypeptides of the present invention having this activity can be characterized for example as using commercially available kits (e.g., P450-Glo™ Assays, Promega); or as described in Walsky and Obach, Drug Metab Dispos. (2004), 32(6): 647-60.
• Dehydrogenase. Polypeptides of the present invention having this activity can be characterized for example as described in Mayer and Arnold, J. Biomol. Screen. (2002), 7(2): 135-140.
• Endo-1,3(4)-beta-glucanase (laminarinase). Polypeptides of the present invention having this activity can be characterized for example as described in Akiyama et al., J Plant Physiol. (2009), 166(16): 1814- 25; or Hua et al., Biosci Biotechnol Biochem. (2011), 75(9): 1807-12.
• Endo-1,4-beta-xylanase. Polypeptides of the present invention having this activity can be characterized for example as described in Song et al., Enzyme and Microbial Technology (2013). 52(3): 170-176.
• Endo-1,5-alpha-arabinanase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent publication No. US 2012/0270263. More particularly, this assay of arabinase activity is based on colorimetrically determination by measuring the resulting increase in reducing groups using a 3,5-dinitrosalicylic acid reagent. Enzyme activity can be calculated from the relationship between the concentration of reducing groups, as arabinose equivalents, and absorbance at 540 nm. The assay is generally carried out at pH 3.5, but it can be performed at different pH values for the additional characterization and specification of enzymes. Polypeptides of the present invention having this activity can also be characterized for example as described in Hong et al., Biotechnol Lett. (2009), 31 (9): 1439-43.
• Endo-1,6-beta-glucanase. Polypeptides of the present invention having this activity can be characterized for example as described in Bryant et al., Fungal Genet Biol. (2007), 44(8): 808-17; or in Oyama et al., Biosci Biotechnol Biochem. (2006), 70(7): 1773-5.
• Endochitinase. Polypeptides of the present invention having this activity can be characterized for example as described in Wen et al., Biotechnol. Applied Biochem. (2002), 35: 213-219.
• Endoglucanase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent No. 8,063,267.
• Endoglycoceramidase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent No. 5,795,765; or US patent application publication No. US 2009/0170155 A1.
• Endo-polygalacturonase. Polypeptides of the present invention having this activity can be characterized for example as described in European patent application publication Nos. EP1614748 A1 and EP1114165 A1.
Endo-polygalacturonase. Polypeptides of the present invention having this activity can be characterized for example as described in PCT application publication No. WO 1994/014952 A1 ; or in European patent application publication No. EP1614748 A1.
Endo-rhamnogalacturonase GH28. Polypeptides of the present invention having this activity can be characterized for example as described in Sprockett et al., Gene (2011), 479(1-2): 29-36; or An et al., Carbohydrate Research (1994), 264(1): 83-96.
Exo-1,3-beta-galactanase GH43. Polypeptides of the present invention having this activity can be characterized for example as described in lchinose et al., AppI Environ Microbiol. (2006), 72(5): 3515— 3523.
Exo-1,3-beta-glucanase. Polypeptides of the present invention having this activity can be characterized for example as described in O'Connell et al., AppI Microbiol Biotechnol. (2011), 89(3): 685-96; or Santos et al., J Bacteriol. (1979), 139(2): 333-338.
Exo-1,4-beta-xylosidase. Polypeptides of the present invention having this activity can be characterized for example as described in La Grange et al., Applied and Environmental Microbiology (2001), 67(12): 5512-5519.
Exo-arabinanase. Polypeptides of the present invention having this activity can be characterized for example as described in Tatsuji Sakamoto and Thibault, AppI Environ Microbiol. (2001), 67(7): 3319— 3321.
Exoglucanase. Polypeptides of the present invention having this activity can be characterized for example as described in Creuzet et al., FEMS Microbiology Letters (1983), 20(3): 347-350; or Kruus et al., Journal of Bacteriology (1995), 177(6): 1641 -1644.
Exo-glucosaminidase GH2. Polypeptides of the present invention having this activity can be characterized for example as described in Tanaka et al., Journal of Bacteriology (2003), 185(17): 5175- 5181.
Exo-polygalacturonase. Polypeptides of the present invention having this activity can be characterized for example as described in Dong and Wang, BMC Biochem. (2011), 12: 51.
Exo-rhamnogalacturonase GH28. Polypeptides of the present invention having this activity can be characterized for example as described in US patent No. 5,811 ,291.
Expansin. Polypeptides of the present invention having this activity can be characterized for example as described in PCT application publication No. WO 2005/030965 A2; or in US patent No. 7,001 ,743.
Expansin-like protein 1. Polypeptides of the present invention having this activity can be characterized for example as described in Lee et al., Molecules and Cells (2010), 29(4): 379-85.
Feruloyl esterase. Polypeptides of the present invention having this activity can be characterized for example as described in PCT application publication No. WO 2009/076122 A1.
Galactanase GH5. Polypeptides of the present invention having this activity can be characterized for example as described in lchinose et al., Applied and Environmental Microbiology (2008), 74(8): 2379- 2383.
Gamma-glutamyltranspeptidase 2. Polypeptides of the present invention having this activity can be characterized for example as described in Rossi et al., PLoS One (2012), 7(2): e30543.
Glucan 1,3-beta-glucosidase. Polypeptides of the present invention having this activity can be characterized for example as described in Boonvitthya et al., Biotechnol Lett (2012), 34(10): 1937-43.
Glycosidase. Polypeptides of the present invention having this activity can be characterized for example
as described in US patent No. 8,119,383.
Hephaestin-like protein 1. Polypeptides of the present invention having this activity can be characterized for example as described for oxioreductases.
Hexosaminidase. Polypeptides of the present invention having this activity can be characterized for example as described in Wendeler and Sandhoff, Glycoconj J. (2009), 26(8):945-952.
Hydrophobin. Polypeptides of the present invention having this activity can be characterized for example as described in Bettini et al., Canadian Journal of Microbiology (2012), 58(8): 965-972; or Niu et al., Amino Acids. (2012), 43(2)763-71.
Iron transport multicopper oxidase FET3. Polypeptides of the present invention having this activity can be characterized for example as described in Askwith et al., Cell (1994), 76: 403-10; or De Silva et al., J. Biol. Chem. (1995) 270: 1098-1101.
Laccase. Polypeptides of the present invention having this activity can be characterized for example as described in Dedeyan et al., Appl Environ Microbiol. (2000), 66(3): 925-929.
Laminarinase GH55. Polypeptides of the present invention having this activity can be characterized for example as described in Ishida et al., J Biol Chem. (2009), 284(15): 10100-10109; or Kawai et al., Biotechnol Lett. (2006), 28(6): 365-71.
L-Ascorbate oxidase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent Nos. 5,612,208 and 5,180,672.
L-carnitine dehydrogenase. Polypeptides of the present invention having this activity can be characterized for example as described in Aurich et al., Biochim Biophys Acta. (1967), 139(2): 505-7; or US patent No. 5,156,966.
Leucine aminopeptidase 1. Polypeptides of the present invention having this activity can be characterized for example as described in Beattie et al., Biochem. J. (1987), 242: 281-283.
Licheninase (beta-D-glucan 4-glucanohydrolase). Polypeptides of the present invention having this activity can be characterized for example as described in Tang et al., J Agric Food Chem. (2012), 60(9): 2354-61.
Lipase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent Nos. 7,662,602 and 7,893,232.
L-sorbosone dehydrogenase. Polypeptides of the present invention having this activity can be characterized for example as described in Shinjoh et al., Applied and Environment Microbiology (1995), 61 (2): 413-420.
Lysophospholipase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent No. 5,965,422.
Metallocarboxypeptidase. Polypeptides of the present invention having this activity can be characterized for example as described in Tayyab et al., J Biosci Bioeng. (2011), 111 (3): 259-65; or Song et al., J Biol 0»em. (1997), 272(16): 10543-50.
Methylenetetrahydrofolate dehydrogenase [NAD(+)]. Polypeptides of the present invention having this activity can be characterized for example as described in Wohlfarth et al., J Bacteriol. (1991), 173(4): 1414-1419.
Mixed-link glucanase. Polypeptides of the present invention having this activity can be characterized for example as described in Clark et al., Carbohydr Res. (1978), 61 : 457-477.
Multicopper oxidase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent application publication No. US 2012/0094335 A1.
Mutanase. Polypeptides of the present invention having this activity can be characterized for example as described in Pleszczynska, Biotechnol Lett. (2010), 32(11): 1699-1704; or WO 1998/000528 A1.
N-acetylglucosaminidase GH18. Polypeptides of the present invention having this activity can be characterized for example as described in Murakami et al., Glycobiology (2013), e-pub: Feb.22, PMID: 23436287; or in US patent application publication No. US20120258089 A1.
NADPH-cytochrome P450 reductase. Polypeptides of the present invention having this activity can be characterized for example as described in Guengerich et al., Nat Protoc. (2009), 4(9): 1245-51.
Non-hemolytic phospholipase C. Polypeptides of the present invention having this activity can be characterized for example as described in Weingart and Hooke, Curr Microbiol. (1999), 38(4): 233-8; Korbsrisate et al., J Clin Microbiol. (1999), 37(11): 3742-5.
Oxidase. Polypeptides of the present invention having this activity can be characterized for example using a number of commercially available kits [e.g., Amplex® Red Galactose/Galactose Oxidase Kit (A22179) and Amplex® Red Glucose/Glucose Oxidase Assay Kit (Molecular Probes/lnvitrogen); Cytochrome C Oxidase Assay Kit (Cat. No. CYTOCOX1 -1 KT; Sigma-Aldrich); Xanthine Oxidase Assay Kit (ab102522, Abeam); Lysyl Oxidase Activity Assay Kit (ab112139, Abeam); Glucose Oxidase Assay Kit (ab138884, Abeam); Monoamine oxidase B (MAOB) Specific Activity Assay Kit (ab109912, Abeam)].
Oxidoreductase. Polypeptides of the present invention having this activity can be characterized for example as described in Hommes et al., Anal Chem. (2013), 85(1): 283-291.
Para-nitrobenzyl esterase. Polypeptides of the present invention having this activity can be characterized for example as described in Moore and Arnold, Nat Biotechnol. (1996), 14(4): 458-67.
Pectate lyase. Polypeptides of the present invention having this activity can be characterized for example as described in Wang et al., BMC Biotechnology (2011), 11 : 32.
Pectin methylesterase. Polypeptides of the present invention having this activity can be characterized for example as described in PCT application publication No. WO 1997/031102 A1.
Pectinesterase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent No. 5,053,232.
Penicillopepsin. Polypeptides of the present invention having this activity can be characterized for example as described in Cao et al., Protein Sci. (2000), 9(5): 991-1001 ; or Hofmann et al., Biochemistry. (1984), 14;23(4): 635-43.
Peroxidase. Polypeptides of the present invention having this activity can be characterized for example using a number of commercially available kits [e.g., Amplex® Red Hydrogen Peroxide/Peroxidase Assay Kit (Molecular Probes/lnvitrogen); Peroxidase Activity Assay Kit (Cat. No. K772-100; BioVision); QuantiChrom™ Peroxidase Assay Kit (Cat. No. DPOD-100, BioAssay Systems].
Phospholipase C. Polypeptides of the present invention having this activity can be characterized for example using commercially available kits (Amplex® Red Phosphatidylcholine-Specific Phospholipase C Assay Kit, Molecular Probes/lnvitrogen).
Polysaccharide monooxygenase. Polypeptides of the present invention having this activity can be characterized for example as described in Kittl et al., Biotechnol Biofuels. (2012), 5(1 ):79, Phillips et al., ACS Chem Biol (2011), 6(12): 1399-1406, Wu et al., J. Biol. Chem (2013), 288(18): 12828-39. Polysaccharide monooxygenases, sometimes referred to functionally as "cellulase-enhancing proteins", generally belong the enzyme class GH61 and are reported to cleave polysaccharides with the insertion of
oxygen.
Protease. Polypeptides of the present invention having this activity can be characterized for example as described in US patent application publication No. US 2005/0010037 A1.
Putative exoglucanase type C (1 ,4-beta-cellobiohydrolase; beta-glucancellobiohydrolase;
exocellobiohydrolase I). Polypeptides of the present invention having this activity can be characterized for example as described in Dai et al., Applied Biochemistry and Biotechnology (1999), 79, Issue 1 -3: 689- 699.
Rhamnogalacturonan lyase PL4. Polypeptides of the present invention having this activity can be characterized for example as described in Mutter et al., Plant Physiol. (1998), 117: 153-163; or de Vries, Appl. Microbiol Biotechnol. (2003), 61 : 10-20.
Rodlet protein. Polypeptides of the present invention having this activity can be characterized for example as described in Yang et al., Biopolymers (2013), 99(1): 84-94.
Serine-type carboxypeptidase F. Polypeptides of the present invention having this activity can be characterized for example as described in US patent No. 6,379,913.
Swollenin. Polypeptides of the present invention having this activity can be characterized for example as described in Jager et al., Biotechnol Biofuels. (2011), 4: 33; or Saloheimo et al., Eur J Biochem. (2002), 269(17): 4202-11.
Tyrosinase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent application publication No. US 2011/0311693 A1.
Unsaturated rhamnogalacturonyl hydrolase YteR. Polypeptides of the present invention having this activity can be characterized for example as described in Itoh et al., Biochem Biophys Res Commun. (2006), 347(4): 1021-9; or Itoh et al., J Mol Biol. (2006), 360(3): 573-85.
Xylan alpha-1,2-glucuronidase. Polypeptides of the present invention having this activity can be characterized for example as described in Ishihara, M. and Shimizu, K., "alpha-(1->2)-Glucuronidase in the enzymatic saccharification of hardwood xylan: Screening of alpha-glucuronidase producing fungi." Journal Mokuzai Gakkaishi, (1988) 34: 58-64.
Xylanase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent application publication No. US 2012/0028306 A1 ; US patent No. 7,759,102; or PCT application publication No. WO 2006/078256 A2.
Xyloglucanase GH12. Polypeptides of the present invention having this activity can be characterized for example as described in Master et al., Biochem. (2008), 411 (1): 161-170.
Xyloglucan-specific endo-beta-1,4-glucanase A. Polypeptides of the present invention having this activity can be characterized for example as described in European patent application publication No. EP0972016 B1 ; in US patent No. 6,077,702; Damasio et al., Biochim Biophys Acta. (2012), 1824(3): 461- 7; or Wong et al., Appl Microbiol Biotechnol. (2010), 86(5): 1463-71.
Xylosidase/arabinosidase. Polypeptides of the present invention having this activity can be characterized for example as described in Whitehead and Cotta, Curr Microbiol. (2001), 43(4): 293-8; or Xiong et al., Journal of Experimental Botany (2007), 58(11): 2799-2810.
Example 6: General Molecular Biology Procedures
[00246] Standard molecular cloning techniques such as DNA isolation, gel electrophoresis, enzymatic restriction modifications of nucleic acids, E. coii transformation, etc., were performed as described by Sambrook et
al., 1989, (Molecular cloning: A laboratory manual, 2nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York and Innes et al. (1990) PCR protocols, a guide to methods and applications, Academic Press, San Diego, edited by Michael A. Innis et al). Primers were prepared by IDT (Integrated DNA Technologies). Sanger DNA sequencing was performed using an Applied Biosystem's 3730x1 DNA Analyzer technology at the Innovation Centre (Genome Quebec), McGill University in Montreal.
Example 7: Construction of pGBFIN49 expression plasmids
[00247] Genes of interest were cloned into the expression vector pGBFIN-49. This vector is a derivative of pGBFIN-41 that contains the A. niger glaA promoter, A. niger TrpC terminator, A. nidulans gpdA promoter, gene encoding the pheomycin resistance gene, A. niger glaA terminator and an E. coli backbone. Figure 1 represents a schematic map of pGBFIN-49 and the complete nucleotide sequence is presented as SEQ ID NO: 3041. Details of the construction of pGBFIN-49 are as follows:
1. TtrpC terminator PCR amplification (0.7kb):
[00248] TtrpC terminator was PCR amplified using purified pGBFIN33 plasmid as a template. The following primers and PCR program were used:
Primer-3 : 5 ' -GTCCGTCGCCGTCCTTCAccgccggtccgacg-3 ' (SEQ ID NO: 3042)
Primer-4 : 5 ' -GCGGCCGGCGTATTGGGTGttacggagc-3 ' (SEQ ID NO: 3043)
[00249] Primer-4 is entirely specific to the TtrpC 3' end. Primer-3 was designed to suit the LIC cloning strategy but also to keep the TtrpC sequence as close to the original sequence. To do so, five adenines were replaced by thymines (underlined).
PCR master mix:
pGBFIN33 1 L (5-10ng)
Primer-3 (10 mM) 1 μΙ_
Primer-4 (10 mM) 1 μΙ_
dNTPs (2 mM) 5 ML
HF Buffer (5x) 10 μΙ_
Phusion DNS pol. 0.5 ML
Nuclease-free water 31.5 μί.
Total 50 ML
[00250] PCR program: 1 x 98°C, 2 min; 25 x (98°C, 30 sec; 68°C, 30 sec; 72°C, 1 min); 72°C, 7 min.
[00251] Reaction conditions: 5 μί. of the PCR reaction was separated by electrophoresis on 1.0% agarose gel and the remaining was purified using QIAEX II™ gel Extraction kit (QIAGEN) and resuspended in nuclease-free water.
Z PGBFIN41 vector PCR amplification (8.3kb):
[00252] Vector backbone was PCR amplified using pGBFIN41 as a template. Primers were designed outside of the ccdA region (not included in pGBFIN49). The following primers and PCR program were used:
Primer-2 : 5 ' -CACCCAATACGCCGGCCGCgcttccagacagctc-3 ' (SEQ ID NO: 3044)
Primer-lC : 5 ' -GGTGTTTTGTTGCTGGGGAtgaagctcaggctctcagttgcgtc-3 ' (SEQ ID NO: 3045)
[00253] Primer-2 contains a pgpdA-specific region and an extra sequence specific to TtrpC 3' end (also included in Primer-4). Primer-1 C was designed to suit the LIC cloning strategy but also to keep PgalA region as close to the original sequence. To do so, three thymines were replaced by adenines (underlined).
PCR master mix:
pGBFIN41 1 L (50ng)
Primer-2 (10 mM) 1 μί.
Primer-1 C (10 mM) 1 μΙ_
dNTPs (2 mM) 5 μί.
HF Buffer (5x) 10 μί.
Phusion DNS pol. 0.5 μί.
DMSO 1 μί.
Nuclease-free water 30.5 μί.
Total 50 \il
[00254] PCR program: 1 x 98°C, 3 min; 10x (98°C, 30 sec ; 68°C, 30 sec, 72°C, 5 min); 20 x (98°C, 30 sec, 68°C, 30 sec, 72°C, 5 min + 10 sec/cycle); 72°C, 10 min.
[00255] Reaction conditions: 5 μί. of the PCR reaction was separated on a 0.5% agarose gel and remaining was purified using QIAEX II™ gel Extraction kit (QIAGEN) and resuspended in nuclease-free water.
3. PGBFIN41 + TtrpC overlap-extension PCR:
[00256] Overlap-extension / Long range PCR was performed to: a) fuse the two PCR pieces together; b) add an Sfol restriction site to re-circularize the vector. No primers were used in the overlap-extension stage. Primer-11 and Primer- 12 were used for the long range PCR reaction.
Primer-ll: 5 ' -CACCGGCGCCGTCCGTCGCCGTCCTTC -3 ' (SEQ ID NO: 3046) Primer-12: 5 ' -ACGGCGCCGGTGTTTTGTTGCTGGGGATG -3 ' (SEQ ID NO: 3047)
[00257] Primer-11 is specific to the LIC tag located on the TtrpC terminator, while Primer-12 is specific to the LIC tag located on the PglaA region. The Sfol restriction site sequence is underlined above.
[00258] A standard PCR master mix was prepared to perform overlap-extension PCR using pGBFIN41 and TtrpC purified PCR products as templates. No primers were added.
Overlap-extension master mix:
TtrpC 1 μί
pGBFIN41 9 μί
Buffer GC (5x) 10 μί
dNTPs (2 mM) 5 μί
Phusion DNA pol. 0.5 μί
Nuclase-free water 24.5 μΙ_
pGBFIN41 50 μΙ_
[00259] PCR program - overlap (no primers): 1x 98°C, 2 min; 5x (98°C, 15 sec; 58°C, 30 sec; 72°C, 5 min), 5x
(98°C, 15 sec; 63°C, 30 sec; 72°C, 5 min), 5x (98°C, 15 sec; 68°C, 30 sec; 72°C, 5 min); 72°C, 10 min.
[00260] The overlap-extension PCR product was then, purified on QIAEX II™ column and 5 μί. of the purified reaction was used as template DNA for Long range PCR step with Primers-11 and -12.
PCR master mix:
Overlap product 5 μί.
Primer-11 (10mM) 1 μί.
Primer-12 (10mM) 1 μΙ_
dNTPs (2mM) 5 μΙ_
HF Buffer (5x) 10 μΙ_
Phusion DNA pol. 0.5 μί.
DMSO 1 μΙ_
Nuclease-free water 26.5 μΙ_
pGBFIN41 50 μΙ_
[00261] PCR program - Long range: 1x 98°C, 3 min; 10x (98°C, 30 sec ; 68°C, 30 sec ; 72°C, 5 min); 20 x (98°C, 30 sec ; 68°C, 30 sec ; 72°C, 5 min + 10 sec/cycle); 72°C, 10 min.
[00262] Reaction conditions: 5 μί. of the PCR reaction was separated on 0.5% agarose gel and remaining was purified using QIAEX II™ gel Extraction kit and resuspended in nuclease-free water. Then, Sfol digestion was performed and digested product was purified using QIAEX II gel extraction kit follow the procedure as described by the manufacturer.
4. Ligation:
[00263] 100 ng of the purified digested fragment was ligated to itself using 1 μΙ_ of T4 DNA Ligase (New England Biolabs, M0202), and incubated at 16°C overnight. Enzyme inactivation was performed at 65°C for 10 minutes. Then, 10 μί. of ligation product was transformed in DH5 E. coli competent cells and plated on 2xYT agar containing 100 ug/mL ampicillin. DNA extraction was performed on single colonies the next day. Restriction analysis and sequencing were done to confirm the structure.
Example 8: Cloning of Thermoascus aurantiacus, Myceliophthora fergusii (Corynascus thermophilus), and
Pseudocercosporella herpotrichoides genes in E. coli
[00264] Cloning genes of interest in the pGBFIN-49 expression vector was performed using the Ligation- independent cloning (LIC) method according to Aslanidis, C, de Jong, P. (1990) Nucleic Acids Research Vol. 18 No. 20, 6069-6074.
[00265] Coding sequences from genes of interest were amplified by PCR using primers containing LIC tags, which are homologous to Pg/a and TrpC sequences in the pGBFIN-49 cloning vector fused to sequences homologous to the coding sequences of the gene of interest, and either genomic DNA or cDNA as template. Primers have the following sequences:
Forward primer: 5 -CCCCAGCAACAAAACACCTCAGCAATG...15-20 nucleotides specific to each gene to be cloned (SEQ ID NO: 3048)
Reverse primer: 5 -GAAGGACGGCGACGGACTTCA...15-20 nucleotides specific to each gene to be cloned (SEQ ID NO: 3049)
PCR mix consists of following components:
Template (gDNA or cDNA) 1-10 ng/μί. 1 μΙ_
5X Phusion HF Buffer (Finnzymes™) 10 ML
2 mM dNTPs 5 ML
LIC primer (F+R) mix 10 mM 0.5 ML
Phusion DNA Polymerase (Finnzymes™) 0.5 ML
DMSO 1.5 ML
H20 31.5 ML
TOTAL 50 μί.
[00266] PCR amplification was carried out with following conditions:
[00267] Following PCR, 90 μί. milHQ™ water was added to each sample and the mix was purified using a Multiscreen PCR96 Filter Plate (Millipore) according to manufacturer's instructions. The PCR product was eluted from the filter in 25 ML 10 mM Tris-HCI pH 8.0.
[00268] Expression vector pGBFIN-49 was PCR amplified using primers with following sequences:
Forward primer: 5 ' -GTCCGTCGCCGTCCTTCACCG-3 ' (SEQ ID NO: 3050) Reverse primer: 5 ' -GGTGTTTTGTTGCTGGGGATGAAGC-3 ' (SEQ ID NO: 3051)
(Primers are located at either site of the Sfol restriction site.)
PCR mix consists of following components:
pGBFIN-49 plasmid DNA (10 ng/ μΙ_) 2 μΙ_
5X Phusion HF Buffer (Finnzymes™) 20 μί.
2 mM dNTPs 10 μΙ_
LIC Primer mix (F+R) 10 mM 2 μί
Phusion DNA Polymerase (Finnzymes™) 1.5 μί.
DMSO 3 μΙ_
H2Q 61.5 ML
TOTAL 100 ML
[00269] PCR amplification was carried out with following conditions:
[00270] Following PCR, 1 μί. of Dpn\ was added to the PCR mix and digestion was performed overnight at 37°C. Digested PCR product was purified using the Qiaquick™ PCR purification kit (Qiagen) according to manufacturer's instructions.
[00271] Obtained PCR fragments were treated with T4 DNA polymerase in the presence of dTTP to create single stranded tails at the ends of the PCR fragments. The single stranded tails of the PCR fragment are complementary to those of the vector, thus permitting non-covalent bi-molecular associations, e.g., circularization between molecules.
[00272] The reaction mix of the T4 DNA polymerase treatment of the pGBFIN-49 PCR fragment consisted of the following components:
Purified pGBFIN-49 PCR fragment 600 ng
10X Neb Buffer 2 2 μΙ_
25 mM dTTP 2 μΙ_
ϋΤΤ 100 μΜ 0.8 μΙ_
T4 DNA Polymerase 3U/ μί. 1 μΙ_
H2Q Up to 20 μί.
TOTAL 20 μΙ_
[00273] The reaction mix of T4 DNA polymerase treatment of the Gene of Interest (GOI) PCR fragment consisted of the following components:
Purified GOI PCR 5 μΙ_
10X NEB Buffer 2 2 μΙ_
25 mM dATP 2 μΙ_
ϋΤΤ Ι ΟΟ μΜ 0.8 μΙ_
T4 DNA Polymerase 3U/ μί. 1 μΙ_
H20 9.2 μί.
TOTAL 20 μΙ_
[00275] Following T4 DNA polymerase treatment, 2 μί. of pGBFIN-49 vector and 4 μί. of the GOI were mixed and incubated at room temperature allowing annealing of GOI fragment with pGBFIN-49 vector fragment. The bi- molecular forms are used to transform E. coli. Plasmid DNA of resulting transformants was isolated and verified by sequence analyses for correct amplification and cloning of the gene of interest.
Example 9: Transformation of Thermoascus aurantiacus, Myceliophthora fergusii (Corynascus thermophilus), and Pseudocercosporella herpotrichoides gene expression cassettes into A. niger
[00276] As host strain for enzyme production, A. niger GBA307 was used. Construction of A. niger GBA307 is described in WO 2011/009700.
[00277] Transformation of A. niger was performed essentially according to the method described by Tilburn, J. et. al. (1983) Gene 26, 205-221 and Kelly, J & Hynes, M. (1985) EMBO J., 4, 475-479 with the following modifications:
Spores were grown for 16-24 hours at 30°C in a rotary shaker at 250 rpm in Aspergillus minimal medium. Aspergillus minimal medium contains per liter: 6 g NaNOs; 0.52 g KCI; 1.52 g KH2PO4; 1.12 ml
4 M KOH; 0.52 g MgS0 -7H20; 10 g glucose; 1 g casamino acids; 22 mg ZnS0 -7H20; 11 mg H3B03;
5 mg FeS04.7H20; 1.7 mg CoCI2-6H20; 1.6 mg CuS0 -5H20; 5 mg MnCI2.2H20; 1.5 mg Na2Mo0 -2 H20; 50 mg EDTA; 2 mg riboflavin; 2 mg thiamine-HCI; 2 mg nicotinamide; 1 mg pyridoxine-HCI; 0.2 mg panthotenic acid; 4 μg biotin; 10 ml Penicillin (5000IU/mL/Streptomycin (5000 UG/mL) solution (Invitrogen);
Glucanex 200G (Novozymes) was used for the preparation of protoplasts;
- After protoplast formation (2-3 hours) 10 mL TB layer (per liter: 109.32 g Sorbitol; 100 mL 1 M Tris-HCI pH 7.5) was pipetted gently on top of the protoplast suspension. After centrifugation for 10 min at 4330
x g at 4°C in a swinging bucket rotor, the protoplasts on the interface were transferred to a fresh tube and washed with STC buffer (1.2 M Sorbitol, 10 mM Tris-HCI pH 7.5, 50 mM CaCb). The protoplast suspension was centrifuged for 10 min at 1560 x g in a swinging bucket rotor and resuspended in STC- buffer at a concentration of 10s protoplasts/mL;
To 200 μί of the protoplast suspension, 20 μί. ATA (0.4 M Aurintricarboxylic acid), the DNA dissolved in 10 ML in TE buffer (10 mM Tris-HCI pH 7.5, 0.1 mM EDTA), 100 μί. of a PEG solution (20% PEG 4000 (Merck), 0.8M sorbitol, 10 mM Tris-HCI pH 7.5, 50 mM CaCb) was added;
After incubation of the DNA-protoplast suspension for 10 min at room temperature, 1.5 ml PEG solution (60% PEG 4000 (Merck), 10 mM Tris-HCI pH7.5, 50 mM CaCb) was added slowly, with repeated mixing of the tubes. After incubation for 20 min at room temperature, suspensions were diluted with 5 ml 1.2 M sorbitol, mixed by inversion and centrifuged for 10 min at 2770 x g at room temperature. The protoplasts were resuspended gently in 1 mL 1.2 M sorbitol and plated onto selective regeneration medium consisting of Aspergillus minimal medium without riboflavin, thiamine. HCI, nicotinamide, pyridoxine, panthotenic acid, biotin, casamino acids and glucose, supplemented with 150 μg/mL Phleomycin (Invitrogen), 0.07 M NaNC , 1 M sucrose, solidified with 2% bacteriological agar #1 (Oxoid, England). After incubation for 5-10 days at 30°C, single transformants were isolated on PDA (Potato Dextrose Agar (Difco) supplemented with 150 μg/mL Phleomycin in 96 wells MTP. After 5-7 days growth at 30°C single transformants were used for MTP fermentation.
Example 10: Aspergillus niger microtiter plate fermentation
[00278] 96 wells microtiter plates (MTP) with sporulated Aspergillus niger strains were used to harvest spores for MTP fermentations. To do this, 100 \ii water was added to each well and after resuspending the mixture, 40 \ii of spore suspension was used to inoculate 2 mL A.niger medium (70 g/L glucose*H20, 10 g/L yeast extract, 10 g/L (NH4)2S04, 2 g/L K2S04, 2 g/L KH2P04, 0.5 g/L MgS04-7H20, 0.5 g/L ZnS04-7H20, 0.2 g/L CaCb, 0.01 g/L MnSO HA 0.05 g/L FeS04-7H20, 0.002 Na2Mo04-2H20, 0.25 g/L Tween™-80, 10 g/L citric acid, 30 g/L MES; pH 5.5 adjusted with 4 M NaOH) in a 24 well MTP. In the MTP fermentations for strains expressing GH61 proteins (e.g., polysaccharide monooxygenases), 30 μΜ CuS04 was included in the media. The MTP's were incubated in a humidity shaker (Infors) at 34°C at 550 rpm, and 80% humidity for 6 days. Plates were centrifuged and supernatants were harvested.
Example 11 : Aspergillus niger shake flask fermentation
[00279] Approximately 1x106 - 1x107 spores were inoculated in 20 mL pre-culture medium containing Maltose 30 g/L; Peptone (aus casein) 10 g/L; Yeast extract 5 g/L; KH2P0 1 g/L; MgS0 -7H20 0.5 g/L; ZnCb 0.03 g/L; CaCb 0.02 g/L; MnS0 -4H20 0.01 g/L; FeS0 -7H20 0.3 g/L; Tween™-80 3 g/L; pH 5.5. After growing overnight at 34°C in
a rotary shaker, 10-15 mL of the growing culture was inoculated in 100 mL main culture containing Glucose*H20 70 g/L; Peptone (aus casein) 25 g/L; Yeast extract 12.5 g/L; K2S04 2 g/L; KH2P04 1 g/L; MgS0 -7H20 0.5 g/L; ZnCI2 0.03 g/L; CaCI2 0.02 g/L; MnS04-1 H20 0.009 g/L; FeS04-7H20 0.003 g/L; pH 5.6. Note: for GH61 (e.g., polysaccharide monooxygenase) enzymes the culture media were supplemented with 10 μΜ CuS04.
[00280] Main cultures were grown until all glucose was consumed as measured with Combur Test N strips (Roche), which was the case mostly after 4-7 days of growth. Culture supernatants were harvested by centrifugation for 10 minutes at 5000 x g followed by germ-free filtration of the supernatant over 0.2 μιτι PES filters (Nalgene).
Example 12: Protein concentration determination with TCA-biuret method
[00281] Concentrated protein samples (supernatants) were diluted with water to a concentration between 2 and 8 mg/mL. Bovine serum albumin (BSA) dilutions (0, 1 , 2, 5, 8 and 10 mg/mL were made and included as samples to generate a calibration curve. 1 mL of each diluted protein sample was transferred into a 10-mL tube containing 1 mL of a 20% (w/v) trichloro acetic acid solution in water and mixed thoroughly. Subsequently, the tubes were incubated on ice water for one hour and centrifuged for 30 minutes, at 4°C and 6000 rpm. The supernatant was discarded and pellets were dried by inverting the tubes on a tissue and letting them stand for 30 minutes at room temperature. Next, 4-mL BioQuant Biuret reagent mix was added to the pellet in the tube and the pellet was solubilized upon mixing. Next, 1 mL water was added to the tube, the tube was mixed thoroughly and incubated at room temperature for 30 minutes. The absorption of the mixture was measured at 546 nm with a water sample used as a blank measurement and the protein concentration was calculated via the BSA calibration line.
Example 13: Microtiter plate (MTP) sugar-release activity assay
[00282] For each (hemi-)cellulase assay, the stored samples were analyzed twice according the following procedure 100 \ii sample and 100 of a (hemi-)cellulase base mix [1.75 mg/g DM TEC-210 or a 3 enzyme mix at a total dosage of 3.5 mg/g DM consisting of 0.5 mg/g DM BG (14% of total protein 3E mix), 1.6 mg/g DM CBHI (47% of total protein 3E mix) and 1.4 mg/g DM CBHI I (39% of total protein 3E mix)] was transferred to two suitable vials: one vial containing 800 \ii 2.5 % (w/ w) dry matter of the acid pre-treated corn stover substrate in a 50 mM citrate buffer, buffered at pH 4.5. The other vial consisted of a blank, where the 800 \ii 2.5 % (w/ w) dry matter, acid pre- treated corn stover substrate suspension was replaced by 800 \ii 50 mM citrate buffer, buffered at pH 4.5. The assay samples were incubated for 72 hrs at 65°C. After incubation of the assay samples, a fixed volume of an internal standard, maleic acid (20 g/L), EDTA (40 g/L) and DSS (0.5g/L), was added. After centrifugation, the supernatant of the samples is lyophilized overnight; subsequently 100 \ii D20 is added to the dried residue and lyophilized once more. The dried residue is dissolved in 600 \ii of D20.
[00283] The amount of sugar released, is based on the signal between 4.65 - 4.61 ppm, relative to DSS, and is determined by means of 1 D 1H NMR operating at a proton frequency of 500 MHz, using a pulseprogram without water suppression, at a temperature of 27°C.
[00284] The (hemi)-cellulase enzyme solution may contain residual sugars. Therefore, the results of the assay are corrected for the sugar content measured after incubation of the enzyme solution.
Example 14: Sugar-release activity assays: labscale, incubation with shaking
[00285] A. niger strains expressing Thermoascus aurantiacus, Myceliophthora fergusii (Corynascus thermophilus), and Pseudocercosporella herpotrichoides clones were grown in shake flask, as described above (Example 11), in order to obtain greater amounts of material for further testing. The fermentation supernatants (volume between 40 and 80 mL) were concentrated using a 10-kDa spin filter to a volume of approximately 5 mL. Subsequently, the protein concentration in the concentrated supernatant was determined via a TCA-biuret method, as described above in Example 12. The (hemi-)cellulase activity of these protein samples was tested in an assay where the supernatants were spiked on top of an enzyme base mix in the presence of 10% (w/w) acid pretreated corn stover (aCS). 'To spike' or 'spiking of a supernatant or an enzyme indicates, in this context, the addition of a supernatant or an enzyme to a (hemi)-cellulase base mix. The feedstock solution was prepared via the dilution of a concentrated feedstock solution with water. Subsequently, the pH was adjusted to pH 4.5 with a 4 M NaOH solution. The proteins were spiked based on dosage; the concentrated supernatant samples were added in a final concentration of 2 mg/gDM to the base enzyme mix (TEC-210 5 mg/gDM) in a total volume of 10 mL at a feedstock concentration of 10% aCS (w/w) in an 30-mL centrifuge bottle (Nalgene Oakridge). All experiments were performed at least in duplicate and were incubated for 72 hours at 65°C in an oven incubator (Techne HB-1 D hybridization oven) while rotating at set-point 3. After incubation, the samples were centrifuged and soluble sugars were analysed by HPLC as described below.
Example 15: Soluble sugar analysis by HPLC
[00286] The sugar content of the samples after enzymatic hydrolysis were analyzed using a High-Performance Liquid Chromatography System (Agilent 1100) equipped with a refection index detector (Agilent 1260 Infinity). The separation of the sugars was achieved by using a 300 x 7.8 mm Aminex HPX-87P (Bio-Rad cat. no. 125-0098) column; Pre-column: Micro guard Carbo-P (Bio-Rad cat. no. 125-0119); mobile phase was HPLC grade water; flow rate of 0.6 mL/min and a column temperature of 85°C. The injection volume was 10
[00287] The samples were diluted with HPLC grade water to a maximum of 10 g/L glucose and filtered by using 0.2 μιτι filter (Afridisc LC25 mm syringe filter PVDF membrane). The glucose was identified and quantified according to the retention time, which was compared to the external glucose standard (D-(+)-Glucose Sigma cat. no: G7528) ranging from 0.2; 0.4; 1.0; 2.0 g/L.
Example 16: Protein Activity Assays
16.1 Alpha-arabino(furano)sidase activity assay
[00288] This assay measures the ability of a-arabino(furano)sidases to remove the alpha-L-arabinofuranosyl residues from substituted xylose residues. Single and double substituted oligosaccharides are prepared by incubating wheat arabinoxylan (WAX medium viscosity; 2 mg/mL; Megazyme, Bray, Ireland) in 50 mM acetate buffer pH 4.5 with an appropriate amount of endo-xylanase (Aspergillus Awamori, FJM, Kormelink, Carbohydrate Research, 249 (1993) 355-367) for 48 hours at 50°C to produce an sufficient amount of arabinoxylo-oligosaccharides. The reaction is stopped by heating the samples at 100°C for 10 minutes. The samples are centrifuged for 5 minutes at 10,000 x g. The supernatant is used for further experiments. Degradation of the arabinoxylan is followed by High Performance Anion Exchange Chromatography (HPAEC).
[00289] The enzyme is added to the single and double substituted arabinoxylo-oligosaccharides (endo-xylanase treated WAX) in a dosage of 10 mg protein/ g substrate in 50 mM sodium acetate buffer which is then incubated at 65°C for 24 hours. The reaction is stopped by heating the samples at 100°C for 10 minutes. The samples are centrifuged for 5 minutes at 10,000 x g and 10 times diluted. Release of arabinose from the arabinoxylo- oligosaccharides is analyzed by HPAEC analysis.
[00290] The analysis is performed using a Dionex HPLC system equipped with a Dionex CarboPac PA-1 (2 mm ID x 250 mm) column in combination with a CarboPac PA guard column (2 mm ID x 50 mm) and a Dionex PAD- detector (Dionex Co. Sunnyvale). A flow rate of 0.3 mL/min is used with the following gradient of sodium acetate in 0.1 M NaOH: 0-40 min, 0-400 mM. Each elution is followed by a washing step of 5 min 1000 mM sodium acetate in 0.1 M NaOH and an equilibration step of 15 min 0.1 M NaOH. Arabinose release is quantified by an arabinose standard (Sigma) and compared to a sample where no enzyme was added.
16.2 Beta-xylosidase activity assay
[00291] This assay measures the release of xylose by the action of beta-xylosidase on xylobiose. Sodium acetate buffer (0.05 M, pH 4.5) is prepared as follows. 4.1 g of anhydrous sodium acetate or 6.8 g of sodium acetate * 3H2O is dissolved in distilled water to a final volume of 1000 mL (Solution A). In a separate flask, 3.0 g (2.86 mL) of glacial acetic acid is mixed with distilled water to make the total volume of 1000 mL (Solution B). The final 0.05 M sodium acetate buffer, pH 4.5, is prepared by mixing Solution A with Solution B until the pH of the resulting solution is equal to 4.5.
[00292] Xylobiose was purchased from Sigma and a solution of 100 μg/mL sodium acetate buffer pH 4.5 was prepared. The assay is performed as detailed below.
[00293] The enzyme is added to the substrate in a dosage of 10, 5 or 1 mg protein/ g substrate, which is then incubated at 62-65°C for 24 hours. The reaction is stopped by heating the samples for 10 minutes at 100°C.
Samples are appropriate diluted and the release of xylose is analyzed by High Performance Anion Exchange Chromatography.
[00294] The analysis is performed using a Dionex HPLC system equipped with a Dionex CarboPac PA-1 (2 mm ID x 250 mm) column in combination with a CarboPac PA guard column (2 mm ID x 50 mm) and a Dionex PAD- detector (Dionex Co. Sunnyvale). A flow rate of 0.3 mL/min is used with the following gradient of sodium acetate in 0.1 M NaOH: 0-20 min, 0-17.8 mM. Each elution is followed by a washing step of 5 min 1000 mM sodium acetate in 0.1 M NaOH and an equilibration step of 15 min 0.1 M NaOH.
[00295] In case interfering compounds are present that complicate xylose quantification, the analysis is performed by running isocratic on H2O for 30 min a gradient (0.5M NaOH is added post-column at 0.1 mL/min for detection) followed by a washing step of 5 min 1000 mM sodium acetate in 0.1 M NaOH and an equilibration step of 15 min H20.
[00296] Standards of xylose and xylobiose (Sigma) are used for identification and quantification of the substrate and product formed by the enzyme.
16.3a Alpha-qlucuronidase activity assay
[00297] The following example illustrates an assay to measure the alpha-glucuronidase activity towards glucuronoxylooligosaccharides. Sodium acetate buffer (0.05 M, pH 4.5) is prepared as follows: 4.1 g of anhydrous sodium acetate is dissolved in distilled water to a final volume of 1000 mL (Solution A). In a separate flask, 3.0 g (2.86 mL) of glacial acetic acid is mixed with distilled water to make the total volume of 1000 mL (Solution B). The final 0.05 M sodium acetate buffer, pH 4.5, is prepared by mixing Solution A with Solution B until the pH of the resulting solution is 4.5.
[00298] To determine the activity on small oligomers an aldouronic acid mixture containing aldotetraouronic, aldotriouronic and aldobiouronic acids (Megazyme) was used. The enzyme is added to this substrate in a dosage of 10 mg protein/g substrate, which is then incubated at 60°C for 24 hours. The reaction is stopped by heating the samples for 10 minutes at 100°C. The release of xylose, xylobiose and xylotriose is analyzed by High Performance Anion Exchange Chromatography.
[00299] The analysis is performed using a Dionex HPLC system equipped with a Dionex CarboPac PA-1 (2 mm ID x 250 mm) column in combination with a CarboPac PA guard column (2 mm ID x 50 mm) and a Dionex PAD- detector (Dionex Co. Sunnyvale). A flow rate of 0.3 mL/min is used with the following gradient of sodium acetate in 0.1 M NaOH: 0-40 min, 0-400 mM. Each elution is followed by a washing step of 5 min 1000 mM sodium acetate in 0.1 M NaOH and an equilibration step of 15 min 0.1 M NaOH.
[00300] Standards of xylose, xylobiose and xylotriose (Sigma) are used to identify the xylooligomers released by the action of the enzyme that removes 4-O-methyl-glucuronic acid from these oligomers.
16.3b Acetyl-xylan esterase activity assay
[00301] Acetyl-xylan esterases are enzymes able to hydrolyze ester linked acetyl groups attached to the xylan backbone, releasing acetic acid. This assay measures the release of acetic acid by the action of acetyl xylan esterase on acid pretreated corn stover (aCS) that contains ester linked acetyl groups.
Determine the presence of acetyl groups in pCS
[00302] The aCS used contains ± 284 (± 5.5) μg acetic acid/ 20 mg pCS as determined according to the following method.
[00303] About 20 mg of aCS substrate was weighed in a 2 mL reaction tube and placed in an ice-water bath. Then 1 mL of 0.4M NaOH in Millipore water/ isopropanol (1 :1) was added and the sample was thoroughly mixed. This was incubated on ice for 1 hour. Subsequently, the samples were mixed again and incubated for 2 additional hours at room temperature (mixed once in a while). After this samples were centrifuged for 5 min at 12000 rpm and the supernatant was analyzed for acetic acid content by HPLC.
Enzyme incubations
[00304] Enzyme incubations were performed in citrate buffer (0.05 M, pH 4.5) which is prepared as follows; 147 g of tri-sodium citrate is dissolved in distilled water to a final volume of 1000 mL (Solution A). In a separate flask, 10.5 g citric acid monohydrate is mixed with distilled water to make the total volume of 1000 mL (Solution B). The final 0.05 M sodium citrate buffer, pH 4.5, is prepared by mixing Solution A with Solution B until the pH of the resulting solution is 4.5.
[00305] The aCS substrate is solved in citrate buffer to obtain ± 20 mg/mL. The enzyme is added to the substrate in a dosage of 1 or 10 mg protein/ g substrate, which is then incubated at 60°C for 24 hours head-over-tail. The reaction is stopped by heating the samples for 10 minutes at 100°C. The release of acetic acid is analyzed by HPLC.
[00306] As a blank sample the substrate is treated and incubated in the same way but then without the addition of enzyme.
[00307] The analysis is performed using an Ultimate 3000 system (Dionex) equipped with a Shodex Rl detector and an Aminex HPX 87H column (7.8 mm ID x 300 mm) column (BioRad). A flow rate of 0.6 mL/min is used with 5.0 mM H2S04 as eluent for 30 minutes at a column temperature of 40°C. Acetic acid was used as a standard to quantify its release from pCS by the enzymes.
16.4 Endo-xylanase activity assay 1
[00308] Endoxylanases are enzymes able to hydrolyze β-1 ,4 bonds in the xylan backbone, producing short xylooligosaccharides. This assay measures the release of xylose and xylo-oligosaccharides by the action of
xylanases on wheat arabinoxylan oligosaccharides (WAX) (Megazyme, Medium viscosity 29 cSt) and Beech Wood Xylan (Beech) (Sigma).
[00309] Sodium acetate buffer (0.05 M, pH 4.5) is prepared as follows; 4.1 g of anhydrous sodium acetate is dissolved in distilled water to a final volume of 1000 mL (Solution A). In a separate flask, 3.0 g (2.86 mL) of glacial acetic acid is mixed with distilled water to make the total volume of 1000 mL (Solution B). The final 0.05 M sodium acetate buffer, pH 4.5, is prepared by mixing Solution A with Solution B until the pH of the resulting solution is 4.5.
[00310] The substrates WAX and Beech are solved in sodium acetate buffer to obtain 2.0 mg/mL. The enzyme is added to the substrate in a dosage of 10 mg protein/ g substrate which is then incubated at 65°C for 24 hours. The reaction is stopped by heating the samples for 10 minutes at 100°C. The release of xylose and (arabino)xylan oligosaccharides is analyzed by High Performance Anion Exchange Chromatography.
[00311] As a blank sample the substrate is treated and incubated in the same way but then without the addition of enzyme.
[00312] The analysis is performed using a Dionex HPLC system equipped with a Dionex CarboPac PA-1 (2 mm ID x 250 mm) column in combination with a CarboPac PA guard column (2 mm ID x 50 mm) and a Dionex PAD- detector (Dionex Co. Sunnyvale). A flow rate of 0.3 mL/min is used with the following gradient of sodium acetate in 0.1 M NaOH: 0-40 min, 0-400 mM. Each elution is followed by a washing step of 5 min 1000 mM sodium acetate in 0.1 M NaOH and an equilibration step of 15 min 0.1 M NaOH. Standards of xylose, xylobiose, xylotriose and xylotetraose (Sigma) are used to identify and quantify these oligomers released by the action of the enzyme.
16.5 Endo-xylanase activity assay 2
[00313] Endo-xylanases are enzymes able to hydrolyze beta- 1 ,4 bonds in the xylan backbone, producing short xylooligosaccharides. This assay measures the release of xylose and xylo-oligosaccharides by the action of xylanases on wheat arabinoxylan oligosaccharides (WAX) (Megazyme, Medium viscosity 29 cSt).
[00314] Sodium acetate buffer (0.05 M, pH 4.5) is prepared as follows: 4.1 g of anhydrous sodium acetate is dissolved in distilled water to a final volume of 1000 mL (Solution A). In a separate flask, 3.0 g (2.86 mL) of glacial acetic acid is mixed with distilled water to make the total volume of 1000 mL (Solution B). The final 0.05 M sodium acetate buffer, pH 4.5, is prepared by mixing Solution A with Solution B until the pH of the resulting solution is 4.5.
[00315] The substrate WAX is solved in sodium acetate buffer to obtain 2.0 mg/mL. The enzyme is added to the substrate in a dosage of 1 mg protein/ g substrate which is then incubated at 65°C for 24 hours. During these 24 hours, samples are taken and the reaction is stopped by heating the samples for 10 minutes at 100°C.
[00316] The enzyme activity is demonstrated by using a reducing sugars assay (PAHBAH) as detection method.
[00317] Reagent A: 5 g of p-Hydroxybenzoic acid hydrazide (PAHBAH) is suspended in 60 mL water, 4.1 mL of concentrated hydrochloric acid is added and the volume is adjusted to 100 mL. Reagent B: 0.5 M sodium hydroxide.
Both reagents are stored at room temperature. Working Reagent: 10 mL of Reagent A is added to 40 mL of Reagent B. This solution is prepared freshly every day, and is stored on ice between uses. Using the above reagents, the assay is performed as detailed below.
[00318] The assay is conducted in microtiter plate format. After incubation 10 μ L of each sample is added to a well and mixed with 150 μί. working reagent. These solutions are heated at 70°C for 30 minutes or for 5 minutes at 90°C. After cooling down, the samples are analyzed by measuring the absorbance at 405 nm. The standard curve is made by treating 10 L of an appropriate diluted xylose solution the same way as the samples. The reducing-ends formed due to the action of enzyme is expressed as xylose equivalents.
[00319] Rasamsonia (Talaromyces) emersonii strain was deposited at CENTRAAL BUREAU VOOR SCHIMMELCULTURES, Uppsalalaan 8, P.O. Box 85167, NL-3508 AD Utrecht, The Netherlands in December 1964 having the Accession Number CBS 393.64.
[00320] Other suitable strains can be equally used in the present examples to show the effect and advantages of the invention. For example TEC-101 , TEC-147, TEC-192, TEC-201 or TEC-210 are suitable Rasamsonia strains which are described in WO 2011/000949. The "4E mix" or "4E composition" was used containing CBHI, CBHII, EG4 and BG (30wt%, 25wt%, 28wt% and 8wt%, respectively, as described in WO 2011/098577, wt% on dry matter protein).
[00321] Rasamsonia (Talaromyces) emersonii strain TEC-101 (also designated as FBG 101) was deposited at CENTRAAL BUREAU VOOR SCHIMMELCULTURES, Uppsalalaan 8, P.O. Box 85167, NL-3508 AD Utrecht, The Netherlands on 30th June 2010 having the Accession Number CBS 127450.
[00322] TEC-210 was fermented according to the inoculation and fermentation procedures described in WO 2011/000949.
[00323] The 4E mix (4 enzymes mixture or 4 enzyme mix) containing CBHI, CBHII, GH61 and BG (30%, 25%, 36% and 9%, respectively as described in WO 2011/098577) was used.
[00324] 3E mix (3 enzymes mixture or 3 enzyme mix) is spiked with a fourth enzyme to form the 4E mix. 16.6 Xyloglucanase activity assay
[00325] Sodium acetate buffer (0.05 M, pH 4.5) is prepared as follows: 4.1 g of anhydrous sodium acetate is dissolved in distilled water to a final volume of 1000 mL (Solution A). In a separate flask, 3.0 g (2.86 mL) of glacial acetic acid is mixed with distilled water to make the total volume of 1000 mL (Solution B). The final 0.05 M sodium acetate buffer, pH 4.5, is prepared by mixing Solution A with Solution B until the pH of the resulting solution is 4.5.
[00326] Tamarind xyloglucan is dissolved in sodium acetate buffer to obtain 2.0 mg/mL. The enzyme is added to the substrate in a dosage of 10 mg protein/ g substrate, which is then incubated at 60°C for 24 hours. The reaction
is stopped by heating the samples for 10 minutes at 100°C. The formation of lower molecular weight oligosaccharides is analyzed by High Performance size-exclusion Chromatography
[00327] As a blank sample, the substrate is treated and incubated in the same way but then without the addition of enzyme.
[00328] The analysis is performed using High-performance size-exclusion chromatography (HPSEC) performed on three TSK-gel columns (6.0 mm x 15.0 cm per column) in series SuperAW4000, SuperAW3000, SuperAW2500;Tosoh Bioscience), in combination with a PWXguard column (Tosoh Bioscience). Elution is performed at 55°C with 0.2 M sodium nitrate at 0.6 mL/min. The eluate was monitored using a Shodex RI-101 (Kawasaki) refractive index (Rl) detector. Calibration was performed by using pullulans (Associated Polymer Labs Inc., New York, USA) with a molecular weight in the range of 0.18-788 kDa.
[00329] The enzyme samples (supernatants) used in Examples 16.7 to 16.17 and in the further protein characterizations below were prepared either on microtiter plate scale (Example 10) or shake flask scale (Example 11) from A. niger fermentations as described above. Supernatants were buffer-exchanged by repeated concentration using ultrafiltration followed by dilution with 10 mM sodium citrate buffer, pH 5.0. Microtiter plate scale samples were adjusted to the original volume of supernatant, while samples from shake flask cultures were adjusted to a final volume approximately 1/5 of the original supernatant volume.
16.7 Assay Protocol CU1 : Colorimetric assay for glycosidase or esterase activity, measuring release of 4-nitrophenol
[00330] Enzyme sample is diluted in 10 mM citrate buffer, pH 5.0, made by dissolving 1.92 g of citric acid in water, adjusting pH to 5.0 with 10 M NaOH and diluting to 1 L. 10 μί. of diluted enzyme sample is added to 30 μί. of 50 mM acetate-phosphate-borate reaction buffer at appropriate pH (made by dissolving 2.88 mL 99.7% glacial acetic acid, 3.42 mL 85% phosphoric acid, and 3.10 g boric acid in water, adjusting pH with 10 M NaOH and diluting to 1 L) in a PCR plate and preheated to appropriate temperature in a dry bath heater, and reaction is started by addition of 10 L of preheated 5 mM substrate in water (see Table 5) to buffer and sample. Standards contain 10 L of 4- nitrophenol (from 0 to 3 mM; 3 mM solution is made by dissolving 139 mg 4-nitrophenol in isopropyl alcohol and diluting 300 \ii of resulting 100 mM solution to 10 mL in water) and 40 \ii of reaction buffer. Sample blank contains 10 μ L of enzyme sample and 40 \ii of reaction buffer. Substrate blank contains 10 \ii of substrate (see table) and 40 μί of reaction buffer. After appropriate incubation time, 50 \ii of [1] for 4-nitrophenyl acetate, 1 M HEPES buffer pH 8 in water; [2] for 4-nitrophenyl butyrate, 250 mM Na2C03 in water; [3] for all other substrates, 1 M Na2C03 in water is added. 80 \ii is then transferred to a clear microtiter flat-bottomed plate, absorbance is read at 410 nm and compared to the standard curve. One unit is defined as the amount of enzyme that releases one micromole of 4-
nitrophenol per minute at the specified pH and temperature. (Adapted from Holmsen et al., (1989) Methods in Enzymology, 169, 336-342.)
Table 5: CU1-1
16.8 Assay Procedure CU2: Colorimetric assay for endo-glvcanase activity, measuring copper (I) reduced by polysaccharide reducing ends
[00331] Enzyme sample is diluted in 10 mM citrate buffer, pH 5.0, made by dissolving 1.92 g of citric acid in water, adjusting pH to 5.0 with 10 M NaOH and diluting to 1 L. 10 uL of diluted sample is added to 30 uL of either [1] 50 mM acetate-phosphate-borate reaction buffer at appropriate pH (made by dissolving 2.88 mL 99.7% glacial acetic acid, 3.42 mL 85% phosphoric acid, and 3.10 g boric acid in water, adjusting pH with 10 M NaOH and diluting to 1 L) or [2] for enzymes that utilize calcium, 50 mM acetate-MOPS-borate reaction buffer at appropriate pH (made by dissolving 2.88 mL 99.7% glacial acetic acid, 10.45 g MOPS, and 3.10 g boric acid in water, adjusting pH with 10 M NaOH and diluting to 1 L) in a PCR plate and preheated to appropriate temperature in a dry bath heater. The reaction is started by addition of 10 of preheated substrate in water (see Table 6: CU-2.1) to buffer and sample. Standards contain 10 \ii of 0 to 7.5 mM monosaccharide solution (see Table 6: CU-2.1) in water and 40 \ii of reaction buffer. Enzyme sample blank contains 10 of sample and 40 \ii of reaction buffer. Substrate blank contains 10 \ii of substrate (see Table 6: CU-2.1) and 40 \ii of reaction buffer. After appropriate incubation time, 10 μί is removed and added to another PCR plate containing 95 \ii of BCA Reagent A (made by dissolving 0.543 g Na2C03, 0.242 g NaHC03 and 19 mg disodium 2,2'-bicinchoninate in water and diluting to 1 L) and 95 μί of BCA Reagent B (made by dissolving 12 mg CUSO4 and 13 mg L-Serine in water and diluting to 1 L), sealed and incubated in a dry bath heater for 25 minutes at 80°C. PCR plate is put on ice for 5 minutes, then 160 μί. is transferred to a clear microtiter flat-bottomed plate, absorbance is read at 562 nm and compared to the standard curve. One unit is defined as the amount of enzyme that releases one micromole of monosaccharide-equivalent reducing ends per minute at the specified pH and temperature. (Adapted from Fox et al., (1991) Anal. Biochem., 195, 93-96.) Colloidal chitin is prepared by mixing 10 g chitin from crab shell in 100 mL concentrated hydrochloric acid, stirring overnight at room temperature, then adding 1 L cold distilled water, filtering resulting suspension through Whatman No. 1 paper
washing retentate with distilled water until pH is greater than 4, determining dry weight by gravimetry and diluting to 1 % solution with distilled water. (Adapted from Shimahara et al., (1988) Methods in Enzymology 161 , 417-423.)
Table 6: CU-2.1
16.9 Assay procedure CU3: UV assay for acetylesterase activity, measuring release of alpha- naphthol
[00332] Enzyme sample is diluted in 10 mM citrate buffer, pH 5.0, made by dissolving 1.92 g of citric acid in dH20, adjusting pH to 5.0 with 10 M NaOH and diluting to 1 L. 20 μί. of diluted sample is added to 20 μί. of 300 mM acetate-phosphate-borate reaction buffer at appropriate pH (made by dissolving 17.28 mL 99.7% glacial acetic acid, 20.52 mL 85% phosphoric acid, and 18.6 g boric acid in water, adjusting pH with 10 M NaOH and diluting to 1 L) in a clear microtiter plate and preheated to appropriate temperature in the plate reader. The reaction is started by addition of 160 \ii 0.5 mM alpha-naphthyl acetate substrate solution in water (prepared by diluting 46.55 mg of a-Naphthyl acetate in 1 ml of acetone and then transferring to 499 mL of water), preheated to assay temperature in a dry block heater, to the buffer and enzyme sample. Standards contain 180 \ii of 0 to 0.1 mM alpha-naphthol in water and 20 μί of reaction buffer. Blank contains 20 \ii of reaction buffer, 20 \ii of water and 160 \ii of substrate solution. Absorbance is continuously monitored at 303 nm and compared to that of the standards. One unit is the amount of enzyme that produces one micromole of alpha-naphthol per minute under the specified conditions. (Adapted from Yuorno et al., (1981), Anal. Biochem. 115, 188-193).
16.10 Assay procedure CU4: Polarimetric assay for aldose 1-epimerase activity, measuring the rate increase of the mutarotation of alpha-D-glucose
[00333] 5 mM phosphate reaction buffer (prepared by dissolving 342 μί. 85% phosphoric acid in water, adjusting pH to 5.0 with 1 M NaOH and diluting to 1 L) is preheated to 40°C. A Perkin-Elmer 341 polarimeter (USA) with sodium/halogen and mercury lamps is preheated to 40°C and blanked by measuring the optical rotation of polarized 578 nm light by 5 mL reaction buffer. 36 mg of alpha-D-Glucose is dissolved in 10 mL of reaction buffer, then 60 μί. of undiluted enzyme is added to 4.94 mL of the resulting solution and optical rotation is immediately measured in the polarimeter. Readings are recorded at 40°C every minute until equilibrium is reached. One unit is the amount of enzyme that converts one micromole of alpha-D-glucose to beta-D-glucose (calculated by determining the reaction's first-order rate constant less that of the blank) in one minute. (Adapted from Bailey et al., (1975), Methods in Enzymology 41 , 471-484.)
16.11 Assay procedure CU6: UV assay of lyase activity, measuring formation of unsaturated bonds
[00334] Enzyme sample is diluted in 50 mM acetate-MOPS-borate reaction buffer at appropriate pH (made by dissolving 2.88 mL 99.7% glacial acetic acid, 10.45 g MOPS, 3.10 g boric acid and 1.11 g calcium chloride in water, adjusting pH with 10 M NaOH and diluting to 1 L) and left to equilibrate for 30 minutes at room temperature. Reaction buffer is mixed in a 1 :1 ratio with substrate solution (1% polygalacturonic acid in water or 0.75% Rhamnogalacturonan I from potato in water) and preheated to reaction temperature in a dry bath heater (if reaction temperature is greater than plate reader maximum temperature) or in a microtiter plate in plate reader. Reaction is started by addition of 10 \ii of diluted enzyme sample to 240 \ii of reaction buffer/substrate in UV-transparent microtiter flat-bottomed plate. Blank contains 10 \ii of reaction buffer added to 240 \ii of reaction buffer/substrate solution. Absorbance at 235 nm is continuously monitored, and the molar absorptivity coefficient of unsaturated galacturonic acid is used to determine activity. One unit is the amount of enzyme that releases one micromole of unsaturated galacturonic acid equivalents per minute under the specified conditions. Adapted from Hansen et al., (2001) J. AOAC International, 84, 1851-1854).
16.12 Assay procedure CU7: Fluorescence assay, measuring release of 4-methylumbelliferone
[00335] Enzyme sample is diluted in 10 mM citrate buffer, pH 5.0, made by dissolving 1.92 g of citric acid in water, adjusting pH to 5.0 with 10 M NaOH and diluting to 1 L. 10 μί of diluted sample is added to 30 μί of 50 mM acetate-phosphate-borate reaction buffer at appropriate pH (made by dissolving 2.88 mL 99.7% glacial acetic acid, 3.42 mL 85% phosphoric acid, and 3.10 g boric acid in water, adjusting pH with 10 M NaOH and diluting to 1 L) in a PCR plate and preheated to appropriate temperature in a dry bath heater. The reaction is started by addition of 10 of preheated 1 mM substrate in water (made by diluting 5.0 mg of 4-methylumbelliferyl cellobioside or 4- methylumbelliferyl lactoside in 10 mL water) to buffer and sample. Standards contain 10 \ii of 4-methylumbelliferone (from 0 to 50 uM; 19.8 mg of 4-methylumbelliferone sodium salt is dissolved in 100 mL methanol and resulting solution is diluted 20X in water) and 40 \ii of reaction buffer. Enzyme sample blank contains 10 \ii of enzyme sample
and 40 μί. of reaction buffer. Substrate blank contains 10 L of substrate and 40 μί of reaction buffer. After appropriate incubation time, 20 \ii is removed and added to a black microtiter plate containing 180 \ii of glycine/carbonate buffer, pH 107 (made by dissolving 10 g glycine and 8.8 g sodium carbonate in water, adjusting pH with 10 M NaOH and diluting to 1 L). The fluorescence of the wells is measured at 355 nm excitation, 460 nm emission and compared to the standard curve. One unit is defined as the amount of enzyme that releases one micromole of 4-methylumbelliferone per minute. (Adapted from van Tilbeurgh et al. (1988), Methods in Enzymology 160: 45-59.)
16.13 Assay procedure CU8: Spectrophotometric assay of acetylxylanesterase activity, measuring release of acetic acid
[00336] Enzyme sample is diluted in 10 mM citrate buffer, pH 5.0, made by dissolving 1.92 g of citric acid in dhbO, adjusting pH to 5.0 with 10 M NaOH and diluting to 1 L. 40 μί. of 1 % acetylated xylan from birchwood (prepared by Swern oxidation in a fume hood according to Johnson et al., (1988), Methods in Enzymology 160, 551- 560, taking appropriate precautions against toxic products released in the air and into water during dialysis) are added to 40 \ii of 50 mM phosphate reaction buffer (prepared by dissolving 3.42 mL of 85% phosphoric acid in water, adjusting pH to 6.0 with 10 M NaOH and diluting to 1 L) in the wells of a 96-well PCR plate and preheated to the appropriate temperature in a dry block heater. The reaction is started by adding 20 \ii of diluted sample to the wells containing substrate and reaction buffer. Standards contain 20 \ii of 0 mg/mL to 1 mg/mL acetic acid in water, and 80 μί. reaction buffer. Sample blank contains 20 μί. of diluted enzyme sample, 40 μί. of reaction buffer and 40 μί. of water. Substrate blank contains 20 \ii of substrate 40 \ii of reaction buffer and 40 \ii of water. After appropriate incubation time, the plate is heated to 90°C for 5 minutes and centrifuged 10 minutes at 1500 X g. The amount of acetic acid in the supernatant is then determined with the K-ACETAK™ kit by Megazyme; one unit is defined as the amount of enzyme required to release one micromole of acetic acid per minute under the specified conditions. (Adapted from Johnson et al., (1988), Methods in Enzymology 160, 551-560 and K-ACETAK™ assay kit procedure by Megazyme (Ireland)).
16.14 Assay procedure CU10: Colorimetric assay of cellobiose dehydrogenase, measuring reduction of DCIP
[00337] Enzyme sample is diluted in 10 mM citrate buffer, pH 5.0, made by dissolving 1.92 g of citric acid in water, adjusting pH to 5.0 with 10 M NaOH and diluting to 1 L. 10 μί. of diluted enzyme sample is added to 10 μί. of 48 mM sodium fluoride (made by dissolving 2 mg NaF in 10 mL water), 10 L of 3.6 mM 2,6-dichloroindophenol (DCIP, made by dissolving 9.6 mg in 10 mL water) and 80 \ii of 50 mM acetate-phosphate-borate reaction buffer at appropriate pH (made by dissolving 2.88 mL 99.7% glacial acetic acid, 3.42 mL 85% phosphoric acid, and 3.10 g boric acid in water, adjusting pH with 10 M NaOH and diluting to 1 L) in a clear microtiter flat-bottomed plate and
preheated to the appropriate temperature in a dry bath heater. Reaction is started by addition of 110 μί of 360 mM lactose (made by dissolving 1.23 g lactose in 100 mL water). Blank contains 10 μί. sample, 10 48 mM NaF, 10 μί. 3.6 mM DCIP, 80 μί. reaction buffer and 110 μί. water. Absorbance at 520 nm is continuously monitored and compared to the molar absorptivity coefficient of DCIP. One unit is the amount of enzyme that reduces one micromole of DCIP per minute under the specified assay conditions. (Adapted from Baminger et al., (2001), AppI Environ Microbiol, 67(4), 1766-1774.)
16.15 Assay procedure CU12: UV assay of alpha-glucuronidase activity, measuring NADH
[00338] Enzyme sample is diluted in 10 mM citrate buffer, pH 5.0, made by dissolving 1.92 g of citric acid in water, adjusting pH to 5.0 with 10 M NaOH and diluting to 1 L. Directions for the Megazyme microplate assay kit are followed, except the reaction buffer used is 50 mM acetate-phosphate-borate reaction buffer at appropriate pH (made by dissolving 2.88 mL 99.7% glacial acetic acid, 3.42 mL 85% phosphoric acid, and 3.10 g boric acid in water, adjusting pH with 10 M NaOH and diluting to 1 L). (Adapted from K-AGLUA™ assay kit procedure by Megazyme (Ireland)).
16.16 Protein activity-temperature profiles
[00339] Temperature optima are determined by first determining the range of enzyme concentration that reproducibly displays initial velocity kinetics at 40°C and at the enzyme's optimal pH (see Example 16.17) in the appropriate assay. Enzyme is then diluted to an amount within this range, divided into aliquots, and, where possible, each aliquot is assayed simultaneously at the different temperatures (e.g., when reaction is incubated in a dry bath heater, then transferred to a plate reader for endpoint measurement). Where simultaneous measurements at different temperatures are impossible (e.g., when reaction is incubated in a plate reader for continuous measurement) activities are measured in sequence at different temperatures.
16.17 Determination of pH optima
[00340] pH optima are determined by first determining the range of enzyme concentration that reproducibly displays initial velocity kinetics at standard pH and temperature (according to Tables 15, 16 and 18) for the appropriate assay. Enzyme is then diluted to an amount within this range, divided into aliquots, and each aliquot is assayed simultaneously at the different pHs.
Example 17: Identification of genes that encode secreted proteins
[00341] Genes (and polypeptides) from the organisms Thermoascus aurantiacus (Theau), Myceliophthora fergusii (Corynascus thermophilus) (Corth), and Pseudocercosporella herpotrichoides (Psehe) were identified that,
based on curation (described above, see Example 4), encoded a secreted protein. A list of these genes and polypeptides is shown in Tables 1A-1C.
Example 18: Improvement of thermophilic cellulase mixture by various proteins
[00342] The cellulase activity of THEAU_1_00024, THEAU_2_04931, Corth2p4_001043 was further analyzed. The supernatants of the corresponding A. niger expressing shake flask fermentations were concentrated and spiked in a dosage of 0.45 mg/gDM on top of a base activity of a three enzyme base mix (4.55 mg/gDM composed of: CBHI at 1.25 g/gDM, CBHII at 1.5 mg/gDM and GH61 at 1.8 mg/gDM) at a feedstock concentration of 10% (w/w) aCS, as described above. As a negative control, the 3 enzyme base mix was also tested. All experiments were performed at least in duplicate and were incubated for 72 hours at 65°C in an oven incubator (Techne HB-1 D hybridization oven) while rotating at set-point 3. After incubation, the samples were centrifuged and soluble sugars were analysed by HPLC as described above. Addition of these tested BG proteins showed increased sugar release as shown below.
Table 7: Effect of various proteins spiked on top of a 3E mix using aCS substrate
[00343] In a second experiment, the cellulase enhancing activity of a Thermoascus aurantiacus GH61 protein was further analysed. The supernatant of the A. niger expressing Theau2p4_004983 shake flask fermentation was concentrated and spiked in a dosage of 1.8 mg/gDM on top of a base activity of a three enzyme base mix (3.2 mg/gDM composed of: BG at 0.45 g/gDM, CBHI at 1.5 mg/gDM and CBHII at 1.25 mg/gDM) at a feedstock concentration of 10% (w/w) aCS, as described above. As a negative control, the 3 enzyme base mix was also tested. All experiments were performed at least in duplicate and were incubated for 72 hours at 65°C in an oven incubator (Techne HB-1 D hybridization oven) while rotating at set-point 3. After incubation, the samples were centrifuged and soluble sugars were analysed by HPLC as described above. Addition of this Thermoascus aurantiacus GH61 protein showed increased sugar release as shown below.
Table 8: Effect of GH61 protein Theau2p4_004983 spiked on top of a 3E mix using aCS substrate
Target ID SEQ ID NOs: Glucose (g/L)
Theau2p4_004983 56, 256, 456 37.7
3 enzyme mix 29.7
Bhd Whteec wooea
lbil xyan aranoxyan
Example 19: Characterization of thermophilic Thermoascus aurantiacus beta-xylosidases
[00344] The beta-xylosidase activity of some Thermoascus aurantiacus enzymes was analysed as described above. The supernatant of the A. niger shake flask fermentations was concentrated and assayed in two dosages for xylose release from xylobiose after incubation for 24 hours at pH 4.5 and 65°C. At least two enzymes showed significant xylose release from xylobiose as shown below.
Table 9: Effect of two Thermoascus aurantiacus beta-xylosidases on the release of xylose from xylobiose
Example 20: Characterization of thermophilic endo-xylanases
[00345] The endo-xylanase activity of several proteins of the present invention was analysed. The supernatant of the corresponding A. niger shake flask fermentations were concentrated and assayed for endo-xylanase activity on wheat arabinoxylan oligosaccharides and beech wood xylan as described above in endo-xylanase activity assay 1. The proteins THEAU_2_04829, CORTH_1_02177 and PSEHE_1_00010 were able to release xylose and xylose oligomers release from the two substrates after incubation for 24 hours with 1% (w/w) enzyme dose at pH 4.5 and 65°C as is shown below.
Table 10: Effect of various proteins on release of xylose & xylose oligomers
from Beech wood xylan & Wheat arabinoxylan
Amount released (Mg/mg substrate)
1% (w/w)
SEQ ID NOs: xylose xylobiose xylotriose xylotetraose protein/substrate
no enzyme 1.4 0.3 0.0 0.2
THEAU_2_04829 163, 363, 563 127.1 362.2 16.4 0.8
CORTH_1_02177 664, 953, 1242 82.6 356.1 3.0 0.4
PSEHE_1_00010 1633, 2157, 2681 51.0 277.2 88.6 0.9 no enzyme 0.5 0.0 0.0 0.1
THEAU_2_04829 163, 363, 563 38.8 76.7 3.2 0.0
CORTH_1_02177 664, 953, 1242 47.0 64.0 0.6 0.0
PSEHE_1_00010 1633, 2157, 2681 54.7 88.0 1.3 0.0
[00346] In a second experiment, the endo-xylanase activity of various proeins was analysed as described above in endo-xylanase activity assay 2 (Example 16.5). The supernatant of the corresponding A. niger shake flask fermentations were concentrated and assayed for endo-xylanase activity by measuring reducing-end formation expressed as xylose equivalents after incubation of the enzymes at 0.1% (w/w) dose on wheat arabinoxylan during 24 hours at 65 °C and pH 4.5. Porteins THEAU_2_04829, CORTH_1_02177 and PSEHE_1_00010 were able to release reducing sugars from the substrates as shown below.
Table 11 : Effect of various proteins on the release of reducing sugars
(reported as xylose equivalents) from Wheat arabinoxylan
Example 21 : Characaterization of thermophilic alpha-glucuronidases
[00347] The alpha-glucuronidase activity of several proteins was analysed. The supernatant of the corresponding A. niger shake flask fermentations were concentrated and assayed for alpha-glucuronidase activity on glucuronoxylo oligosaccharides as described above. Several enzymes were able to release xylose, xylobiose and xylotriose by releasing the 4-O-methyl-glucuronic acid residue from these oligomers after incubation for 24 hours with 1 % (w/w) enzyme dose at pH 4.5 and 60°C as shown below.
Table 12: Effect of alpha-glucuronidases on the release of xylose (X1), xylobiose (X2) and xylotriose (X3) from glucuronoxylooligomers
Example 22: Identification of thermophilic various arabino(furano)sidases
[00348] The arabino(furano)sidase activity of various enzymes was further analysed, as described above (Example 16.1). The supernatant of corresponding A. niger shake flask fermentations were concentrated and
assayed for arabinose release from wheat arabinoxylan, which was pre-digested by an endo-xylanase, after incubation for 24 hours at pH 4.5 and 65°C. Three enzymes showed increased arabinose release as shown below.
Table 13: Effect of various proteins on pre-digested wheat arabinoxylan substrate
Example 23: Characterization of thermophilic acetyl-xylan esterase
[00349] The acetyl-xylan esterase activity of CORTH 2p4_004688 was further analyzed. The supernatant of this A. niger shake flask fermentation was concentrated and assayed for acetic acid release from acid pretreated corn stover as described above (Example 16.3). This protein was identified as an active acetyl xylan esterase because it was able to release acetic acid from the substrate as is shown below.
Table 14: Effect of CORTH2p4_004688 enzyme on release of acetic acid from pretreated corn stover
Example 27: Further characterization of expressed proteins from Thermoascus auranticus
[00350] The Thermoascus auranticus proteins THEAU_1_00022, THEAU_1_00024, THEAU_1_00067, THEAU_1_00078, THEAU_1_00078, THEAU_1_00119, THEAU_1_00155, THEAU_2_01901, THEAU_2_04829, THEAU_2_04931, THEAU_3_00028, Theau2p4_000253, Theau2p4_000766, Theau2p4_003130, Theau2p4_005525, Theau2p4_006768, Theau2p4_007017, and Theau2p4_008902, were further characterized using the assay protocols and assay conditions indicated in Table 15.
Example 28: Further characterization of expressed proteins from Myceliophthora fergusii (Corynascus thermophilus)
[00351] The Myceliophthora fergusii (Corynascus thermophilus) proteins CORTH_1_01285, CORTH_1_01910, CORTH_1_01922, CORTH_1_01923, CORTH_1_02205, CORTH_1_02799, CORTH_1_02834, Corth2p4_000317, Corth2p4_000894, Corth2p4_000894, Corth2p4_001043, Corth2p4_002886, Corth2p4_003311,
Corth2p4_003344, Corth2p4_005378, Corth2p4_006231, Corth2p4_006773, and Corth2p4_006798, Corth2p4_007365, Corth2p4_008555, were further characterized using the assay protocols and assay conditions indicated in Table 16.
Example 29: Further characterization of expressed proteins from Pseudocercosporella herpotrichoides
[00352] The Pseudocercosporella herpotrichoides proteins PSEHE_1_00002, PSEHE_1_00122, PSEHE_1_00137, PSEHE_1_00176, PSEHE_1_00216, PSEHE_1_00218, PSEHE_1_00303, Psehe2p4_001564, and Psehe2p4_007460, were further characterized using the assay protocols and assay conditions indicated in Table 17.
Φ U, micromole product formed per minute under the indicated assay conditions
Table 16: Activities of expressed enzymes from Myceliophthora fergusii (Corynascus thermophilus)
Activity Fold Activity (U/mL)
Standard Temperature
Target ID Assay (U/mL) Φ at increase PH Φ at pH and
Substrate assay optimum
[SEQ ID NOs] Protocol standard over optimum temperature conditions (°C)
conditions control* optima
CORTH 1 01285 acetylated xylan from pH 5, 40 °C,
CU8 4.2 na 7 50 12.6 [660, 949, 12381 beechwood 0.4% 15 min
CORTH 1 01910 Xylan from beechwood, pH 5, 40 °C,
CU2 2.45 8.2 5
[679, 968, 12571 0.2% 30 min
CORTH 1 01922 4-methylumbelliferyl beta- pH 5, 40 °C,
CU7 0.042 28 5.5 60 0.015 [609, 898, 11871 D-cellobioside, 0.2 mM 30 min
CORTH 1 01923 4-methylumbelliferyl beta- pH 5, 40 °C,
CU7 0.046 31 5
[809,1098, 13871 D-cellobioside, 0.2 mM 30 min
CORTH 1 02205 Carboxymethylcellulose, pH 5, 40 °C,
CU2 17.5 58 5 55 31 [659, 948, 1237] 0.2% 30 min
CORTH 1 02799 Xylan from beechwood, pH 5, 40 °C,
CU2 1.9 6.4
[635, 924, 12131 0.2% 30 min
CORTH 1 02834 Carboxymethyl-linear pH 5, 40 °C,
CU2 1.3 16.25 5 60 2.55 [802, 1091, 1380] arabinan, 0.2% 30 min
Corth2p4 000317 pH 5, 40 °C,
CU2 Lichenan, 0.2% 2.1 23.4 4 40 1.4 [607, 896, 11851 30 min
Activity Fold Activity (U/mL)
Standard Temperature
Target ID Assay (U/mL) at increase PH Φ at pH and
Substrate assay optimum
[SEQ ID NOs] Protocol standard over optimum temperature conditions (°C)
conditions control* optima
Corth2p4 000894 acetylated xylan from pH 5, 40 °C,
CU8 5.8 na 7 50 12.6 [612, 901, 1190] beechwood 0.4% 15 min
Corth2p4 000894 alpha-naphthyl acetate, 0.4 pH 5, 30 °C,
CU3 4.9 na 6.5
[612, 901, 11901 mM continuous
Corth2p4 001043 4-nitrophenyl beta-D- pH 5, 40 °C,
CU1 0.61 12.2 4 60 1.67 [620, 909, 11981 glucopyranoside, 1 mM 30 min
Corth2p4 002886 Carboxymethylcellulose, pH 5, 40 °C,
CU2 1.4 4.7 5.5 60 4.58 [649, 938, 12271 0.2% 30 min
Corth2p4 003311 pH 5, 40 °C,
CU10 Lactose, 30 mM 35 na 5
[655, 944, 12331 continuous
Corth2p4 003344 Polygalacturonic acid, pH 5, 40 °C,
CU2 6.7 17 10
[658, 947, 12361 0.1% 30 min
Corth2p4 005378 alpha-D-Glucose, 10 pH 5, 40 °C,
CU4 63 na
[693, 982, 12711 umol/mL continuous
Corth2p4 006231 4-methylumbelliferyl beta- pH 5, 40 °C,
CU7 0.0285 19 5.5 55 0.044 [701, 990, 1279] D-cellobioside, 0.2 mM 30 min
Corth2p4 006773 pH 5, 40 °C,
CU10 Lactose, 30 mM 45 na
[711, 1000, 12891 continuous
Corth2p4 006798 4-methylumbelliferyl beta- pH 5, 40 °C,
CU7 0.016 11 6 55 0.0196 [712, 1001, 1290] D-cellobioside, 0.2 mM 30 min
Corth2p4 007365 Polygalacturonic acid, pH 8, 40 °C,
CU6 10.8 27 10 40 21.5 [730, 1019, 1308] 0.9% initial rate
Corth2p4 008555 Xylan from beechwood, pH 5, 40 °C,
CU2 41 137 6 50 18.2 [803, 1092, 13811 0.2% 30 min
* na, not applicable as control exhibited no detectable activity. Control is an equal volume of supernatant from a vector-only transformant
Φ U, micromole product formed per minute under the indicated assay conditions
Table 17: Activities of expressed enzymes from Pseudocercosporella herpotrichoides
Activity Fold Activity (U/mL)
Standard Temperature
Target ID Assay (U/mL) at increase PH Φ at pH and
Substrate assay optimum
[SEQ ID NOs] Protocol standard over optimum temperature conditions (°C)
conditions control* optima
PSEHE 1 00002 4-nitrophenyl beta-D- pH 5, 40 °C,
CU1 1.7 34 6 40 1.97 [1625, 2149, 26731 glucopyranoside, 1 mM 30 min
4-nitrophenyl N-acetyl-
PSEHE 1 00122 pH 5, 40 °C,
CU1 beta-D-glucosaminide, 1 36 180 3.5 55 96.5 [1749, 2273, 2797] 30 min
mM
PSEHE 1 00137 Polygalacturonic acid, pH 5, 40 °C,
CU2 5.8 14 5.5 40 6.7 [1577, 2101, 26251 sodium salt, 0.1% 30 min
PSEHE 1 00176 4-nitrophenyl beta-D- pH 5, 40 °C,
CU1 0.67 670 4.5 45 0.814 [1899, 2423, 2947] xylopyranoside, 1 mM 30 min
PSEHE 1 00216 4-nitrophenyl beta-D- pH 5, 40 °C,
CU1 2.1 10.5 3.5 45 2.21 [1593, 2117, 26411 galactopyranoside, 1 mM 30 min
PSEHE 1 00218 4-nitrophenyl beta-D- pH 5, 40 °C,
CU1 0.27 1.35 4 60 0.66 [1904, 2428, 29521 galactopyranoside, 1 mM 30 min
PSEHE 1 00303 4-methylumbelliferyl beta- pH 5, 40 °C,
CU7 0.008 5.3 5 40 0.0075 [1758, 2282, 28061 D-cellobioside, 0.2 mM 30 min
Psehe2p4 001564 4-nitrophenyl beta-D- pH 5, 40 °C,
CU1 1.825 36.5 5 55 2.96 [1503, 2027, 2551] glucopyranoside, 1 mM 30 min
Psehe2p4 007460 Xylan from birchwood, pH 5, 40 °C,
CU2 1.4 4.7
[1649, 2173, 26971 0.2% 30 min
PSEHE 1 00002 4-nitrophenyl beta-D- pH 5, 40 °C,
CU1 1.7 34 6 40 1.97 [1625, 2149, 26731 glucopyranoside, 1 mM 30 min
4-nitrophenyl N-acetyl-
PSEHE 1 00122 pH 5, 40 °C,
CU1 beta-D-glucosaminide, 1 36 180 3.5 55 96.5 [1749, 2273, 2797] 30 min
mM
PSEHE 1 00137 Polygalacturonic acid, pH 5, 40 °C,
CU2 5.8 14 5.5 40 6.7 [1577, 2101, 26251 sodium salt, 0.1% 30 min
PSEHE 1 00176 4-nitrophenyl beta-D- pH 5, 40 °C,
CU1 0.67 670 4.5 45 0.814 [1899, 2423, 29471 xylopyranoside, 1 mM 30 min
PSEHE 1 00216 4-nitrophenyl beta-D- pH 5, 40 °C,
CU1 2.1 10.5 3.5 45 2.21 [1593, 2117, 26411 galactopyranoside, 1 mM 30 min
Activity Fold Activity (U/mL)
Standard Temperature
Target ID Assay (U/mL) at increase PH Φ at pH and
Substrate assay optimum
[SEQ ID NOs] Protocol standard over optimum temperature conditions (°C)
conditions control* optima
PSEHE 1 00218 4-nitrophenyl beta-D- pH 5, 40 °C,
CU1 0.27 1.35 4 60 0.66 [1904, 2428, 2952] galactopyranoside, 1 mM 30 min
PSEHE 1 00303 4-methylumbelliferyl beta- pH 5, 40 °C,
CU7 0.008 5.3 5 40 0.0075 [1758, 2282, 28061 D-cellobioside, 0.2 mM 30 min
Psehe2p4 001564 4-nitrophenyl beta-D- pH 5, 40 °C,
CU1 1.825 36.5 5 55 2.96 [1503, 2027, 25511 glucopyranoside, 1 mM 30 min
Psehe2p4 007460 Xylan from birchwood, pH 5, 40 °C,
CU2 1.4 4.7
[1649, 2173, 26971 0.2% 30 min
* na, not applicable as control exhibited no detectable activity. Control is an equal volume of supernatant from a vector-only transformant
Φ U, micromole product formed per minute under the indicated assay conditions
Example 30: Determination of activity-temperature profiles
[00353] Activity-temperature profiles were determined according to the protocol in Example 16.16 for various proteins of the present invention (e.g., having an observed temperature optimum of 50°C or higher). Results for are shown in Figures 2-8 for various proteins from Thermoascus aurantiacus, Myceliophthora fergusii (Corynascus thermophiius), and Pseudocercosporeiia herpotrichoides, using the Assay Protocols and Assay Conditions indicated below in Tables 18-20.
Table 18: Activity-temperature profiles for various Thermoascus aurantiacus proteins
Table 19: Activity-temperature profiles for various Myceliophthora fergusii (Corynascus thermophiius) proteins
30 min
Table 20: Activity-temperature profiles for various Pseudocercosporeiia herpotrichoides proteins
[00354] Although the present invention has been described hereinabove by way of specific embodiments thereof, it can be modified, without departing from the spirit and nature of the subject invention as defined in the appended claims.
Claims
1. An isolated polypeptide which is:
(a) a polypeptide comprising the amino acid sequence of any one of SEQ ID NOs: 401-600, 1179- 1467, or 2516-3039;
(b) a polypeptide comprising an amino acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% amino acid sequence identity to the polypeptide defined in (a);
(c) a polypeptide comprising an amino acid sequence encoded by the nucleic acid sequence of any one of SEQ ID NOs: 201-400, 890-1178, or 1992-2514;
(d) a polypeptide comprising an amino acid sequence encoded by any one of the exonic nucleic acid sequences corresponding to the positions as defined in Tables 2A-2C;
(e) a polypeptide comprising an amino acid sequence encoded by a polynucleotide molecule that hybridizes under medium-high stringency conditions, high stringency conditions, or very high stringency conditions with the full-length complement of a polynucleotide molecule comprising the nucleic acid sequence defined in (c) or (d);
(f) a polypeptide comprising an amino acid sequence encoded by a polynucleotide molecule having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% nucleic acid sequence identity to a polynucleotide comprising the nucleic acid sequence defined in (c) or (d);
(g) a functional variant of the polypeptide defined in (a) comprising a substitution, deletion, and/or insertion at one or more residues; or
(h) a functional fragment of the polypeptide of any one of (a) to (g).
2. The isolated polypeptide of claim 1 , wherein said polypeptide has a corresponding function and/or protein activity according to Tables 1A-1C.
3. The isolated polypeptide of claim 1 or 2 comprising or consisting of the amino acid sequence of any one of SEQ ID NOs: 401-600, 1179-1467, or 2516-3039.
4. The isolated polypeptide of any one of claims 1 to 3, wherein said polypeptide is a recombinant polypeptide.
5. The isolated polypeptide of any one of claims 1 to 4 obtainable from a fungus.
6. The isolated polypeptide of any one of claims 1 to 5, wherein said fungus is from the genus Thermoascus, Myceliophthora (Corynascus), or Pseudocercosporella.
7. The isolated polypeptide of any one of claims 1 to 6, wherein said fungus is Thermoascus aurantiacus, Myceliophthora fergusii (Corynascus thermophiius), or Pseudocercosporella herpotrichoides.
8. An antibody that specifically binds to the isolated polypeptide of any one of claims 1 to 7.
9. An isolated polynucleotide molecule encoding the polypeptide of any one of claims 1 to 7.
10. An isolated polynucleotide molecule which is:
(a) a polynucleotide molecule comprising a nucleic acid sequence encoding the polypeptide of any one of SEQ ID NOs: 401-600, 1179-1467, or 2516-3039;
(b) a polynucleotide molecule comprising the nucleic acid sequence of any one of SEQ ID NOs: 1- 200, 601-889, or 1468-1991 ;
(c) a polynucleotide molecule comprising the nucleic acid sequence of any one of SEQ ID NOs: 201- 400, 890-1178, or 1992-2514;
(d) a polynucleotide molecule comprising any one of the exonic nucleic acid sequences corresponding to the positions as defined in Tables 2A-2C;
(e) a polynucleotide molecule comprising a nucleic acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% nucleic acid sequence identity to any one of the polynucleotide molecules defined in (a) to (d); or
(f) a polynucleotide molecule that hybridizes under medium-high stringency conditions, high stringency conditions, or very high stringency conditions with the full-length complement of any one of the polynucleotide molecules defined in (a) to (e).
11. The isolated polynucleotide molecule of claim 9 or 10 obtainable from a fungus.
12. The isolated polynucleotide molecule of claim 11 , wherein said fungus is from the genus Thermoascus, Myceliophthora (Corynascus), or Pseudocercosporella.
13. The isolated polynucleotide molecule of claim 12, wherein said fungus is Thermoascus aurantiacus, Myceliophthora fergusii (Corynascus thermophilus), or Pseudocercosporella herpotrichoides.
14. A vector comprising a polynucleotide molecule as defined in any one of claims 9 to 13.
15. The vector of claim 14 further comprising a regulatory sequence operatively linked to said polynucleotide molecule for expression of same in a suitable host cell.
16. The vector of claim 15, wherein said suitable host cell is a bacterial cell.
17. The vector of claim 15, wherein said suitable host cell is a fungal cell.
18. The vector of claim 17, wherein said fungal cell is a filamentous fungal cell.
19. A recombinant host cell comprising the polynucleotide molecule as defined in any one of claims 9 to 13, or a vector as defined in any one of claims 14 to 18.
20. The recombinant host cell of claim 19, wherein said cell is a bacterial cell.
21. The recombinant host cell of claim 19, wherein said cell is a fungal cell.
22. The recombinant host cell of claim 21 , wherein said fungal cell is a filamentous fungal cell.
23. A polypeptide obtainable by expressing the polynucleotide molecule of any one of claims 9 to 13, or the vector of any one of claims 14 to 18 in a suitable host cell.
24. A composition comprising the polypeptide of any one of claims 1 to 7 or 23, or the recombinant host cell of any one of claims 19 to 22.
25. The composition of claim 24 further comprising a suitable carrier.
26. The composition of claim 24 or 25 further comprising a substrate of said polypeptide.
27. The composition of claim 26, wherein said substrate is biomass.
A method for producing the polypeptide of any one of claims 1 to 7 or 23, said method comprising:
(a) culturing a strain comprising the polynucleotide molecule of any one of claims 9 to 13 or the vector of any one of claims 14 to 18 under conditions conducive for the production of said polypeptide; and
(b) recovering said polypeptide.
29. The method of claim 28, wherein said strain is a bacterial strain.
30. The method of claim 28, wherein said strain is a fungal strain.
31. The method of claim 30, wherein said fungal strain is a filamentous fungal strain.
32. A method for producing the polypeptide of any one of claims 1 to 7 or 23, said method comprising:
(a) culturing the recombinant host cell of any one of claims 19 to 22 under conditions conducive for the production of said polypeptide; and
(b) recovering said polypeptide.
33. A method for preparing a food product, said method comprising incorporating the polypeptide of any one of claims 1 to 7 or 23 during preparation of said food product.
34. The method of claim 33, wherein said food product is a bakery product.
35. Use of the polypeptide of any one of claims 1 to 7 or 23 for the preparation or processing of a food product.
36. The use of claim 33, wherein said food product is a bakery product.
37. The polypeptide of any one of claims 1 to 7 or 23 for use in the preparation or processing of a food product.
38. The polypeptide of claim 37, wherein said food product is a bakery product.
39. Use of the polypeptide of any one of claims 1 to 7 or 23 for the preparation of animal feed.
40. Use of the polypeptide of any one of claims 1 to 7 or 23 for increasing digestion or absorption of animal feed.
41. The use of claim 39 or 40, wherein said animal feed is a cereal-based feed.
42. The polypeptide of any one of claims 1 to 7 or 23 for the preparation of animal feed, or for increasing digestion or absorption of animal feed.
43. The polypeptide of claim 42, wherein said animal feed is a cereal-based feed.
44. Use of the polypeptide of any one of claims 1 to 7 or 23 for the production or processing of kraft pulp or paper.
45. The use of claim 44, wherein said processing comprises prebleaching.
46. The use of claim 44, wherein said processing comprises de-inking.
47. The polypeptide of any one of claims 1 to 7 or 23 for the production or processing of kraft pulp or paper.
48. The polypeptide of claim 47, wherein said processing comprises prebleaching or de-inking.
49. Use of the polypeptide of any one of claims 1 to 7 or 23 for processing lignin.
50. The polypeptide of any one of claims 1 to 7 or 23 for processing lignin.
51. Use of the polypeptide of any one of claims 1 to 7 or 23 for producing ethanol.
52. The polypeptide of any one of claims 1 to 7 or 23 for producing ethanol.
53. The use of any one of claims 35, 36, 40, 41 , 44 to 46, 49 and 51 in conjunction with cellulose or a cellulase.
54. Use of the polypeptide of any one of claims 1 to 7 or 23 for treating textiles or dyed textiles.
55. The polypeptide of any one of claims 1 to 7 or 23 for treating textiles or dyed textiles.
56. Use of the polypeptide of any one of claims 1 to 7 or 23 for degrading biomass or pretreated biomass.
57. The polypeptide of any one of claims 1 to 7 or 23 for degrading biomass or pretreated biomass.
Applications Claiming Priority (8)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201261714493P | 2012-10-16 | 2012-10-16 | |
| US201261714496P | 2012-10-16 | 2012-10-16 | |
| US201261714485P | 2012-10-16 | 2012-10-16 | |
| US61/714,485 | 2012-10-16 | ||
| US61/714,493 | 2012-10-16 | ||
| US61/714,496 | 2012-10-16 | ||
| US201261714999P | 2012-10-17 | 2012-10-17 | |
| US61/714,999 | 2012-10-17 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2014059541A1 true WO2014059541A1 (en) | 2014-04-24 |
Family
ID=50487381
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CA2013/050778 Ceased WO2014059541A1 (en) | 2012-10-16 | 2013-10-15 | Novel cell wall deconstruction enzymes of thermoascus aurantiacus, myceliophthora fergusii (corynascus thermophilus), and pseudocercosporella herpotrichoides, and uses thereof |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2014059541A1 (en) |
Cited By (28)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20140304859A1 (en) * | 2011-12-15 | 2014-10-09 | Novozymes Inc. | Polypeptides Having Endoglucanase Activity and Polynucleotides Encoding Same |
| CN104334572A (en) * | 2012-07-02 | 2015-02-04 | 诺维信公司 | Polypeptides having xylanase activity and polynucleotides encoding same |
| EP2780450A4 (en) * | 2011-11-15 | 2015-09-09 | Novozymes Inc | Polypeptides having cellobiohydrolase activity and polynucleotides encoding same |
| EP2794872A4 (en) * | 2011-12-19 | 2015-12-09 | Novozymes Inc | Polypeptides having beta-glucosidase activity and polynucleotides encoding same |
| EP2867248A4 (en) * | 2012-07-02 | 2015-12-30 | Novozymes As | POLYPEPTIDES HAVING XYLANASE ACTIVITY AND POLYNUCLEOTIDES ENCODING SAME |
| EP2867247A4 (en) * | 2012-06-29 | 2016-03-09 | Novozymes As | POLYPEPTIDES HAVING ACTIVITY PROMOTING CELLULOLYSIS AND POLYNUCLEOTIDES ENCODING SAME |
| WO2016106432A3 (en) * | 2014-12-22 | 2016-10-06 | Novozymes A/S | Endoglucanase variants and polynucleotides encoding same |
| US9957492B2 (en) | 2012-06-29 | 2018-05-01 | Novozymes A/S | Polypeptides having cellulolytic enhancing activity and polynucleotides encoding same |
| EP3219797A4 (en) * | 2014-11-12 | 2018-05-30 | Riken | Cellulase activator and method for saccharifying lignocellulosic biomass by using same |
| WO2018185181A1 (en) * | 2017-04-04 | 2018-10-11 | Novozymes A/S | Glycosyl hydrolases |
| US10435731B2 (en) | 2013-07-10 | 2019-10-08 | Glykos Finland Oy | Multiple proteases deficient filamentous fungal cells and methods of use thereof |
| WO2019234295A1 (en) | 2018-06-05 | 2019-12-12 | Teknologian Tutkimuskeskus Vtt Oy | Beta glucosidase with high glucose tolerance, high thermal stability and broad ph activity spectrum |
| WO2019234294A1 (en) | 2018-06-05 | 2019-12-12 | Teknologian Tutkimuskeskus Vtt Oy | Beta glucosidase with high glucose tolerance, high thermal stability and broad ph activity spectrum |
| US10513724B2 (en) | 2014-07-21 | 2019-12-24 | Glykos Finland Oy | Production of glycoproteins with mammalian-like N-glycans in filamentous fungi |
| WO2020002575A1 (en) * | 2018-06-28 | 2020-01-02 | Novozymes A/S | Polypeptides having pectin lyase activity and polynucleotides encoding same |
| EP3578647A4 (en) * | 2016-10-28 | 2020-04-22 | Feed Research Institute Chinese Academy of Agricultural Sciences | THERMOPHILIC POLYGALACTURONASE ACID TEPG28A, AND CODING GENE AND APPLICATION THEREOF |
| WO2020206058A1 (en) | 2019-04-02 | 2020-10-08 | Novozymes A/S | Process for producing a fermentation product |
| EP3634145A4 (en) * | 2017-06-09 | 2021-03-10 | Novozymes A/S | POLYPEPTIDE, USE AND PROCESS FOR HYDROLYSIS OF PROTEIN |
| WO2021055395A1 (en) | 2019-09-16 | 2021-03-25 | Novozymes A/S | Polypeptides having beta-glucanase activity and polynucleotides encoding same |
| WO2021207687A1 (en) * | 2020-04-10 | 2021-10-14 | Liberty Biosecurity Llc | Polypeptide compositions and uses thereof |
| WO2022225915A1 (en) * | 2021-04-19 | 2022-10-27 | The Regents Of The University Of California | Inhibitory rna for the control of phytopathogens |
| WO2023137417A3 (en) * | 2022-01-17 | 2023-08-24 | University Of Washington | De novo designed luciferase |
| WO2023203080A1 (en) | 2022-04-20 | 2023-10-26 | Novozymes A/S | Process for producing free fatty acids |
| WO2023225459A2 (en) | 2022-05-14 | 2023-11-23 | Novozymes A/S | Compositions and methods for preventing, treating, supressing and/or eliminating phytopathogenic infestations and infections |
| CN117551195A (en) * | 2024-01-12 | 2024-02-13 | 杭州畅溪制药有限公司 | VHH nanobody targeting TSLP and application thereof |
| WO2024137250A1 (en) * | 2022-12-19 | 2024-06-27 | Novozymes A/S | Carbohydrate esterase family 3 (ce3) polypeptides having acetyl xylan esterase activity and polynucleotides encoding same |
| EP4291634A4 (en) * | 2021-02-10 | 2025-01-01 | Novozymes A/S | Polypeptides having pectinase activity, polynucleotides encoding same, and uses thereof |
| WO2025036987A1 (en) * | 2023-08-15 | 2025-02-20 | Novozymes A/S | Polypeptides having alkaline phosphatase activity for animal feed |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2004067709A2 (en) * | 2003-01-17 | 2004-08-12 | Elitra Pharmaceuticals, Inc. | Identification of essential genes of aspergillus fumigatus and methods of use |
-
2013
- 2013-10-15 WO PCT/CA2013/050778 patent/WO2014059541A1/en not_active Ceased
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2004067709A2 (en) * | 2003-01-17 | 2004-08-12 | Elitra Pharmaceuticals, Inc. | Identification of essential genes of aspergillus fumigatus and methods of use |
Non-Patent Citations (6)
| Title |
|---|
| DATABASE NCBI 26 March 2008 (2008-03-26), NIERMAN, W.C.: "''Neosartorya fischeri NRRL 181 5'/3'-nucleotidase SurE family protein (NFIA 044430) partial mRNA''. D", accession no. M 001267520.1. * |
| DATABASE NCBI 3 March 2011 (2011-03-03), SAWANO, T. ET AL.: "5'/3'-nucleotidase SurE family protein [Aspergillus oryzae RIB40]" * |
| DATABASE NCBI GENBANK "Acid phosphatase precursor [Aspergillus kawachii IFO 4308]", accession no. AA87042.1 * |
| DATABASE NCBI REFSEQ "Aspergittus fumigatus Af293 acid phosphatase (AFUA 4G01070), partial mRNA", accession no. M 741257 * |
| FUTAGAMI, T. ET AL.: "Genome sequence ofthe white koji mold Aspergillus kawachii IFO 4308, used for brewing the Japanese distilled spirit shochu", EUKARYOTIC CELL., vol. 10, 2011, pages 1586 - 1587 * |
| NIERMAN, W.C. ET AL.: "Genomic sequence of the pathogenic and allergenic filamentous fungus Aspergillus fumigatus", NATURE., vol. 438, 2005, pages 1151 - 1156 * |
Cited By (42)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP2780450A4 (en) * | 2011-11-15 | 2015-09-09 | Novozymes Inc | Polypeptides having cellobiohydrolase activity and polynucleotides encoding same |
| US20140304859A1 (en) * | 2011-12-15 | 2014-10-09 | Novozymes Inc. | Polypeptides Having Endoglucanase Activity and Polynucleotides Encoding Same |
| US9771568B2 (en) | 2011-12-19 | 2017-09-26 | Novozymes, Inc. | Polypeptides having beta-glucosidase activity and polynucleotides encoding same |
| EP2794872A4 (en) * | 2011-12-19 | 2015-12-09 | Novozymes Inc | Polypeptides having beta-glucosidase activity and polynucleotides encoding same |
| US9957492B2 (en) | 2012-06-29 | 2018-05-01 | Novozymes A/S | Polypeptides having cellulolytic enhancing activity and polynucleotides encoding same |
| EP2867247A4 (en) * | 2012-06-29 | 2016-03-09 | Novozymes As | POLYPEPTIDES HAVING ACTIVITY PROMOTING CELLULOLYSIS AND POLYNUCLEOTIDES ENCODING SAME |
| EP2867248A4 (en) * | 2012-07-02 | 2015-12-30 | Novozymes As | POLYPEPTIDES HAVING XYLANASE ACTIVITY AND POLYNUCLEOTIDES ENCODING SAME |
| CN104334572A (en) * | 2012-07-02 | 2015-02-04 | 诺维信公司 | Polypeptides having xylanase activity and polynucleotides encoding same |
| US10435731B2 (en) | 2013-07-10 | 2019-10-08 | Glykos Finland Oy | Multiple proteases deficient filamentous fungal cells and methods of use thereof |
| US10988791B2 (en) | 2013-07-10 | 2021-04-27 | Glykos Finland Oy | Multiple proteases deficient filamentous fungal cells and methods of use thereof |
| US10724063B2 (en) | 2013-07-10 | 2020-07-28 | Glykos Finland Oy | Multiple proteases deficient filamentous fungal cells and methods of use thereof |
| US10544440B2 (en) | 2013-07-10 | 2020-01-28 | Glykos Finland Oy | Multiple protease deficient filamentous fungal cells and methods of use thereof |
| US10513724B2 (en) | 2014-07-21 | 2019-12-24 | Glykos Finland Oy | Production of glycoproteins with mammalian-like N-glycans in filamentous fungi |
| EP3219797A4 (en) * | 2014-11-12 | 2018-05-30 | Riken | Cellulase activator and method for saccharifying lignocellulosic biomass by using same |
| WO2016106432A3 (en) * | 2014-12-22 | 2016-10-06 | Novozymes A/S | Endoglucanase variants and polynucleotides encoding same |
| EP3578647A4 (en) * | 2016-10-28 | 2020-04-22 | Feed Research Institute Chinese Academy of Agricultural Sciences | THERMOPHILIC POLYGALACTURONASE ACID TEPG28A, AND CODING GENE AND APPLICATION THEREOF |
| US10920206B2 (en) * | 2016-10-28 | 2021-02-16 | Feed Research Institute, Chinese Academy Of Agricultural Sciences | Acidic thermophilic polygalacturonase TEPG28A, and encoding gene and application thereof |
| US20200199557A1 (en) * | 2016-10-28 | 2020-06-25 | Feed Research Institute, Chinese Academy Of Agricultural Sciences | Acidic thermophilic polygalacturonase tepg28a, and encoding gene and application thereof |
| WO2018185181A1 (en) * | 2017-04-04 | 2018-10-11 | Novozymes A/S | Glycosyl hydrolases |
| CN110651029A (en) * | 2017-04-04 | 2020-01-03 | 诺维信公司 | Glycosyl hydrolase |
| CN110651029B (en) * | 2017-04-04 | 2022-02-15 | 诺维信公司 | glycosyl hydrolase |
| US11339355B2 (en) | 2017-04-04 | 2022-05-24 | Novozymes A/S | Glycosyl hydrolases |
| EP3634145A4 (en) * | 2017-06-09 | 2021-03-10 | Novozymes A/S | POLYPEPTIDE, USE AND PROCESS FOR HYDROLYSIS OF PROTEIN |
| US11946079B2 (en) | 2017-06-09 | 2024-04-02 | Novozymes A/S | Method for producing a protein hydrolysate using an endopeptidase and a carboxypeptidase |
| US11254919B2 (en) | 2017-06-09 | 2022-02-22 | Novozymes A/S | Polynucleotide encoding polypeptide having carboxypeptidase activity |
| US11371032B2 (en) | 2018-06-05 | 2022-06-28 | Teknologian Tutkimuskeskus Vtt Oy | Beta glucosidase with high glucose tolerance, high thermal stability and broad PH activity spectrum |
| WO2019234294A1 (en) | 2018-06-05 | 2019-12-12 | Teknologian Tutkimuskeskus Vtt Oy | Beta glucosidase with high glucose tolerance, high thermal stability and broad ph activity spectrum |
| WO2019234295A1 (en) | 2018-06-05 | 2019-12-12 | Teknologian Tutkimuskeskus Vtt Oy | Beta glucosidase with high glucose tolerance, high thermal stability and broad ph activity spectrum |
| WO2020002575A1 (en) * | 2018-06-28 | 2020-01-02 | Novozymes A/S | Polypeptides having pectin lyase activity and polynucleotides encoding same |
| WO2020206058A1 (en) | 2019-04-02 | 2020-10-08 | Novozymes A/S | Process for producing a fermentation product |
| WO2021055395A1 (en) | 2019-09-16 | 2021-03-25 | Novozymes A/S | Polypeptides having beta-glucanase activity and polynucleotides encoding same |
| US12275967B2 (en) | 2019-09-16 | 2025-04-15 | Novozymes A/S | Processes for producing fermentation products and compositions used therein |
| WO2021207687A1 (en) * | 2020-04-10 | 2021-10-14 | Liberty Biosecurity Llc | Polypeptide compositions and uses thereof |
| EP4291634A4 (en) * | 2021-02-10 | 2025-01-01 | Novozymes A/S | Polypeptides having pectinase activity, polynucleotides encoding same, and uses thereof |
| WO2022225915A1 (en) * | 2021-04-19 | 2022-10-27 | The Regents Of The University Of California | Inhibitory rna for the control of phytopathogens |
| WO2023137417A3 (en) * | 2022-01-17 | 2023-08-24 | University Of Washington | De novo designed luciferase |
| WO2023203080A1 (en) | 2022-04-20 | 2023-10-26 | Novozymes A/S | Process for producing free fatty acids |
| WO2023225459A2 (en) | 2022-05-14 | 2023-11-23 | Novozymes A/S | Compositions and methods for preventing, treating, supressing and/or eliminating phytopathogenic infestations and infections |
| WO2024137250A1 (en) * | 2022-12-19 | 2024-06-27 | Novozymes A/S | Carbohydrate esterase family 3 (ce3) polypeptides having acetyl xylan esterase activity and polynucleotides encoding same |
| WO2025036987A1 (en) * | 2023-08-15 | 2025-02-20 | Novozymes A/S | Polypeptides having alkaline phosphatase activity for animal feed |
| CN117551195A (en) * | 2024-01-12 | 2024-02-13 | 杭州畅溪制药有限公司 | VHH nanobody targeting TSLP and application thereof |
| CN117551195B (en) * | 2024-01-12 | 2024-04-09 | 杭州畅溪制药有限公司 | VHH nanobody targeting TSLP and application thereof |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20150175980A1 (en) | Novel cell wall deconstruction enzymes of scytalidium thermophilum, myriococcum thermophilum, and aureobasidium pullulans, and uses thereof | |
| WO2014059541A1 (en) | Novel cell wall deconstruction enzymes of thermoascus aurantiacus, myceliophthora fergusii (corynascus thermophilus), and pseudocercosporella herpotrichoides, and uses thereof | |
| WO2014138983A1 (en) | Novel cell wall deconstruction enzymes of malbranchea cinnamomea, thielavia australiensis, and paecilomyces byssochlamydoides, and uses thereof | |
| WO2015109405A1 (en) | Novel cell wall deconstruction enzymes of chaetomium thermophilum, thermomyces stellatus, and corynascus sepedonium, and uses thereof | |
| WO2012130950A1 (en) | Novel cell wall deconstruction enzymes of talaromyces thermophilus and uses thereof | |
| DK2519630T3 (en) | METHOD OF TREATING CELLULOS MATERIAL AND CBHII / CEL6A ENZYMES THAT CAN BE USED THEREOF | |
| WO2012130964A1 (en) | Novel cell wall deconstruction enzymes of thermomyces lanuginosus and uses thereof | |
| WO2016090474A1 (en) | Novel cell wall deconstruction enzymes of chaetomium olivicolor, acremonium thermophilum, and myceliophthora hinnulea, and uses thereof | |
| WO2014110675A1 (en) | Novel cell wall deconstruction enzymes of amorphotheca resinae, rhizomucor pusillus, and calcarisporiella thermophila, and uses thereof | |
| WO2016090472A1 (en) | Novel cell wall deconstruction enzymes of remersonia thermophila (stilbella thermophila), melanocarpus albomyces, and lentinula edodes, and uses thereof | |
| WO2012092676A1 (en) | Novel cell wall deconstruction enzymes and uses thereof | |
| WO2016090473A1 (en) | Novel cell wall deconstruction enzymes of rhizomucor miehei, thermoascus thermophilus (dactylomyces thermophilus), and humicola hyalo thermophila, and uses thereof | |
| WO2012093149A2 (en) | Novel cell wall deconstruction enzymes and uses thereof | |
| WO2014140165A1 (en) | Cell wall deconstruction enzymes of paecilomyces byssochlamydoides and uses thereof | |
| WO2013182669A2 (en) | Novel cell wall deconstruction enzymes of myriococcum thermophilum and uses thereof | |
| WO2014060379A1 (en) | Cell wall deconstruction enzymes of myceliophthora fergusii (corynascus thermophilus) and uses thereof | |
| WO2017102540A1 (en) | Method for producing reducing sugar from lignocellulosic substrates | |
| CN108884481A (en) | β-glucosyl enzym and application thereof | |
| CN108368530A (en) | β-glucosyl enzym and application thereof | |
| EP4320258A1 (en) | Enzyme composition | |
| EP4320257A1 (en) | Enzyme composition | |
| CN108368529A (en) | β-glucosyl enzym and application thereof | |
| WO2014140167A1 (en) | Cell wall deconstruction enzymes of malbranchea cinnamomea and uses thereof | |
| WO2013182670A2 (en) | Novel cell wall deconstruction enzymes of scytalidium thermophilum and uses thereof | |
| WO2014060380A1 (en) | Cell wall deconstruction enzymes of thermoascus aurantiacus and uses thereof |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13847093 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 13847093 Country of ref document: EP Kind code of ref document: A1 |