[go: up one dir, main page]

WO2020185865A1 - Compositions and methods for modulating trichomes, root hairs and secondary metabolites in cannabaceae - Google Patents

Compositions and methods for modulating trichomes, root hairs and secondary metabolites in cannabaceae Download PDF

Info

Publication number
WO2020185865A1
WO2020185865A1 PCT/US2020/022053 US2020022053W WO2020185865A1 WO 2020185865 A1 WO2020185865 A1 WO 2020185865A1 US 2020022053 W US2020022053 W US 2020022053W WO 2020185865 A1 WO2020185865 A1 WO 2020185865A1
Authority
WO
WIPO (PCT)
Prior art keywords
plant
polynucleotide
protein
sequences
expression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2020/022053
Other languages
French (fr)
Inventor
Jorn Gorlach
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AVANTGARDE LLC
Original Assignee
AVANTGARDE LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by AVANTGARDE LLC filed Critical AVANTGARDE LLC
Publication of WO2020185865A1 publication Critical patent/WO2020185865A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8261Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
    • C12N15/8262Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield involving plant development
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01HNEW PLANTS OR NON-TRANSGENIC PROCESSES FOR OBTAINING THEM; PLANT REPRODUCTION BY TISSUE CULTURE TECHNIQUES
    • A01H6/00Angiosperms, i.e. flowering plants, characterised by their botanic taxonomy
    • A01H6/28Cannabaceae, e.g. cannabis
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/415Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from plants

Definitions

  • Trichomes are various outgrowths of the epidermis in plants including branched and unbranched hairs, vesicles, hooks spines and stinging hairs. Trichomes are staked, multicellular protruding structures that are considered important in the protection of plants against abiotic and biotic stresses. Trichomes have been categorized into types I to VII, with types I, IV, VI and VII as glandular trichome types and II, III, and V as non-glandular . Glandular trichomes, also referred to as secretory or peltate trichomes, are lipophilic glands composed of a group of secretory cells and a cuticle-enclosed cavity that fills with secondary metabolites .
  • Hop ⁇ Humulus lupulus is a perennial, dioecious plant that belongs to the Cannabaceae family. "Hops" is the common term for the female inflorescences of hop plants, well known for their use in beer flavoring. These inflorescences develop into cones upon maturation. The lower parts of the inner surface of the bracts of mature female hop cones are covered with glandular trichomes, termed lupulin glands, which contain a number of terpenoid- related compounds (mono- and sesquiterpenes) , bitter acids (prenylated polyketides) , and prenylflavonoids, which are mainly used as flavoring in the beer brewing process . IN this respect, methods for genetically modifying hop for production of trichomes on non-flowering parts such as leaves have been suggested (WO 2018/191398) .
  • cannabis (Cannabis indlca) and hemp (Cannabis sativa) trichomes are the source of secondary metabolites including cannabinoids such as tetrahydrocannabinol (THC) and cannabidiol (CBD) , as well as terpenes such as myrcene, pinene, caryophyllene, limonene, humulene and linalool.
  • THC tetrahydrocannabinol
  • CBD cannabidiol
  • terpenes such as myrcene, pinene, caryophyllene, limonene, humulene and linalool.
  • Three types of trichomes are produced by cannabis: bulbous trichomes, capitate sessile trichomes and capitate-stalked trichomes, the latter of which are the most abundant and largest (50-100 pm) .
  • This invention is a recombinant vector including : (i) a polynucleotide encoding a polypeptide of SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3, or a variant or Cannabaceae ortholog thereof; or (ii) a polynucleotide capable of reducing the expression of a polynucleotide encoding a polypeptide of SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3, or a variant or Cannabaceae ortholog thereof.
  • An isolated host cell, transgenic plant (e. g . , of the genus Humulus or Cannabis) and transformed seed harboring said recombinant vector are also provided as are methods for modulating trichomes, root hairs or levels of one or more secondary metabolites in a plant .
  • FIG. 1 provides an amino acid sequence comparison of trichome modulating polypeptides from Humulus sp. (HuTMl, SEQ ID NO: 1) and Cannabis sp. (CaTMl, SEQ ID NO: 2; CaTM2, SEQ ID NO: 3) and TCL1 polypeptide from Arabidopsis thaliana (AtCL1, SEQ ID NO: 4). Identical residues are indicated with "*” and similar amino acid residues are indicated with " : " . The single R3 MYB domain is indicated with think line underneath .
  • amino acid signature [D/E]L x 2 [R/K] x 3L x 6L x 3R that is required for interacting with R/B-like BHLH transcription factors is indicated by arrowheads on the top of amino acids.
  • amino acids within the MYB domain that are involved in cell-to-cell movement of CPC are indicated by arrows on the top of amino acids.
  • FIG. 2 provides the cDNA and deduced amino acid sequences of Humulus HuTMl.
  • FIG. 3 provides the nucleic acid sequence of the gene encoding Humulus HuTMl (SEQ ID NO: 27) .
  • Trichomes in members of the Cannabaceae family are sources of many of the beneficial compounds associated with the plants in this family. Accordingly, to enhance or increase the amounts of one or more beneficial compounds produced by plants in this family, the present invention provides compositions and methods for increasing the number, density, clustering, size and/or branching of trichomes in plants from the Cannabaceae family. More specifically, this invention relates to compositions and methods for the modulating the expression or activity of one or more proteins involved in regulating the number, density, size, clustering or branching of trichomes in plants from the Cannabaceae family.
  • the Cannabaceae family includes members of the genera Cannabis, Humulus, Celtis, Pteroceltis, Aphananthe, Chaetacime, Gironniera, Lozanella, Trema and Parasponia.
  • the Cannabaceae family member is a species from the genus Cannabis or Humulus.
  • the Cannabaceae family member is Humulus lupulus (i.e • / hop), Humulus japonicus (i.e • / wild hop) , Humulus yunnanensis (Chinese hop) , Cannabis sp.
  • Cannabis sp the species C. sativa, C. ruderalis, C. afghanlca and C. indlca are referred to herein collectively as Cannabis sp.
  • trichome encompasses different types of trichomes, both glandular trichomes and/or non-glandular trichomes .
  • Trichome cells refer to the cells making up the trichome structure, such as the gland, or secretory cells, base cells and stalk, or stipe cells, extra-cellular cavity and cuticle cells. Trichomes can also be composed of one single cell.
  • modulating includes either a decrease or increase in expression or activity.
  • compositions of the invention include Cannabaceae trichome modulating polypeptides and polynucleotides, as well as expression cassettes, host cells, and transgenic plants containing the same .
  • a "trichome modulating polypeptide, " "TM polypeptide” or “TM protein” is a polypeptide or protein that are involved in modulating trichome initiation and/or development .
  • Trichome modulating polypeptides of particular interest in the context of this invention are single repeat myb-like factors, also referred to as MYB3K or 3K MYB, from members of the genus Cannabaceae.
  • the invention encompasses isolated or substantially purified polynucleotide or protein compositions .
  • An "isolated” or “purified” polynucleotide or protein, or biologically active portion thereof, is substantially or essentially free from components that normally accompany or interact with the polynucleotide or protein as found in its naturally occurring environment.
  • an isolated or purified polynucleotide or protein is substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.
  • an "isolated" polynucleotide is free of sequences (optimally protein encoding sequences) that naturally flank the polynucleotide (i.e., sequences located at the 5' and 3' ends of the polynucleotide) in the genomic DNA of the organism from which the polynucleotide is derived.
  • the isolated polynucleotide can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of nucleotide sequence that naturally flank the polynucleotide in genomic DNA of the cell from which the polynucleotide is derived.
  • a protein that is substantially free of cellular material includes preparations of protein having less than about 30%, 20%, 10%, 5% or 1% (by dry weight) of contaminating protein.
  • optimally culture medium represents less than about 30%, 20%, 10%, 5% or 1% (by dry weight) of chemical precursors or non-protein-of-interest chemicals.
  • Fragments and variants of the disclosed polynucleotides and proteins encoded thereby are also encompassed by the present invention.
  • fragment is intended a portion of the polynucleotide or a portion of the amino acid sequence and hence of the protein encoded thereby.
  • Fragments of a polynucleotide may encode protein fragments that retain the biological activity of the native protein.
  • fragments of a polynucleotide, which are useful as hybridization probes generally do not encode protein fragments retaining biological activity.
  • fragments of a nucleotide sequence may range from at least about 20 nucleotides, about 50 nucleotides, about 100 nucleotides and up to a full-length polynucleotide encoding a protein of the invention.
  • a fragment of a TM polynucleotide that encodes a biologically active portion of a TM protein of the invention will encode at least 15, 25, 30, 50, 100, 150, 200, 250, 300, 350, or 400 contiguous amino acids, or up to the total number of amino acids present in a full-length TM protein of the invention. Fragments of a TM polynucleotide that are useful as hybridization probes or PCR primers generally need not encode a biologically active portion of a TM protein. Thus, a fragment of a TM polynucleotide may encode a biologically active portion of a TM protein, or it may be a fragment that can be used as a hybridization probe or PCR primer using methods disclosed below .
  • a biologically active portion of a TM protein can be prepared by isolating a portion of one of the TM polynucleotides of the invention, expressing the encoded portion of the TM protein (e.g • 9 by recombinant expression in vitro) , and assessing the activity of the encoded portion of the TM protein.
  • Polynucleotides that are fragments of a TM nucleotide sequence include at least 16, 20, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 800, 900, 1000, 1100, or 1200 nucleotides, or up to the number of nucleotides present in a full-length TM polynucleotide disclosed herein .
  • a variant includes a deletion and/or addition of one or more nucleotides at one or more sites within the native polynucleotide and/or a substitution of one or more nucleotides at one or more sites in the native polynucleotide .
  • a "native" polynucleotide or polypeptide includes a naturally occurring nucleotide sequence or amino acid sequence, respectively.
  • conservative variants include those sequences that, because of the degeneracy of the genetic code, encode the amino acid sequence of one of the TM polypeptides of the invention.
  • Naturally occurring allelic variants such as these can be identified with the use of well-known molecular biology techniques, as, for example, with polymerase chain reaction (PCR) and hybridization techniques as outlined below.
  • Variant polynucleotides also include synthetically derived polynucleotides, such as those generated, for example, by using site-directed mutagenesis but which still encode a TM protein of the invention.
  • variants of a particular polynucleotide of the invention will have at least about 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to that particular polynucleotide as determined by sequence alignment programs and parameters described elsewhere herein.
  • Variants of a particular polynucleotide of the invention can also be evaluated by comparison of the percent sequence identity between the polypeptide encoded by a variant polynucleotide and the polypeptide encoded by the reference polynucleotide. Percent sequence identity between any two polypeptides can be calculated using sequence alignment programs and parameters described elsewhere herein.
  • the percent sequence identity between the two encoded polypeptides is at least about 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity.
  • "Variant" protein is intended to mean a protein derived from the native protein by deletion or addition of one or more amino acids at one or more sites in the native protein and/or substitution of one or more amino acids at one or more sites in the native protein.
  • Variant proteins encompassed by the present invention are biologically active, that is they continue to possess the desired biological activity of the native protein, that is, TM activity as described herein. Such variants may result from, for example, genetic polymorphism or from human manipulation .
  • Biologically active variants of a native TM protein of the invention will have at least about 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the amino acid sequence for the native protein as determined by sequence alignment programs and parameters described elsewhere herein .
  • a biologically active variant of a protein of the invention may differ from that protein by as few as 1-15 amino acid residues, as few as 1-10, such as 6-10, as few as 5, as few as 4, 3, 2 or even 1 amino acid residue .
  • the upper limit of variation for an amino acid sequence of the invention which retains biological activity can be determined empirically, i.e • 9 by testing variants in an assay for TM activity as described elsewhere herein.
  • a biologically active variant of a protein of the invention may differ from that protein by as much as 100 or 200 amino acids.
  • the proteins of the invention may be altered in various ways including amino acid substitutions, deletions, truncations, and insertions. Methods for such manipulations are generally known in the art. For example, amino acid sequence variants and fragments of the TM proteins can be prepared by mutations in the DNA. Methods for mutagenesis and polynucleotide alterations are well known in the art . See, for example, Kunkel (1985) Proc. Natl. Acad. Scl. USA 82 : 488-492; Kunkel, et al. (1987) Methods in Enzymol. 154 : 367-382; US 4,873,192; Walker & Gaastra, eds .
  • the genes and polynucleotides of the invention include both the naturally occurring sequences as well as mutant forms .
  • the proteins of the invention encompass both naturally occurring proteins as well as variations and modified forms thereof. Such variants will continue to possess the desired TM activity.
  • the mutations that will be made in the DNA encoding the variant must not place the sequence out of reading frame and optimally will not create complementary regions that could produce secondary mRNA structure . See, EP 0075444.
  • deletions, insertions, and substitutions of the protein sequences encompassed herein are not expected to produce radical changes in the characteristics of the protein. However, when it is difficult to predict the exact effect of the substitution, deletion, or insertion in advance of doing so, one skilled in the art will appreciate that the effect will be evaluated by routine screening assays. That is, the activity can be evaluated by assaying for TM activity. TM activity can be assayed in a variety of ways. For example, modulating hormone responses or modulating trichome density, clustering, size, length, branching, etc.
  • Variant polynucleotides and proteins also encompass sequences and proteins derived from a mutagenic and recombinogenic procedure such as DNA shuffling. With such a procedure, one or more different TM sequences can be manipulated to create a new TM polypeptide possessing the desired properties. In this manner, libraries of recombinant polynucleotides are generated from a population of related sequence polynucleotides comprising sequence regions that have substantial sequence identity and can be homologously recombined in vitro or in vivo.
  • sequence motifs encoding a domain of interest may be shuffled between the TM gene of the invention and other known TM genes to obtain a new gene coding for a protein with an improved property of interest.
  • Strategies for such DNA shuffling are known in the art. See, for example, Stemmer (1994) Proc. Natl. Acad. Sci. USA 91:10747-10751; Stemmer (1994) Nature 370:389-391; Crameri, et al. (1997) Nature Biotech. 15:436-438; Moore, et al. (1997) J. Mol. Biol. 272:336-347; Zhang, et al. (1997) Proc. Natl. Acad. Sci. USA 94:4504-4509; Crameri, et al. (1998) Nature 391:288-291; US 5,605,793 and US 5,837,458.
  • the polynucleotides of the invention can be used to isolate corresponding sequences from other organisms, particularly other plants. In this manner, methods such as PCR, hybridization, and the like can be used to identify such sequences based on their sequence homology to the sequences set forth herein. Sequences isolated based on their sequence identity to the entire TM sequences or the TM promoter sequences set forth herein or to variants and fragments thereof are encompassed by the present invention. Such sequences include sequences that are orthologs of the disclosed sequences . "Orthologs" is intended to mean genes derived from a common ancestral gene and which are found in different species as a result of speciation. Genes found in different species are considered orthologs when their nucleotide sequences and/or their encoded protein sequences share at least 60%, 70%,
  • TM polynucleotides and polypeptides from Humulus and Cannabis are provided herein, orthologs of said TM polynucleotides and polypeptides can be identified in Celtis, Pteroceltis, Aphananthe, Chaetachme, Gironnlera, Lozanella, Trema and Parasponia using one or more of the approaches described below.
  • oligonucleotide primers can be designed for use in PCR reactions to amplify corresponding DNA sequences from cDNA or genomic DNA extracted from any plant of interest.
  • Methods for designing PCR primers and PCR cloning are generally known in the art and are disclosed in Sambrook, et al. (1989) Molecular Cloning: A
  • PCR Protocols A Guide to Methods and Applications (Academic Press, New York); Innis & Gelfand, eds. (1995) PCR Strategies (Academic Press, New York); and Innis & Gelfand, eds. (1999) PCR Methods Manual (Academic Press, New York) .
  • Known methods of PCR include, but are not limited to, methods using paired primers, nested primers, single specific primers, degenerate primers, gene-specific primers, vector-specific primers, partially-mismatched primers, and the like.
  • hybridization techniques all or part of a known polynucleotide is used as a probe that selectively hybridizes to other corresponding polynucleotides present in a population of cloned genomic DNA fragments or cDNA fragments (i.e., genomic or cDNA libraries) from a chosen organism.
  • the hybridization probes may be genomic DNA fragments, cDNA fragments, RNA fragments, or other oligonucleotides, and may be labeled with a detectable group such as 32 P, or any other detectable marker.
  • probes for hybridization can be made by labeling synthetic oligonucleotides based on the TM polynucleotides of the invention.
  • an entire TM polynucleotide sequence disclosed herein, or one or more portions thereof, may be used as a probe capable of specifically hybridizing to corresponding TM polynucleotides and messenger RNAs.
  • probes include sequences that are unique among TM polynucleotide sequences and are optimally at least about 10 nucleotides in length, and most optimally at least about 20 nucleotides in length.
  • Such probes may be used to amplify corresponding TM polynucleotides from a chosen plant by PCR.
  • Hybridization techniques include hybridization screening of plated DNA libraries (either plaques or colonies; see, for example, Sambrook, et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed • 9 Cold Spring Harbor Laboratory Press, Plainview, NY) .
  • Hybridization of such sequences may be carried out under stringent conditions.
  • stringent conditions or
  • stringent hybridization conditions are intended conditions under which a probe will hybridize to its target sequence to a detectably greater degree than to other sequences (e.g • / at least 2-fold over background) . Stringent conditions are sequence-dependent and will be different in different circumstances. By controlling the stringency of the hybridization and/or washing conditions, target sequences that are 100% complementary to the probe can be identified (homologous probing) . Alternatively, stringency conditions can be adjusted to allow some mismatching in sequences so that lower degrees of similarity are detected (heterologous probing) . Generally, a probe is less than about 1000 nucleotides in length, optimally less than 500 nucleotides in length.
  • stringent conditions will be those in which the salt concentration is less than about 1.5 M Na ion, typically about 0.01 M to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., 10 to 50 nucleotides) and at least about 60 °C for long probes (e.g • r greater than 50 nucleotides) .
  • Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide.
  • Exemplary low stringency conditions include hybridization with a buffer solution of 30% to 35% formamide, 1 M NaCl, 1% SDS (sodium dodecyl sulphate) at 37 °C, and a wash in IX to 2X SSC (20X SSC - 3.0 M NaCl/0.3 M trisodium citrate) at 50°C to 55 e C.
  • Exemplary moderate stringency conditions include hybridization in 40% to 45% formamide, 1.0 M Nad, 1% SDS at 37 °C, and a wash in 0.5X to IX SSC at 55°C to 60 e C.
  • Exemplary high stringency conditions include hybridization in 50% formamide, 1 M NaCl, 1% SDS at 37°C, and a wash in 0.1X SSC at 60 e C to 65 °C.
  • wash buffers may comprise about 0.1% to about 1% SDS .
  • Duration of hybridization is generally less than about 24 hours, usually about 4 hours to about 12 hours . The duration of the wash time will be at least a length of time sufficient to reach equilibrium.
  • T m 81.5°C + 16.6 (log M)+0.41 (% GC)-0.61 (% form) - 500/L; where M is the molarity of monovalent cations, % GC is the percentage of guanosine and cytosine nucleotides in the DNA, . % form is the percentage of formamide in the hybridization solution, and L is the length of the hybrid in base pairs.
  • the T m is the temperature (under defined ionic strength and pH) at which 50% of a complementary target sequence hybridizes to a perfectly matched probe .
  • T m is reduced by about 1°C for each 1% of mismatching; thus, T m , hybridization, and/or wash conditions can be adjusted to hybridize to sequences of the desired identity. For example, if sequences with 90% identity are sought, the T m can be decreased 10 °C.
  • stringent conditions are selected to be about 5°C lower than the thermal melting point (T m ) for the specific sequence and its complement at a defined ionic strength and pH.
  • sequence identity and, “percentage of sequence identity.”
  • reference sequence is a defined sequence used as a basis for sequence comparison.
  • a reference sequence may be a subset or the entirety of a specified sequence; for example, as a segment of a full- length cDNA or gene sequence, or the complete cDNA or gene sequence .
  • comparison window makes reference to a contiguous and specified segment of a polynucleotide sequence, wherein the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two polynucleotides .
  • the comparison window is at least 20 contiguous nucleotides in length, and optionally can be 30, 40, 50, 100 nucleotides or longer.
  • a gap penalty is typically introduced and is subtracted from the number of matches.
  • Computer implementations of these mathematical algorithms can be utilized for comparison of sequences to determine sequence identity. Such implementations include, but are not limited to: CLUSTAL in the PC/Gene program (available from Intelligenetics, Mountain View, CA) ; the ALIGN program (Version 2.0) and GAP, BESTFIT, BLAST, FASTA, and TFASTA in the GCG Wisconsin Genetics Software Package,
  • the ALIGN program is based on the algorithm of Myers & Miller (1988) supra. A PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4 can be used with the ALIGN program when comparing amino acid sequences.
  • Gapped BLAST in BLAST 2.0 can be utilized as described in Altschul, et al. (1997) Nucleic Acids Res.
  • sequence identity/similarity values refer to the value obtained using GAP Version 10 using the following parameters : % identity and % similarity for a nucleotide sequence using GAP Weight of 50 and Length Weight of 3, and the nwsgapdna .
  • cmp scoring matrix % identity and % similarity for an amino acid sequence using GAP Weight of 8 and Length Weight of 2, and the BLOSUM62 scoring matrix; or any equivalent program thereof.
  • equivalent program any sequence comparison program that, for any two sequences in question, generates an alignment having identical nucleotide or amino acid residue matches and an identical percent sequence identity when compared to the corresponding alignment generated by GAP Version 10.
  • sequence identity or “identity” in the context of two polynucleotides or polypeptide sequences makes reference to the residues in the two sequences that are the same when aligned for maximum correspondence over a specified comparison window.
  • sequence identity or “identity” in the context of two polynucleotides or polypeptide sequences makes reference to the residues in the two sequences that are the same when aligned for maximum correspondence over a specified comparison window.
  • percentage of sequence identity is used in reference to proteins it is recognized that residue positions which are not identical often differ by conservative amino acid substitutions, where amino acid residues are substituted for other amino acid residues with similar chemical properties (e.g. , charge or hydrophobicity) and therefore do not change the functional properties of the molecule.
  • sequences differ in conservative substitutions the percent sequence identity may be adjusted upwards to correct for the conservative nature of the substitution .
  • Sequences that differ by such conservative substitutions are said to have "sequence similarity" or “similarity” .
  • Means for making this adjustment are well known to those of skill in the art . Typically this involves scoring a conservative substitution as a partial rather than a full mismatch, thereby Increasing the percentage sequence identity. Thus, for example, where an identical amino acid is given a score of 1 and a non-conservative substitution is given a score of zero, a conservative substitution is given a score between zero and 1.
  • the scoring of conservative substitutions is calculated, e.g., as implemented in the program PC/GENE (Intelligenetics, Mountain View, CA) .
  • percentage of sequence identity means the value determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may include additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison, and multiplying the result by 100 to yield the percentage of sequence identity.
  • the invention further provides a plant having altered levels and/or activities of the TM polypeptides of the invention.
  • the plant of the invention has stably incorporated into its genome one or more of the TM polynucleotides described herein. In other embodiments, all or a portion of the TM polynucleotide in the plant has been deleted.
  • the term plant includes plant cells, plant protoplasts, plant cell tissue cultures from which a plant can be regenerated, plant calli, plant clumps, and plant cells that are intact in plants or parts of plants such as embryos, pollen, ovules, seeds, leaves, flowers, fruit, cones, stalks, roots, root tips, anthers, and the like. Progeny, variants, and mutants of the regenerated plants are also included within the scope of the invention, provided that these parts comprise the introduced nucleic acid sequences.
  • a "subject plant” or “subject plant cell” is one in which genetic alteration, such as transformation, has been affected as to a gene of interest, or is a plant or plant cell which is descended from a plant or plant cell so altered and which comprises the alteration.
  • a "control” or “control plant” or “control plant cell” provides a reference point for measuring changes in the subject plant or plant cell.
  • a control plant or control plant cell may include, for example, (a) a wild-type plant or plant cell, i.e., of the same genotype as the starting material for the genetic alteration which resulted in the subject plant or subject plant cell; (b) a plant or plant cell of the same genotype as the starting material but which has been transformed with a null construct (i.e., with a construct which has no known effect on the trait of interest, such as a construct comprising a marker gene) ; (c) a plant or plant cell which is a non-transformed segregant among progeny of a subject plant or subject plant cell; (d) a plant or plant cell genetically identical to the subject plant or subject plant cell but which is not exposed to conditions or stimuli that would induce expression of the gene of interest; or (e) the subject plant or subject plant cell itself, under conditions in which the gene of interest is not expressed.
  • a wild-type plant or plant cell i.e., of the same genotype as the starting material for the genetic alteration which
  • changes in TM mRNA or protein levels and/or changes in one or more traits such as trichome (and/or root hair) density, size, mass, clustering, length, number, or branching could be measured by comparing a subject plant or subject plant cell to a control plant or control plant cell .
  • polynucleotides of the present invention can be introduced and optionally expressed in a host cell such as bacteria, yeast, insect, mammalian, or preferably plant cells. It is expected that those of skill in the art are knowledgeable in the numerous systems available for the introduction of a polypeptide or a nucleotide sequence of the present invention into a host cell.
  • host cell is meant a cell which harbors an exogenous nucleic acid sequence of the invention.
  • Host cells may be prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or mammalian cells.
  • the host cell is a monocotyledonous or dicotyledonous plant cell, more preferably a Cannabaceae plant cell, most preferably a
  • Humulus lupulus Cannabis plant cell.
  • polynucleotide is not intended to limit the present invention to polynucleotides comprising DNA.
  • polynucleotides can include ribonucleotides and combinations of ribonucleotides and deoxyribonucleotides .
  • deoxyribonucleotides and ribonucleotides include both naturally occurring molecules and synthetic analogues .
  • the polynucleotides of the invention also encompass all forms of sequences including, but not limited to, single-stranded forms, double-stranded forms, hairpins, stem-and-loop structures, and the like.
  • the TM polynucleotide sequences of the invention can be provided in expression cassettes for expression in the organism of interest.
  • the cassette may include 5' and 3' regulatory sequences operably linked to a TM polynucleotide of the invention.
  • "Operably linked" is intended to mean a functional linkage between two or more elements .
  • an operable linkage between a polynucleotide of interest and a regulatory sequence i.e. , a promoter
  • Operably linked elements may be contiguous or non-contiguous .
  • the cassette may additionally contain at least one additional gene to be cotransformed into the organism. Alternatively, any additional gene (s) can be provided on multiple expression cassettes .
  • Such an expression cassette is provided with a plurality of restriction sites and/or recombination sites for insertion of the TM polynucleotide to be under the transcriptional regulation of the regulatory regions .
  • the expression cassette may additionally contain selectable marker genes .
  • the expression cassette may include in the 5 '-3 ' direction of transcription, a transcriptional and translational initiation region (i.e., a promoter), a TM polynucleotide of the invention, and a transcriptional and translational termination region (i.e., termination region) functional in the host cell (i.e. , the plant) .
  • the regulatory regions (i.e., promoters, transcriptional regulatory regions, and translational termination regions) and/or the TM polynucleotide of the invention may be native/analogous to the host cell or to each other. Alternatively, the regulatory regions and/or the TM polynucleotide of the invention may be heterologous to the host cell or to each other.
  • heterologous in reference to a sequence is a sequence that originates from a foreign species, or, if from the same species, is substantially modified from its native form in composition and/or genomic locus by deliberate human intervention.
  • a promoter operably linked to a heterologous polynucleotide is from a species different from the species from which the polynucleotide was derived, or, if from the same/analogous species, one or both are substantially modified from their original form and/or genomic locus, or the promoter is not the native promoter for the operably linked polynucleotide.
  • a chimeric gene comprises a coding sequence operably linked to a transcription initiation region that is heterologous to the coding sequence.
  • a termination region may be native with the transcriptional initiation region, may be native with the operably linked TM polynucleotide of interest or with the TM promoter sequences, may be native with the plant host, or may be derived from another source (i.e., foreign or heterologous) to the promoter, the TM polynucleotide of interest, the plant host, or any combination thereof.
  • Convenient termination regions are available from the Ti- plasmid of A. fcumefaciens, such as the octopine synthase and nopaline synthase termination regions. See also, Guerineau, et al. (1991) Mol. Gen. Genet.
  • the polynucleotides may be optimized for increased expression in the transformed plant. That is, the polynucleotides can be synthesized using plant-preferred codons for improved expression. See, for example , Campbell s Gowri (1990) Plant Physiol. 92:1-11 for a discussion of host-preferred codon usage. Methods are available in the art for synthesizing plant-preferred genes . See, for example, US 5,380,831; US 5,436,391; and Murray, et al. (1989) Nucleic Acids Res. 17:477-498.
  • Additional sequence modifications are known to enhance gene expression in a cellular host . These include elimination of sequences encoding spurious polyadenylation signals, exon-intron splice site signals, transposon-like repeats, and other such well-characterized sequences that may be deleterious to gene expression.
  • the G-C content of the sequence may be adjusted to levels average for a given cellular host, as calculated by reference to known genes expressed in the host cell. When possible, the sequence is modified to avoid predicted hairpin secondary mRNA structures .
  • the expression cassettes may additionally contain 5 ' leader sequences .
  • leader sequences can act to enhance translation.
  • Translation leaders are known in the art and include: picornavirus leaders, for example, EMCV leader (Encephalomyocarditis 5 ' noncoding region) (Elroy-Stein, et al. (1989) Proc. Natl. Acad. Sci. USA 86:6126-6130) ; potyvirus leaders, for example, TEV leader (Tobacco Etch Virus) (Gallie, et al. (1995) Gene 165 (2) :233-238) , MDMV leader (Maize Dwarf Mosaic Virus) (Johnson, et al.
  • the various DNA fragments may be manipulated, so as to provide for the DNA sequences in the proper orientation and, as appropriate, in the proper reading frame.
  • adapters or linkers may be employed to join the DNA fragments or other manipulations may be involved to provide for convenient restriction sites, removal of superfluous DNA, removal of restriction sites, or the like.
  • in vitro mutagenesis, primer repair, restriction, annealing, resubstitutions e.g • / transitions and transversions , may be involved.
  • the expression cassette can also include a selectable marker gene for the selection of transformed cells .
  • Selectable marker genes are utilized for the selection of transformed cells or tissues.
  • Marker genes include genes encoding antibiotic resistance, such as those encoding neomycin phosphotransferase II (NEO) and hygromycin phosphotransferase (HPT) , as well as genes conferring resistance to herbicidal compounds, such as glufosinate ammonium, bromoxynil, imidazolinones, and 2,4- dichlorophenoxyacetate (2,4-D) .
  • Additional selectable markers include phenotypic markers such as beta- galactosidase and fluorescent proteins such as green fluorescent protein (GFP) (Su, et al. (2004) Biotechnol Bioeng. 85:610-9 and Fetter, et al. (2004 ) Plant Cell
  • GYP cyan florescent protein
  • PHIYFPTM yellow florescent protein
  • the number, density, mass, clustering, size, length and/or branching of trichomes may modulated over the entire plant or may be localized to one or more organs (e. g. , stem, leaf, root, fruit or flower) using, e.g., tissue or organ-specific promoters,
  • trichome number, density, mass, clustering, size, length and/or branching may be modulated under certain conditions or at a certain time (e.g., at flowering) .
  • a number of promoters can be used in the practice of the invention, including the native promoter of the polynucleotide sequence of interest .
  • the promoters can be selected based on the desired outcome .
  • the nucleic acids can be combined with constitutive , inducible (e.g., stress- or chemical- induced) , tissue-preferred, or other promoters for expression in plants .
  • Constitutive promoters include, for example, the core promoter of the Rsyn7 promoter and other constitutive promoters disclosed in WO 99/43838 and US 6,072,050; the core CaMV 35S promoter (Odell, et al. (1985) Mature 313 : 810-812) ; rice actin (McElroy, et al. (1990) Plant Cell 2 : 163-171) ; ubiquitin (Christensen, et al. (1989) Plant Mol. Biol. 12:619-632 and Christensen, et al. (1992) Plant Mol. Biol . 18:675-689) ; PEMU (Last, et al. (1991) Theor.
  • Tissue-preferred promoters can be utilized to target enhanced TM expression within a particular plant tissue .
  • Tissue-preferred promoters include those disclosed by Yamamoto, et al. (1997) Plant J. 12 (2) : 255-265; Kawamata, et al. (1997) Plant Cell Physiol. 38 (7) : 792-803; Hansen, et al. (1997) Mol . Gen. Genet. 254 (3) : 337-343; Russell, et al. (1997) Transgenic Res. 6 (2) : 157-168 ; Rinehart, et al. (1996) Plant Physiol. 112 (3) : 1331-1341; Van Camp, et al. (1996) Plant Physiol.
  • Leaf-preferred promoters are known in the art. See, for example, Yamamoto, et al. (1997) Plant J. 12 (2 ) : 255- 265; Kwon, et al. (1994) Plant Physiol. 105 : 357-67; Yamamoto, et al. (1994) Plant Cell Physiol. 35 (5) : 773-778; Gotor, et al. (1993) Plant J. 3:509-18; Orozco, et al. (1993) Plant Mol. Biol. 23(6) : 1129-1138; Baszczynski, et al. (1988) Nucl . Acid Res. 16:4732; Mitra, et al. (1994) Plant Mol. Biol.
  • Senescence regulated promoters are also of use, such as SAM22 (Crowell, et al. (1992) Plant Mol. Biol. 18:459-466).
  • Root-preferred or root-specific promoters are known and can be selected from the many available from the literature or isolated de novo from various compatible species . See, for example, Hire, et al. (1992) Plant Mol. Biol. 20(2) : 207-218 (soybean root-specific glutamine synthetase gene) ; Keller & Baumgartner (1991) Plant Cell 3(10) : 1051-1061 (root-specific control element in the GRP 1.8 gene of French bean) ; Sanger, et al. (1990) Plant Mol. Biol.
  • Plant Cell 2(7): 633-641 where two root-specific promoters isolated from hemoglobin genes from the nitrogen-fixing nonlegume Parasponla andersonii and the related non- nitrogen-fixing nonlegume Trema tomentosa are described.
  • the promoters of these genes were linked to a beta- glucuronidase reporter gene and introduced into both the nonlegume Nicotiana tabacum and the legume Lotus corniculatus, and in both instances root-specific promoter activity was preserved.
  • Promoters of the highly expressed rolC and rolD root-inducing genes of Agrobacterium rhizogenes have also been described (see, Leach & Aoyagi (1991) Plant Science (Limerick) 79(1) : 69-76) . Additional root-preferred promoters include the VfENOD-GRP3 gene promoter (Kuster, et al. (1995) Plant Mol. Biol. 29(4): 759-
  • rolB promoter Capana, et al. (1994) Plant Mol. Biol. 25 (4) : 681-691
  • CRWAQ81 root-preferred promoter with the ADH first intron See also, US 5,837,876; US 5,750,386; US 5,633,363; US 5,459,252; US 5,401,836; US 5,110,732 and US 5,023,179.
  • Shoot-preferred promoters include shoot meristem- preferred promoters such as promoters disclosed in Weigal, et al. (1992) Cell 69:843-859; Accession Number AJ131822; Accession Number Z71981; Accession Number AF049870 and shoot-preferred promoters disclosed in McAvoy, et al. (2003) Acta Hort. (ISHS) 625:379-385.
  • a trichome-specific promoter may also be used.
  • trichome-specific promoters are known in the art and include, e.g., the promoter of the Arabidopsis thaliana OASAl gene, which has activity in both glandular and non-glandular trichomes of tobacco (Gutierrez-Alcala, et al. (2005) J. Exp. Bot. 56:2487-94).
  • a trichome specific promoter from tobacco P450 gene, CYP71D16 which shows expression in tobacco glandular trichomes at all developmental stages has also been described (Wang, et al. (2002) J. Exp. Bot. 53:1891).
  • WO 2004/111183 describes trichome specific promoters from tomato and tobacco leaves; WO 2009/082208 describes trichome-specific promoters from tomato; and US 9,856,486 describes the promoters from farnesyl diphosphate synthase (ShzFPS), zingiberene synthase (ShZIS) and germacrene synthase (ShTPS9) genes isolated from Solanum habrochaltes as imparting trichome-specific expression. Additional glandular trichome-specific gene promoters have been reported in the literature for a variety of plants, including, but not restricted to, Antirrhinum majus (Jaffe, et al. (2007) J. Exp. Bot.
  • Dividing cell or meristematic tissue-preferred promoters have been disclosed in Ito, et al. (1994) Plant Mol. Biol. 24 : 863-878; Reyad, et al. (1995) Mol. Gen. Genet. 248:703-711; Shaul, et al. (1996) Proc. Natl. Acad. Sci. 93:4868-4872; Ito, et al. (1997) Plant J. 11:983-992.
  • Inflorescence-preferred promoters include the promoter of chalcone synthase (Van der Meer, et al. (1990) Plant Mol. Biol. 15:95-109) and LAT52 (Twell, et al. (1989) Mol. Gen. Genet. 217:240-245).
  • Stress inducible promoters include salt/water stress-inducible promoters such as P5CS (Zang, et al. (1997) Plant Sci. 129:81-89); cold-inducible promoters, such as cor15a (Hajela, et al. (1990) Plant Physiol. 93:1246-1252) , cor15b (Wlihelm, et al. (1993) Plant Mol.
  • heat inducible promoters such as heat shock proteins (Barros, et al. (1992) Plant Mol. 19:665-75; Marrs, et al. (1993) Dev. Genet. 14:27-41), and smHSP (Waters, et al. (1996) J. Exper. Bot. 47:325-338).
  • Other stress-inducible promoters include rip2 (OS 5,332,808 and US 2003/0217393) and rd29a (Yamaguchi-Shinozaki, et al. (1993) Mol. Gen. Genet. 236:331-334).
  • Expression can also be regulated via a chemically inducible promoter (see Gatz (1997) Annu. Rev. Plant Physiol. Plant Mol. Biol. 48 : 89-108) .
  • Chemically inducible promoters are particularly suitable when it is desired that gene expression should take place in a time-specific manner. Examples of such promoters are promoters inducible by tetracycline (Gatz, et al. (1992) Plant J. 2:397-404), salicylic acid (WO 95/19443) or Bion, a substance which can replace salicylic acid in some of its functions (Weigel, et al. (2001) Plant Mol. Biol. 46:143).
  • a TM polypeptide or polynucleotide encoding the same is introduced into a host cell, in particular a plant .
  • "Introducing” or “introduced” is intended to mean presenting to the plant the polynucleotide or polypeptide in such a manner that the sequence gains access to the interior of a cell .
  • This invention does not depend on a particular method for introducing a sequence into the host cell, only that the polynucleotide or polypeptides gains access to the interior of at least one cell of the host.
  • Methods for introducing polynucleotide or polypeptides into host cells are known in the art and include, but are not limited to, stable transformation methods , transient transformation methods , and virus-mediated methods.
  • stable transformation is intended to mean that the nucleotide construct introduced into a host (i.e. , a plant) integrates into the genome of the plant and is capable of being inherited by the progeny thereof.
  • Transient transformation is intended to mean that a polynucleotide is introduced into the host (i.e., a plant) and expressed temporally or a polypeptide is introduced into a host (i.e., a plant) .
  • Transformation protocols as well as protocols for introducing polypeptides or polynucleotide sequences into plants may vary depending on the type of plant or plant cell targeted for transformation. Suitable methods of introducing polypeptides and polynucleotides into plant cells include microinjection (Crossway, et al. (1986)
  • H. lupulus Genetic transformation of H. lupulus is described in the art and includes particle bombardment (Gatica-Arias & Weber (2013) In Vitro Cell. Dev. Biol. 49 : 656-664; Batista, et al. (2008) Plant Cell Rep. 27 : 1185-96) ; Agrobacterium- mediated transformation (Horlemann, et al. (2003) Plant
  • the TM sequences of the invention can be provided to a plant using a variety of transient transformation methods .
  • transient transformation methods include, but are not limited to, the introduction of the TM protein or variants or fragments thereof directly into the plant, or the introduction of a TM transcript into the plant.
  • Such methods include, for example, microinjection or particle bombardment . See, for example, Crossway, et al. (1986) Mol. Gen. Genet. 202:179- 185; Nomura, et al. (1986) Plant Sci. 44:53-58; Hepler, et al. (1994) Proc. Natl. Acad. Sci. 91: 2176-2180 and Hush, et al.
  • the TM polynucleotide can be transiently transformed into the plant using techniques known in the art .
  • Such techniques include viral vector system and the precipitation of the polynucleotide in a manner that precludes subsequent release of the DNA.
  • the transcription from the particle-bound DNA can occur, but the frequency with which it is released to become integrated into the genome is greatly reduced .
  • Such methods include the use particles coated with polyethylimine .
  • the polynucleotide of the invention may be introduced into plants by contacting plants with a virus or viral nucleic acids.
  • such methods involve incorporating an expression construct of the invention within a viral DNA or RNA molecule .
  • a TM sequence of the invention may be initially synthesized as part of a viral polyprotein, which later may be processed by proteolysis in vivo or in vitro to produce the desired recombinant protein.
  • promoters of the invention also encompass promoters used for transcription by viral RNA polymerases .
  • Methods for introducing polynucleotides into plants and expressing a protein encoded therein, involving viral DNA or RNA molecules are known in the art. See, for example, US 5,889,191, US 5,889,190, US 5,866,785, US 5,589,367, US 5,316,931 and Porta, et al. (1996) Mol. Biotech. 5:209-221.
  • Methods are known in the art for the targeted insertion of a polynucleotide at a specific location in the plant genome .
  • the insertion of the polynucleotide at a desired genomic location is achieved using a site-specific recombination system. See, for example, WO 99/25821, WO 99/25854, WO 99/25840, WO 99/25855, and WO 99/25853; also, US 6,552,248, US 6,624,297, US 6,573,425, US 6,455,315 and US 6,458,594.
  • the polynucleotide of the invention can be contained in a transfer cassette flanked by two nonidentical recombination sites.
  • the transfer cassette is introduced into a plant having stably incorporated into its genome a target site which is flanked by two non-identical recombination sites that correspond to the sites of the transfer cassette.
  • An appropriate recombinase is provided and the transfer cassette is integrated at the target site.
  • Any genome editing nucleases known in the art may be used, including but not limited to Zinc-finger nucleases (ZFNs) , transcription activator-like effector nucleases (TALENs) , and clustered regularly interspaced short palindromic repeat (CRISPR) /Cas-based RNA-guided DNA endonucleases .
  • ZFNs Zinc-finger nucleases
  • TALENs transcription activator-like effector nucleases
  • CRISPR clustered regularly interspaced short palindromic repeat
  • the polynucleotide of interest is thereby integrated at a specific chromosomal position in the plant genome .
  • RNA-guided endonucleases are used for modifying the plant genome (e. g. , insertion or deletion of a polynucleotide of interest) .
  • RNA-guided endonuclease systems include a guide RNA, which interacts with an RNA-guided endonuclease to direct the endonuclease to a specific target site, wherein the 5' end of the guide RNA base pairs with a specific protospacer sequence.
  • the RNA-guided endonuclease is derived from a CRISPR/CRISPR-associated (Cas) system.
  • the CRISPR/Cas system can be a type I, a type II, or a type III system.
  • Non-limiting examples of suitable CRISPR/Cas proteins include Cas3, Cas4, Cas5, Cas5e (or CasD) , Cas6, Cas6e, Cas6f, Cas7, CasBal, Cas8a2, CasSb, CasBc, Cas9, CaslO, CastlOd, CasF, CasG, CasH, Csyl, Csy2, Csy3, Csel (or CasA) , Cse2 (or CasB) , Cse3 (or CasE) , Cse4 (or CasC) , Cscl, Csc2, Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmrl, Cmr3, Cmr4, Cmr5, Cmr6, Csbl, Csb2, Csb3, Csxl7, Csxl4, CsxlO, Csxl6, C
  • CRISPR/Cas proteins have at least one RNA recognition and/or RNA binding domain. RNA recognition and/or RNA binding domains interact with guide RNAs. CRISPR/Cas proteins can also include nuclease domains (i.e., DNase or RNase domains), DNA binding domains, helicase domains , RNAse domains , protein-protein interaction domains, dimerization domains, as well as other domains .
  • the CRISPR/Cas-like protein can be a wild-type CRISPR/Cas protein, a modified CRISPR/Cas protein, or a fragment of a wild-type or modified CRISPR/Cas protein.
  • the CRISPR/Cas-like protein can be modified to increase nucleic acid binding affinity and/or specificity, alter an enzyme activity, and/or change another property of the protein.
  • nuclease i.e., DNase, RNase
  • the CRISPR/Cas-like protein can be truncated to remove domains that are not essential for the function of the fusion protein.
  • the CRISPR/Cas-like protein can also be truncated or modified to optimize the activity of the effector domain of the fusion protein.
  • a plant genome may also be modified by using the Cre-lox system (for example, as described in DS 5,658,772).
  • a plant genome can be modified to include first and second lox sites that are then contacted with a Cre recombinase. If the lox sites are in the same orientation, the intervening DNA sequence between the two sites is excised. If the lox sites are in the opposite orientation, the intervening sequence is inverted.
  • silencing approaches using antisense RNA, short hairpin RNA (shRNA) systems, complementary mature CRISPR RNA (crRNA) by CRISPR/Cas systems, virus-inducing gene silencing (VIGS) systems may be used to down-regulate or knockout expression of the target polynucleotide .
  • shRNA short hairpin RNA
  • crRNA complementary mature CRISPR RNA
  • VIGS virus-inducing gene silencing
  • the generation of polynucleotides capable of reducing the expression of a TM polypeptide or TM polynucleotide described herein can be carried out using the polynucleotide described herein as templates .
  • the resulting transformed cells may be grown into plants in accordance with conventional methods. See, for example, McCormick, et al. (1986) Plant Cell Rep. 5:81-84. These plants may then be grown, and either pollinated with the same transformed strain or different strains, and the resulting progeny having appropriate expression of the desired phenotypic characteristic identified. Two or more generations may be grown to ensure that expression of the desired phenotypic characteristic is stably maintained and inherited, and then seeds harvested to ensure expression of the desired phenotypic characteristic has been achieved . In this manner, the present invention provides transformed seed (also referred to as "transgenic seed") having a polynucleotide of the invention, for example, an expression cassette of the invention, stably incorporated into its genome .
  • Pedigree breeding starts with the crossing of two genotypes, such as an elite line of interest and one other line having one or more desirable characteristics (e. g. , having stably incorporated a polynucleotide of the invention, having a modulated activity and/or level of the polypeptide of the invention, etc. ) which complements the elite line of interest. If the two original parents do not provide all the desired characteristics, other sources can be included in the breeding population.
  • superior plants are selfed and selected in successive filial generations . In the succeeding filial generations the heterozygous condition gives way to homogeneous lines as a result of self-pollination and selection.
  • the inbred line includes homozygous alleles at about 95% or more of its loci.
  • Backcrossing can be used to transfer one or more specifically desirable traits from one line, the donor parent, to an inbred called the recurrent parent, which has overall good agronomic characteristics yet lacks that desirable trait or traits.
  • Backcrossing may be used in combination with pedigree breeding to modify an elite line of interest, and a hybrid is made using the modified elite line.
  • pedigree breeding to modify an elite line of interest, and a hybrid is made using the modified elite line.
  • the same procedure can be used to move the progeny toward the genotype of the recurrent parent but at the same time retain many components of the non-recurrent parent, by stopping the backcrossing at an early stage and proceeding with selfing and selection.
  • an Fl such as a commercial hybrid, is created. This commercial hybrid may be backcrossed to one of its parent lines to create a BC1 or BC2.
  • Progeny are selfed and selected so that the newly developed inbred has many of the attributes of the recurrent parent and yet several of the desired attributes of the non-recurrent parent. This approach leverages the value and strengths of the recurrent parent for use in new hybrids and breeding.
  • an intermediate host cell will be used in the practice of this invention to increase the copy number of the cloning vector.
  • the vector containing the nucleic acid of interest can be isolated in significant quantities for introduction into the desired plant cells.
  • plant promoters that do not cause expression of the polypeptide in bacteria are employed.
  • prokaryotes most frequently are represented by various strains of E. coli; however, other microbial strains may also be used.
  • Commonly used prokaryotic control sequences which are defined herein to include promoters for transcription initiation, optionally with an operator, along with ribosome binding sequences, include such commonly used promoters as the beta lactamase (penicillinase) and lactose (lac) promoter systems (Chang, et al. (1977) Nature 198:1056), the tryptophan (trp) promoter system (Goeddel, et al. (1980) Nucleic Acids Res.
  • the vector is selected to allow introduction into the appropriate host cell.
  • Bacterial vectors are typically of plasmid or phage origin. Appropriate bacterial cells are infected with phage vector particles or transfected with naked phage vector DNA. If a plasmid vector is used, the bacterial cells are transfected with the plasmid vector DNA.
  • Expression systems for expressing a protein of the present invention are available using Bacillus sp. and Salmonella (Palva, et al. (1983) Gene 22:229-235) ; Mosbach, et al. (1983) Nature 302 : 543-545) .
  • the polynucleotides of the present invention can be stacked with any combination of other polynucleotide sequences of interest in order to create a plant with a desired phenotype with respect to one or more traits .
  • the combinations generated may include multiple copies of any one or more of the polynucleotides of interest.
  • stacked combinations can be created by any method including, but not limited to, cross breeding plants by any conventional or TopCross methodology, or genetic transformation .
  • the traits are stacked by genetically transforming the plants, the polynucleotide sequences of interest can be combined at any time and in any order.
  • a transgenic plant comprising one or more desired traits can be used as the target to introduce further traits by subsequent transformation .
  • the traits can be introduced simultaneously in a co-transformation protocol with the polynucleotides of interest provided by any combination of transformation cassettes . For example, if two sequences will be introduced, the two sequences can be contained in separate transformation cassettes (trans) or contained on the same transformation cassette (cis) .
  • Expression of the sequences can be driven by the same promoter or by different promoters. In certain cases, it may be desirable to introduce a transformation cassette that will suppress the expression of a polynucleotide of interest . This may be combined with any combination of other suppression cassettes or overexpression cassettes to generate the desired combination of traits in the plant .
  • a method for modulating the level and/or activity of a TM polypeptide in a plant is also provided.
  • level and/or activity is increased or decreased by at least 1%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% or 90% relative to a native control plant, plant part, or cell which did not have the sequence of the invention introduced. Modulation in the present invention may occur during and/or subsequent to growth of the plant to the desired stage of development .
  • a variety of methods can be employed to assay for a modulation in the level and/or activity of a TM polypeptide.
  • the expression level of the TM polypeptide may be measured directly, for example, by assaying for the level of the TM polypeptide in the plant
  • TM activity is measured elsewhere herein.
  • the level and/or activity of a TM polypeptide is modulated in vegetative tissue, in reproductive tissue, or in both vegetative and reproductive tissue .
  • plants with altered TM expression and/or activity are screened and selected for having an increase in trichome (and/or root hair) density, clustering, size, mass, length, number, and/or branching .
  • Methods are provided to modulate the level and/or activity of a TM polypeptide of the invention in a plant.
  • the level and/or activity of a TM polypeptide is increased.
  • Such an increase in the level and/or activity of a TM polypeptide of the invention can be achieved by providing to the plant a TM polypeptide, providing a TM polynucleotide, or by modifying a genomic locus encoding the TM polypeptide (e. g. , replacing the promoter with a constitutive promoter or promoter that provides elevated expression of TM polypeptide) .
  • the level and/or activity of a TM polypeptide is decreased.
  • Such a decrease in the level and/or activity of a TM polypeptide of the invention can be achieved by providing to the plant a dominant-negative or truncated TM polypeptide, providing an antisense TM polynucleotide (e.g. , ribozyme, antisense, or siRNA) , or by modifying a genomic locus encoding the TM polypeptide (e.g. , replacing all or a portion of the coding region or promoter to knock out or down-regulate expression of the TM polypeptide) .
  • an antisense TM polynucleotide e.g. , ribozyme, antisense, or siRNA
  • a plant having the introduced sequence of the invention is selected using methods known to those of skill in the art such as, but not limited to, Southern blot analysis, DNA sequencing, PCR analysis, or phenotypic analysis.
  • a plant or plant part altered or modified by the foregoing is grown under plant forming conditions for a time sufficient to modulate the concentration and/or activity of the TM polypeptide in the plant . Plant forming conditions are well known in the art and discussed briefly elsewhere herein.
  • Modulation of the level and/or activity of the TM polypeptide will result in increases or decreases in trichome (and/or root hair) density, size, clustering, mass, length, number and/or branching. Therefore, this invention, also provides a method for modulating trichomes (and/or root hairs) in a Cannabaceae plant by modulating the level and/or activity of the TM polypeptide in the Cannabaceae plant .
  • incrementasing trichomes refers to an increase in trichome density, size, mass, clustering, length, number, and/or branching as compared to a control plant.
  • “decreasing trichomes” refers to a decrease in trichome density, size, mass, clustering, length, number, and/or branching as compared to a control plant .
  • this invention also provides a method for modulating secondary metabolite levels in a plant by modulating the level or expression or a TM polypeptide in the Cannabaceae plant.
  • modulating secondary metabolite levels is intended an increase or decrease in the amount or level of one or more secondary metabolites in the transgenic Cannabaceae plant when compared to a control plant .
  • Secondary metabolites that can be increased or decreased in accordance with the present method include, but are not limited to bitter acids (e.g., alpha acid and beta acid), essential oils, flavonoids, terpenophenolic compounds and/or terpenes.
  • bitter acids e.g., alpha acid and beta acid
  • Cinnamaldehyde (a-amyl-Cinnamaldehyde, a-hexyl- Cinnamaldehyde) , Cinnamic Acid, Cinnamyl Alcohol, Citronellal, Citronellol, Cryptone, Curcumene (a-Curcumene, g-Curcumene) , Decanal, Dehydrovomifoliol, Diallyl Disulfide, Dihydroactinidiolide, Dimethyl Disulfide,
  • Eicosane/Icosane Elemene (b-Elemene), Estragole, Ethyl acetate, Ethyl Cinnamate, Ethyl maltol, Eucalyptol/1, 8- Cineole, Eudesmol (a-Eudesmol, b-Eudesmol, g-Eudesmol), Eugenol, Euphol, Farnesene, Farnesol, Fenchol (b-Fenchol),
  • Germacrene B Guaia-1 (10) , 1 1-diene, Guaiacol, Guaiene (or-
  • Verdoxan, a-Y GmbH, Umbelliferone, or Vanillin and Cannabinoids such as D 9 -tetrahydrocannabidiol (THC) , D 9 - tetrahydrocannabinolic acid (THCA) , cannabidiol (CBD) , cannabinol (CBN) , cannabigerol (CBG) , cannabichromene (CBC) , cannabigerolic acid (CBGA) , cannabichromenic acid (CBCA) , cannabidiolic acid (CBDA) and the like .
  • THC D 9 -tetrahydrocannabidiol
  • THCA D 9 - tetrahydrocannabinolic acid
  • CBD cannabidiol
  • CBD cannabinol
  • CBD cannabigerol
  • CBC cannabichromene
  • CBDA cannabigerolic acid
  • CBCA can
  • Transgenic expression of a TM polypeptide can also be used to modify the tolerance of a plant to abiotic (drought, salt, heavy metals, etc.) and/or biotic (pathogen) stress. Accordingly, in one method of the invention, a Cannabaceae plant's tolerance to stress is increased or maintained, when compared to a control plant, by transgenic expression of a TM polypeptide in one or more parts of the plant.
  • a TM polynucleotide is provided by introducing into the plant an expression cassette harboring a TM polynucleotide, expressing the TM polynucleotide .
  • the TM expression construct introduced into the plant is stably incorporated into the genome of the plant.
  • HuTMl, CaTM1 and CaTM2 share a high degree of similarity with TCLl polypeptide from Arabidopsis thaliana (AtCLl; Accession No. AT2G30432.1) , a single-repeat R3 MYB transcription factor (Wang, et al. (2008) BMC Plant Biology 8:81).
  • the HuTMl, CaTMl and CaTM2 possess the single R3 MYB domain, the amino acid signature [D/E] L x 2 [R/K] x 3L x 6L x 3R that is required for interacting with R/B-like BHLH transcription factors (Zimmerman, et al. (2004) Plant J. 40:22-34) and the amino acids within the MYB domain that are involved in cell-to-cell movement of CPC (Kurata, et al. (2005) Development 132 : 5387-98) .
  • nucleic acids encoding HuTMl, CaTMl or CaTM2 are introduced into a plant expression vector under the control of a suitable promoter
  • the plant expression vector is introduced into Humulus or Cannabis, the HuTMl, CaTMl or CaTM2 protein is overexpressed and trichome and/or root hair number, mass, size, clustering, length and/or density is modulated .
  • secondary metabolite production is concurrently modulated .
  • Exemplary gene (including 5' and 3' regulatory regions and introns) and cDNA sequences encoding HuTMl are provided in FIG. 2 and FIG. 3, respectively.
  • an exemplary gene sequence encoding CaTMl is provided in the sequences set forth in Table 2.
  • siRNAs are designed to target nucleic acids encoding HuTMl, CaTMl or CaTM2 protein in the Humulus or Cannabis genome .
  • exemplary siRNA target sequences in polynucleotides encoding HuTMl and CaTMl are provided in Table 3.
  • siRNA or shRNA using, e. g. , loop sequence TCAAGAG (SEQ ID NO: 6)
  • these siRNA/shRNA molecules can be used to reduce the expression of a HuTMl and CaTMl polypeptide or polynucleotide.
  • sgRNA are designed to target nucleic acids encoding HuTMl, CaTMl or CaTM2 protein in the Humulus or Cannabis genome .
  • exemplary sgRNA target sequences (based on S. pyogenes (NGG PAM) and S. aureus (NNGRR PAM) CRISPR Cas9 enzyme families) in polynucleotides encoding HuTMl and CaTMl are provided in Table 4.
  • the sgRNA is introduced into Humulus or Cannabis along with a cognate CRISPR/Cas endonuclease, and all or a portion of the nucleic acids encoding HuTMl , CaTMl or CaTM2 protein are deleted thereby resulting in a decrease in the expression of a HuTMl , CaTMl or CaTM2 polypeptide or polynucleotide .

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Botany (AREA)
  • Physics & Mathematics (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Physiology (AREA)
  • Cell Biology (AREA)
  • Natural Medicines & Medicinal Plants (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Plant Pathology (AREA)
  • Developmental Biology & Embryology (AREA)
  • Microbiology (AREA)
  • Medicinal Chemistry (AREA)
  • Environmental Sciences (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

Compositions and methods for modulating trichomes, root hairs and/or levels of secondary metabolites in Cannabaceae plants are provided.

Description

COMPOSITIONS AND METHODS FOR MODULATING TRICHOMES, ROOT
HAIRS AND SECONDARY METABOLITES IN CANNABACEAE
Background
[0001] Trichomes are various outgrowths of the epidermis in plants including branched and unbranched hairs, vesicles, hooks spines and stinging hairs. Trichomes are staked, multicellular protruding structures that are considered important in the protection of plants against abiotic and biotic stresses. Trichomes have been categorized into types I to VII, with types I, IV, VI and VII as glandular trichome types and II, III, and V as non-glandular . Glandular trichomes, also referred to as secretory or peltate trichomes, are lipophilic glands composed of a group of secretory cells and a cuticle-enclosed cavity that fills with secondary metabolites .
[0002] Hop {Humulus lupulus) is a perennial, dioecious plant that belongs to the Cannabaceae family. "Hops" is the common term for the female inflorescences of hop plants, well known for their use in beer flavoring. These inflorescences develop into cones upon maturation. The lower parts of the inner surface of the bracts of mature female hop cones are covered with glandular trichomes, termed lupulin glands, which contain a number of terpenoid- related compounds (mono- and sesquiterpenes) , bitter acids (prenylated polyketides) , and prenylflavonoids, which are mainly used as flavoring in the beer brewing process . IN this respect, methods for genetically modifying hop for production of trichomes on non-flowering parts such as leaves have been suggested (WO 2018/191398) .
[0003] Like hop, cannabis (Cannabis indlca) and hemp (Cannabis sativa) trichomes are the source of secondary metabolites including cannabinoids such as tetrahydrocannabinol (THC) and cannabidiol (CBD) , as well as terpenes such as myrcene, pinene, caryophyllene, limonene, humulene and linalool. Three types of trichomes are produced by cannabis: bulbous trichomes, capitate sessile trichomes and capitate-stalked trichomes, the latter of which are the most abundant and largest (50-100 pm) .
Summary of the Invention
[0004] This invention is a recombinant vector including : (i) a polynucleotide encoding a polypeptide of SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3, or a variant or Cannabaceae ortholog thereof; or (ii) a polynucleotide capable of reducing the expression of a polynucleotide encoding a polypeptide of SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3, or a variant or Cannabaceae ortholog thereof. An isolated host cell, transgenic plant (e. g . , of the genus Humulus or Cannabis) and transformed seed harboring said recombinant vector are also provided as are methods for modulating trichomes, root hairs or levels of one or more secondary metabolites in a plant .
Brief Description of the Drawings
[0005] FIG. 1 provides an amino acid sequence comparison of trichome modulating polypeptides from Humulus sp. (HuTMl, SEQ ID NO: 1) and Cannabis sp. (CaTMl, SEQ ID NO: 2; CaTM2, SEQ ID NO: 3) and TCL1 polypeptide from Arabidopsis thaliana (AtCL1, SEQ ID NO: 4). Identical residues are indicated with "*" and similar amino acid residues are indicated with " : " . The single R3 MYB domain is indicated with think line underneath . The amino acid signature [D/E]L x 2 [R/K] x 3L x 6L x 3R that is required for interacting with R/B-like BHLH transcription factors is indicated by arrowheads on the top of amino acids. The amino acids within the MYB domain that are involved in cell-to-cell movement of CPC are indicated by arrows on the top of amino acids.
[0006] FIG. 2 provides the cDNA and deduced amino acid sequences of Humulus HuTMl.
[0007] FIG. 3 provides the nucleic acid sequence of the gene encoding Humulus HuTMl (SEQ ID NO: 27) .
Detailed Description of the Invention
[0008] Trichomes in members of the Cannabaceae family are sources of many of the beneficial compounds associated with the plants in this family. Accordingly, to enhance or increase the amounts of one or more beneficial compounds produced by plants in this family, the present invention provides compositions and methods for increasing the number, density, clustering, size and/or branching of trichomes in plants from the Cannabaceae family. More specifically, this invention relates to compositions and methods for the modulating the expression or activity of one or more proteins involved in regulating the number, density, size, clustering or branching of trichomes in plants from the Cannabaceae family.
[0009] As is known in the art, the Cannabaceae family includes members of the genera Cannabis, Humulus, Celtis, Pteroceltis, Aphananthe, Chaetacime, Gironniera, Lozanella, Trema and Parasponia. In certain embodiments of this invention, the Cannabaceae family member is a species from the genus Cannabis or Humulus. In particular embodiments, the Cannabaceae family member is Humulus lupulus (i.e • / hop), Humulus japonicus (i.e • / wild hop) , Humulus yunnanensis (Chinese hop) , Cannabis sp. In so far as it has been debated whether plants falling within the Cannabis genus are distinct species, the species C. sativa, C. ruderalis, C. afghanlca and C. indlca are referred to herein collectively as Cannabis sp.
[0010] As used herein, the term "trichome" encompasses different types of trichomes, both glandular trichomes and/or non-glandular trichomes . "Trichome cells" refer to the cells making up the trichome structure, such as the gland, or secretory cells, base cells and stalk, or stipe cells, extra-cellular cavity and cuticle cells. Trichomes can also be composed of one single cell.
[0011] With the regard to protein expression or activity, the term "modulating" includes either a decrease or increase in expression or activity.
[0012] Compositions of the invention include Cannabaceae trichome modulating polypeptides and polynucleotides, as well as expression cassettes, host cells, and transgenic plants containing the same . A "trichome modulating polypeptide, " "TM polypeptide" or "TM protein" is a polypeptide or protein that are involved in modulating trichome initiation and/or development . Trichome modulating polypeptides of particular interest in the context of this invention are single repeat myb-like factors, also referred to as MYB3K or 3K MYB, from members of the genus Cannabaceae.
[0013] In certain embodiments, the invention encompasses isolated or substantially purified polynucleotide or protein compositions . An "isolated" or "purified" polynucleotide or protein, or biologically active portion thereof, is substantially or essentially free from components that normally accompany or interact with the polynucleotide or protein as found in its naturally occurring environment. Thus, an isolated or purified polynucleotide or protein is substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized. Optimally, an "isolated" polynucleotide is free of sequences (optimally protein encoding sequences) that naturally flank the polynucleotide (i.e., sequences located at the 5' and 3' ends of the polynucleotide) in the genomic DNA of the organism from which the polynucleotide is derived. For. example, in various embodiments, the isolated polynucleotide can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of nucleotide sequence that naturally flank the polynucleotide in genomic DNA of the cell from which the polynucleotide is derived.
[0014] A protein that is substantially free of cellular material includes preparations of protein having less than about 30%, 20%, 10%, 5% or 1% (by dry weight) of contaminating protein. When the protein of the invention or biologically active portion thereof is recombinantly produced, optimally culture medium represents less than about 30%, 20%, 10%, 5% or 1% (by dry weight) of chemical precursors or non-protein-of-interest chemicals.
[0015] Fragments and variants of the disclosed polynucleotides and proteins encoded thereby are also encompassed by the present invention. By "fragment" is intended a portion of the polynucleotide or a portion of the amino acid sequence and hence of the protein encoded thereby. Fragments of a polynucleotide may encode protein fragments that retain the biological activity of the native protein. Alternatively, fragments of a polynucleotide, which are useful as hybridization probes generally do not encode protein fragments retaining biological activity. Thus, fragments of a nucleotide sequence may range from at least about 20 nucleotides, about 50 nucleotides, about 100 nucleotides and up to a full-length polynucleotide encoding a protein of the invention.
[0016] A fragment of a TM polynucleotide that encodes a biologically active portion of a TM protein of the invention will encode at least 15, 25, 30, 50, 100, 150, 200, 250, 300, 350, or 400 contiguous amino acids, or up to the total number of amino acids present in a full-length TM protein of the invention. Fragments of a TM polynucleotide that are useful as hybridization probes or PCR primers generally need not encode a biologically active portion of a TM protein. Thus, a fragment of a TM polynucleotide may encode a biologically active portion of a TM protein, or it may be a fragment that can be used as a hybridization probe or PCR primer using methods disclosed below .
[0017] A biologically active portion of a TM protein can be prepared by isolating a portion of one of the TM polynucleotides of the invention, expressing the encoded portion of the TM protein (e.g • 9 by recombinant expression in vitro) , and assessing the activity of the encoded portion of the TM protein. Polynucleotides that are fragments of a TM nucleotide sequence include at least 16, 20, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 800, 900, 1000, 1100, or 1200 nucleotides, or up to the number of nucleotides present in a full-length TM polynucleotide disclosed herein .
[0018] "Variants" is intended to mean substantially similar sequences . For polynucleotides , a variant includes a deletion and/or addition of one or more nucleotides at one or more sites within the native polynucleotide and/or a substitution of one or more nucleotides at one or more sites in the native polynucleotide . As used herein, a "native" polynucleotide or polypeptide includes a naturally occurring nucleotide sequence or amino acid sequence, respectively. For polynucleotides, conservative variants include those sequences that, because of the degeneracy of the genetic code, encode the amino acid sequence of one of the TM polypeptides of the invention. Naturally occurring allelic variants such as these can be identified with the use of well-known molecular biology techniques, as, for example, with polymerase chain reaction (PCR) and hybridization techniques as outlined below. Variant polynucleotides also include synthetically derived polynucleotides, such as those generated, for example, by using site-directed mutagenesis but which still encode a TM protein of the invention. Generally, variants of a particular polynucleotide of the invention will have at least about 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to that particular polynucleotide as determined by sequence alignment programs and parameters described elsewhere herein.
[0019] Variants of a particular polynucleotide of the invention (i.e., the reference polynucleotide) can also be evaluated by comparison of the percent sequence identity between the polypeptide encoded by a variant polynucleotide and the polypeptide encoded by the reference polynucleotide. Percent sequence identity between any two polypeptides can be calculated using sequence alignment programs and parameters described elsewhere herein. Where any given pair of polynucleotides of the invention is evaluated by comparison of the percent sequence identity shared by the two polypeptides they encode, the percent sequence identity between the two encoded polypeptides is at least about 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity. [0020] "Variant" protein is intended to mean a protein derived from the native protein by deletion or addition of one or more amino acids at one or more sites in the native protein and/or substitution of one or more amino acids at one or more sites in the native protein. Variant proteins encompassed by the present invention are biologically active, that is they continue to possess the desired biological activity of the native protein, that is, TM activity as described herein. Such variants may result from, for example, genetic polymorphism or from human manipulation . Biologically active variants of a native TM protein of the invention will have at least about 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the amino acid sequence for the native protein as determined by sequence alignment programs and parameters described elsewhere herein . A biologically active variant of a protein of the invention may differ from that protein by as few as 1-15 amino acid residues, as few as 1-10, such as 6-10, as few as 5, as few as 4, 3, 2 or even 1 amino acid residue . The upper limit of variation for an amino acid sequence of the invention which retains biological activity can be determined empirically, i.e • 9 by testing variants in an assay for TM activity as described elsewhere herein. A biologically active variant of a protein of the invention may differ from that protein by as much as 100 or 200 amino acids.
[0021] The proteins of the invention may be altered in various ways including amino acid substitutions, deletions, truncations, and insertions. Methods for such manipulations are generally known in the art. For example, amino acid sequence variants and fragments of the TM proteins can be prepared by mutations in the DNA. Methods for mutagenesis and polynucleotide alterations are well known in the art . See, for example, Kunkel (1985) Proc. Natl. Acad. Scl. USA 82 : 488-492; Kunkel, et al. (1987) Methods in Enzymol. 154 : 367-382; US 4,873,192; Walker & Gaastra, eds . (1983) Techniques in Molecular Biology (MacMillan Publishing Company, New York) . Guidance as to appropriate amino acid substitutions that do not affect biological activity of the protein of interest may be found in the model of Dayhoff, et al. (1978) Atlas of Protein Sequence and Structure (Natl . Biomed . Res . Found., Washington, D.C. ) . Conservative substitutions, such as exchanging one amino acid with another having similar properties, may be optimal.
[0022] Thus, the genes and polynucleotides of the invention include both the naturally occurring sequences as well as mutant forms . Likewise, the proteins of the invention encompass both naturally occurring proteins as well as variations and modified forms thereof. Such variants will continue to possess the desired TM activity. The mutations that will be made in the DNA encoding the variant must not place the sequence out of reading frame and optimally will not create complementary regions that could produce secondary mRNA structure . See, EP 0075444.
[0023] The deletions, insertions, and substitutions of the protein sequences encompassed herein are not expected to produce radical changes in the characteristics of the protein. However, when it is difficult to predict the exact effect of the substitution, deletion, or insertion in advance of doing so, one skilled in the art will appreciate that the effect will be evaluated by routine screening assays. That is, the activity can be evaluated by assaying for TM activity. TM activity can be assayed in a variety of ways. For example, modulating hormone responses or modulating trichome density, clustering, size, length, branching, etc.
[0024] Variant polynucleotides and proteins also encompass sequences and proteins derived from a mutagenic and recombinogenic procedure such as DNA shuffling. With such a procedure, one or more different TM sequences can be manipulated to create a new TM polypeptide possessing the desired properties. In this manner, libraries of recombinant polynucleotides are generated from a population of related sequence polynucleotides comprising sequence regions that have substantial sequence identity and can be homologously recombined in vitro or in vivo. For example, using this approach, sequence motifs encoding a domain of interest may be shuffled between the TM gene of the invention and other known TM genes to obtain a new gene coding for a protein with an improved property of interest. Strategies for such DNA shuffling are known in the art. See, for example, Stemmer (1994) Proc. Natl. Acad. Sci. USA 91:10747-10751; Stemmer (1994) Nature 370:389-391; Crameri, et al. (1997) Nature Biotech. 15:436-438; Moore, et al. (1997) J. Mol. Biol. 272:336-347; Zhang, et al. (1997) Proc. Natl. Acad. Sci. USA 94:4504-4509; Crameri, et al. (1998) Nature 391:288-291; US 5,605,793 and US 5,837,458.
[0025] The polynucleotides of the invention (i.e., the TM sequences) can be used to isolate corresponding sequences from other organisms, particularly other plants. In this manner, methods such as PCR, hybridization, and the like can be used to identify such sequences based on their sequence homology to the sequences set forth herein. Sequences isolated based on their sequence identity to the entire TM sequences or the TM promoter sequences set forth herein or to variants and fragments thereof are encompassed by the present invention. Such sequences include sequences that are orthologs of the disclosed sequences . "Orthologs" is intended to mean genes derived from a common ancestral gene and which are found in different species as a result of speciation. Genes found in different species are considered orthologs when their nucleotide sequences and/or their encoded protein sequences share at least 60%, 70%,
75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or greater sequence identity. Functions of orthologs are often highly conserved among species. Thus, isolated polynucleotides that encode a TM protein and which hybridize under stringent conditions to the TM sequences disclosed herein, or to variants or fragments or complements thereof, are encompassed by the present invention. In particular, while exemplary TM polynucleotides and polypeptides from Humulus and Cannabis are provided herein, orthologs of said TM polynucleotides and polypeptides can be identified in Celtis, Pteroceltis, Aphananthe, Chaetachme, Gironnlera, Lozanella, Trema and Parasponia using one or more of the approaches described below.
[0026] In a PCR approach, oligonucleotide primers can be designed for use in PCR reactions to amplify corresponding DNA sequences from cDNA or genomic DNA extracted from any plant of interest. Methods for designing PCR primers and PCR cloning are generally known in the art and are disclosed in Sambrook, et al. (1989) Molecular Cloning: A
Laboratory Manual (2d ed • 9 Cold Spring Harbor Laboratory Press, Plainview, NY) . See also, Innis, et al • 9 eds. (1990)
PCR Protocols: A Guide to Methods and Applications (Academic Press, New York); Innis & Gelfand, eds. (1995) PCR Strategies (Academic Press, New York); and Innis & Gelfand, eds. (1999) PCR Methods Manual (Academic Press, New York) . Known methods of PCR include, but are not limited to, methods using paired primers, nested primers, single specific primers, degenerate primers, gene-specific primers, vector-specific primers, partially-mismatched primers, and the like.
[0027] In hybridization techniques, all or part of a known polynucleotide is used as a probe that selectively hybridizes to other corresponding polynucleotides present in a population of cloned genomic DNA fragments or cDNA fragments (i.e., genomic or cDNA libraries) from a chosen organism. The hybridization probes may be genomic DNA fragments, cDNA fragments, RNA fragments, or other oligonucleotides, and may be labeled with a detectable group such as 32P, or any other detectable marker. Thus, for example, probes for hybridization can be made by labeling synthetic oligonucleotides based on the TM polynucleotides of the invention. Methods for preparation of probes for hybridization and for construction of cDNA and genomic libraries are generally known in the art and are disclosed in Sambrook, et al. (1989) Molecular Cloning: A Laboratory
Manual (2d ed • f Cold Spring Harbor Laboratory Press, Plainview, NY) .
[0028] For example, an entire TM polynucleotide sequence disclosed herein, or one or more portions thereof, may be used as a probe capable of specifically hybridizing to corresponding TM polynucleotides and messenger RNAs. To achieve specific hybridization under a variety of conditions, such probes include sequences that are unique among TM polynucleotide sequences and are optimally at least about 10 nucleotides in length, and most optimally at least about 20 nucleotides in length. Such probes may be used to amplify corresponding TM polynucleotides from a chosen plant by PCR. This technique may be used to isolate additional coding sequences from a desired plant or as a diagnostic assay to determine the presence of coding sequences in a plant. Hybridization techniques include hybridization screening of plated DNA libraries (either plaques or colonies; see, for example, Sambrook, et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed • 9 Cold Spring Harbor Laboratory Press, Plainview, NY) .
[0029] Hybridization of such sequences may be carried out under stringent conditions. By "stringent conditions" or
"stringent hybridization conditions" are intended conditions under which a probe will hybridize to its target sequence to a detectably greater degree than to other sequences (e.g • / at least 2-fold over background) . Stringent conditions are sequence-dependent and will be different in different circumstances. By controlling the stringency of the hybridization and/or washing conditions, target sequences that are 100% complementary to the probe can be identified (homologous probing) . Alternatively, stringency conditions can be adjusted to allow some mismatching in sequences so that lower degrees of similarity are detected (heterologous probing) . Generally, a probe is less than about 1000 nucleotides in length, optimally less than 500 nucleotides in length.
[0030] Typically, stringent conditions will be those in which the salt concentration is less than about 1.5 M Na ion, typically about 0.01 M to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., 10 to 50 nucleotides) and at least about 60 °C for long probes (e.g • r greater than 50 nucleotides) . Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. Exemplary low stringency conditions include hybridization with a buffer solution of 30% to 35% formamide, 1 M NaCl, 1% SDS (sodium dodecyl sulphate) at 37 °C, and a wash in IX to 2X SSC (20X SSC - 3.0 M NaCl/0.3 M trisodium citrate) at 50°C to 55eC. Exemplary moderate stringency conditions include hybridization in 40% to 45% formamide, 1.0 M Nad, 1% SDS at 37 °C, and a wash in 0.5X to IX SSC at 55°C to 60 eC. Exemplary high stringency conditions include hybridization in 50% formamide, 1 M NaCl, 1% SDS at 37°C, and a wash in 0.1X SSC at 60 eC to 65 °C. Optionally, wash buffers may comprise about 0.1% to about 1% SDS . Duration of hybridization is generally less than about 24 hours, usually about 4 hours to about 12 hours . The duration of the wash time will be at least a length of time sufficient to reach equilibrium.
[0031] Specificity is typically the function of posthybridization washes, the critical factors being the ionic strength and temperature of the final wash solution. For DNA-DNA hybrids, the T* can be approximated from the equation of Meinkoth & Wahl ( (1984) Anal. Biochem. 138:267- 284) : Tm =81.5°C + 16.6 (log M)+0.41 (% GC)-0.61 (% form) - 500/L; where M is the molarity of monovalent cations, % GC is the percentage of guanosine and cytosine nucleotides in the DNA, . % form is the percentage of formamide in the hybridization solution, and L is the length of the hybrid in base pairs. The Tm is the temperature (under defined ionic strength and pH) at which 50% of a complementary target sequence hybridizes to a perfectly matched probe . Tm is reduced by about 1°C for each 1% of mismatching; thus, Tm, hybridization, and/or wash conditions can be adjusted to hybridize to sequences of the desired identity. For example, if sequences with 90% identity are sought, the Tm can be decreased 10 °C. Generally, stringent conditions are selected to be about 5°C lower than the thermal melting point (Tm) for the specific sequence and its complement at a defined ionic strength and pH. However, severely stringent conditions can utilize a hybridization and/or wash at 1°C, 2°C, 3°C or VC lower than the thermal melting point (Tm) ; moderately stringent conditions can utilize a hybridization and/or wash at 6°C, 7°C, 8°C, 9°C or 10°C lower than the thermal melting point (Tm) ; low stringency conditions can utilize a hybridization and/or wash at 11°C, 12°C, 13°C,
14 °C, 15°C or 20°C lower than the thermal melting point
(Tm) . Using the equation, hybridization and wash compositions, and desired Tm, those of ordinary skill will understand that variations in the stringency of hybridization and/or wash solutions are inherently described. If the desired degree of mismatching results in a Tm of less than 45 °C (aqueous solution) or 32 °C (formamide solution) , it is optimal to increase the SSC concentration so that a higher temperature can be used. An extensive guide to the hybridization of nucleic acids is found in Tijssen (1993) Laboratory Techniques In Biochemistry and Molecular Biology, Hybridization with Nucleic Acid Probes, Part I, Chapter 2 (Elsevier, NY); and Ausubel, et al . , eds. (1995) Current Protocols in Molecular Biology, Chapter 2 (Greene Publishing and Wiley-Interscience , New York) . See, Sambrook, et al. (1989) Molecular Cloning: A Laboratory
Manual (2d ed • 9 Cold Spring Harbor Laboratory Press, Plainview, NY) .
[0032] The following terms are used to describe the sequence relationships between two or more polynucleotides or polypeptides : "reference sequence", "comparison window",
"sequence identity", and, "percentage of sequence identity." As used herein, "reference sequence" is a defined sequence used as a basis for sequence comparison. A reference sequence may be a subset or the entirety of a specified sequence; for example, as a segment of a full- length cDNA or gene sequence, or the complete cDNA or gene sequence . As used herein, "comparison window" makes reference to a contiguous and specified segment of a polynucleotide sequence, wherein the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two polynucleotides . Generally, the comparison window is at least 20 contiguous nucleotides in length, and optionally can be 30, 40, 50, 100 nucleotides or longer. Those of skill in the art understand that to avoid a high similarity to a reference sequence due to inclusion of gaps in the polynucleotide sequence a gap penalty is typically introduced and is subtracted from the number of matches.
[0033] Methods of alignment of sequences for comparison are well known in the art. Thus, the determination of percent sequence identity between any two sequences can be accomplished using a mathematical algorithm. Non-limiting examples of such mathematical algorithms are the algorithm of Myers & Miller (1988) CABIOS 4:11-17; the local alignment algorithm of Smith, et al. (1981) Adv. Appl. Math. 2:482; the global alignment algorithm of Needleman & Wunsch (1970) J. Mol. Biol. 48:443-453; the search-for- local alignment method of Pearson & Lipman (1988) Proc. Natl. Acad. Sci. 85:2444-2448; the algorithm of Karlin & Altschul (1990) Proc. Natl. Acad. Sci. USA 872264, modified as in Karlin & Altschul, (1993) Proc. Natl. Acad. Sci. USA 90:5873-5877.
[0034] Computer implementations of these mathematical algorithms can be utilized for comparison of sequences to determine sequence identity. Such implementations include, but are not limited to: CLUSTAL in the PC/Gene program (available from Intelligenetics, Mountain View, CA) ; the ALIGN program (Version 2.0) and GAP, BESTFIT, BLAST, FASTA, and TFASTA in the GCG Wisconsin Genetics Software Package,
Version 10 (available from Accelrys Inc • 9 San Diego, CA) . Alignments using these programs can be performed using the default parameters. The CLUSTAL program is well described by Higgins, et al. (1988) Gene 73:237-244 (1988); Higgins, et al. (1989) CABIOS 5:151-153; Corpet, et al. (1988)
Nucleic Acids Res. 16:10881-90; Huang, et al. (1992) CABIOS
8:155-65; and Pearson, et al. (1994) Meth. Mol. Biol.
24:307-331. The ALIGN program is based on the algorithm of Myers & Miller (1988) supra. A PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4 can be used with the ALIGN program when comparing amino acid sequences. The BLAST programs of Altschul, et al. (1990) J. Mol. Biol. 215:403 are based on the algorithm of Karlin & Altschul (1990) supra. BLAST nucleotide searches can be performed with the BLASTN program, score=100, wordlength=12, to obtain nucleotide sequences homologous to a nucleotide sequence encoding a protein of the invention. BLAST protein searches can be performed with the BLASTX program, score=50, wordlength=3, to obtain amino acid sequences homologous to a protein or polypeptide of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST (in BLAST 2.0) can be utilized as described in Altschul, et al. (1997) Nucleic Acids Res.
25:3389. Alternatively, PSI-BLAST (in BLAST 2.0) can be used to perform an iterated search that detects distant relationships between molecules. See, Altschul, et al. (1997) supra. When utilizing BLAST, Gapped BLAST, PSI- BLAST, the default parameters of the respective programs (e.g • 9 BLASTN for nucleotide sequences, BLASTX for proteins) can be used. [0035] Unless otherwise stated, sequence identity/similarity values provided herein refer to the value obtained using GAP Version 10 using the following parameters : % identity and % similarity for a nucleotide sequence using GAP Weight of 50 and Length Weight of 3, and the nwsgapdna . cmp scoring matrix; % identity and % similarity for an amino acid sequence using GAP Weight of 8 and Length Weight of 2, and the BLOSUM62 scoring matrix; or any equivalent program thereof. By "equivalent program" is intended any sequence comparison program that, for any two sequences in question, generates an alignment having identical nucleotide or amino acid residue matches and an identical percent sequence identity when compared to the corresponding alignment generated by GAP Version 10.
[0036] As used herein, "sequence identity" or "identity" in the context of two polynucleotides or polypeptide sequences makes reference to the residues in the two sequences that are the same when aligned for maximum correspondence over a specified comparison window. When percentage of sequence identity is used in reference to proteins it is recognized that residue positions which are not identical often differ by conservative amino acid substitutions, where amino acid residues are substituted for other amino acid residues with similar chemical properties (e.g. , charge or hydrophobicity) and therefore do not change the functional properties of the molecule. When sequences differ in conservative substitutions, the percent sequence identity may be adjusted upwards to correct for the conservative nature of the substitution . Sequences that differ by such conservative substitutions are said to have "sequence similarity" or "similarity" . Means for making this adjustment are well known to those of skill in the art . Typically this involves scoring a conservative substitution as a partial rather than a full mismatch, thereby Increasing the percentage sequence identity. Thus, for example, where an identical amino acid is given a score of 1 and a non-conservative substitution is given a score of zero, a conservative substitution is given a score between zero and 1. The scoring of conservative substitutions is calculated, e.g., as implemented in the program PC/GENE (Intelligenetics, Mountain View, CA) .
[0037] As used herein, "percentage of sequence identity" means the value determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may include additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison, and multiplying the result by 100 to yield the percentage of sequence identity.
[0038] The invention further provides a plant having altered levels and/or activities of the TM polypeptides of the invention. In some embodiments, the plant of the invention has stably incorporated into its genome one or more of the TM polynucleotides described herein. In other embodiments, all or a portion of the TM polynucleotide in the plant has been deleted.
[0039] As used herein, the term plant includes plant cells, plant protoplasts, plant cell tissue cultures from which a plant can be regenerated, plant calli, plant clumps, and plant cells that are intact in plants or parts of plants such as embryos, pollen, ovules, seeds, leaves, flowers, fruit, cones, stalks, roots, root tips, anthers, and the like. Progeny, variants, and mutants of the regenerated plants are also included within the scope of the invention, provided that these parts comprise the introduced nucleic acid sequences.
[0040] A "subject plant" or "subject plant cell" is one in which genetic alteration, such as transformation, has been affected as to a gene of interest, or is a plant or plant cell which is descended from a plant or plant cell so altered and which comprises the alteration. A "control" or "control plant" or "control plant cell" provides a reference point for measuring changes in the subject plant or plant cell.
[0041] A control plant or control plant cell may include, for example, (a) a wild-type plant or plant cell, i.e., of the same genotype as the starting material for the genetic alteration which resulted in the subject plant or subject plant cell; (b) a plant or plant cell of the same genotype as the starting material but which has been transformed with a null construct (i.e., with a construct which has no known effect on the trait of interest, such as a construct comprising a marker gene) ; (c) a plant or plant cell which is a non-transformed segregant among progeny of a subject plant or subject plant cell; (d) a plant or plant cell genetically identical to the subject plant or subject plant cell but which is not exposed to conditions or stimuli that would induce expression of the gene of interest; or (e) the subject plant or subject plant cell itself, under conditions in which the gene of interest is not expressed.
[0042] In the present case, for example, in various embodiments, changes in TM mRNA or protein levels and/or changes in one or more traits such as trichome (and/or root hair) density, size, mass, clustering, length, number, or branching could be measured by comparing a subject plant or subject plant cell to a control plant or control plant cell .
[0043] The polynucleotides of the present invention can be introduced and optionally expressed in a host cell such as bacteria, yeast, insect, mammalian, or preferably plant cells. It is expected that those of skill in the art are knowledgeable in the numerous systems available for the introduction of a polypeptide or a nucleotide sequence of the present invention into a host cell.
[0044] By "host cell" is meant a cell which harbors an exogenous nucleic acid sequence of the invention. Host cells may be prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or mammalian cells. Preferably, the host cell is a monocotyledonous or dicotyledonous plant cell, more preferably a Cannabaceae plant cell, most preferably a
Humulus lupulus, Cannabis plant cell.
[0045] The use of the term "polynucleotide" is not intended to limit the present invention to polynucleotides comprising DNA. Those of ordinary skill in the art will recognize that polynucleotides can include ribonucleotides and combinations of ribonucleotides and deoxyribonucleotides . Such deoxyribonucleotides and ribonucleotides include both naturally occurring molecules and synthetic analogues . The polynucleotides of the invention also encompass all forms of sequences including, but not limited to, single-stranded forms, double-stranded forms, hairpins, stem-and-loop structures, and the like.
[0046] The TM polynucleotide sequences of the invention can be provided in expression cassettes for expression in the organism of interest. The cassette may include 5' and 3' regulatory sequences operably linked to a TM polynucleotide of the invention. "Operably linked" is intended to mean a functional linkage between two or more elements . For example, an operable linkage between a polynucleotide of interest and a regulatory sequence (i.e. , a promoter) is a functional link that allows for expression of the polynucleotide of interest. Operably linked elements may be contiguous or non-contiguous . When used to refer to the joining of two protein coding regions, by "operably linked" is intended that the coding regions are in the same reading frame . The cassette may additionally contain at least one additional gene to be cotransformed into the organism. Alternatively, any additional gene (s) can be provided on multiple expression cassettes . Such an expression cassette is provided with a plurality of restriction sites and/or recombination sites for insertion of the TM polynucleotide to be under the transcriptional regulation of the regulatory regions . The expression cassette may additionally contain selectable marker genes .
[0047] The expression cassette may include in the 5 '-3 ' direction of transcription, a transcriptional and translational initiation region (i.e., a promoter), a TM polynucleotide of the invention, and a transcriptional and translational termination region (i.e., termination region) functional in the host cell (i.e. , the plant) . The regulatory regions (i.e., promoters, transcriptional regulatory regions, and translational termination regions) and/or the TM polynucleotide of the invention may be native/analogous to the host cell or to each other. Alternatively, the regulatory regions and/or the TM polynucleotide of the invention may be heterologous to the host cell or to each other. As used herein, "heterologous" in reference to a sequence is a sequence that originates from a foreign species, or, if from the same species, is substantially modified from its native form in composition and/or genomic locus by deliberate human intervention. For example, a promoter operably linked to a heterologous polynucleotide is from a species different from the species from which the polynucleotide was derived, or, if from the same/analogous species, one or both are substantially modified from their original form and/or genomic locus, or the promoter is not the native promoter for the operably linked polynucleotide. As used herein, a chimeric gene comprises a coding sequence operably linked to a transcription initiation region that is heterologous to the coding sequence.
[0048] A termination region may be native with the transcriptional initiation region, may be native with the operably linked TM polynucleotide of interest or with the TM promoter sequences, may be native with the plant host, or may be derived from another source (i.e., foreign or heterologous) to the promoter, the TM polynucleotide of interest, the plant host, or any combination thereof. Convenient termination regions are available from the Ti- plasmid of A. fcumefaciens, such as the octopine synthase and nopaline synthase termination regions. See also, Guerineau, et al. (1991) Mol. Gen. Genet. 262:141-144; Proudfoot (1991) Cell 64:671-674; Sanfacon, et al. (1991) Genes Dev. 5:141-149; Mogen, et al. (1990) Plant Cell 2:1261-1272; Munroe, et al. (1990) Gene 91:151-158; Balias, et al. (1989) Nucleic Acids Res. 17:7891-7903; and Joshi, et al. (1987) Nucleic Acids Res. 15:9627-9639.
[0049] Where appropriate, the polynucleotides may be optimized for increased expression in the transformed plant. That is, the polynucleotides can be synthesized using plant-preferred codons for improved expression. See, for example , Campbell s Gowri (1990) Plant Physiol. 92:1-11 for a discussion of host-preferred codon usage. Methods are available in the art for synthesizing plant-preferred genes . See, for example, US 5,380,831; US 5,436,391; and Murray, et al. (1989) Nucleic Acids Res. 17:477-498.
[0050] Additional sequence modifications are known to enhance gene expression in a cellular host . These include elimination of sequences encoding spurious polyadenylation signals, exon-intron splice site signals, transposon-like repeats, and other such well-characterized sequences that may be deleterious to gene expression. The G-C content of the sequence may be adjusted to levels average for a given cellular host, as calculated by reference to known genes expressed in the host cell. When possible, the sequence is modified to avoid predicted hairpin secondary mRNA structures .
[0051] The expression cassettes may additionally contain 5 ' leader sequences . Such leader sequences can act to enhance translation. Translation leaders are known in the art and include: picornavirus leaders, for example, EMCV leader (Encephalomyocarditis 5 ' noncoding region) (Elroy-Stein, et al. (1989) Proc. Natl. Acad. Sci. USA 86:6126-6130) ; potyvirus leaders, for example, TEV leader (Tobacco Etch Virus) (Gallie, et al. (1995) Gene 165 (2) :233-238) , MDMV leader (Maize Dwarf Mosaic Virus) (Johnson, et al. (1986) Virology 154:9-20), and human immunoglobulin heavy-chain binding protein (BiP) (Macejak, et al. (1991) Nature 353:90-94); untranslated leader from the coat protein mRNA of alfalfa mosaic virus (AMV RNA 4) (Jobling, et al. (1987) Nature 325: 622-625) ; tobacco mosaic virus leader (TMV) (Gallie, et al. (1989) in Molecular Biology of RNA, ed. Cech (Liss, New York) , pp. 237-256) ; and maize chlorotic mottle virus leader (MCMV) (Lommel, et al. (1991) Virology 81:382-385) . See also, Della-Cioppa, et al. (1987) Plant Physiol. 84 : 965-968. Other methods known to enhance translation can also be utilized, for example, introns, and the like.
[0052] In preparing the expression cassette, the various DNA fragments may be manipulated, so as to provide for the DNA sequences in the proper orientation and, as appropriate, in the proper reading frame. Toward this end, adapters or linkers may be employed to join the DNA fragments or other manipulations may be involved to provide for convenient restriction sites, removal of superfluous DNA, removal of restriction sites, or the like. For this purpose, in vitro mutagenesis, primer repair, restriction, annealing, resubstitutions , e.g • / transitions and transversions , may be involved.
[0053] The expression cassette can also include a selectable marker gene for the selection of transformed cells . Selectable marker genes are utilized for the selection of transformed cells or tissues. Marker genes include genes encoding antibiotic resistance, such as those encoding neomycin phosphotransferase II (NEO) and hygromycin phosphotransferase (HPT) , as well as genes conferring resistance to herbicidal compounds, such as glufosinate ammonium, bromoxynil, imidazolinones, and 2,4- dichlorophenoxyacetate (2,4-D) . Additional selectable markers include phenotypic markers such as beta- galactosidase and fluorescent proteins such as green fluorescent protein (GFP) (Su, et al. (2004) Biotechnol Bioeng. 85:610-9 and Fetter, et al. (2004 ) Plant Cell
16:215-28), cyan florescent protein (GYP) (Bolte, et al. (2004) J. Cell Science 117 : 943-54 and Kato, et al. (2002) Plant Physiol. 129: 913-42) , and yellow florescent protein (PHIYFP™, see, Bolte, et al. (2004) J. Cell Science 117 : 943-54) . For additional selectable markers, see generally, Yarranton (1992) Curr. Opin. Biotech. 3:506-511; Christopherson, et al. (1992) Proc. Natl. Acad. Sci. USA 89:6314-6318; Yao, et al. (1992) Cell 71:63-72; Reznikoff, (1992) Mol. Microbiol. 6:2419-2422; Barkley, et al. (1980) in The Operon, pp. 177-220; Hu, et al. (1987) Cell 48:555- 566; Brown, et al. (1987) Cell 49:603-612; Figge, et al. (1988) Cell 52 : 713-722; Deuschle, et al. (1989) Proc. Natl. Acad. Aci. USA 86:5400-5404; Fuerst, et al. (1989) Proc. Natl. Acad. Sci. USA 86:2549-2553; Deuschle, et al. (1990) Science 248 : 480-483.
[0054] The number, density, mass, clustering, size, length and/or branching of trichomes may modulated over the entire plant or may be localized to one or more organs (e. g. , stem, leaf, root, fruit or flower) using, e.g., tissue or organ-specific promoters, In addition, trichome number, density, mass, clustering, size, length and/or branching may be modulated under certain conditions or at a certain time (e.g., at flowering) . In this respect, a number of promoters can be used in the practice of the invention, including the native promoter of the polynucleotide sequence of interest . The promoters can be selected based on the desired outcome . The nucleic acids can be combined with constitutive , inducible (e.g., stress- or chemical- induced) , tissue-preferred, or other promoters for expression in plants .
[0055] Constitutive promoters include, for example, the core promoter of the Rsyn7 promoter and other constitutive promoters disclosed in WO 99/43838 and US 6,072,050; the core CaMV 35S promoter (Odell, et al. (1985) Mature 313 : 810-812) ; rice actin (McElroy, et al. (1990) Plant Cell 2 : 163-171) ; ubiquitin (Christensen, et al. (1989) Plant Mol. Biol. 12:619-632 and Christensen, et al. (1992) Plant Mol. Biol . 18:675-689) ; PEMU (Last, et al. (1991) Theor.
Appl. Genet. 81 : 581-588) ; MAS (Velten, et al. (1984) EMBO J. 3 : 2723-2730) ; ALS promoter (US 5,659,026), and the like. Examples of other constitutive promoters are described in, for example, US 5,608,149; US 5,608,144; US 5,604,121; US 5,569,597; US 5,466,785; US 5,399,680; US 5,268,463; US 5,608,142 and US 6,177,611.
[0056] Tissue-preferred promoters can be utilized to target enhanced TM expression within a particular plant tissue . Tissue-preferred promoters include those disclosed by Yamamoto, et al. (1997) Plant J. 12 (2) : 255-265; Kawamata, et al. (1997) Plant Cell Physiol. 38 (7) : 792-803; Hansen, et al. (1997) Mol . Gen. Genet. 254 (3) : 337-343; Russell, et al. (1997) Transgenic Res. 6 (2) : 157-168 ; Rinehart, et al. (1996) Plant Physiol. 112 (3) : 1331-1341; Van Camp, et al. (1996) Plant Physiol. 112(2) : 525-535; Canevascini, et al. (1996) Plant Physiol. 112 (2 ): 513-524; Yamamoto, et al. (1994) Plant Cell Physiol. 35 (5) : 773-778; Lam, (1994) Results Probl. Cell Differ. 20:181-196; Orozco, et al. (1993) Plant Mol. Biol. 23(6): 1129-1138; Matsuoka, et al. (1993) Proc Natl. Acad. Sci. USA 90 (20) : 9586-9590; and Guevara-Garcia, et al. (1993) Plant J. 4 (3) : 495-505. Such promoters can be modified, if necessary, for weak expression. See, also, US 2003/0074698. Promoters active in maternal plant tissues, such as female florets, ovaries, either pre-pollination or upon pollination, may be of particular interest.
[0057] Leaf-preferred promoters are known in the art. See, for example, Yamamoto, et al. (1997) Plant J. 12 (2 ) : 255- 265; Kwon, et al. (1994) Plant Physiol. 105 : 357-67; Yamamoto, et al. (1994) Plant Cell Physiol. 35 (5) : 773-778; Gotor, et al. (1993) Plant J. 3:509-18; Orozco, et al. (1993) Plant Mol. Biol. 23(6) : 1129-1138; Baszczynski, et al. (1988) Nucl . Acid Res. 16:4732; Mitra, et al. (1994) Plant Mol. Biol. 26:35-93; Kayaya, et al. (1995) Mol. Gen. Genet. 248:668-674; and Matsuoka, et al. (1993) Proc. Natl. Acad. Sci. USA 90 (20) : 9586-9590. Senescence regulated promoters are also of use, such as SAM22 (Crowell, et al. (1992) Plant Mol. Biol. 18:459-466).
[0058] Root-preferred or root-specific promoters are known and can be selected from the many available from the literature or isolated de novo from various compatible species . See, for example, Hire, et al. (1992) Plant Mol. Biol. 20(2) : 207-218 (soybean root-specific glutamine synthetase gene) ; Keller & Baumgartner (1991) Plant Cell 3(10) : 1051-1061 (root-specific control element in the GRP 1.8 gene of French bean) ; Sanger, et al. (1990) Plant Mol. Biol. 14(3) : 433-443 (root-specific promoter of the mannopine synthase (MAS) gene of Agrobacteriurn tumefaciens) ; and Miao, et al. (1991) Plant Cell 3(1): 11-22 (full-length cDNA clone encoding cytosolic glutamine synthetase (GS) , which is expressed in roots and root nodules of soybean) . See also, Bogusz, et al. (1990) Plant Cell 2(7): 633-641, where two root-specific promoters isolated from hemoglobin genes from the nitrogen-fixing nonlegume Parasponla andersonii and the related non- nitrogen-fixing nonlegume Trema tomentosa are described. The promoters of these genes were linked to a beta- glucuronidase reporter gene and introduced into both the nonlegume Nicotiana tabacum and the legume Lotus corniculatus, and in both instances root-specific promoter activity was preserved. Promoters of the highly expressed rolC and rolD root-inducing genes of Agrobacterium rhizogenes have also been described (see, Leach & Aoyagi (1991) Plant Science (Limerick) 79(1) : 69-76) . Additional root-preferred promoters include the VfENOD-GRP3 gene promoter (Kuster, et al. (1995) Plant Mol. Biol. 29(4): 759-
772); rolB promoter (Capana, et al. (1994) Plant Mol. Biol. 25 (4) : 681-691; and the CRWAQ81 root-preferred promoter with the ADH first intron (US Ser. No. 10/961,629). See also, US 5,837,876; US 5,750,386; US 5,633,363; US 5,459,252; US 5,401,836; US 5,110,732 and US 5,023,179.
[0059] Shoot-preferred promoters include shoot meristem- preferred promoters such as promoters disclosed in Weigal, et al. (1992) Cell 69:843-859; Accession Number AJ131822; Accession Number Z71981; Accession Number AF049870 and shoot-preferred promoters disclosed in McAvoy, et al. (2003) Acta Hort. (ISHS) 625:379-385.
[0060] A trichome-specific promoter may also be used. Examples of trichome-specific promoters are known in the art and include, e.g., the promoter of the Arabidopsis thaliana OASAl gene, which has activity in both glandular and non-glandular trichomes of tobacco (Gutierrez-Alcala, et al. (2005) J. Exp. Bot. 56:2487-94). A trichome specific promoter from tobacco P450 gene, CYP71D16, which shows expression in tobacco glandular trichomes at all developmental stages has also been described (Wang, et al. (2002) J. Exp. Bot. 53:1891). Further, WO 2004/111183 describes trichome specific promoters from tomato and tobacco leaves; WO 2009/082208 describes trichome-specific promoters from tomato; and US 9,856,486 describes the promoters from farnesyl diphosphate synthase (ShzFPS), zingiberene synthase (ShZIS) and germacrene synthase (ShTPS9) genes isolated from Solanum habrochaltes as imparting trichome-specific expression. Additional glandular trichome-specific gene promoters have been reported in the literature for a variety of plants, including, but not restricted to, Antirrhinum majus (Jaffe, et al. (2007) J. Exp. Bot. 58:1515-24), Artemisia annua (Wang, et al. (2011) Am. J. Plant Sci. 2:619-28; Wang, et al. (2013) Plant Mol . Biol. 81:119-38) , H. lupulus (Okada &
Ito (2001) Biosci. Biotechnol. Biochem. 65 : 150-155) , Nicotiana sp. (Wang, et al. (2002) J. Exp. Bot. 53:1891-7;
Ennajdaoui, et al. (2010) Plant Mol. Biol. 73:673-685;
Choi, et al. (2012) Plant J. 70 : 480-491; Sallaud, et al. (2012) Plant J. 72:1-17) S. lycopersicum (Liu, et al. (2006) Plant Cell Physiol. 47 : 1274-84; Spyropoulou, et al. (2014) Plant Mol. Biol. 84:345-357) and cotton (Shangguan, et al. (2008 ) J. Exp. Bot. 59:3533-3542) . See also, Kortbeek, et al. (2016) Methods Enzymol. 576:305-331; Laterre, et al. (2017) Plant Physiol. 173:2110-2120) .
[0061] Dividing cell or meristematic tissue-preferred promoters have been disclosed in Ito, et al. (1994) Plant Mol. Biol. 24 : 863-878; Reyad, et al. (1995) Mol. Gen. Genet. 248:703-711; Shaul, et al. (1996) Proc. Natl. Acad. Sci. 93:4868-4872; Ito, et al. (1997) Plant J. 11:983-992.
[0062] Inflorescence-preferred promoters include the promoter of chalcone synthase (Van der Meer, et al. (1990) Plant Mol. Biol. 15:95-109) and LAT52 (Twell, et al. (1989) Mol. Gen. Genet. 217:240-245).
[0063] Stress inducible promoters include salt/water stress-inducible promoters such as P5CS (Zang, et al. (1997) Plant Sci. 129:81-89); cold-inducible promoters, such as cor15a (Hajela, et al. (1990) Plant Physiol. 93:1246-1252) , cor15b (Wlihelm, et al. (1993) Plant Mol.
Biol. 23:1073-1077), wscl20 (Ouellet, et al. (1998) FEBS
Lett. 423:324-328) , ci7 (Kirch, et al. (1997) Plant Mol.
Biol. 33:897-909), ci2lA (Schneider, et al. (1997) Plant Physiol. 113 : 335-45) ; drought-inducible promoters, such as, Trg-31 (Chaudhary, et al. (1996) Plant Mol. Biol. 30:1247- 57), rd29 (Kasuga, et al. (1999) Nature Biotech. 18:287- 291) ; osmotic inducible promoters, such as, Rabl7 (Vilardell, et al. (1991) Plant Mol. Biol. 17 : 985-93) and osmotin (Raghothama, et al. (1993) Plant Mol. Biol. 23 : 1117-28) ; and heat inducible promoters, such as heat shock proteins (Barros, et al. (1992) Plant Mol. 19:665-75; Marrs, et al. (1993) Dev. Genet. 14:27-41), and smHSP (Waters, et al. (1996) J. Exper. Bot. 47:325-338). Other stress-inducible promoters include rip2 (OS 5,332,808 and US 2003/0217393) and rd29a (Yamaguchi-Shinozaki, et al. (1993) Mol. Gen. Genet. 236:331-334).
[0064] Expression can also be regulated via a chemically inducible promoter (see Gatz (1997) Annu. Rev. Plant Physiol. Plant Mol. Biol. 48 : 89-108) . Chemically inducible promoters are particularly suitable when it is desired that gene expression should take place in a time-specific manner. Examples of such promoters are promoters inducible by tetracycline (Gatz, et al. (1992) Plant J. 2:397-404), salicylic acid (WO 95/19443) or Bion, a substance which can replace salicylic acid in some of its functions (Weigel, et al. (2001) Plant Mol. Biol. 46:143).
[0065] In accordance with this invention, a TM polypeptide or polynucleotide encoding the same is introduced into a host cell, in particular a plant . "Introducing" or "introduced" is intended to mean presenting to the plant the polynucleotide or polypeptide in such a manner that the sequence gains access to the interior of a cell . This invention does not depend on a particular method for introducing a sequence into the host cell, only that the polynucleotide or polypeptides gains access to the interior of at least one cell of the host. Methods for introducing polynucleotide or polypeptides into host cells (i.e • f plants) are known in the art and include, but are not limited to, stable transformation methods , transient transformation methods , and virus-mediated methods. [0066] "Stable transformation" is intended to mean that the nucleotide construct introduced into a host (i.e. , a plant) integrates into the genome of the plant and is capable of being inherited by the progeny thereof. "Transient transformation" is intended to mean that a polynucleotide is introduced into the host (i.e., a plant) and expressed temporally or a polypeptide is introduced into a host (i.e., a plant) .
[0067] Transformation protocols as well as protocols for introducing polypeptides or polynucleotide sequences into plants may vary depending on the type of plant or plant cell targeted for transformation. Suitable methods of introducing polypeptides and polynucleotides into plant cells include microinjection (Crossway, et al. (1986)
Biotechniques 4 : 320-334) , electroporation (Riggs, et al. (1986) Proc. Natl. Acad. Sci. USA 83:5602-5606, Agrobacterium-mediated transformation (US 5,563,055; US 5,981,840), direct gene transfer (Paszkowski, et al. (1984) EMBO J. 3:2717-2722), and ballistic particle acceleration (see, for example, US 4,945,050; US 5,879,918; US
5,886,244; US 5,932,782; Tomes, et al. (1995) in Plant Cell , Tissue , and Organ Culture: Fundamental Methods, ed. Gamborg and Phillips (Springer-Verlag, Berlin) ; McCabe, et al. (1988) Biotechnology 6 : 923-926) ; and Led transformation (WO 00/28058) .
[0068] Genetic transformation of H. lupulus is described in the art and includes particle bombardment (Gatica-Arias & Weber (2013) In Vitro Cell. Dev. Biol. 49 : 656-664; Batista, et al. (2008) Plant Cell Rep. 27 : 1185-96) ; Agrobacterium- mediated transformation (Horlemann, et al. (2003) Plant
Cell Rep. 22:210-7; Okada, et al. (2003) J. Plant Physiol. 160:1101-1108) . Similarly, A. tumefaciens- and A . rhixogenes-mediated transformation of C. sativa has been described (Wahby, et al. (2013) J. Plant Interact. 8:312- 20; Feeney & Punja (2003) In Vitro Cell. Dev. Biol. 39; 578- 585; Zheiri (2016) Pharma. Biomed. Res. 2:13-18) as has been described.
[0069] In specific embodiments, the TM sequences of the invention can be provided to a plant using a variety of transient transformation methods . Such transient transformation methods include, but are not limited to, the introduction of the TM protein or variants or fragments thereof directly into the plant, or the introduction of a TM transcript into the plant. Such methods include, for example, microinjection or particle bombardment . See, for example, Crossway, et al. (1986) Mol. Gen. Genet. 202:179- 185; Nomura, et al. (1986) Plant Sci. 44:53-58; Hepler, et al. (1994) Proc. Natl. Acad. Sci. 91: 2176-2180 and Hush, et al. (1994) J. Cell Sci. 107:775-784. Alternatively, the TM polynucleotide can be transiently transformed into the plant using techniques known in the art . Such techniques include viral vector system and the precipitation of the polynucleotide in a manner that precludes subsequent release of the DNA. Thus, the transcription from the particle-bound DNA can occur, but the frequency with which it is released to become integrated into the genome is greatly reduced . Such methods include the use particles coated with polyethylimine .
[0070] In certain embodiments, the polynucleotide of the invention may be introduced into plants by contacting plants with a virus or viral nucleic acids. Generally, such methods involve incorporating an expression construct of the invention within a viral DNA or RNA molecule . It is recognized that a TM sequence of the invention may be initially synthesized as part of a viral polyprotein, which later may be processed by proteolysis in vivo or in vitro to produce the desired recombinant protein. Further, it is recognized that promoters of the invention also encompass promoters used for transcription by viral RNA polymerases . Methods for introducing polynucleotides into plants and expressing a protein encoded therein, involving viral DNA or RNA molecules, are known in the art. See, for example, US 5,889,191, US 5,889,190, US 5,866,785, US 5,589,367, US 5,316,931 and Porta, et al. (1996) Mol. Biotech. 5:209-221.
[0071] Methods are known in the art for the targeted insertion of a polynucleotide at a specific location in the plant genome . In one embodiment, the insertion of the polynucleotide at a desired genomic location is achieved using a site-specific recombination system. See, for example, WO 99/25821, WO 99/25854, WO 99/25840, WO 99/25855, and WO 99/25853; also, US 6,552,248, US 6,624,297, US 6,573,425, US 6,455,315 and US 6,458,594. Briefly, the polynucleotide of the invention can be contained in a transfer cassette flanked by two nonidentical recombination sites. The transfer cassette is introduced into a plant having stably incorporated into its genome a target site which is flanked by two non-identical recombination sites that correspond to the sites of the transfer cassette. An appropriate recombinase is provided and the transfer cassette is integrated at the target site. Any genome editing nucleases known in the art may be used, including but not limited to Zinc-finger nucleases (ZFNs) , transcription activator-like effector nucleases (TALENs) , and clustered regularly interspaced short palindromic repeat (CRISPR) /Cas-based RNA-guided DNA endonucleases . The polynucleotide of interest is thereby integrated at a specific chromosomal position in the plant genome .
[0072] In certain embodiments, RNA-guided endonucleases are used for modifying the plant genome (e. g. , insertion or deletion of a polynucleotide of interest) . RNA-guided endonuclease systems include a guide RNA, which interacts with an RNA-guided endonuclease to direct the endonuclease to a specific target site, wherein the 5' end of the guide RNA base pairs with a specific protospacer sequence.
[0073] In some embodiments, the RNA-guided endonuclease is derived from a CRISPR/CRISPR-associated (Cas) system. The CRISPR/Cas system can be a type I, a type II, or a type III system. Non-limiting examples of suitable CRISPR/Cas proteins include Cas3, Cas4, Cas5, Cas5e (or CasD) , Cas6, Cas6e, Cas6f, Cas7, CasBal, Cas8a2, CasSb, CasBc, Cas9, CaslO, CastlOd, CasF, CasG, CasH, Csyl, Csy2, Csy3, Csel (or CasA) , Cse2 (or CasB) , Cse3 (or CasE) , Cse4 (or CasC) , Cscl, Csc2, Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmrl, Cmr3, Cmr4, Cmr5, Cmr6, Csbl, Csb2, Csb3, Csxl7, Csxl4, CsxlO, Csxl6, CsaX, Csx3, Cszl, Csxl5, Csfl, Csf2, Csf3, Csf4, and Cul966.
[0074] In general, CRISPR/Cas proteins have at least one RNA recognition and/or RNA binding domain. RNA recognition and/or RNA binding domains interact with guide RNAs. CRISPR/Cas proteins can also include nuclease domains (i.e., DNase or RNase domains), DNA binding domains, helicase domains , RNAse domains , protein-protein interaction domains, dimerization domains, as well as other domains .
[0075] The CRISPR/Cas-like protein can be a wild-type CRISPR/Cas protein, a modified CRISPR/Cas protein, or a fragment of a wild-type or modified CRISPR/Cas protein. The CRISPR/Cas-like protein can be modified to increase nucleic acid binding affinity and/or specificity, alter an enzyme activity, and/or change another property of the protein. For example, nuclease (i.e., DNase, RNase) domains of the CRISPR/Cas-like protein can be modified, deleted, or inactivated. Alternatively, the CRISPR/Cas-like protein can be truncated to remove domains that are not essential for the function of the fusion protein. The CRISPR/Cas-like protein can also be truncated or modified to optimize the activity of the effector domain of the fusion protein.
[0076] A plant genome may also be modified by using the Cre-lox system (for example, as described in DS 5,658,772). A plant genome can be modified to include first and second lox sites that are then contacted with a Cre recombinase. If the lox sites are in the same orientation, the intervening DNA sequence between the two sites is excised. If the lox sites are in the opposite orientation, the intervening sequence is inverted.
[0077] Further, when the goal is decrease or reduce expression or activity of a target polynucleotide, silencing approaches using antisense RNA, short hairpin RNA (shRNA) systems, complementary mature CRISPR RNA (crRNA) by CRISPR/Cas systems, virus-inducing gene silencing (VIGS) systems may be used to down-regulate or knockout expression of the target polynucleotide . Dominant-negative approaches may also be used. The generation of polynucleotides capable of reducing the expression of a TM polypeptide or TM polynucleotide described herein can be carried out using the polynucleotide described herein as templates .
[0078] Whether a TM polynucleotide or polypeptide is introduced into a plant cell for overexpression, reduced expression or knockout, the resulting transformed cells may be grown into plants in accordance with conventional methods. See, for example, McCormick, et al. (1986) Plant Cell Rep. 5:81-84. These plants may then be grown, and either pollinated with the same transformed strain or different strains, and the resulting progeny having appropriate expression of the desired phenotypic characteristic identified. Two or more generations may be grown to ensure that expression of the desired phenotypic characteristic is stably maintained and inherited, and then seeds harvested to ensure expression of the desired phenotypic characteristic has been achieved . In this manner, the present invention provides transformed seed (also referred to as "transgenic seed") having a polynucleotide of the invention, for example, an expression cassette of the invention, stably incorporated into its genome .
[0079] Pedigree breeding starts with the crossing of two genotypes, such as an elite line of interest and one other line having one or more desirable characteristics (e. g. , having stably incorporated a polynucleotide of the invention, having a modulated activity and/or level of the polypeptide of the invention, etc. ) which complements the elite line of interest. If the two original parents do not provide all the desired characteristics, other sources can be included in the breeding population. In the pedigree method, superior plants are selfed and selected in successive filial generations . In the succeeding filial generations the heterozygous condition gives way to homogeneous lines as a result of self-pollination and selection. Typically in the pedigree method of breeding, five or more successive filial generations of selfing and selection are practiced: Fl->F2; F2->F3; F3->F4; F4->F5, etc. After a sufficient amount of inbreeding, successive filial generations will serve to increase seed of the developed inbred . Preferably, the inbred line includes homozygous alleles at about 95% or more of its loci.
[0080] Backcrossing can be used to transfer one or more specifically desirable traits from one line, the donor parent, to an inbred called the recurrent parent, which has overall good agronomic characteristics yet lacks that desirable trait or traits. Backcrossing may be used in combination with pedigree breeding to modify an elite line of interest, and a hybrid is made using the modified elite line. However, the same procedure can be used to move the progeny toward the genotype of the recurrent parent but at the same time retain many components of the non-recurrent parent, by stopping the backcrossing at an early stage and proceeding with selfing and selection. For example, an Fl, such as a commercial hybrid, is created. This commercial hybrid may be backcrossed to one of its parent lines to create a BC1 or BC2. Progeny are selfed and selected so that the newly developed inbred has many of the attributes of the recurrent parent and yet several of the desired attributes of the non-recurrent parent. This approach leverages the value and strengths of the recurrent parent for use in new hybrids and breeding.
[0081] Typically, an intermediate host cell will be used in the practice of this invention to increase the copy number of the cloning vector. With an increased copy number, the vector containing the nucleic acid of interest can be isolated in significant quantities for introduction into the desired plant cells. In one embodiment, plant promoters that do not cause expression of the polypeptide in bacteria are employed.
[0082] Prokaryotes most frequently are represented by various strains of E. coli; however, other microbial strains may also be used. Commonly used prokaryotic control sequences, which are defined herein to include promoters for transcription initiation, optionally with an operator, along with ribosome binding sequences, include such commonly used promoters as the beta lactamase (penicillinase) and lactose (lac) promoter systems (Chang, et al. (1977) Nature 198:1056), the tryptophan (trp) promoter system (Goeddel, et al. (1980) Nucleic Acids Res. 8: 4057) and the lambda derived PL promoter and N-gene ribosome binding site (Shimatake, et al. (1981) Nature 292:128) . The inclusion of selection markers in DNA vectors transfected in E. coli is also useful. Examples of such markers include genes specifying resistance to ampicillin, tetracycline, or chloramphenicol .
[0083] The vector is selected to allow introduction into the appropriate host cell. Bacterial vectors are typically of plasmid or phage origin. Appropriate bacterial cells are infected with phage vector particles or transfected with naked phage vector DNA. If a plasmid vector is used, the bacterial cells are transfected with the plasmid vector DNA. Expression systems for expressing a protein of the present invention are available using Bacillus sp. and Salmonella (Palva, et al. (1983) Gene 22:229-235) ; Mosbach, et al. (1983) Nature 302 : 543-545) .
[0084] In certain embodiments, the polynucleotides of the present invention can be stacked with any combination of other polynucleotide sequences of interest in order to create a plant with a desired phenotype with respect to one or more traits . The combinations generated may include multiple copies of any one or more of the polynucleotides of interest.
[0085] These stacked combinations can be created by any method including, but not limited to, cross breeding plants by any conventional or TopCross methodology, or genetic transformation . If the traits are stacked by genetically transforming the plants, the polynucleotide sequences of interest can be combined at any time and in any order. For example, a transgenic plant comprising one or more desired traits can be used as the target to introduce further traits by subsequent transformation . The traits can be introduced simultaneously in a co-transformation protocol with the polynucleotides of interest provided by any combination of transformation cassettes . For example, if two sequences will be introduced, the two sequences can be contained in separate transformation cassettes (trans) or contained on the same transformation cassette (cis) .
Expression of the sequences can be driven by the same promoter or by different promoters. In certain cases, it may be desirable to introduce a transformation cassette that will suppress the expression of a polynucleotide of interest . This may be combined with any combination of other suppression cassettes or overexpression cassettes to generate the desired combination of traits in the plant .
[0086] A method for modulating the level and/or activity of a TM polypeptide in a plant is also provided. In general, level and/or activity is increased or decreased by at least 1%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% or 90% relative to a native control plant, plant part, or cell which did not have the sequence of the invention introduced. Modulation in the present invention may occur during and/or subsequent to growth of the plant to the desired stage of development .
[0087] A variety of methods can be employed to assay for a modulation in the level and/or activity of a TM polypeptide. For instance, the expression level of the TM polypeptide may be measured directly, for example, by assaying for the level of the TM polypeptide in the plant
(i.e • r western or northern blot analysis) , or indirectly, for example, expression of downstream proteins in the case of transcription factors . Methods for measuring the TM activity are described elsewhere herein. In specific embodiments, the level and/or activity of a TM polypeptide is modulated in vegetative tissue, in reproductive tissue, or in both vegetative and reproductive tissue . In certain embodiments, plants with altered TM expression and/or activity are screened and selected for having an increase in trichome (and/or root hair) density, clustering, size, mass, length, number, and/or branching .
[0088] Methods are provided to modulate the level and/or activity of a TM polypeptide of the invention in a plant. In one embodiment, the level and/or activity of a TM polypeptide is increased. Such an increase in the level and/or activity of a TM polypeptide of the invention can be achieved by providing to the plant a TM polypeptide, providing a TM polynucleotide, or by modifying a genomic locus encoding the TM polypeptide (e. g. , replacing the promoter with a constitutive promoter or promoter that provides elevated expression of TM polypeptide) . In another embodiment, the level and/or activity of a TM polypeptide is decreased. Such a decrease in the level and/or activity of a TM polypeptide of the invention can be achieved by providing to the plant a dominant-negative or truncated TM polypeptide, providing an antisense TM polynucleotide (e.g. , ribozyme, antisense, or siRNA) , or by modifying a genomic locus encoding the TM polypeptide (e.g. , replacing all or a portion of the coding region or promoter to knock out or down-regulate expression of the TM polypeptide) . Subsequently, a plant having the introduced sequence of the invention is selected using methods known to those of skill in the art such as, but not limited to, Southern blot analysis, DNA sequencing, PCR analysis, or phenotypic analysis. A plant or plant part altered or modified by the foregoing is grown under plant forming conditions for a time sufficient to modulate the concentration and/or activity of the TM polypeptide in the plant . Plant forming conditions are well known in the art and discussed briefly elsewhere herein.
[0089] Many methods are known in the art for providing a polypeptide to a plant including, but not limited to, direct introduction of the polypeptide into the plant, and introducing into the plant (transiently or stably) a polynucleotide construct encoding a polypeptide having TM activity. It is also recognized that the methods of the invention may employ a polynucleotide that is not capable of directing, in the transformed plant, the expression of a protein or an RNA. Thus, the level and/or activity of a TM polypeptide may be increased by altering the gene encoding the TM polypeptide or by altering or affecting its promoter . See, US 5,565,350 and PCT/US93/03868. Therefore, mutagenized plants that carry mutations in TM genes , where the mutations increase expression of the TM gene, are provided .
[0090] Modulation of the level and/or activity of the TM polypeptide will result in increases or decreases in trichome (and/or root hair) density, size, clustering, mass, length, number and/or branching. Therefore, this invention, also provides a method for modulating trichomes (and/or root hairs) in a Cannabaceae plant by modulating the level and/or activity of the TM polypeptide in the Cannabaceae plant . As used herein, "increasing trichomes" refers to an increase in trichome density, size, mass, clustering, length, number, and/or branching as compared to a control plant. Likewise, "decreasing trichomes" refers to a decrease in trichome density, size, mass, clustering, length, number, and/or branching as compared to a control plant .
[0091] An increase in trichomes will result in an increase in the pool of one or more secondary metabolites produced by the Cannabaceae plant. Conversely, a decrease in trichomes will result in a decrease in the pool of one or more secondary metabolites produced by the Cannabaceae plant. Therefore, this invention also provides a method for modulating secondary metabolite levels in a plant by modulating the level or expression or a TM polypeptide in the Cannabaceae plant. By "modulating secondary metabolite levels" is intended an increase or decrease in the amount or level of one or more secondary metabolites in the transgenic Cannabaceae plant when compared to a control plant .
[0092] Secondary metabolites that can be increased or decreased in accordance with the present method include, but are not limited to bitter acids (e.g., alpha acid and beta acid), essential oils, flavonoids, terpenophenolic compounds and/or terpenes. Examples of secondary metabolites within the context of this disclosure include, e.g. , 7, 8-dihydroionone, Acetanisole, Acetyl Cedrene, Anethole, Anisole, Benzaldehyde, Bergamotene [a-cis- Bergamotene, or- trans-Bergamotene ) , Bisabolol (b-Bisabolol) , Borneol, Bornyl Acetate, Butanoic/Butyric Acid, Cadinene (a-Cadinene, g-Cadinene) , Cafestol, Caffeic acid, Camphene, Camphor, Capsaicin, Carene (D-3-Carene) , Carotene, Carvacrol, Carvone, Dextro-Carvone, Laevo-Carvone, Caryophyllene (b-Caryophyllene) , Caryophyllene oxide, Cedrene (or-Cedrene, b-Cedrene) , Cedrene Epoxide (a-Cedrene Epoxide) , Cedrol, Cembrene, Chlorogenic Acid,
Cinnamaldehyde (a-amyl-Cinnamaldehyde, a-hexyl- Cinnamaldehyde) , Cinnamic Acid, Cinnamyl Alcohol, Citronellal, Citronellol, Cryptone, Curcumene (a-Curcumene, g-Curcumene) , Decanal, Dehydrovomifoliol, Diallyl Disulfide, Dihydroactinidiolide, Dimethyl Disulfide,
Eicosane/Icosane, Elemene (b-Elemene), Estragole, Ethyl acetate, Ethyl Cinnamate, Ethyl maltol, Eucalyptol/1, 8- Cineole, Eudesmol (a-Eudesmol, b-Eudesmol, g-Eudesmol), Eugenol, Euphol, Farnesene, Farnesol, Fenchol (b-Fenchol),
Fenchone, Geraniol, Geranyl acetate, Germacrenes,
Germacrene B, Guaia-1 (10) , 1 1-diene, Guaiacol, Guaiene (or-
Guaiene) , Gurjunene (a-Gurjunene) , Herniarin,
Hexanaldehyde, Hexanoic Acid, Humulene (a-Humulene, b- Humulene), Ionol (3-oxo-a-ionol, b-Ionol) , Ionone (a- Ionone, b-Ionone) , Ipsdienol, Isoamyl acetate, Isoamyl Alcohol, Isoamyl Formate, Isoborneol, Isomyrcenol, Isopulegol, Isovaleric Acid, Isoprene, Kahweol, Lavandulol, Limonene, g-Linolenic Acid, Linalool, Longifolene, a- Longipinene, Lycopene, Menthol, Methyl butyrate, 3- Mercapto-2-Methylpentanal, Mercaptan/Thiols, Mercaptoacetic Acid, Allyl Mercaptan, Benzyl Mercaptan, Butyl Mercaptan, Ethyl Mercaptan, Methyl Mercaptan, Furfuryl Mercaptan, Ethylene Mercaptan, Propyl Mercaptan, Thenyl Mercaptan,
Methyl Salicylate, Methylbutenol, Methyl-2-Methylvalerate, Methyl Thiobutyrate , Myrcene (b-Myrcene), g-Muurolene, Nepetalactone, Nerol, Nerolidol, Neryl acetate, Nonanaldehyde, Nonanoic Acid, Ocimene, Octanal, Octanoic Acid, P-cymene, Pentyl butyrate, Phellandrene, Phenylacetaldehyde , Phenylethanethiol, Phenylacetic Acid, Phytol, Pinene (a-Pinene, b-Pinene), Propanethiol,
Pristimerin, Pulegone, Quercetin, Retinol, Rutin, Sabinene, Sabinene Hydrate, cis-Sabinene Hydrate, trans-Sabinene
Hydrate, Safranal, a-Selinene, a-Sinensal, b-Sinensal, b-
Sitosterol, Squalene, Taxadiene, Terpin hydrate, Terpine-4- ol, a-Terpinene, Thiophenol, Thuj one, Thymol, a-Tocopherol, Tonka Undecanone, Undecanal, Valeraldehyde/Pentanal,
Verdoxan, a-Ylangene, Umbelliferone, or Vanillin and Cannabinoids such as D9-tetrahydrocannabidiol (THC) , D9- tetrahydrocannabinolic acid (THCA) , cannabidiol (CBD) , cannabinol (CBN) , cannabigerol (CBG) , cannabichromene (CBC) , cannabigerolic acid (CBGA) , cannabichromenic acid (CBCA) , cannabidiolic acid (CBDA) and the like .
[0093] Transgenic expression of a TM polypeptide can also be used to modify the tolerance of a plant to abiotic (drought, salt, heavy metals, etc.) and/or biotic (pathogen) stress. Accordingly, in one method of the invention, a Cannabaceae plant's tolerance to stress is increased or maintained, when compared to a control plant, by transgenic expression of a TM polypeptide in one or more parts of the plant.
[0094] In accordance with the above-referenced methods , a TM polynucleotide is provided by introducing into the plant an expression cassette harboring a TM polynucleotide, expressing the TM polynucleotide . In one embodiment of these methods , the TM expression construct introduced into the plant is stably incorporated into the genome of the plant.
[0095] The invention will be further described in the following examples, which do not limit the scope of the invention described in the claims .
Example 1: Modulation of TM Polypeptide Expression/Aotivity
[0096] The amino acid sequence of TM polypeptides from Humulus sp. (HuTM1) and Cannabis sp. (CaTM1 and CaTM2) are provided in Table 1.
Figure imgf000046_0001
[0097] As shown in FIG. 1, HuTMl, CaTM1 and CaTM2 share a high degree of similarity with TCLl polypeptide from Arabidopsis thaliana (AtCLl; Accession No. AT2G30432.1) , a single-repeat R3 MYB transcription factor (Wang, et al. (2008) BMC Plant Biology 8:81). In particular, the HuTMl, CaTMl and CaTM2 possess the single R3 MYB domain, the amino acid signature [D/E] L x 2 [R/K] x 3L x 6L x 3R that is required for interacting with R/B-like BHLH transcription factors (Zimmerman, et al. (2004) Plant J. 40:22-34) and the amino acids within the MYB domain that are involved in cell-to-cell movement of CPC (Kurata, et al. (2005) Development 132 : 5387-98) .
[0098] Like AtCLl, increasing or decreasing the level or expression of HuTMl in Humulus or increasing or decreasing the level or expression of CaTMl or CaTM2 in Cannabis can modulate trichomes and/or root hairs in the respective plants . Accordingly, in one embodiment, nucleic acids encoding HuTMl, CaTMl or CaTM2 are introduced into a plant expression vector under the control of a suitable promoter
(e.g • , a constitutive, or root- or shoot-preferred promoter, the plant expression vector is introduced into Humulus or Cannabis, the HuTMl, CaTMl or CaTM2 protein is overexpressed and trichome and/or root hair number, mass, size, clustering, length and/or density is modulated . Likewise, secondary metabolite production is concurrently modulated . Exemplary gene (including 5' and 3' regulatory regions and introns) and cDNA sequences encoding HuTMl are provided in FIG. 2 and FIG. 3, respectively. Similarly, an exemplary gene sequence encoding CaTMl is provided in the sequences set forth in Table 2.
Figure imgf000048_0001
[0099] Alternatively, in another embodiment, siRNAs are designed to target nucleic acids encoding HuTMl, CaTMl or CaTM2 protein in the Humulus or Cannabis genome . By way of illustration, exemplary siRNA target sequences in polynucleotides encoding HuTMl and CaTMl are provided in Table 3. Using these target sequences, siRNA or shRNA (using, e. g. , loop sequence TCAAGAG (SEQ ID NO: 6)) are produced. When expressed in a plant cell, these siRNA/shRNA molecules can be used to reduce the expression of a HuTMl and CaTMl polypeptide or polynucleotide.
Figure imgf000048_0002
[00100] Similarly, sgRNA are designed to target nucleic acids encoding HuTMl, CaTMl or CaTM2 protein in the Humulus or Cannabis genome . By way of illustration, exemplary sgRNA target sequences (based on S. pyogenes (NGG PAM) and S. aureus (NNGRR PAM) CRISPR Cas9 enzyme families) in polynucleotides encoding HuTMl and CaTMl are provided in Table 4.
Figure imgf000049_0001
[00101] The sgRNA is introduced into Humulus or Cannabis along with a cognate CRISPR/Cas endonuclease, and all or a portion of the nucleic acids encoding HuTMl , CaTMl or CaTM2 protein are deleted thereby resulting in a decrease in the expression of a HuTMl , CaTMl or CaTM2 polypeptide or polynucleotide .
[00102] Depending on whether the expression/activity of the TM protein is increase or decreased, trichome and secondary metabolite production is expected to be modulated as summarized in Table 5.
Figure imgf000050_0001

Claims

What is claimed is:
1. A recombinant vector comprising:
(i) a polynucleotide encoding a polypeptide of SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3, or a variant or Cannabaceae ortholog thereof; or
(ii) a polynucleotide that reduces the expression of a polynucleotide encoding a polypeptide of SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3, or a variant or Cannabaceae ortholog thereof.
2. An isolated host cell comprising the recombinant vector of claim 1.
3. A transgenic plant comprising the recombinant vector of claim 1.
4. The transgenic plant of claim 3, wherein said plant is from the genus Humulus or Cannabis.
5. A transformed seed of the plant of claim 4.
6. A method for modulating trichomes or root hairs in a plant comprising introducing into a plant the recombinant vector of claim 1 and increasing or decreasing the expression of the polypeptide thereby modulating trichomes or root hairs in the plant .
7. A method for modulating the level of one or more secondary metabolites in a plant comprising introducing into a plant the recombinant vector of claim 1 and increasing or decreasing the expression of the polypeptide thereby modulating the level of one or more secondary metabolites in the plant .
8. The method of claim 7, wherein the polynucleotide is under control of a root-preferred or shoot-preferred promoter .
PCT/US2020/022053 2019-03-13 2020-03-11 Compositions and methods for modulating trichomes, root hairs and secondary metabolites in cannabaceae Ceased WO2020185865A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201962817794P 2019-03-13 2019-03-13
US62/817,794 2019-03-13

Publications (1)

Publication Number Publication Date
WO2020185865A1 true WO2020185865A1 (en) 2020-09-17

Family

ID=72426303

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2020/022053 Ceased WO2020185865A1 (en) 2019-03-13 2020-03-11 Compositions and methods for modulating trichomes, root hairs and secondary metabolites in cannabaceae

Country Status (1)

Country Link
WO (1) WO2020185865A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022169839A1 (en) * 2021-02-03 2022-08-11 Altria Client Services Llc Increasing trichome density and improving transport of metabolites in plant trichomes
WO2022198093A1 (en) * 2021-03-18 2022-09-22 Calyxt, Inc. Producing albumin using plant cell matrices
WO2022198094A1 (en) * 2021-03-18 2022-09-22 Calyxt, Inc. Producing albumin in cannabaceae plant parts
WO2022198085A3 (en) * 2021-03-18 2022-10-20 Calyxt, Inc. Plant cell matrices and methods thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080072340A1 (en) * 2006-08-31 2008-03-20 Ceres, Inc. Nucleotide sequences and corresponding polypeptides conferring modulated plant characteristics
US20140345002A1 (en) * 2008-06-26 2014-11-20 Basf Plant Science Gmbh Plants having enhanced yield-related traits and a method for making the same
WO2017181018A1 (en) * 2016-04-14 2017-10-19 Ebbu, LLC Enhanced cannabis plants and methods of making and using the same

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080072340A1 (en) * 2006-08-31 2008-03-20 Ceres, Inc. Nucleotide sequences and corresponding polypeptides conferring modulated plant characteristics
US20140345002A1 (en) * 2008-06-26 2014-11-20 Basf Plant Science Gmbh Plants having enhanced yield-related traits and a method for making the same
WO2017181018A1 (en) * 2016-04-14 2017-10-19 Ebbu, LLC Enhanced cannabis plants and methods of making and using the same

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022169839A1 (en) * 2021-02-03 2022-08-11 Altria Client Services Llc Increasing trichome density and improving transport of metabolites in plant trichomes
US12319923B2 (en) 2021-02-03 2025-06-03 Altria Client Services Llc Increasing trichome density and improving transport of metabolites in plant trichomes
US12351809B2 (en) 2021-02-03 2025-07-08 Altria Client Services Llc Tissue-specific promoters in plants
WO2022198093A1 (en) * 2021-03-18 2022-09-22 Calyxt, Inc. Producing albumin using plant cell matrices
WO2022198094A1 (en) * 2021-03-18 2022-09-22 Calyxt, Inc. Producing albumin in cannabaceae plant parts
WO2022198085A3 (en) * 2021-03-18 2022-10-20 Calyxt, Inc. Plant cell matrices and methods thereof

Similar Documents

Publication Publication Date Title
US7763779B2 (en) Maize stress-responsive promoter
US8916749B2 (en) Isopentenyl transferase sequences and methods of use
WO2020185865A1 (en) Compositions and methods for modulating trichomes, root hairs and secondary metabolites in cannabaceae
US8569577B2 (en) Soybean isopentenyl transferase genes and methods of use
CA2743707A1 (en) Methods and compositions for enhanced yield by targeted expression of knotted1
AU2005309827B2 (en) Cytokinin-sensing histidine kinases and methods of use
CA2563344C (en) Cytokinin oxidase sequences and methods of use
US20100212049A1 (en) Compositions and Methods of Use of Response Regulators
US9796985B2 (en) Compositions and method for modulating the sensitivity of plants to cytokinin

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20771116

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 20/01/2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20771116

Country of ref document: EP

Kind code of ref document: A1