WO2025137000A1 - Terpene genetic markers - Google Patents
Terpene genetic markers Download PDFInfo
- Publication number
- WO2025137000A1 WO2025137000A1 PCT/US2024/060604 US2024060604W WO2025137000A1 WO 2025137000 A1 WO2025137000 A1 WO 2025137000A1 US 2024060604 W US2024060604 W US 2024060604W WO 2025137000 A1 WO2025137000 A1 WO 2025137000A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- genotype
- seq
- chromosome
- content
- analyzing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01H—NEW PLANTS OR NON-TRANSGENIC PROCESSES FOR OBTAINING THEM; PLANT REPRODUCTION BY TISSUE CULTURE TECHNIQUES
- A01H6/00—Angiosperms, i.e. flowering plants, characterised by their botanic taxonomy
- A01H6/28—Cannabaceae, e.g. cannabis
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01H—NEW PLANTS OR NON-TRANSGENIC PROCESSES FOR OBTAINING THEM; PLANT REPRODUCTION BY TISSUE CULTURE TECHNIQUES
- A01H1/00—Processes for modifying genotypes ; Plants characterised by associated natural traits
- A01H1/04—Processes of selection involving genotypic or phenotypic markers; Methods of using phenotypic markers for selection
- A01H1/045—Processes of selection involving genotypic or phenotypic markers; Methods of using phenotypic markers for selection using molecular markers
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01H—NEW PLANTS OR NON-TRANSGENIC PROCESSES FOR OBTAINING THEM; PLANT REPRODUCTION BY TISSUE CULTURE TECHNIQUES
- A01H5/00—Angiosperms, i.e. flowering plants, characterised by their plant parts; Angiosperms characterised otherwise than by their botanic taxonomy
- A01H5/02—Flowers
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01H—NEW PLANTS OR NON-TRANSGENIC PROCESSES FOR OBTAINING THEM; PLANT REPRODUCTION BY TISSUE CULTURE TECHNIQUES
- A01H5/00—Angiosperms, i.e. flowering plants, characterised by their plant parts; Angiosperms characterised otherwise than by their botanic taxonomy
- A01H5/12—Leaves
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6888—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
- C12Q1/6895—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/13—Plant traits
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
Definitions
- the present disclosure relates to genes associated with terpene production in Cannabis, and methods of producing Cannabis varieties having high terpene content.
- Sequence Listing is submitted as an XML file in the form of the file named “Sequence.xml” (95,829 bytes), created on December 2, 2024, which is incorporated by reference herein.
- the methods include (i) analyzing one or more genetic markers in a nucleic acid sample from the Cannabis plant or its germplasm; (ii) detecting one or more genetic markers that indicate modified teipene content; and (iii) identifying and/or selecting the Cannabis plant having one or more genetic markers that indicate modified terpene content, thereby idcntifying/sclccting Cannabis plants having modified terpene content. Also disclosed are methods of producing one or more Cannabis plants having modified terpene content (e.g., increased terpene content relative to a control).
- the modified terpene content is increased terpene content relative to a control.
- Modified terpene content can include, for example, modified total terpenes, total monoterpenes, beta-myrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, trans-nerolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content.
- the one or more genetic markers that are analyzed and/or detected can include, for example, a genetic marker disclosed herein e.g., one or more genetic markers included in Table 15 or Table 16). In some aspects, 2 to 10 genetic markers disclosed herein are analyzed. In further aspects, at least one genetic marker indicating modified terpene content is detected. In some aspects, at least one genetic marker indicating modified terpene content is detected. In some aspects, at least two genetic markers indicating modified terpene content are detected.
- products e.g., a kief, hashish, bubble hash, an edible product, solvent reduced oil, sludge, e-juice, or tincture
- methods of Cannabis breeding including crossing a Cannabis plant identified, selected, or produced by a method disclosed herein.
- SEQ ID NOs: 1-99 arc sequences encompassing SNP markers. Each sequence includes 50 bp of 5’ and 3' flanking sequence with the SNP marker at position 51 bp.
- Cannabis has long been used for drug and industrial purposes, fiber (hemp), for seed and seed oils, for medicinal purposes, and for recreational purposes.
- Industrial hemp products are made from Cannabis plants selected to produce an abundance of fiber.
- Some Cannabis varieties have been bred to produce minimal levels of THC, the principal psychoactive constituent responsible for the psychoactivity associated with marijuana.
- Marijuana has historically consisted of the dried flowers of Cannabis plants selectively bred to produce high levels of THC and other psychoactive cannabinoids. As a drug it usually comes in the form of dried flower buds (marijuana), resin (hashish), or various extracts collectively known as hashish oil.
- Cannabis is an annual, dioecious, flowering herb. The leaves are palmately compound or digitate, with serrate leaflets. Cannabis normally has imperfect flowers, with staminate “male” and pistillate “female” flowers occurring on separate plants. It is not unusual, however, for individual plants to separately bear both male and female flowers (i.eowski have monoecious plants). Although monoecious plants are often referred to as “hermaphrodites,” true hermaphrodites (which are less common in Cannabis) bear staminate and pistillate structures on individual flowers, whereas monoecious plants bear male and female flowers at different locations on the same plant.
- Cannabis plants The life cycle of Cannabis varies with each variety but can be generally summarized into germination, vegetative growth, and reproductive stages. Because of heavy breeding and selection by humans, most Cannabis seeds have lost dormancy mechanisms and do not require any pre-treatments or winterization to induce germination. Seed placed in viable growth conditions are expected to germinate in about 3 to 7 days. The first true leaves of a Cannabis plant contain a single leaflet, with subsequent leaves developing in opposite formation with increasing number of leaflets. Leaflets can be narrow or broad depending on the morphology of the plant grown. Cannabis plants are normally allowed to grow vegetatively for the first 4 to 8 weeks. During this period, the plant responds to increasing light with faster and faster growth. Under ideal conditions, Cannabis plants can grow up to 2.5 inches a day and are capable of reaching heights of up to 20 feet. Indoor growth pruning techniques tend to limit Cannabis size through careful pruning of apical or side shoots.
- the first genome sequence of Cannabis which is estimated to be 820 Mb in size, was published in 2011 by a team of Canadian scientists (Bakel et al., “The draft genome and transcriptome of Cannabis sativa” Genome Biology 12:R102).
- Cannabis plants produce a variety of secondary metabolites, including cannabinoids, terpenoids, and other compounds, which are often secreted by glandular trichomes that occur most abundantly on the floral calyxes and bracts of female plants.
- Cannabinoids are the most studied group of secondary metabolites in Cannabis. Most exist in two forms, as acids and in neutral (decarboxylated) forms. The acid form is designated by an “A” at the end of its acronym (i.e. THCA).
- the phytocannabinoids are synthesized in the plant as acid forms, and while some decarboxylation does occur in the plant, it increases significantly post-harvest and the kinetics increase at high temperatures (Sanchez and Verpoorte 2008).
- the biologically active forms for human consumption are the neutral forms. Decarboxylation is usually achieved by thorough drying of the plant material followed by heating it, often by either combustion, vaporization, or heating or baking in an oven.
- Cannabinoids found in Cannabis plants include, but are not limited to, A9-Tetrahydrocannabinol (A9- THC), A8-Tetrahydrocannabinol (A8-THC), Cannabichromene (CBC), Cannabicyclol (CBL), Cannabidiol (CBD), Cannabielsoin (CBE), Cannabigerol (CBG), Cannabinidiol (CBND), Cannabinol (CBN), Cannabitriol (CBT), and their propyl homologs, including, but are not limited to cannabidivarin (CBDV), A9- Tetrahydrocannabivarin (THCV), cannabichromevarin (CBCV), and cannabigerovarin (CBGV).
- A9-Tetrahydrocannabinol A9- THC
- A8-Tetrahydrocannabinol A8-THC
- Cannabichromene
- Non-THC cannabinoids can be collectively referred to as “CBs”, wherein CBs can be one of THCV, CBDV, CBGV, CBCV, CBD, CBC, CBE, CBG, CBN, CBND, and CBT cannabinoids.
- Terpenes are primarily produced in glandular trichomes of female inflorescences (Livingston et al., "Cannabis glandular trichomes alter morphology and metabolite content during flower maturation," The Plant Journal 101.1 (2020): 37-56). Besides affecting aroma and fragrance, terpenes may have a synergic effect with cannabinoids (Sommano et al., “The cannabis terpenes," Molecules 25.24 (2020): 5792), and have been attributed medicinal properties (Maggini et al., "An Optimized Terpene Profile for a New Medical Cannabis Oil,” Pharmaceutics 14.2 (2022): 298).
- terpenes in Cannabis Two main groups of terpenes in Cannabis are the monoterpenes and sesquiterpenes, which are produced in the methylerythritol phosphate pathway (MEP) and mevalonic acid pathway (MEV), respectively (Booth et al., "Terpene synthases from Cannabis sativa,” Pios one 12.3 (2017): e0173911).
- Monoterpenes have a ten-carbon isoprenoid precursor, geranyl diphosphate (GPP).
- Sesquiterpenes have a fifteen-carbon isoprenoid precursor, farnesyl diphosphate (FPP).
- GPP and FPP are converted to different monoterpenes and scsquitcipcncs, respectively, by tcipcnc synthases (TPS; Booth et al., “Terpenes in Cannabis sativa-From plant genome to humans," Plant Science 284 (2019): 67-72).
- TPS tcipcnc synthases
- SNP single nucleotide polymorphism
- a plant includes singular or plural plants and can be considered equivalent to the phrase “at least one plant.”
- the term “comprises” means “includes.” It is further to be understood that any and all base sizes or amino acid sizes, and all molecular weight or molecular mass values, given for nucleic acids or polypeptides are approximate, and are provided for descriptive purposes, unless otherwise indicated. Although many methods and materials similar or equivalent to those described herein can be used, particular suitable methods and materials are described herein. In case of conflict, the present specification, including explanations of terms, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting. To facilitate review of the various aspects, the following explanations of terms are provided:
- Abacus refers to the Cannabis sativa reference genome known as the Abacus reference genome version Csat_AbacusV2 (NCBI assembly accession GCA_025232715.1, incorporated by reference herein), which is also sometimes referred to as CsaAba2.
- alternative nucleotide call is a nucleotide polymorphism relative to a reference nucleotide for a SNP marker that is significantly associated with a desired phenotype (e.g., modified terpene content). Unless otherwise specified, the reference is the Abacus sequence.
- a “beneficial” as used herein refers to a genetic element ( ⁇ ?.g., gene, allele, or polymorphism) conferring or associated with modified terpene content (e.g., increased terpene content).
- a “beneficial polymorphism” or “beneficial allele” refers to a polymorphism or allele associated with modified terpene content (e.g., increased terpene content).
- hybridizing specifically to refers to the binding, duplexing, or hybridizing of a nucleic acid molecule preferentially to a particular nucleotide sequence under stringent conditions.
- stringent conditions refers to conditions under which a nucleic acid will hybridize preferentially to a target sequence, and to a lesser extent to, or not at all to, other off-target sequences.
- a “stringent hybridization” and “stringent hybridization wash conditions” in the context of nucleic acid hybridization are sequence dependent, and are different under different environmental parameters.
- line is used broadly to include, but is not limited to, a group of plants vegetatively propagated from a single parent plant, via tissue culture techniques or a group of inbred plants which are genetically very similar due to descent from a common parent(s).
- a plant is said to “belong” to a particular line if it (a) is a primary transformant (TO) plant regenerated from material of that line; (b) has a pedigree comprised of a TO plant of that line; or (c) is genetically very similar' due to common ancestry (e.g., via inbreeding or selfing).
- the term “pedigree” denotes the lineage of a plant, e.g. in terms of the sexual crosses affected such that a gene or a combination of genes, in heterozygous (hemizygous) or homozygous condition, imparts a desired trait to the plant (e.g., modified terpene content).
- a marker probe refers to a probe of any type that is able to distinguish (i.e., genotype) the particular allele that is present at a marker locus.
- a “marker locus” is a locus that can be used to track the presence of a second linked locus, e.g., a linked locus that encodes or contributes to expression of a phenotypic trait.
- a marker locus can be used to monitor segregation of alleles at a locus, such as a QTL, that are genetically or physically linked to the marker locus.
- a “marker allele,” alternatively an “allele of a marker locus,” is one of a plurality of polymorphic nucleotide sequences found at a marker locus in a population that is polymorphic for the marker locus.
- markers include restriction fragment length polymorphism (RFLP) markers, amplified fragment length polymorphism (AFLP) markers, single nucleotide polymorphisms (SNPs), microsatellite markers (e.g. SSRs), sequence-characterized amplified region (SCAR) markers, cleaved amplified polymorphic sequence (CAPS) markers or isozyme markers or combinations of the markers described herein which defines a specific genetic and chromosomal location.
- RFLP restriction fragment length polymorphism
- AFLP amplified fragment length polymorphism
- SNPs single nucleotide polymorphisms
- SCAR sequence-characterized amplified region
- CAS cleaved amplified polymorphic sequence
- modified Cannabis plant or “modified plant” is not a naturally occurring plant.
- offspring or “progeny” refer to a plant resulting as from a vegetative or sexual reproduction from one or more parent plants.
- an offspring/progeny plant may be obtained by cloning or selfing of a parent plant or by crossing two parent plants.
- An Fl is a first-generation offspring produced from parents at least one of which is used for the first time as donor of a trait, while offspring of second generation (F2) or subsequent generations (F3, F4, etc.) are specimens produced from selfings of Fl’s, F2’s etc.
- An Fl may thus be (and usually is) a hybrid resulting from a cross between two true breeding parents (true-breeding is homozygous for a trait), while an F2 may be (and usually is) an offspring resulting from self-pollination.
- Plant tissue refers to any tissue of a plant, including but not limited to, tissue from an embryo, shoot, root, stem, seed, stipule, leaf, trichome, petal, flower bud, flower, ovule, bract, branch, petiole, internode, bark, pubescence, tiller, rhizome, frond, blade, ovule, pollen, stamen.
- a plant tissue or plant organ may be a seed, protoplast, callus, or any other group of plant cells that is organized into a structural or functional unit.
- a plant cell or tissue culture may be capable of regenerating a plant having the physiological and morphological characteristics of the plant from which the cell or tissue was obtained, and of regenerating a plant having substantially the same genotype as the plant.
- Regenerable cells in a plant cell or tissue culture may be embryos, protoplasts, meristematic cells, callus, pollen, leaves, anthers, roots, root tips, silk, flowers, kernels, ears, cobs, husks, or stalks.
- Plant parts include harvestable parts and parts useful for propagation of progeny plants. Plant parts useful for propagation include, for example and without limitation: seed; fruit; a cutting; a seedling; a tuber; and a rootstock.
- a harvestable part of a plant may be any useful part of a plant, including, for example and without limitation: flower; pollen; seedling; tuber; leaf; stem; fruit; seed; and root.
- a plant cell is the structural and physiological unit of the plant.
- a plant cell may be in the form of an isolated single cell, or an aggregate of cells (e.g., a friable callus and a cultured cell), and may be part of a higher organized unit (e.g., a plant tissue, plant organ, and plant).
- a plant cell may be a protoplast, a gamete producing cell, or a cell or collection of cells that can regenerate into a whole plant.
- a seed which comprises multiple plant cells and is capable of regenerating into a whole plant, is considered a “plant cell.” Described herein are plants in the genus of Cannabis and plants derived therefrom, which can be produced by asexual or sexual reproduction.
- polymorphism refers to a difference in the nucleotide or amino acid sequence of a given region as compar ed to a nucleotide or amino acid sequence in a homologous-region of another individual, in particular, a difference in the nucleotide of amino acid sequence of a given region which differs between individuals of the same species.
- a polymorphism is generally defined in relation to a reference sequence. Unless indicated otherwise, the reference sequence is the Cannabis Abacus reference genome (version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1) or CDS produced from the Cannabis Abacus reference genome.
- Polymorphisms include single nucleotide differences, differences in sequence of more than one nucleotide, and single or multiple nucleotide insertions, inversions and deletions; as well as single amino acid differences, differences in sequence of more than one amino acid, and single or multiple amino acid insertions, inversions, and deletions.
- polynucleotide “polynucleotide sequence,” “nucleotide sequence,” “nucleic acid sequence,” and “nucleic acid fragment,” are used interchangeably. These terms encompass polymers composed of nucleotide units (ribonucleotides, deoxyribonucleotides, related naturally occurring structural variants, and synthetic non-naturally occurring analogs thereof).
- oligonucleotide typically refers to short polynucleotides, generally no greater than 150 nucleotides, for example, no greater than 125 nucleotides, no greater than 100 nucleotides, no greater than 75 nucleotides, no greater than 50 nucleotides, or no greater than 25 nucleotides. It will be understood that when a nucleic acid sequence is represented as a DNA sequence (z.e., A, T, G, C), this also includes an RNA sequence (z.e., A, U, G, C) in which “U” replaces “T.” Nucleic acids can be single- or double-stranded.
- nucleic acids include cDNA, genomic DNA, synthetic DNA, RNA, or mixtures thereof.
- polypeptide or protein refers to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical analogue of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers.
- amino acid residue or “amino acid” includes reference to an amino acid that is incorporated into a protein, polypeptide, or peptide.
- the amino acid can be a naturally occurring amino acid and, unless otherwise limited, can encompass known analogs of natural amino acids that can function in a similar manner as naturally occurring amino acids.
- recombinant includes reference to a protein produced using cells that do not have, in their native state, an endogenous copy of the DNA able to express the protein.
- the cells produce the recombinant protein because they have been genetically altered by the introduction of the appropriate isolated nucleic acid sequence.
- the term also includes reference to a cell, or nucleic acid, or vector, that has been modified by the introduction of a heterologous nucleic acid or the alteration of a native nucleic acid to a form not native to that cell, or that the cell is derived from a cell so modified.
- the primer may vary in length depending on the particular conditions and requirements of the application.
- the oligonucleotide primer is typically 15-25 or more nucleotides in length.
- the primer must be of sufficient complementarity to the desired template to prime the synthesis of the desired extension product, that is, to be able anneal with the desired template strand in a manner sufficient to provide the 3’ hydroxyl moiety of the primer in appropriate juxtaposition for use in the initiation of synthesis by a polymerase or similar enzyme. It is not required that the primer sequence represent an exact complement of the desired template.
- a non-complementary nucleotide sequence may be attached to the 5’ end of an otherwise complementary primer.
- non-complementary bases may be interspersed within the oligonucleotide primer sequence, provided that the primer sequence has sufficient complementarity with the sequence of the desired template strand to functionally provide a template-primer complex for the synthesis of the extension product.
- product as used in reference to a Cannabis product, is a composition including Cannabis (or an extract thereof).
- Products include, but are not limited to: a kief, hashish, bubble hash, an edible product, solvent reduced oil, sludge, e-juice, tincture, or other compositions including Cannabis (e.g., a Cannabis plant disclosed herein, or an extract thereof).
- promoter refers to a nucleic acid control sequence that directs transcription of a nucleic acid.
- a promoter includes necessary nucleic acid sequences near the start site of transcription, and may include distal enhancer or repressor elements.
- a “constitutive promoter” is a promoter that is continuously active and is not subject to regulation by external signals or molecules. In contrast, the activity of an “inducible promoter” is regulated by an external signal or molecule (for example, a transcription factor).
- Exemplary promoters include pol III promoters (e.g., U6), pol II promoter, ubiquitin promoter, Cauliflower Mosaic Virus (CaMV) 35S promoter, or RUB1SCO promoter.
- the terms “initiate transcription,” “initiate expression,” “drive transcription,” and “drive expression” are used interchangeably herein and all refer to the primary function of a promoter.
- purified as used herein relates to the isolation of a molecule or compound in a form that is substantially free of contaminants normally associated with the molecule or compound in a native or natural environment, or substantially enriched in concentration relative to other compounds present when the compound is first formed, and means having been increased in purity as a result of being separated from other components of the original composition.
- purified nucleic acid is used herein to describe a nucleic acid sequence which has been separated, produced apart from, or purified away from other biological compounds including, but not limited to polypeptides, lipids and carbohydrates, while effecting a chemical or functional change in the component (e.g., a nucleic acid may be purified from a chromosome by removing protein contaminants and breaking chemical bonds connecting the nucleic acid to the remaining DNA in the chromosome).
- recombinant refers to a nucleic acid or protein that has a sequence made by an artificial combination of two otherwise separated segments of sequence (e.g., a “chimeric” sequence). This artificial combination can be accomplished by chemical synthesis or by manipulation of isolated segments of nucleic acids, for example, by standard molecular biology techniques (e.g., cloning).
- a “recombinant expression construct” refers to an expression vector into which a nucleic acid sequence or fragment can be moved. Preferably, it is a plasmid vector, or a fragment thereof, comprising a promoter. The choice of plasmid vector is dependent upon the method that will be used to transform host plants.
- genetic elements that must be present on the plasmid vector to successfully transform, select and propagate host cells containing the chimeric gene is dependent on the specific transformation method. Different independent transformation events typically result in different levels and patterns of expression and thus multiple events must be screened to obtain lines displaying the desired expression level and pattern. Such screening may be accomplished by PCR and Southern analysis of DNA, RT-PCR and Northern analysis of mRNA expression. Western analysis of protein expression, or phenotypic analysis.
- reference plant or “reference genome” refers to a reference sequence that genetic markers or sequences of a test sample can be compared to in order to detect a modification of the sequence in the test sample.
- the reference plant or genome is Abacus (Csat_AbacusV2, NCBI assembly accession GCA_025232715.1).
- sequence identity or “percent identity” are used interchangeably to refer to a sequence comparison based on identical matches between correspondingly identical positions in two or more amino acid or nucleotide sequences that are being compared.
- the percent identity refers to the extent to which two optimally aligned polynucleotide or peptide sequences are invariant throughout a window of alignment of components, e.g., nucleotides or amino acids.
- Hybridization experiments and mathematical algorithms known in the art may be used to determine percent identity.
- Many mathematical algorithms exist as sequence alignment computer programs known in the art that calculate percent identity. These programs may be categorized as either global sequence alignment programs or local sequence alignment programs.
- the NCBI Basic Local Alignment Search Tool (BLAST) tool is often used and is available from several sources, including the National Center for Biotechnology Information (blast.ncbi.nlm.nih.gov/Blast.cgi).
- BLAST Basic Local Alignment Search Tool
- Various types of BLAST are available, for example, blastp, blastn, blastx, tblastn and tblastx.
- a description of how to determine sequence identity using this program is available on the NCBI website and other resources.
- percent sequence identity is determined by using BLAST with default parameters.
- nucleic acid fragments wherein changes in one or more nucleotide bases do not affect the ability of the nucleic acid fragment to mediate gene expression or produce a certain phenotype. These terms also refer to modifications of nucleic acid fragments, such as deletion or insertion of one or more nucleotides that do not substantially alter the functional properties of the resulting nucleic acid fragment relative to the initial, unmodified fragment.
- a “substantially homologous sequence” refers to variants of the disclosed sequences such as those that result from site-directed mutagenesis, as well as synthetically derived sequences.
- a substantially homologous sequence also refers to fragments of a particular promoter nucleotide sequence disclosed herein that operate to promote the constitutive expression of an operably linked heterologous nucleic acid fragment.
- These promoter fragments will include at least about 20 contiguous nucleotides, for example, at least 50 contiguous nucleotides, at least 75 contiguous nucleotides, or at least 100 contiguous nucleotides of the particular promoter nucleotide sequence disclosed herein.
- the nucleotides of such fragments will usually comprise the TATA recognition sequence of the particular promoter sequence.
- Such fragments may be obtained by use of restriction enzymes to cleave the naturally occurring promoter nucleotide sequences disclosed herein; by synthesizing a nucleotide sequence from the naturally occurring promoter DNA sequence; or may be obtained through the use of PCR technology. Functional variants of these promoter fragments, such as those resulting from site-directed mutagenesis, are encompassed by the present disclosure.
- single nucleotide polymorphism refers to a change in which a single base in the DNA differs from the base at the corresponding position of a reference genome or sequence.
- target region refers to a nucleotide sequence that resides at a specific chromosomal location.
- the "target region” or “nucleic acid target” can be specifically recognized by a probe.
- terpene refers to a class of secondary metabolite typically found in plants. Terpenes are hydrocarbons with small isoprene units linked to one another to form chains. Two types of terpenes/terpenoids commonly found in Cannabis include monoterpenes (10C; two isoprenes) and sesquiterpenes (15C; three isoprenes).
- Exemplary monoterpenes include limonene (e.g., L- limonene or D-limonene), myrcene (also referred to as P-myrcene), pinene (e.g., a-pinene or -pinene), camphene, linalool, terpinolene, terpinene (e.g., a-terpinene or y-terpinene), and ocimene (also referred to as p-ocimene).
- limonene e.g., L- limonene or D-limonene
- myrcene also referred to as P-myrcene
- pinene e.g., a-pinene or -pinene
- camphene linalool
- terpinolene terpinene
- terpinene e.g., a-terpinene or y
- Exemplary sesquiterpenes include nerolidol (e.g., cis-nerolidol and/or rrans-nerolidol), humulene (also referred to as a-humulene), guaiol, and caryophyllene (also referred to as P-caryophyllene).
- nerolidol e.g., cis-nerolidol and/or rrans-nerolidol
- humulene also referred to as a-humulene
- guaiol guaiol
- caryophyllene also referred to as P-caryophyllene
- Terpene biosynthesis starts with common isoprenoid diphosphate precursors (5 carbon) through two biosynthetic pathways, the plastidial methylerythritol phosphate (MEP) pathway and the cytosolic mevalonate (MEV) pathway. Both the MEP and MEV pathways provide isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), which are condensed into longer-chain isoprenoid diphosphates that include geranyl diphosphate (GPP) and farnesyl diphosphate (FPP).
- IPP isopentenyl diphosphate
- DMAPP dimethylallyl diphosphate
- GPP geranyl diphosphate
- FPP farnesyl diphosphate
- Linear isoprenoid diphosphates are substrates for monoterpene synthases (mono-TPS) and sesquiterpene synthases (sesqui-TPS), respectively, which diversify these compounds through enzymatic modifications, such as hydroxylation, dehydrogenation, acylation, and glycosylation, resulting in the production of diverse mono- and sesquiterpenes.
- GPP is also a building block of cannabinoid biosynthesis.
- transformant refers to a cell, tissue or organism that has undergone transformation.
- the original transformant is designated as “TO” or “TO.”
- Selfing the TO produces a first transformed generation designated as “Tl” or “Tl.”
- transgenic refers to any cell, cell line, callus, tissue, plant part or plant, the genome of which has been altered by the presence of a heterologous nucleic acid, such as a recombinant DNA construct, including those initial transgenic events as well as those created by sexual crosses or asexual propagation from the initial transgenic event.
- a heterologous nucleic acid such as a recombinant DNA construct
- the term “transgenic” as used herein does not encompass the alteration of the genome (chromosomal or extra-chromosomal) by conventional plant breeding methods or by naturally occurring events such as random cross-fertilization, non-recombinant viral infection, non-recombinant bacterial transformation, non-recombinant transposition, or spontaneous mutation.
- a “transgene” is a gene that has been introduced into the genome by a transformation procedure.
- “variety” means a plant grouping within a single botanical taxon of the lowest known rank, which grouping, irrespective of whether the conditions for the grant of a breeder’ s right are fully met, can be i) defined by the expression of the characteristics resulting from a given genotype or combination of genotypes, ii) distinguished from any other plant grouping by the expression of at least one of the said characteristics and iii) considered as a unit with regard to its suitability for being propagated unchanged.
- vector refers to a nucleic acid molecule that can be introduced into a host cell (for example, by transformation), thereby producing a transformed host cell.
- a vector can include nucleic acid sequences that permit it to replicate in a host cell, such as an origin of replication.
- Recombinant DNA vectors are vectors containing recombinant DNA.
- a vector can also include one or more selectable marker genes and other genetic elements. Often vectors are DNA plasmids, however, they can also be viral vectors (DNA or RNA), cosmids, or artificial chromosomes.
- CBDVA cannabidivarinic acid
- Disclosed are methods of producing one or more Cannabis plants having modified terpene content comprising: (i) analyzing one or more genetic markers in a nucleic acid sample from a Cannabis plant or its germplasm; (ii) detecting one or more markers that indicate modified terpene content, (iii) crossing the Cannabis plant comprising the one or more markers, and (iv) obtaining one or more progeny plants comprising the one or more markers; wherein the one of more progeny plants have modified terpene content relative to a control, thereby producing one or more Cannabis plants having modified terpene content.
- Also disclosed are methods of identifying or selecting a Cannabis plant having modified terpene content comprising: (i) analyzing one or more genetic markers in a nucleic acid sample from the Cannabis plant or its germplasm; (ii) detecting one or more genetic markers that indicate modified terpene content; and (iii) identifying or selecting the Cannabis plant, thereby identifying or selecting the Cannabis plant having modified terpene content.
- the Cannabis plant having modified teipene content is selected for further analysis, propagation, crossing, or to make a product.
- the method further includes crossing the Cannabis plant having modified terpene content and producing one or more progeny plants having modified terpene content.
- Terpene content can be measured using standard analytical techniques, e.g.. gas chromatography and/or HPLC with mass-spectrometry. Modified terpene content can be determined, for example, as a difference in terpene content relative to a control/reference, e.g., a Cannabis plant not having the one or more markers that indicate modified teipene content.
- the terpene content is modified in flowers or inflorescence, for example, in female flowers or female inflorescence.
- the teipene content is modified in trichomes (e.g., glandular trichomes).
- the terpene content is modified in leaf or other vegetative tissue.
- the modified terpene content is an increase in terpene content relative to a suitable control (e.g., a sample from a Cannabis plant not having the one or more markers that indicate modified terpene content).
- a suitable control e.g., a sample from a Cannabis plant not having the one or more markers that indicate modified terpene content.
- the Cannabis plant is Cannabis sativa, Cannabis indie a, or Cannabis ruderalis.
- the Cannabis plant is Cannabis sativa.
- Modified terpene content as used herein can include, for example, modified total terpenes, total monoterpenes, beta-myrcene, total sesquiteipenes, alpha-pinene, beta-pinene, alpha-terpinene, gammaterpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, trans-nerolidol, alpha-humulene, betacaryophyllene, and/or guaiol levels.
- modified terpene content includes or consists of modified total tcipcncs; modified total monotcrpcncs; modified total monotcrpcncs absent bcta-myrccnc; modified total sesquiteipenes; modified alpha-pinene; modified beta-pinene; modified alpha-terpinene, gamma-terpinene, and terpinolene; modified beta-myrcene to total monoterpene ratio; modified beta-ocimene; modified camphene and D-limonene; modified linalool and trans-nerolidol; modified alpha-humulene and beta-caryophyllene; and/or modified guaiol.
- the modified terpene content is modified total terpene content. In some aspects, the modified terpene content is modified total monoterpene content. In some aspects, the modified terpene content is modified total monoterpenes content absent beta-myrcene. In some aspects, the modified terpene content is modified total sesquiterpene content. In some aspects, the modified terpene content is modified alpha-pinene content. In some aspects, the modified terpene content is modified beta-pinene content. In some aspects, the modified terpene content is modified alpha-terpinene, gamma-terpinene, and terpinolene content.
- the modified terpene content is modified beta-myrcene to monoterpene content ratio ((beta-myrcene + l)/((total monoteipenes - beta-myrcene) +1)).
- the modified terpene content is modified beta-ocimene content.
- the modified terpene content is modified camphene and D- limonene content.
- the modified terpene content is modified linalool and trans-nerolidol content.
- the modified terpene content is modified alpha-humulene and beta-caryophyllene content.
- the modified terpene content is modified guaiol content.
- a plant produced or selected by a method disclosed herein includes a terpene content (e.g., total terpenes, total monoteipenes, beta-myrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma- teipinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, trans-nerolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content) of at least 0.1% by weight, for example, at least 0.2%, at least 0.3%, at least 0.4%, at least 0.5%, at least 0.6%, at least 0.7%, at least 0.8%, at least 0.9%, at least 1.0%, at least 1.2%, at least 1.4%, at least 1.5%, at least 1.75%, at least 2.0%, at least 2.5%, at least
- the plant has a terpene content (e.g., total terpenes, total monoterpenes, beta-myrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, trans-nerolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content) of at least 0.2% by weight.
- a terpene content e.g., total terpenes, total monoterpenes, beta-myrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D-limonene,
- the plant has a teipene content (e.g., total terpenes, total monoterpenes, beta-myrcene, total sesquiterpenes, alphapinene, beta-pinene, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, trans-nerolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content) of at least 1.0% by weight.
- a teipene content e.g., total terpenes, total monoterpenes, beta-myrcene, total sesquiterpenes, alphapinene, beta-pinene, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D-limonene
- the plant has a terpene content (e.g., total terpenes, total monoterpenes, betamyrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma-terpinene, terpinolene, beta- ocimene, camphene, D-limonene, linalool, trans-nerolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content) of at least 3% by weight.
- a terpene content e.g., total terpenes, total monoterpenes, betamyrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma-terpinene, terpinolene, beta- ocimene, camphene, D-limonene,
- the plant has a terpene content (e.g., total terpenes, total monoterpenes, bcta-myrccnc, total sesquiterpenes, alpha-pinene, bcta-pincnc, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, trans-nerolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content) of at least 5% by weight.
- a terpene content e.g., total terpenes, total monoterpenes, bcta-myrccnc, total sesquiterpenes, alpha-pinene, bcta-pincnc, alpha-terpinene, gamma-terpinene, terpinolene,
- a plant produced or selected by a method disclosed herein includes a terpene content (e.g., total terpenes, total monoterpenes, beta-myrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma- terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, tra/i.s-ncrolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content) of 0.1% to 10% by weight, for example, 0.1% to 9%, 0.1% to 8%, 0.1% to 7%, 0.1% to 6%, 0.1% to 5%, 0.1% to 4%, 0.1% to 3%, 0.1% to 2%, 0.1% to 1%, 0.2% to 10%, 0.2% to 9%, 0.2% to 8%, 0.2% to 7%, 0.
- the terpene content e.g., total terpenes, total monoteipenes, beta-myrcene, total sesquiterpenes, alpha-pinene, betapinene, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, transnerolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content
- the terpene content e.g., total terpenes, total monoteipenes, beta-myrcene, total sesquiterpenes, alpha-pinene, betapinene, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, transnerolidol, al
- the terpene content (e.g., total terpenes, total monoteipenes, beta-myrcene, total sesquiterpenes, alphapinene, beta-pinene, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, tra -ncrolidol. alpha-humulene, beta-caryophyllene, and/or guaiol content) is 0.5% to 7% by weight.
- the terpene content (e.g., total terpenes, total monoterpenes, beta-myrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D- limonene, linalool, trans-nerolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content) is 0.5% to 3% by weight.
- the terpene content (e.g., total terpenes, total monoteipenes, beta-myrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, tran -nerolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content) is 1% to 7% by weight.
- total terpenes e.g., total terpenes, total monoteipenes, beta-myrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D-limonene
- a plant produced or selected by the methods disclosed herein includes a total terpene, total monoterpene, and/or total sesquiterpene content of at least 0.1% by weight, for example, at least 0.2%, at least 0.5%, at least 0.75%, at least 1.0%, at least 1.5%, at least 2.0%, at least 2.5%, at least 3%, at least 3.5%, at least 4%, at least 4.5%, at least 5%, at least 5.5%, at least 6%, at least 6.5%, at least 7%, at least 7.5%, at least 8%, at least 8.5%, at least 9%, at least 9.5%, at least 10%, at least 11%, at least 12%, at least 13%, at least 14%, at least 15%, at least 16%, at least 17%, at least 18%, at least 19%, at least 20%, at least 25%, or more, by weight in at least one plant part (e.g., leaves, flowers, or trichomes).
- at least 0.5% at least 0.75%, at least 1.0%,
- the plant has a total terpene, total iiionotcrpcnc. and/or total sesquiterpene content of at least 0.2% by weight. In another non-limiting example, the plant has a total terpene, total monoterpene, and/or total sesquiterpene content of at least 1% by weight. In a further example, the plant has a total terpene, total monoterpene, and/or total sesquiterpene content of at least 3% by weight. In another example, the plant has a total terpene, total monoterpene, and/or total sesquiterpene content of at least 5% by weight.
- a plant produced or selected by a method disclosed herein includes a total terpene, total monoterpene, and/or total sesquiterpene of 0.1% to 10% by weight, for example, 0.1% to 9%, 0.1% to 8%, 0.1% to 7%, 0.1% to 6%, 0.1% to 5%, 0.1% to 4%, 0.1% to 3%, 0.1% to 2%, 0.1% to 1%, 0.2% to 10%, 0.2% to 9%, 0.2% to 8%, 0.2% to 7%, 0.2% to 6%, 0.2% to 5%, 0.2% to 4%, 0.2% to 3%, 0.2% to 2%, 0.2 % to 1%, 0.5% to 10%, 0.5% to 9%, 0.5% to 8%, 0.5% to 7%, 0.5% to 6%, 0.5% to 5%, 0.5% to 4%, 0.5% to 3%, 0.5% to 2%, 0.5 % to 1%, 1% to 10%, 1% to 9%, 1% to 8%, 1% to 7%, 1% to 1% to
- the total terpene, total monoterpene, and/or total sesquiterpene is 0.5% to 7%. In some aspects, the total terpene, total monoterpene, and/or total sesquiterpene is 0.5% to 5%. In some aspects, the total terpene, total monoterpene, and/or total sesquiterpene is 1% to 7%.
- a measure of % by weight in a method disclosed herein can be % by dry weight or % by fresh weight. In several aspects, the % by weight is % by dry weight.
- the modified beta-myrcene to monoteipene content ratio ((beta-myrcene + l)/( (total monoterpenes - beta-myrcene) +1)) is at least 0.75, for example, at least 0.8, at least 0.9, at least 1.0, at least 1.1, at least 1.2, at least 1.3, at least 1.4, at least 1.5, at least 1.6, at least 1.7, at least 1.8, at least 1.9, at least 2.0, or more.
- the modified beta-myrcene to monoterpene content ratio is 0.75 to 5, for example, 0.75 to 4, 0.75 to 3, 0.75 to 2, 0.75 to 1.5, 0.75 to 1.4, 0.75 to 1.3, 0.75 to 1.2, 0.75 to 1.1, 0.75 to 1.0, 0.8 to 4, 0.8 to 3, 0.8 to 2, 0.8 to 1 .5, 0.8 to 1 .4, 0.8 to 1 .3, 0.8 to 1 .2, 0.8 to 1 .1 , 0.8 to 1 .0, 1 to 4, 1 to 3, 1 to 2, 1 to 1 .5, 1 to 1.4, 1 to 1.3, 1 to 1.2, 1 to 1.1, 1.2 to 4, 1.2 to 3, 1.2 to 2, 1.2 to 1.5, 1.2 to 1.4, or 1.2 to 1.3. In some aspects, the modified beta-myrcene to monoterpene content ratio is about 0.75 to 1.3. In some aspects, the modified betamyrcene to monoteipene content ratio is about 1.0 to about 1.4.
- the plant part can be any part of the plant selected or produced by the methods disclosed herein.
- the plant part is a flower (e.g., a female flower) or inflorescence tissue.
- the plant part is a trichome (e.g., glandular trichomes).
- the plant part is a leaf or other vegetative tissue.
- the one or more genetic markers that indicate modified terpene content include or consist of one or more genetic markers disclosed herein, for example, one or more genetic markers described in Table 15.
- the genetic marker is a polymorphism (e.g., SNP) found within one or more of the following haplotypes:
- the methods disclosed herein detect a haplotype associated with modified terpene content, or a haplotype that contains a terpene trait locus.
- the genetic marker is genetically linked to a terpene trait locus.
- analyzing or detecting one or more genetic markers that indicate modified terpene content includes analyzing or detecting at least one SNP disclosed herein, for example, in Table 1, Table 2, Tabic 3, Tabic 4, Tabic 5, Tabic 6, Tabic 7, Tabic 8, Tabic 9, Tabic 10, Tabic 11, Tabic 12, Tabic 13, Tabic 15, or Table 16, respectively.
- analyzing or detecting one or more genetic markers that indicate modified terpene content includes analyzing or detecting at least two SNPs described in Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11, Table 12, Table 13, Table 15, or Table 16, respectively.
- analyzing or detecting one or more genetic markers that indicate modified terpene content includes analyzing or detecting at least three SNPs described in Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11, Table 12, Table 13, Table 15, or Table 16, respectively. In some aspects, analyzing or detecting one or more genetic markers that indicate modified terpene content includes analyzing or detecting at least five SNPs described in Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11, Table 12, Table 13, Table 15, or Table 16, respectively.
- analyzing or detecting one or more genetic markers that indicate modified terpene content includes analyzing or detecting all of the SNPs described in Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11, Table 12, Table 13, Table 15, or Table 16, respectively.
- Combinations of SNP markers disclosed herein can be useful, for example, for screening increased levels of multiple specific terpenes of interest. While any combination of SNPs disclosed herein could be useful, an exemplary subset of SNPs is provided in Table 16.
- analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting at least one SNP from the list of SNPs disclosed in Table 16.
- analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting at least two SNPs from the list of SNPs disclosed in Table 16.
- analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting at least three SNPs from the list of SNPs disclosed in Table 16.
- analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting at least four SNPs from the list of SNPs disclosed in Table 16. In some aspects, analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting at least five SNPs from the list of SNPs disclosed in Table 16. In some aspects, analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting at least six SNPs from the list of SNPs disclosed in Table 16. In some aspects, analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting at least seven SNPs from the list of SNPs disclosed in Table 16.
- analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting at least eight SNPs from the list of SNPs disclosed in Table 16. In some aspects, analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting at least nine SNPs from the list of SNPs disclosed in Table 16. In some aspects, analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting all the SNPs from the list of SNPs disclosed in Table 16.
- analyzing one or more genetic markers in the nucleic acid sample includes analyzing one or more of nucleotide positions: 8,871,401; 8,886,933; 9,101,934; 10,446,475; 10,543,062; 10,561,778; 10,633,191; 10,934,458; 11,169,492; or 13,920,896 on chromosome 1;
- the one or more nucleic acid polymorphisms are beneficial polymorphisms associated with increased terpene content in Cannabis.
- analyzing one or more genetic markers in the nucleic acid sample includes analyzing at least 2 genetic markers, for example, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 12, at least 14, at least 16, at least 18, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, or more markers.
- analyzing one or more genetic markers in the nucleic acid sample includes analyzing at least 3 genetic markers.
- analyzing one or more genetic markers in the nucleic acid sample includes analyzing at least 5 genetic markers.
- analyzing one or more genetic markers in the nucleic acid sample includes analyzing at least 7 genetic markers.
- analyzing one or more genetic markers in the nucleic acid sample includes analyzing 2 to 50 genetic markers, for example, 2 to 40, 2 to 30, 2 to 20, 2 to 10, 5 to 50, 5 to 40, 5 to 30, 5 to 20, 5 to 10, 10 to 50, 10 to 40, 10 to 30, or 10 to 20 genetic markers.
- 2 to 10 genetic markers e.g., SNPs
- the one or more genetic markers arc genetically linked to a terpene trait locus.
- analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions: 93,291,929 on chromosome 2; 72,717,623 on chromosome 4; 55,114,152 on chromosome 9; 57,912,635 on chromosome x; and/or 58,545,628 on chromosome x, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified total terpene content.
- analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions: 93,291,929 on chromosome 2; 47,140,085 on chromosome 3; 72,717,623 on chromosome 4; 3,807,710 on chromosome 5; 3,842,906 on chromosome 5; 55,475,322 on chromosome 5; and/or 33,592,849 on chromosome 8, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified total monoterpene content.
- analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions 1,100,981 ; 1,840,325; 2,366,529; 2,698,301; 3,081,773; 3,485,895; 3,585,965; 3,599,637; 3,629,225; and/or 4,384,123 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing nucleotide position 3,081,773 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified total monoterpenes absent beta-myrcene.
- analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions 2,774,108; 5,854,661; or 7,307,552 on chromosome 5, or 1,695,817; 1,727,397; 1,960,918; 5,175,087; 14,069,586; 14,329,191; and/or 14,866,064 of chromosome 6 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified total sesquiterpene content.
- analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing nucleotide position 1,366,137 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified beta-pinene content.
- analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions 755,967; 1,840,325; 2,302,063; 2,366,529; 2,698,301; 3,485,895; 4,384,123; 11,993,646; 12,418,741; and/or 12,446,524 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing nucleotide position 2,698,301 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified alpha-terpinene, gamma-terpinene, and terpinolene content.
- analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing nucleotide position 1,929,134 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified monoterpene to beta-myrcene content ratio.
- analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions 1,828,050; 2,038,965; 2,120,881; 2,208,629; 2,360,380; 2,364,964; 32,342,917; 32,395,736; 62,911,168; and/or 66,477,802 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_ Abacus V2, NCBI assembly accession GCA_025232715.1 .
- analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions 1,331,433; 1,487,633; 1,837,343; 2,288,919; 2,291,467; 2,318,276; 2,774,108; 2,780,345; 4,391,586; and/or 74,391,606 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing nucleotide position 2,288,919 and/or 2,774,108 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified camphene and D-limonene content.
- analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions 8,871,401; 8,886,933; 9,101,934; 10,446,475; 10,543,062; 10,561,778; 10,633,191; 10,934,458; 11,169,492; and/or 13,920,896 on chromosome 1 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing nucleotide position 10,633,191 on chromosome 1, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified linalool and trans-nerolidol content.
- analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions 1,695,817; 1,727,397; 1,960,918; 5,175,087; 5,468,920; 5,868,053; 6,061,359; 6,120,135; and/or 64,943,914 on chromosome 6; or 14,069,586 on chromosome 8; according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing nucleotide position 5,175,087 on chromosome 6, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified alpha-humulene and beta-caryophyllene content.
- analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions 1,220,207; 1,288,012; 1,999,618; 2,012,149; 2,931,923; 3,073,845; 3,091,941; 3,185,660; 6,311,954; and/or 6,589,961 on chromosome 6 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing nucleotide position 6,311,954 on chromosome 6, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_O25232715.1 .
- the modified terpene content is modified guaiol content.
- detecting one or more genetic markers that indicate modified teipene content includes detecting one or more of the following SNPs:
- Chromosome 2 (k) a A/ A or T/A genotype at position 93291929;
- analyzing one or more genetic markers comprises analyzing one or more of nucleotide positions: (a) 3,081,773 on chromosome 5; (b) 10,633,191 on chromosome 1; (c) 1,366,137 on chromosome 5; (d) 1 ,929,134 on chromosome 5; (e) 2,038,965 on chromosome 5; (f) 2,288,919 on chromosome 5; (g) 2,534,579 on chromosome 5; (h) 2,698,301 on chromosome 5; (i) 2,774,108 on chromosome 5; (j) 5,175,087 on chromosome 6; and/or (k) 6,311,954 on chromosome 6.
- analyzing one or more genetic markers comprises analyzing all of nucleotide positions: (a) 3,081,773 on chromosome 5; (b) 10,633,191 on chromosome 1; (c) 1,366,137 on chromosome 5; (d) 1,929,134 on chromosome 5; (e) 2,038,965 on chromosome 5; (f) 2,288,919 on chromosome 5; (g) 2,534,579 on chromosome 5; (h) 2,698,301 on chromosome 5; (i) 2,774,108 on chromosome 5; (j) 5,175,087 on chromosome 6; and/or (k) 6,311,954 on chromosome 6.
- detecting one or more genetic markers that indicate modified terpene content includes detecting one or more of the following SNPs: (a) a T/T or C/T genotype at position 3,081,773 on chromosome 5; (b) a C/C or C/A genotype at position 10,633,191 on chromosome 1; (c) a A/A or G/A genotype at position 1,366,137 on chromosome 5; (d) a T/T or T/C genotype at position 1,929,134 on chromosome 5; (e) a C/C or T/C genotype at position 2,038,965 on chromosome 5; (f) a T/T or A/T genotype at position 2,288,919 on chromosome 5; (g) a A/A or A/T genotype at position 2,534,579 on chromosome 5; (h) a G/G or A/G genotype at position 2,698,301 on chromosome 5; (i
- detecting one or more genetic markers (e.g., polymorphisms) that indicate modified terpene content in the nucleic acid sample includes detecting at least 2 genetic markers, for example, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 12, at least 14, at least 16, at least 18, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, or more markers.
- detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting at least 3 genetic markers.
- detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting at least 5 genetic markers.
- detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting at least 7 genetic markers.
- the one or more genetic markers that indicate modified terpene content are beneficial markers that are associated with increased terpene content in Cannabis.
- detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting 2 to 50 genetic markers, for example, 2 to 40, 2 to 30, 2 to 20, 2 to 10, 5 to 50, 5 to 40, 5 to 30, 5 to 20, 5 to 10, 10 to 50, 10 to 40, 10 to 30, or 10 to 20 genetic markers.
- 2 to 10 genetic markers that indicate modified terpene content are detected.
- the one or more genetic markers e.g., SNPs
- the one or more genetic markers that indicate modified terpene content are genetically linked to a terpene trait locus.
- detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions: 93,291,929 on chromosome 2; 72,717,623 on chromosome 4; 55,114,152 on chromosome 9; 57,912,635 on chromosome x; and/or 58,545,628 on chromosome x, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified total terpene content.
- detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions: 93,291,929 on chromosome 2; 47,140,085 on chromosome 3; 72,717,623 on chromosome 4; 3,807,710 on chromosome 5; 3,842,906 on chromosome 5; 55,475,322 on chromosome 5; and/or 33,592,849 on chromosome 8, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified total monoterpene content.
- detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 1,100,981; 1,840,325; 2,366,529; 2,698,301; 3,081,773; 3,485,895; 3,585,965; 3,599,637; 3,629,225; and/or 4,384,123 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- detecting one or more genetic markers that indicate modified terpene content include detecting a SNP that indicates modified terpene content at nucleotide position 3,081,773 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified total monoterpenes absent beta-myrcene.
- detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 2,774,108; 5,854,661; or 7,307,552 on chromosome 5, or 1,695,817; 1,727,397; 1,960,918; 5,175,087; 14,069,586; 14,329,191; and/or 14,866,064 of chromosome 6 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified total sesquiterpene content.
- detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 516,340; 518,238; 523,626; 1,109,162; 1,366,137; 2,346,000; 2,534,579; 3,247,341; 3,503,143; and/or 3,629,225 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- detecting one or more genetic markers that indicate modified terpene content include detecting a SNP that indicates modified terpene content at nucleotide position 2,534,579 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified alpha-pinene content.
- detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 516,340; 518,238; 523,626; 608,718; 1,109,162; 1,366,137; 1,386,965; 2,003,303; 3,247,341; and/or 3,704,632 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- detecting one or more genetic markers that indicate modified terpene content include detecting a SNP that indicates modified terpene content at nucleotide position 1,366,137 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified bcta-pincnc content.
- detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 755,967; 1,840,325; 2,302,063; 2,366,529; 2,698,301; 3,485,895; 4,384,123; 11,993,646; 12,418,741; and/or 12,446,524 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- detecting one or more genetic markers that indicate modified teipene content include detecting a SNP that indicates modified terpene content at nucleotide position 2,698,301 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified alpha-terpinene, gamma-terpinene, and terpinolene content.
- detecting one or more genetic markers that indicate modified teipene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 330,918; 1,353,878; 1,745,101; 1,828,050; 1,929,134; 2,072,869; 2,339,956; 3,081,773; 3,564,387; and/or 3,585,965 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- detecting one or more genetic markers that indicate modified terpene content include detecting a SNP that indicates modified terpene content at nucleotide position 1,929,134 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified beta-myrcene to total monoterpene content ratio.
- detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 1,828,050; 2,038,965; 2,120,881; 2,208,629; 2,360,380; 2,364,964; 32,342,917; 32,395,736; 62,911,168; and/or 66,477,802 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- detecting one or more genetic markers that indicate modified terpene content include detecting a SNP that indicates modified terpene content at nucleotide position 2,038,965 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified beta-ocimene content.
- detecting one or more genetic markers that indicate modified teipene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 1,331,433; 1,487,633; 1,837,343; 2,288,919; 2,291,467; 2,318,276; 2,774,108; 2,780,345; 4,391,586; and/or 74,391,606 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- detecting one or more genetic markers that indicate modified terpene content include detecting a SNP that indicates modified terpene content at nucleotide position 2,288,919 and/or 2,774,108 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified teipene content is modified camphene and D-limonene content.
- detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 8,871,401; 8,886,933; 9,101,934; 10,446,475; 10,543,062; 10,561,778; 10,633,191; 10,934,458; 11,169,492; and/or 13,920,896 on chromosome 1 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- detecting one or more genetic markers that indicate modified terpene content include detecting a SNP that indicates modified terpene content at nucleotide position 10,633,191 on chromosome 1, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified linalool and trans-nerolidol content.
- detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 1,695,817; 1,727,397; 1,960,918; 5,175,087; 5,468,920; 5,868,053; 6,061,359; 6,120,135; and/or 64,943,914 on chromosome 6; or 14,069,586 on chromosome 8; according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- detecting one or more genetic markers that indicate modified terpene content include detecting a SNP that indicates modified terpene content at nucleotide position 5,175,087 on chromosome 6, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified alpha-humulene and beta-caryophyllene content.
- detecting one or more genetic markers that indicate modified terpene content include detecting a SNP that indicates modified terpene content at nucleotide position 6,311 ,954 on chromosome 6, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
- the modified terpene content is modified guaiol content.
- the genetic markers described herein can also be identified based on corresponding SEQ ID NOs disclosed herein, rather than a particular chromosomal location relative to the Abacus Cannabis reference genome. Corresponding SEQ ID NOs are provided in Tables 1-13 and 15. Thus, in some aspects, the one or more genetic markers comprise a polymorphism at position 51 of one or more of SEQ ID NOs: 1-99. In some aspects, detecting one or more markers that indicate modified terpene content includes detecting one or more of the following: Chromosome 1:
- Chromosome 9 (cs) a C/C or G/C genotype at position 51 of SEQ ID NO: 97;
- Methods of analyzing/detecting genetic markers that are suitable for use in the methods disclosed herein have been described, and can include amplification of a target polynucleotide (e.g., by PCR).
- PCR uses a particular amplification primer pair that specifically hybridize to a target polynucleotide and produce an amplification product (the amplicon).
- Primers can be designed such that the amplicon can contain a nucleic acid polymorphism of interest.
- the primers can be radiolabeled, or labeled by any suitable means (e.g., using a non-radioactive fluorescent tag), to allow for rapid visualization of the different size amplicons following an amplification reaction without any additional labeling step or visualization step.
- nucleic acid amplification methods include, but are not limited to, reversetranscription PCR (RT-PCR), quantitative real-time PCR (qPCR), quantitative real-time reverse transcriptase PCR (qRT-PCR) (see, e.g., Adams, A beginner’s guide to RT-PCR, qPCR and RT-qPCR, Biochemist (Lond) (2020) 42(3): 48-53), isothermal amplification methods (see, e.g., Zanoli et al., Biosensors (2013) 3(1): 18-43), nucleic acid sequence-based amplification (NASBA) (see, e.g., Deiman and Sillekens, Mol Biotechnol (2002) 20(2): 163-79), loop-mediated isothermal amplification (LAMP) (see, e.g., Notomi et al., (2000) Nucleic Acids Res.
- RT-PCR reversetranscription PCR
- qPCR quantitative real-time PCR
- HDA helicase-dependent amplification
- RCA rolling circle amplification
- MDA multiple displacement amplification
- RPA recombinase polymerase amplification
- LCR ligase chain reaction
- transcription amplification see e.g., Kwoh et al. (1989) Proc. Natl. Acad. Sci. USA 86: 1173
- self-sustained sequence replication see e.g., Guatelli et al. (1990) Proc. Natl. Acad. Sci.
- amplification produces an amplicon that is at least 20 nucleotides in length, for example, at least 50 nucleotides in length, at least 100 nucleotides in length, at least 200 nucleotides in length, at least 300, at least 400, at least 500, at least 600, at least 700, at least 800, at least 900, at least 1000, at least 1500, at least 2000, or at least 2500 nucleotides in length.
- the amplicon is no longer than 10000 nucleotides in length, for example, no longer than 3000, no longer than 5000, no longer than 7000, or no longer than 9000 nucleotides.
- marker amplification produces an amplicon that is 20 to 10000 nucleotides in length, for example, 20 to 9000 nucleotides, 20 to 8000 nucleotides, 20 to 7000 nucleotides, 20 to 6000 nucleotides, 20 to 5000 nucleotides, 20 to 4000 nucleotides, 20 to 3000 nucleotides, 20 to 2000 nucleotides, 20 to 1500 nucleotides, 20 to 1000 nucleotides, 20 to 500 nucleotides, 20 to 400 nucleotides, 20 to 300 nucleotides, 20 to 200 nucleotides, 20 to 150 nucleotides, 20 to 100 nucleotides, 20 to 50 nucleotides, 50 to 9000 nucleotides, 50 to 8000 nucleotides, 50 to 7000 nucleotides, 50 to 6000 nucleotides, 50 to 5000 nucleotides, 50 to 4000 nucleotides, 50 to 3000 nucleot
- the amplicon is 100 to 4000 nucleotides. In some aspects, the amplicon is 200 to 3000 nucleotides. In some aspects, the amplicon is at least 51 nucleotides. In some aspects, the amplicon is at least
- the presence of a nucleic acid polymorphism in an amplicon can be determined (detected), for example, by directly sequencing the amplicon, performing a restriction enzyme digest (e.g, restriction fragment length polymorphism (RFLP)), or by using a detection probe.
- detection includes using PCR, quantitative PCR (qPCR), reverse-transcription PCR (RT-PCR), quantitative real-time reverse transcriptase PCR (qRT-PCR), and/or sequencing methods.
- detection includes using PCR, quantitative PCR (qPCR), and/or sequencing based detection methods.
- PCR detection and quantification using dual-labeled fluorogenic oligonucleotide probes can also be performed according to the present disclosure.
- These probes are composed of short (e.g., 20-25 base) oligodeoxynucleotides that are labeled with two different fluorescent dyes. On the 5' terminus of each probe is a reporter dye, and on the 3' terminus of each probe a quenching dye is found.
- the oligonucleotide probe sequence is complementary to an internal target sequence present in a PCR amplicon. When the probe is intact, energy transfer occurs between the two fluorophores and emission from the reporter is quenched by the quencher by FRET.
- the probe is cleaved by 5' nuclease activity of the polymerase used in the reaction, thereby releasing the reporter from the oligonucleotide - qucnchcr and producing an increase in reporter emission intensity.
- TaqManTM probes arc oligonucleotides that have a label and a quencher, where the label is released during amplification by the exonuclease action of the polymerase used in amplification, providing a real time measure of amplification during synthesis.
- a variety of TaqManTM reagents are commercially available, e.g., from Applied Biosystems as well as from a variety of specialty vendors such as Biosearch Technologies.
- detecting a nucleic acid polymorphism includes use of an oligonucleotide primer or probe.
- synthetic methods for making oligonucleotides, including probes or primers are known.
- oligonucleotides can be synthesized chemically according to the solid phase phosphoramidite triester method.
- Oligonucleotides, including modified oligonucleotides can also be ordered from a variety of commercial sources.
- Nucleic acid probes to the marker loci can be cloned and/or synthesized. Any suitable label can be used with a probe.
- Detectable labels suitable for use with nucleic acid probes include, for example, any composition detectable by spectroscopic, radioisotopic, photochemical, biochemical, immunochemical, electrical, optical, or chemical means.
- Useful labels include biotin for staining with labeled streptavidin conjugate, magnetic beads, fluorescent dyes, radio labels, enzymes, and colorimetric labels.
- Other labels include ligands which bind to antibodies labeled with fluorophores, chemiluminescent agents, and enzymes.
- a probe can also constitute radio labeled PCR primers that are used to generate a radio labeled amplicon. It is not intended that the nucleic acid probes be limited to any particular size, however, nucleic acid probes are typically 20-100 base pairs.
- Amplification is not always required for detection of a nucleic acid polymorphism (e.g. Southern blotting or RFLP detection).
- Separate detection probes can also be omitted in amplification/detection methods, e.g., by performing a real time amplification reaction that detects product formation by modification of the relevant amplification primer upon incorporation into a product, incorporation of labeled nucleotides into an amplicon, or by monitoring changes in molecular rotation properties of amplicons as compared to unamplified precursors (e.g., by fluorescence polarization).
- the nucleic acid polymorphism is detected by sequencing a nucleic acid fragment comprising a target sequence of interest, or by whole genome sequencing (or whole transcriptome sequencing).
- suitable sequencing methods include capillary electrophoresis (e.g., Sanger sequencing) and high-throughput sequencing (e.g., Illumina® or 454 Sequencing®). High-throughput sequencing includes short read or long read techniques.
- sequencing includes whole genome sequencing (e.g., sequencing the genome of a Cannabis plant of interest).
- sequencing includes targeted sequencing (sequencing of a particular nucleic acid or amplicon of interest).
- sequencing includes sequencing a transcriptome (RNA-Seq) (e.g., sequencing the transcriptome of a Cannabis plant selected or produced by a method disclosed herein). In some implementations, sequencing does not include sequencing of RNA. In some implementations, the genome is sequenced.
- RNA-Seq a transcriptome
- sequencing does not include sequencing of RNA. In some implementations, the genome is sequenced.
- the methods disclosed herein include a step wherein a Cannabis plant including one or more markers that indicate modified terpene content as disclosed herein is identified and/or selected.
- the Cannabis plant including one or more markers that indicate modified terpene content is selected for further analysis, propagation, crossing, or to make a product (e.g., a kief, hashish, bubble hash, an edible product, solvent reduced oil, sludge, e-juice, or tincture).
- a product e.g., a kief, hashish, bubble hash, an edible product, solvent reduced oil, sludge, e-juice, or tincture.
- the product is not, or excludes, any naturally occurring products.
- the methods disclosed herein include a step wherein a Cannabis plant identified as including one or more markers that indicate modified terpene content is crossed (e.g., selfed, sibling crossed, outcrossed, or backcrossed). In some aspects, crossing includes marker-assisted selection (MAS) for at least two generations. In some aspects, progeny plants comprising the one or more markers are obtained from the cross.
- a Cannabis plant identified as including one or more markers that indicate modified terpene content is crossed (e.g., selfed, sibling crossed, outcrossed, or backcrossed).
- crossing includes marker-assisted selection (MAS) for at least two generations.
- progeny plants comprising the one or more markers are obtained from the cross.
- progeny plants including the one or more markers have modified terpene content (e.g., increased terpene content) relative to a control, for example (and without limitation), a sibling progeny plant that does not include the one or more markers indicating modified terpene content, or a parent plant that does not include the one or more markers indicating modified terpene content.
- modified terpene content e.g., increased terpene content
- Cannabis plants identified, selected, or produced by a method disclosed herein are encompassed by this disclosure, as well as material derived from such plants (e.g., a plant part), including seed, tissue, or cells (including protoplasts); and progeny of the plant (e.g., F1-F7, for example, Fl and/or F2 progeny).
- the Cannabis plant is Cannabis sativa, Cannabis indica, or Cannabis ruderaJis.
- the Cannabis plant is Cannabis sativa.
- the plant includes one or more genetic markers indicating increased terpene content disclosed herein.
- plants disclosed herein including plants identified, selected, or produced by a method disclosed herein, can be used for plant breeding (e.g., crossing).
- a plant disclosed herein is used to develop new, unique, and superior variety or hybrid with a desired phenotype (e.g., increased terpene production/content) .
- Pedigree breeding and recurrent selection breeding methods may be used to develop cultivars from breeding populations. Breeding programs may combine desirable traits from two or more varieties or various broad-based sources into breeding pools from which cultivars are developed by selfing and selection of desired phenotypes. The new cultivars may be crossed with other varieties and the hybrids from these crosses are evaluated to determine which have commercial potential.
- a plant identified, selected, or produced by a method disclosed herein is crossed.
- Exemplary types of crosses include selfing, sibling crossing, outcrossing, and backcrossing. Suitable methods of crossing are disclosed herein.
- Pedigree selection where both single plant selection and mass selection practices are employed, may be used for the generation of new varieties.
- Pedigree breeding is used commonly for the improvement of selfpollinating crops or inbred lines of cross-pollinating crops. Two parents which possess favorable, complementary traits are crossed to produce an Fl. An F2 population is produced by selfing one or several Fl’s or by intercrossing two Fl's (sib mating). Selection of the best individuals usually begins in the F2 population; then, beginning in the F3, the best individuals in the best families are usually selected. Replicated testing of families, or hybrid combinations involving individuals of these families, often follows in the F4 generation to improve the effectiveness of selection for traits with low heritability. At an advanced stage of inbreeding (e.g., F6 and F7), the best lines or mixtures of phenotypically similar lines are tested for potential release as new cultivars.
- F6 and F7 an advanced stage of inbreeding
- Choice of breeding or selection methods depends on the mode of plant reproduction, the heritability of the trait(s) being improved, and the type of cultivar used commercially (e.g., Fl hybrid cultivar, pureline cultivar, etc.). For highly heritable traits, a choice of superior individual plants evaluated at a single location will be effective, whereas for traits with low heritability, selection should be based on mean values obtained from replicated evaluations of families of related plants.
- Popular selection methods commonly include pedigree selection, modified pedigree selection, mass selection, and recurrent selection.
- Mass and recurrent selections can be used to improve populations of either self- or cross-pollinating crops.
- a genetically variable population of heterozygous individuals may be identified or created by intercrossing several different parents. The best plants may be selected based on individual superiority, outstanding progeny, or excellent combining ability. Preferably, the selected plants are intercrossed to produce a new population in which further cycles of selection are continued.
- Backcross breeding has been used to transfer genes for a simply inherited, highly heritable trait into a desirable homozygous cultivar or line that is the recurrent parent.
- the source of the trait to be transferred is called the donor parent.
- the resulting plant is expected to have the attributes of the recurrent parent (e.g., cultivar) and the desirable trait transferred from the donor parent.
- individuals possessing the phenotype of the donor parent may be selected and repeatedly crossed (backcrossed) to the recurrent parent.
- the resulting plant is expected to have the attributes of the recurrent parent (e.g., cultivar) and the desirable trait transferred from the donor parent.
- a single-seed descent procedure refers to planting a segregating population, harvesting a sample of one seed per plant, and using the one-seed sample to plant the next generation.
- the plants from which lines are derived will each trace to different F2 individuals.
- the number of plants in a population declines each generation due to failure of some seeds to germinate or some plants to produce at least one seed. As a result, not all of the F2 plants originally sampled in the population will be represented by a progeny when generation advance is completed.
- Mutation breeding is another method of introducing new traits into Cannabis varieties. Mutations that occur spontaneously or are artificially induced can be useful sources of variability for a plant breeder. The goal of artificial mutagenesis is to increase the rate of mutation for a desired characteristic. Mutation rates can be increased by many different means including temperature, long-term seed storage, tissue culture conditions, radiation (such as X-rays, Gamma rays, neutrons, Beta radiation, or ultraviolet radiation), chemical mutagens (such as base analogs like 5-bromo-uracil), antibiotics, alkylating agents (such as sulfur mustards, nitrogen mustards, epoxides, ethyleneamines, sulfates, sulfonates, sulfones, or lactones), azide, hydroxylamine, nitrous acid or acridines. Once a desired trait is observed through mutagenesis the trait may then be incorporated into existing germplasm by traditional breeding techniques. Details of mutation breeding can be found, for example, in Principles of Cultivar
- breeding method may be used to transfer one or a few favorable genes for a highly heritable trait into a desirable cultivar. This approach has been used extensively for breeding disease-resistant cultivars.
- Various recurrent selection techniques are used to improve quantitatively inherited traits controlled by numerous genes. The use of recurrent selection in self -pollinating crops depends on the ease of pollination, the frequency of successful hybrids from each pollination, and the number of hybrid offspring from each successful cross.
- Double haploids are produced by the doubling of a set of chromosomes from a heterozygous plant to produce a completely homozygous individual (e.g., see Wan et al., Theor. Appl. Genet., 77:889-892, 1989).
- MAS marker assisted selection
- MAS is a powerful shortcut to selecting for desired phenotypes and for introgressing desired traits into cultivars (e.g., introgressing desired traits into elite lines).
- MAS is easily adapted to high throughput molecular analysis methods that can quickly screen large numbers of plant or germplasm genetic material for the markers of interest and is much more cost effective than raising and observing plants for visible traits.
- MAS can be used in the methods disclosed herein to produce plants with desired traits (e.g., increased terpene content).
- Cannabis plants that have modified terpene content (e.g., increased or decreased terpene levels relative to a control) made by any of the methods disclosed herein.
- Material derived from the Cannabis plants, including seed, tissue, or cells (including protoplasts); or progeny of the plant, such as Fl or F2 progeny, are encompassed by this disclosure.
- the Cannabis plant can be Cannabis sativa, Cannabis indica, or Cannabis ruderalis.
- the Cannabis plant is Cannabis sativa.
- the product may be any product known in the Cannabis arts, and can include, but is not limited to, extracts, a kief, hashish, bubble hash, an edible product, a flower, a seed, solvent reduced oil, sludge, e-juice, or tincture.
- Kief refers to a composition of concentrated Cannabis trichomes, which are accumulated by being sifted from Cannabis flowers or buds using a mesh screen or sieve.
- Hashish or hash refers to a compressed or purified preparation from Cannabis tissue containing trichomes (e.g., flowers).
- Bubble hash refers to a solid concentr ation of Cannabis trichomes made from a solventless extraction method.
- Cannabis sludges are solvent-free Cannabis extracts made via multigas extraction including the refrigerant 134A, butane, iso-butane and propane in a ratio that delivers a very complete and balanced extraction of cannabinoids and essential oils.
- E-juice vape juice
- a tincture refers to an alcohol-based extract, for example, an extract of Cannabis tissue dissolved in an alcohol.
- the product can be formulated for administration to a subject (e.g., a human), such as by an injection (e.g., intravenous, subcutaneous, intramuscular, parenteral), or by topical, oral, or pulmonary administration.
- a subject e.g., a human
- an injection e.g., intravenous, subcutaneous, intramuscular, parenteral
- topical oral, or pulmonary administration.
- the product is a recreational product.
- the product is a therapeutic product (e.g., medicament).
- the composition is for pulmonary administration.
- the compositions include, but are not limited to, dry powder compositions consisting of the powder of a Cannabis oil described herein, and the powder of a suitable carrier and/or lubricant.
- the compositions for pulmonary administration can be inhaled from any suitable dry powder inhaler device.
- compositions may be conveniently delivered in the form of an aerosol spray from pressurized packs or a nebulizer, with the use of a suitable propellant, for example, dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide, or other suitable gas.
- a suitable propellant for example, dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide, or other suitable gas.
- the dosage unit can be determined by providing a valve to deliver a metered amount.
- Capsules and cartridges of, for example, gelatin for use in an inhaler or insufflator can be formulated containing a powder mix of the compound(s) and a suitable powder base, for example, lactose or starch.
- a composition can take the form of, e.g., a tablet or a capsule prepared by conventional means with a pharmaceutically acceptable excipient.
- a pharmaceutically acceptable excipient e.g., a tablet or a capsule prepared by conventional means with a pharmaceutically acceptable excipient.
- binders e.g., magnesium aluminum silicate, starch paste, gelatin, tragacanth, methylcellulose, sodium carboxymethylcellulose, polyvinylpyrrolidone and
- Tablets can be either uncoated or coated according to known methods.
- the excipients described herein can also be used for preparation of buccal dosage forms and sublingual dosage forms (e.g., films and lozenges) as described, for example, in U.S. Pat. Nos. 5,981 ,552 and 8,475,832.
- Formulation in chewing gums as described, for example, in U.S. Pat. No. 8,722,022, is also contemplated.
- Liquid preparations for oral administration can be prepared by conventional means with pharmaceutically acceptable additives, for example, suspending agents, for example, sorbitol syrup, cellulose derivatives, or hydrogenated edible fats; emulsifying agents, for example, lecithin, xanthan gum, or acacia; non-aqueous vehicles, for example, almond oil, sesame oil, hemp seed oil, fish oil, oily esters, ethyl alcohol, or fractionated vegetable oils; and preservatives, for example, methyl or propyl-p-hydroxybenzoates or sorbic acid.
- pharmaceutically acceptable additives for example, suspending agents, for example, sorbitol syrup, cellulose derivatives, or hydrogenated edible fats
- emulsifying agents for example, lecithin, xanthan gum, or acacia
- non-aqueous vehicles for example, almond oil, sesame oil, hemp seed oil, fish oil, oily esters, ethyl alcohol, or fractionated vegetable oils
- the preparations can also contain buffer salts, flavoring, coloring, and/or sweetening agents as appropriate.
- Typical formulations for topical administration include creams, ointments, sprays, lotions, hydrocolloid dressings, and patches, as well as eye drops, ear drops, and deodorants.
- Cannabis extracts/oils can be administered via transdermal patches as described, for example, in U.S. Pat. Appl. Pub. No. 2015/0126595 and
- Cannabis oils can be formulated, for example, as suppositories containing conventional suppository bases such as cocoa butter and other glycerides as described in U.S. Pat. Nos. 5,508,037 and 4,933,363.
- Compositions can contain other solidifying agents such as shea butter, beeswax, kokum butter, mango butter, illipe butter, tamanu butter, carnauba wax, emulsifying wax, soy wax, castor wax, rice bran wax, and candelilla wax.
- Compositions can further include clays (e.g., bentonite, French green clays, Fuller's earth, Rhassoul clay, white kaolin clay) and salts (e.g., sea salt, Himalayan pink salt, and magnesium salts such as Epsom salt).
- clays e.g., bentonite, French green clays, Fuller's earth, Rhassoul clay, white kaolin clay
- salts e.g., sea salt, Himalayan pink salt, and magnesium salts such as Epsom salt.
- compositions disclosed herein can be formulated for administration by injection, for example, by bolus injection or continuous infusion.
- Formulations for injection can be presented in unit dosage form, for example, in ampoules or in multi-dose containers, optionally with an added preservative.
- Injectable compositions are preferably aqueous isotonic solutions or suspensions, and suppositories are preferably prepared from fatty emulsions or suspensions.
- the compositions may be sterilized and/or contain adjuvants, such as preserving, stabilizing, wetting or emulsifying agents, solution promoters, salts for regulating the osmotic pressure, buffers, and/or other ingredients.
- the compositions can be in powder form for reconstitution with a suitable vehicle, for example, a carrier oil, before use.
- the compositions may also contain other therapeutic agents or substances.
- compositions can be prepared according to conventional mixing, granulating, and/or coating methods, and contain from about 0.1 to about 75%, for example from about 1% to about 50%, of a Cannabis extract.
- subjects receiving a Cannabis composition orally are administered doses ranging from about 1 to about 2000 mg of Cannabis extract.
- a small dose ranging from about 1 to about 20 mg can typically be administered orally when treatment is initiated, and the dose can be increased (e.g., doubled) over a period of days or weeks until the optimal or maximum dose is reached.
- kits for use in research, breeding, or other application may include oligonucleotide probes and/or primers to detect a genetic marker disclosed herein (e.g., any probes or primers disclosed herein).
- the kit includes seed or germplasm of a Cannabis plant.
- the kit includes DNA from a Cannabis plant, for example, that is useful as a positive or negative control.
- the kits include enzymes (e.g., polymerase), dNTPs, enzymatic substrates, reagents for colorimetric or fluorescent detection, buffers, etc.
- the kit components arc in separate containers. The kit can be used to practice any of the methods disclosed herein.
- the kit is for detecting a genetic marker (e.g., SNP) or set of genetic markers disclosed herein.
- the kits may include instructional materials containing directions (i.e., protocols) for the practice of the methods of this disclosure. While the instructional materials typically include written or printed materials, they are not limited to such. Any medium capable of storing such instructions and communicating them to an end user is contemplated. Such media include, but are not limited to, electronic storage media (e.g., magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), cloud-based media, and the like. Such media may include addresses to internet sites that provide such instructional materials.
- a method for producing one or more Cannabis plants having modified terpene content comprising: (i) analyzing one or more genetic markers in a nucleic acid sample from a Cannabis plant or its germplasm; (ii) detecting one or more genetic markers that indicate modified terpene content, (iii) crossing the Cannabis plant comprising the one or more genetic markers indicating modified terpene content, and (iv) obtaining one or more progeny plants comprising the one or more genetic markers indicating modified terpene content, and wherein the one or more progeny plants have modified terpene content relative to a control.
- a method for selecting a Cannabis plant having modified terpene content comprising: (i) analyzing one or more genetic markers in a nucleic acid sample from the Cannabis plant or its germplasm; (ii) detecting one or more genetic markers that indicate modified terpene content; and (iii) selecting the Cannabis plant, thereby selecting the Cannabis plant having modified terpene content.
- Clause 4 The method of any one of the prior clauses, further comprising crossing the Cannabis plant having modified terpene content and producing one or more progeny plants having modified terpene content.
- Clause 7 The method of any one of the prior clauses, wherein the selfing, sibling crossing, outcrossing, or backcrossing comprises marker-assisted selection for at least two generations.
- analyzing one or more genetic markers in the nucleic acid sample comprises analyzing one or more of nucleotide positions: 8,871,401; 8,886,933; 9,101,934; 10,446,475; 10,543,062; 10,561,778; 10,633,191 ; 10,934,458; 11,169,492; or 13,920,896 on chromosome 1; 93,291,929 on chromosome 2; 47,140,085 on chromosome 3; 72,717,623 on chromosome 4; 330,918; 516,340; 518,238; 523,626; 608,718; 755,967; 1,100,981; 1,109,162; 1,331,433; 1,353,878; 1,366,137; 1,386,965; 1,487,633; 1,745,101; 1,828,050; 1,837,343; 1,840,325; 1,929,
- Clause 16 The method of any one of the prior clauses, wherein the genetic markers that indicate modified terpene content comprise a polymorphism at position 51 of one or more of: SEQ ID NOs: 1-99.
- SEQ ID NO: 48 SEQ ID NO: 63, SEQ ID NO: 64, SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 78, SEQ ID NO: 85, SEQ ID NO: 93, SEQ ID NO: 94, and/or SEQ ID NO: 95;
- SEQ ID NO: 14 SEQ ID NO: 23, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 31, SEQ ID NO: 34, SEQ ID NO: 41, SEQ ID NO: 50, SEQ ID NO: 54, and/or SEQ ID NO: 55;
- SEQ ID NO: 28 SEQ ID NO: 33, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 68, SEQ ID NO: 69, SEQ ID NO: 71, and/or SEQ ID NO: 72;
- SEQ ID NO: 1 SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, and/or SEQ ID NO: 10;
- reference genome is Abacus Cannabis reference genome version Csat_AbacusV2, NCB1 assembly accession GCA_025232715.1.
- Clause 21 The method of any one of the prior clauses, wherein the one or more genetic markers are genetically linked to a terpene trait locus.
- the modified terpene content comprises modified total terpenes, total monoterpenes, beta-myrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma- terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, tran. -ncrolidol, alpha-humulene, beta-caryophyllene, and/or guaiol levels.
- Clause 23 The method of any one of the prior clauses, wherein the modified terpene content comprises increased terpene content relative to the control.
- Clause 24 The method of any one of the prior clauses, wherein the modified terpene content comprises increased total terpenes absent beta myrcene.
- Clause 25 The method of any one of the prior clauses, wherein the modified terpene content comprises increased alpha-pinene.
- Clause 26 The method of any one of the prior clauses, wherein the modified terpene content comprises increased beta-pinene.
- Clause 27 The method of any one of the prior clauses, wherein the modified terpene content comprises increased alpha-terpinene, gamma-terpinene, and/or terpinolene.
- Clause 28 The method of any one of the prior clauses, wherein the modified terpene content comprises increased beta-ocimene.
- Clause 30 The method of any one of the prior clauses, wherein the modified terpene content comprises increased linalool and/or trans-nerolidol.
- Clause 31 The method of any one of the prior clauses, wherein the modified terpene content comprises increased guaiol.
- Clause 32 The method of any one of the prior clauses, wherein the modified terpene content comprises increased beta-myrcene.
- Clause 33 The method of any one of the prior clauses, wherein the one or more genetic markers that indicate modified terpene content indicate increased terpene content relative to a control.
- Clause 34 The method of any one of the prior clauses, wherein the control is a Cannabis plant without the one or more markers that indicate modified terpene content.
- Clause 36 The method of any one of the prior clauses, wherein the modified terpene content is increased terpene content relative to the control.
- Clause 38 A seed, plant part, tissue culture, or protoplast of the plant of clause 37.
- Clause 39 A method of Cannabis breeding, comprising crossing the Cannabis plant of clause 37.
- GC gas chromatography
- terpenes were measured using dried flower tissue of up to three clonal replicates per accession; the average across clonal replicates was used for validation.
- data were collected for 22 terpenes: 15 monoterpenes (alpha-pinene, alpha-terpinene, beta-myrcene, beta- ocimene, beta-pinene, camphene, delta-3-carene, D-limonene, eucalyptol, isopulegol, linalool, p-cymene, and terpinolene) and seven sesquiterpenes (alpha-Bisabolol, alpha-humulene, beta-caryophyllene, caryophyllene oxide, guaiol, c iv- nerol idol (also known as nerolidol 1), and tra/z.v-ncrolido
- mapping and validation was done for combinations of terpenes with highly correlated levels: 1. alpha-terpinene, gamma-terpinene, and terpinolene; 2. camphene and D-limonene; 3. alpha-humulene and beta-caryophyllene; 4.
- mapping set and both validation sets were genotyped with an Illumina bead array. After initial SNP quality control (QC), further filtering steps were performed to filter out known low quality SNPs, followed by filtering for missing data ( ⁇ 10%) and minor allele frequency (>1%) using vcftools (Danecek et al., "The variant call format and VCFtools," Bioinformatics 27.15 (2011): 2156-2158). Missing data were subsequently imputed in the mapping set (R package NAM “snpQC” option; Xavier et al., “NAM: association studies in multiple populations,” Bioinformatics 31.23 (2015): 3862-3864), resulting in 36,073 SNPs for the mapping set.
- nested association mapping was performed on terpene data collected for the mapping set of 900 diversity panel accessions with the R package NAM using seed lots as family structure and a kinship matrix to control for relatedness (GWAS2 function).
- the homozygous genotype with the highest average terpene trait value is referred to as the beneficial genotype.
- the homozygous genotype with the lowest average terpene trait value is referred to as the detrimental genotype.
- the heterozygous genotype is considered beneficial in addition to a homozygous genotype if the heterozygous genotype has either an average terpene trait value intermediate between the homozygous reference allele and homozygous alternate allele genotype average trait values or has an average terpene trait value similar' to the beneficial homozygous genotype.
- beneficial genotypes for the mapping set and the first validation set were compared.
- a SNP marker was considered validated in the first validation set if the beneficial genotype for the mapping set matched the beneficial genotype for the first validation set.
- the average terpene trait value across all 397 accessions in the second validation set was compared with the average terpene trait value after selecting for the beneficial and the detrimental genotypes, respectively, of all SNP markers that were validated in the first validation set for a given terpene trait.
- the combination of beneficial genotypes is considered validated if the beneficial genotypes of the SNP markers in combination result in an increased average terpene trait value as compared to the average without SNP marker selection.
- the combination of detrimental genotypes is considered validated if the detrimental genotypes of the SNP markers in combination result in a decreased average terpene trait value as compared to the average without SNP marker selection.
- NAM of total Terpenes in the diversity panel resulted in the identification of five significant (p-value ⁇ Bonferroni threshold of 1.39E-06) SNP markers on chromosomes 2, 4, 9, and X.
- Three of these five SNPs were validated in the first validation set (Table 1; Table 15).
- the beneficial genotypes for these three SNP markers resulted in increased total Terpenes in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
- NAM of total Monoterpenes in the diversity panel resulted in the identification of seven significant SNP markers on chromosomes 2, 3, 4, 5, and 8.
- Four of these seven SNPs were validated in the first validation set (Table 2; Table 15).
- the beneficial genotypes for these four SNP markers resulted in increased total Monoterpenes in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
- NAM of total Monoterpenes excluding beta-Myrcene also referred to as total Monoterpenes - beta- Myrcene
- NAM of total Monoterpenes excluding beta-Myrcene also referred to as total Monoterpenes - beta- Myrcene
- 65 significant SNP markers located on chromosome 5 9 of the top ten SNPs were validated in the first validation set (Table 3; Table 15).
- the beneficial genotypes for these nine SNP markers resulted in increased total Monotcrpcncs excluding beta-Myrcene in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
- NAM of total Sesquiterpenes in the diversity panel resulted in the identification of 33 significant SNP markers located on chromosomes 3, 4, 5, 6, and 8; the majority of these markers consisting of 11 SNP markers are located on chromosome 6. All of the top ten SNPs were validated in the first validation set (Table 4; Table 15). The combination of homozygous alternate beneficial genotypes for three of these ten SNP markers resulted in increased total Sesquiterpenes in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
- NAM of alpha-Pinene in the diversity panel resulted in the identification of 2157 significant SNP markers on all ten chromosomes; the majority of these markers consisting of 362 of these SNP markers are located on chromosome 5.
- Two of the top ten SNPs were validated in the first validation set (Table 5; Table 15).
- the beneficial genotypes of these two SNP markers resulted in increased alpha-Pinene in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
- NAM of beta-Pinene in the diversity panel resulted in the identification of 837 significant SNP markers on all ten chromosomes; 194 of these SNP markers are located on chromosome 5.
- Five of the top ten SNPs were validated in the first validation set (Table 6; Table 15).
- the beneficial genotypes of these five SNP markers resulted in increased beta-Pinene in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
- NAM of the combination of alpha-Terpinene, gamma-Terpinene, and Terpinolene (also referred to as alpha-Terpinene + gamma-Terpinene + Terpinolene) in the diversity panel resulted in the identification of 1395 significant SNP markers on all ten chromosomes; the majority of these markers consisting of 912 of these SNP markers are located on chromosome 5. Eight of the top ten SNPs were validated in the first validation set (Table 7; Table 15).
- the beneficial genotypes of these eight SNP markers resulted in increased alpha- Terpinene + gamma-Terpinene + Terpinolene in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
- NAM of beta-Myrcene to total Monoterpenes Ratio in the diversity panel resulted in the identification of 121 significant SNP markers on all ten chromosomes; the majority of these markers consisting of 90 of these SNP markers are located on chromosome 5.
- Six of the top ten SNPs were validated in the first validation set (Table 8; Table 15).
- the beneficial genotypes for these six SNP markers resulted in increased beta-Myrcene to total Monoterpenes ratio in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
- NAM of beta-Ocimene in the diversity panel resulted in the identification of 864 significant SNP markers on all 10 chromosomes; the majority of these markers consisting of 259 of these SNP markers are located on chromosome 5.
- Six of the top ten SNPs were validated in the first validation set (Table 9; Table 15).
- the beneficial genotypes for these six SNP markers resulted in increased beta-Ocimene in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
- NAM of the combination of Camphene and D-Limonene (also referred to as Camphene + D-Limonene) in the diversity panel resulted in the identification of 209 significant SNP markers on all 10 chromosomes; the majority of these markers consisting of 103 of these SNP markers are located on chromosome 5.
- SNPs Nine of the top 10 SNPs were validated in the first validation set (Table 10; Table 15).
- the beneficial genotypes for these nine SNP markers resulted in increased Camphene and D-Limonene in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
- NAM of the combination of linalool and tra/rv-ncrolidol (also referred to as linalool + tra/i.v-ncrolidol) in the diversity panel resulted in the identification of 481 significant SNP markers on all ten chromosomes; the majority of these markers consisting of 258 are located at the of chromosome 1. All ten of the top ten SNPs were validated in the first validation set (Table 11; Table 15). In combination, the beneficial genotypes for these ten SNP markers resulted in increased linalool and trans-nerolidol in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
- NAM of the combination of alpha-Humulene and beta-Caryophyllene (also referred to as alpha- Humulene + beta-Caryophyllene) in the diversity panel resulted in the identification of 46 significant SNP markers on chromosomes 3, 4, 5, 6, and 8; the majority of these markers consisting of 23 SNPs are located on chromosome 6.
- NAM of Guaiol in the diversity panel resulted in the identification of 526 significant SNP markers on chromosomes 3, 4, 5, 6, 7, 8, 9, and X; the majority of these markers consisting of 488 SNPs are located on chromosome 6.
- SNPs Nine of the top ten SNPs were validated in the first validation set (Table 13; Table 15).
- the beneficial genotypes for these nine SNP markers resulted in increased Guaiol in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
- Combinations of SNP markers disclosed herein can be useful, for example, for screening plants having increased levels of terpenes of interest. While any combination of SNPs disclosed herein could be useful, an exemplary subset of SNPs is provided in Table 16.
- at least one SNP selected from the list of SNPs in Table 16 is analyzed and/or detected in a nucleic acid sample from a Cannabis plant, indicating increased terpene levels in that plant.
- at least two SNPs selected from the list of SNPs in Table 16 are analyzed, and at least one SNP indicating increased terpene levels is detected in a nucleic acid sample from a Cannabis plant, indicating increased terpene levels in that plant.
- all of the SNPs from the list of SNPs provided in Table 16 are analyzed, and at least one SNP indicating increased terpene levels is detected in a nucleic acid sample from a Cannabis plant, indicating increased terpene levels in that plant.
- haplotype surrounding SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
- First column SNP marker number
- Second column SNP marker name
- Third column NAM p-value
- Fifth column reference allele call; Sixth column, alternative allele call; Seventh column, Abacus reference
- a haplotype surrounding a significantly associated SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
- First column SNP marker number
- Second column SNP marker name
- Third column NAM p-value
- Fifth column reference allele call; Sixth column, alternative allele call; Seventh column, Abacus reference genome chromosome;
- haplotype surrounding SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
- First column SNP marker number
- Second column SNP marker name
- Third column NAM p-value
- Fifth column reference allele call; Sixth column, alternative allele call; Seventh column, Abacus reference genome chromosome;
- Abacus reference genome position (Csat_AbacusV2; NCBI assembly accession GCA_025232715.1); Ninth column, left flanking SNP of haplotype surrounding SNP marker; Tenth column, right flanking SNP of haplotype surrounding SNP marker; Eleventh column, Abacus reference genome position left flanking SNP of haplotype surrounding SNP marker; Twelfth column, Abacus reference genome position right flanking SNP of haplotype surrounding SNP marker.
- a haplotype surrounding a significantly associated SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
- First column SNP marker number
- Second column SNP marker name
- Third column NAM p-value
- First column SNP marker number
- Second column SNP marker name
- Third column NAM p-value
- haplotype surrounding a significantly associated SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
- First column SNP marker number
- Second column SNP marker name
- Third column NAM p-value
- Abacus reference genome position (Csat_AbacusV2; NCBI assembly accession GCA_025232715.1); Ninth column, left flanking SNP of haplotype surrounding SNP marker; Tenth column, right flanking SNP of haplotype surrounding SNP marker; Eleventh column, Abacus reference genome position left flanking SNP of haplotype surrounding SNP marker; Twelfth column, Abacus reference genome position right flanking SNP of
- haplotype surrounding SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
- First column SNP marker number
- Second column SNP marker name
- Third column NAM p-value
- Abacus reference genome chromosome 5 Seventh column, Abacus reference genome chromosome; Eighth column, Abacus reference genome position (Csat_AbacusV2; NCBI assembly accession GCA_025232715.1); Ninth column, left flanking SNP of haplotype surrounding SNP marker; Tenth column, right flanking SNP of haplotype surrounding SNP marker; Eleventh column, Abacus reference genome position left flanking SNP of haplotype surrounding SNP marker; Twelfth column, Abacus reference genome position right flanking SNP of haplotype surrounding SNP marker.
- a haplotype surrounding a significantly associated SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
- First column SNP marker number
- Second column SNP marker name
- Third column NAM p-value
- Abacus reference genome chromosome 5 Seventh column, Abacus reference genome chromosome; Eighth column, Abacus reference genome position (Csat_AbacusV2; NCBI assembly accession GCA_025232715.1); Ninth column, left flanking SNP of haplotype surrounding SNP marker; Tenth column, right flanking SNP of haplotype surrounding SNP marker; Eleventh column, Abacus reference genome position left flanking SNP of haplotype surrounding SNP marker; Twelfth column, Abacus reference genome position right flanking SNP of haplotype surrounding SNP marker.
- a haplotype surrounding a significantly associated SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
- First column SNP marker number
- Second column SNP marker name
- Third column NAM p-value
- haplotype surrounding SNP marker Eleventh column, Abacus reference genome position left flanking SNP of haplotype surrounding SNP marker;
- haplotype surrounding SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
- Abacus reference genome position (Csat_AbacusV2; NCBI assembly accession GCA_025232715.1); Ninth column, left flanking SNP of haplotype surrounding SNP marker; Tenth column, right flanking SNP of haplotype surrounding SNP marker; Eleventh column, Abacus reference genome position left flanking SNP of haplotype surrounding SNP marker; Twelfth column, Abacus reference genome position right flanking SNP of haplotype surrounding SNP marker.
- a haplotype surrounding a significantly associated SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
- First column mapped trait
- Second column average value (% of dry weight) for mapped terpene trait in second validation set without using markers to make selections
- Third column number of SNP markers used in combination to make selections
- Fourth column average value (% of dry weight) for
- Table 15 50 bp flanking sequences with SNP marker at position 51 bp.
- First column SNP marker number;
- Second column SNP marker name;
- Third column 50 bp flanking sequences with SNP marker at position 51 bp.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Botany (AREA)
- Environmental Sciences (AREA)
- Developmental Biology & Embryology (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Physiology (AREA)
- Analytical Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Physics & Mathematics (AREA)
- Biotechnology (AREA)
- General Health & Medical Sciences (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Mycology (AREA)
- Molecular Biology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Microbiology (AREA)
- Immunology (AREA)
- Biophysics (AREA)
- Natural Medicines & Medicinal Plants (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Provided are genetic markers associated with increased terpene production in Cannabis. The genetic markers are useful, for example, for identifying, selecting, and/or breeding Cannabis plants having modified terpene content.
Description
TERPENE GENETIC MARKERS
CROSS REFERENCE TO RELATED APPLICATION
This application claims priority to U.S. Provisional Patent Application No. 63/611,288, filed on December 18, 2023, which is incorporated herein by reference in its entirety.
FIELD
The present disclosure relates to genes associated with terpene production in Cannabis, and methods of producing Cannabis varieties having high terpene content.
INCORPORATION OF ELECTRONIC SEQUENCE LISTING
The Sequence Listing is submitted as an XML file in the form of the file named “Sequence.xml” (95,829 bytes), created on December 2, 2024, which is incorporated by reference herein.
BACKGROUND
Terpenes are a class of compounds produced predominately by plants. Terpenes and terpenoids are primary constituents of essential oils and are often responsible for aromas, flavors, and even colors associated with various types of vegetation. Terpenes are widely used as fragrances and flavors in consumer products, such as perfumes, cosmetics, and consumables (food and drink). Many terpenes have also been shown to have pharmacological effects. In the context of Cannabis, terpenes potentially work in synergy with cannabinoids, such as tetrahydrocannabinol (THC) or cannabidiol (CBD), to produce psychoactive effects.
It is desirable to produce Cannabis varieties having modified terpene profiles, however, the most common way to create Cannabis varieties having modified terpene content is the use of traditional breeding methods that select for segregated traits over multiple generations. Traditional breeding methods are laborious and time-consuming. Thus, it is desirable to identify genetic markers and genes associated with particular terpene attributes in Cannabis.
SUMMARY
Disclosed are methods of identifying and/or selecting Cannabis plants having modified terpene content (e.g., increased terpene content relative to a control). The methods include (i) analyzing one or more genetic markers in a nucleic acid sample from the Cannabis plant or its germplasm; (ii) detecting one or more genetic markers that indicate modified teipene content; and (iii) identifying and/or selecting the Cannabis plant having one or more genetic markers that indicate modified terpene content, thereby idcntifying/sclccting Cannabis plants having modified terpene content.
Also disclosed are methods of producing one or more Cannabis plants having modified terpene content (e.g., increased terpene content relative to a control). The methods include (i) analyzing one or more genetic markers in a nucleic acid sample from a Cannabis plant or its germplasm; (ii) detecting one or more genetic markers that indicate modified terpene content, (iii) crossing the Cannabis plant comprising the one or more genetic markers indicating modified terpene content, and (iv) obtaining one or more progeny plants comprising the one or more genetic markers indicating modified terpene content, wherein the one of more progeny plants have modified terpene content relative to a control. Crossing includes, for example, selfing, sibling crossing, outcrossing, and backcrossing.
In several aspects, the modified terpene content is increased terpene content relative to a control. Modified terpene content can include, for example, modified total terpenes, total monoterpenes, beta-myrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, trans-nerolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content. Terpene content can be modified in any part of the plant, however, in non-limiting examples, terpene content is increased in flower or inflorescence tissue, or in trichomes (e.g., glandular trichomes). In some aspects, the control is a Cannabis plant without the one or more markers indicating modified terpene content.
The one or more genetic markers that are analyzed and/or detected can include, for example, a genetic marker disclosed herein e.g., one or more genetic markers included in Table 15 or Table 16). In some aspects, 2 to 10 genetic markers disclosed herein are analyzed. In further aspects, at least one genetic marker indicating modified terpene content is detected. In some aspects, at least one genetic marker indicating modified terpene content is detected. In some aspects, at least two genetic markers indicating modified terpene content are detected.
Also provided are Cannabis plants identified, selected, or produced by a method disclosed herein, as well as seed, plant parts, tissue cultures thereof; and products (e.g., a kief, hashish, bubble hash, an edible product, solvent reduced oil, sludge, e-juice, or tincture) including a Cannabis plant identified, selected, or produced by a method disclosed herein. Further provided are methods of Cannabis breeding, including crossing a Cannabis plant identified, selected, or produced by a method disclosed herein.
The foregoing and other objects, features, and advantages of the disclosure will become more apparent from the following detailed description.
SEQUENCES
Only one strand of each nucleic acid sequence is shown, but the complementary strand is understood as included by any reference to the displayed strand.
SEQ ID NOs: 1-99 arc sequences encompassing SNP markers. Each sequence includes 50 bp of 5’ and 3' flanking sequence with the SNP marker at position 51 bp.
DETAILED DESCRIPTION
I. Introduction
Cannabis has long been used for drug and industrial purposes, fiber (hemp), for seed and seed oils, for medicinal purposes, and for recreational purposes. Industrial hemp products are made from Cannabis plants selected to produce an abundance of fiber. Some Cannabis varieties have been bred to produce minimal levels of THC, the principal psychoactive constituent responsible for the psychoactivity associated with marijuana. Marijuana has historically consisted of the dried flowers of Cannabis plants selectively bred to produce high levels of THC and other psychoactive cannabinoids. As a drug it usually comes in the form of dried flower buds (marijuana), resin (hashish), or various extracts collectively known as hashish oil.
Cannabis is an annual, dioecious, flowering herb. The leaves are palmately compound or digitate, with serrate leaflets. Cannabis normally has imperfect flowers, with staminate “male” and pistillate “female” flowers occurring on separate plants. It is not unusual, however, for individual plants to separately bear both male and female flowers (i.e„ have monoecious plants). Although monoecious plants are often referred to as “hermaphrodites,” true hermaphrodites (which are less common in Cannabis) bear staminate and pistillate structures on individual flowers, whereas monoecious plants bear male and female flowers at different locations on the same plant.
The life cycle of Cannabis varies with each variety but can be generally summarized into germination, vegetative growth, and reproductive stages. Because of heavy breeding and selection by humans, most Cannabis seeds have lost dormancy mechanisms and do not require any pre-treatments or winterization to induce germination. Seed placed in viable growth conditions are expected to germinate in about 3 to 7 days. The first true leaves of a Cannabis plant contain a single leaflet, with subsequent leaves developing in opposite formation with increasing number of leaflets. Leaflets can be narrow or broad depending on the morphology of the plant grown. Cannabis plants are normally allowed to grow vegetatively for the first 4 to 8 weeks. During this period, the plant responds to increasing light with faster and faster growth. Under ideal conditions, Cannabis plants can grow up to 2.5 inches a day and are capable of reaching heights of up to 20 feet. Indoor growth pruning techniques tend to limit Cannabis size through careful pruning of apical or side shoots.
Cannabis is diploid, having a chromosome complement of 2n=20, although polyploid individuals have been artificially produced. The first genome sequence of Cannabis, which is estimated to be 820 Mb in size, was published in 2011 by a team of Canadian scientists (Bakel et al., “The draft genome and transcriptome of Cannabis sativa” Genome Biology 12:R102).
All known varieties of Cannabis are wind-pollinated and the fruit is an achene. Most varieties of Cannabis are short day plants, with the possible exception of C. sativa subsp. Sativa var. spontanea (=C ruderalis), which is commonly described as “auto-flowcring” and may be day-ncutral.
The genus Cannabis was formerly placed in the Nettle (Urticaceae) or Mulberry (Moraceae) family, and later, along with the Humulus genus (hops), in a separate family, the Hemp family (Cannabaceae sensu stricto).
Recent phylogenetic studies based on cpDNA restriction site analysis and gene sequencing strongly suggest that the Cannabaceae sensu stricto arose from within the former Celtidaceae family, and that the two families should be merged to form a single monophyletic family, the Cannabaceae sensu lato.
Cannabis plants produce a variety of secondary metabolites, including cannabinoids, terpenoids, and other compounds, which are often secreted by glandular trichomes that occur most abundantly on the floral calyxes and bracts of female plants. Cannabinoids are the most studied group of secondary metabolites in Cannabis. Most exist in two forms, as acids and in neutral (decarboxylated) forms. The acid form is designated by an “A” at the end of its acronym (i.e. THCA). The phytocannabinoids are synthesized in the plant as acid forms, and while some decarboxylation does occur in the plant, it increases significantly post-harvest and the kinetics increase at high temperatures (Sanchez and Verpoorte 2008). The biologically active forms for human consumption are the neutral forms. Decarboxylation is usually achieved by thorough drying of the plant material followed by heating it, often by either combustion, vaporization, or heating or baking in an oven.
Cannabinoids found in Cannabis plants include, but are not limited to, A9-Tetrahydrocannabinol (A9- THC), A8-Tetrahydrocannabinol (A8-THC), Cannabichromene (CBC), Cannabicyclol (CBL), Cannabidiol (CBD), Cannabielsoin (CBE), Cannabigerol (CBG), Cannabinidiol (CBND), Cannabinol (CBN), Cannabitriol (CBT), and their propyl homologs, including, but are not limited to cannabidivarin (CBDV), A9- Tetrahydrocannabivarin (THCV), cannabichromevarin (CBCV), and cannabigerovarin (CBGV). See, Holley et al. (Constituents of Cannabis sativa L. XI Cannabidiol and cannabichromene in samples of known geographical origin, J. Pharm. Sci. 64:892-894, 1975) and De Zeeuw et al. (Cannabinoids with a propyl side chain in Cannabis, Occurrence and chromatographic behavior, Science 175:778-779). Non-THC cannabinoids can be collectively referred to as “CBs”, wherein CBs can be one of THCV, CBDV, CBGV, CBCV, CBD, CBC, CBE, CBG, CBN, CBND, and CBT cannabinoids.
Terpenes are primarily produced in glandular trichomes of female inflorescences (Livingston et al., "Cannabis glandular trichomes alter morphology and metabolite content during flower maturation," The Plant Journal 101.1 (2020): 37-56). Besides affecting aroma and fragrance, terpenes may have a synergic effect with cannabinoids (Sommano et al., "The cannabis terpenes," Molecules 25.24 (2020): 5792), and have been attributed medicinal properties (Maggini et al., "An Optimized Terpene Profile for a New Medical Cannabis Oil," Pharmaceutics 14.2 (2022): 298). Two main groups of terpenes in Cannabis are the monoterpenes and sesquiterpenes, which are produced in the methylerythritol phosphate pathway (MEP) and mevalonic acid pathway (MEV), respectively (Booth et al., "Terpene synthases from Cannabis sativa," Pios one 12.3 (2017): e0173911). Monoterpenes have a ten-carbon isoprenoid precursor, geranyl diphosphate (GPP). Sesquiterpenes have a fifteen-carbon isoprenoid precursor, farnesyl diphosphate (FPP). GPP and FPP are converted to different monoterpenes and scsquitcipcncs, respectively, by tcipcnc synthases (TPS; Booth et al., "Terpenes in Cannabis sativa-From plant genome to humans," Plant Science 284 (2019): 67-72).
Here, single nucleotide polymorphism (SNP) markers associated with terpene biosynthesis are identified to allow selective breeding of Cannabis plants with increased terpenoid content.
II. Summary of Terms
Unless otherwise noted, technical terms are used according to conventional usage. Definitions of many common terms in molecular biology may be found in Krebs et al. (eds. ), Lewin ’s genes XII, published by Jones & Bartlett Learning, 2017. As used herein, the singular forms “a,” “an,” and “the,” refer to both the singular as well as plural, unless the context clearly indicates otherwise. For example, the term “a plant” includes singular or plural plants and can be considered equivalent to the phrase “at least one plant.” As used herein, the term “comprises” means “includes.” It is further to be understood that any and all base sizes or amino acid sizes, and all molecular weight or molecular mass values, given for nucleic acids or polypeptides are approximate, and are provided for descriptive purposes, unless otherwise indicated. Although many methods and materials similar or equivalent to those described herein can be used, particular suitable methods and materials are described herein. In case of conflict, the present specification, including explanations of terms, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting. To facilitate review of the various aspects, the following explanations of terms are provided:
The term “Abacus” as used herein refers to the Cannabis sativa reference genome known as the Abacus reference genome version Csat_AbacusV2 (NCBI assembly accession GCA_025232715.1, incorporated by reference herein), which is also sometimes referred to as CsaAba2.
The term “about” refers to a range of 5% of the referenced value unless otherwise indicated. For example, about 100 refers to a range of 95 to 105.
The term “alternative nucleotide call” is a nucleotide polymorphism relative to a reference nucleotide for a SNP marker that is significantly associated with a desired phenotype (e.g., modified terpene content). Unless otherwise specified, the reference is the Abacus sequence.
The term “backcrossing” or “to backcross” refers to a process in which a breeder repeatedly crosses hybrid progeny, for example a first generation hybrid (Fl), back to one of the parents of the hybrid progeny. Backcrossing can be used to introduce one or more single locus conversions from one genetic background into another.
The term “beneficial” as used herein refers to a genetic element (<?.g., gene, allele, or polymorphism) conferring or associated with modified terpene content (e.g., increased terpene content). In some aspects, a “beneficial polymorphism” or “beneficial allele” refers to a polymorphism or allele associated with modified terpene content (e.g., increased terpene content).
The term “Cannabis” refers to plants of the genus Cannabis, including Cannabis sativa, Cannabis indica, and Cannabis ruderalis.
The term “cell” refers to a prokaryotic or eukaryotic cell, including plant cells, capable of replicating DNA, transcribing RNA, translating polypeptides, and secreting proteins.
The term “coding sequence” refers to a DNA sequence which codes for a specific amino acid sequence. “Regulatory sequences” refer to nucleotide sequences located upstream (5’ non-coding sequences), within, or downstream (3' non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include, but are not limited to, promoters, translation leader sequences, introns, and polyadenylation recognition sequences.
The term “control” refers to a reference standard. A control can be a negative or positive control. In some aspects, the control is a historical control or a known reference value (or range of values). A practitioner can select a suitable control based on the teachings provided herein. A non-limiting example of a suitable control includes a Cannabis plant not including a marker or gene associated with increased terpene content.
The term “cross” or “crossing” refer to the process by which the pollen of one flower on one plant is applied (artificially or naturally) to the ovule (stigma) of a flower on another plant (or the same plant when selfing). Exemplary types of crosses include selfing, backcrossing, outcrossing, and sibling crossing.
The term “cultivar” means a group of similar plants that by structural features and performance (e.g., morphological and physiological characteristics) can be identified from other varieties within the same species. Furthermore, the term “cultivar” variously refers to a variety, strain or race of plant that has been produced by horticultural or agronomic techniques and is not normally found in wild populations. The terms cultivar, variety, strain, plant and race are often used interchangeably by plant breeders, agronomists and farmers.
The term “detect” or “detecting” refers to any method for determining the presence of a nucleic acid. Methods of detecting nucleic acid polymorphisms, for example, have been described and can include amplification of a target polynucleotide (e.g., by PCR) and/or detection by a probe (e.g., hybridization assays). PCR uses a particular amplification primer pair that specifically hybridize to a target polynucleotide and produce an amplification product (the amplicon). Primers can be designed such that the amplicon can contain a nucleic acid polymorphism of interest. Methods for designing PCR primers and PCR conditions have been described, for example, in Sambrook et al. (2014) Molecular Cloning: A Laboratory Manual (Fourth Edition, Cold Spring Harbor Laboratory Press, Plainview, N.Y.).
The term “expression” or “gene expression” relates to the process by which the coded information of a nucleic acid transcriptional unit (including, e.g., genomic DNA) is converted into an operational, non- operational, or structural part of a cell, often including the synthesis of a protein. Gene expression can be influenced by external signals; for example, exposure of a cell, tissue, or organism to an agent that increases or decreases gene expression. Expression of a gene can also be regulated anywhere in the pathway from DNA to RNA to protein. Regulation of gene expression occurs, for example, through controls acting on transcription, translation, RNA transport and processing, degradation of intermediary molecules such as mRNA, or through activation, inactivation, compartmentalization, or degradation of specific protein molecules after they have been
made, or by combinations thereof. Gene expression can be measured at the RNA level or the protein level by any suitable, known method, including, without limitation, Northern blot, RT-PCR, Western blot, or in vitro, in situ, or in vivo protein activity assay(s). Elevated levels refer to higher than average levels of gene expression in comparison to a reference, e.g., a control plant.
The term "expression cassette" refers to a discrete nucleic acid fragment into which a nucleic acid sequence or fragment can be moved, typically for expression in a host cell. In some aspects, an expression cassette is included on a vector, such as an expression vector.
The term “functional” as used herein refers to DNA or amino acid sequences which are of sufficient size and sequence to have the desired function (i.e. the ability to cause expression of a gene resulting in gene activity expected of the gene found in a reference genome, e.g., the Abacus reference genome).
The term “gene” refers to a nucleic acid fragment that expresses a specific protein, including regulatory sequences preceding (5’ non-coding sequences) and following (3’ non-coding sequences) the coding sequence. “Native gene” refers to a gene as found in nature with its own regulatory sequences. “Chimeric gene” or “recombinant expression construct,” which are used interchangeably, are not naturally occurring molecules and refer to any artificial gene chimera, such as a cDNA sequence or gene operably linked to a non-native regulatory sequence. Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that found in nature. “Endogenous gene” refers to a native gene in its natural location in the genome of an organism. A “heterologous” gene or allele refers to a gene or allele that is not naturally found in the host, but is artificially introduced (e.g., by genetic engineering or selective plant breeding). Heterologous genes can comprise genes inserted into a non-native host, or chimeric genes.
The term "genetic modification” as used herein refers to a change from the wild-type or reference sequence of one or more nucleic acid molecules. Genetic modifications or alterations include without limitation, base pair substitutions, additions, or deletions of at least one nucleotide from a nucleic acid molecule of known sequence.
The term “genome” as it applies to plant cells encompasses not only chromosomal DNA found within the nucleus, but organelle DNA found within subcellular components (e.g., mitochondrial, plastid) of the cell.
The term “genotype” refers to the genetic makeup of an individual cell, cell culture, tissue, organism (e.g., a plant), or group of organisms. A“beneficial genotype” refers to a desired genotype, such as a genotype associated with modified terpene content (e.g., associated with increased terpene content/production). Conversely, a “detrimental genotype” is a genotype that is not associated with modified terpene content (e.g., not associated with increased terpene content/production, or associated with a decrease in terpene contcnt/production). A genotype may refer to a particular genetic marker (e.g., a polymorphism), such as a marker associated with modified terpene content. In some aspects, a beneficial genotype or polymorphism in Cannabis results in increases terpene production relative to a Cannabis plant that does not contain the beneficial
genotype or polymorphism, or relative to a Cannabis plant that contains a detrimental genotype or polymorphism.
The term “germplasm” refers to genetic material of or from an individual (e.g., a plant), a group of individuals (e.g., a plant line, variety, or family), or a clone derived from a line, variety, species, or culture. The germplasm can be part of an organism or cell, or can be separate from the organism or cell. In general, germplasm provides genetic material with a specific molecular makeup that provides a physical foundation for some or all of the hereditary qualities of an organism or cell culture. As used herein, germplasm includes cells, seed or tissues from which new plants can be grown, as well as plant parts, such as leaves, stems, pollen, or cells that can be cultured into a whole plant.
The term “haplotype” refers to the genotype of a plant at a plurality of genetic loci, e.g., a combination of alleles or markers. Haplotype can refer to sequence of polymorphisms at a particular- locus, such as a single marker locus, or sequence polymorphisms at multiple loci along a chromosomal segment in a given genome. As used herein, a haplotype can be a nucleic acid region spanning two markers.
A plant is "homozygous" if the individual has only one type of allele at a given locus (e.g., a diploid individual has a copy of the same allele at a locus for each of two homologous chromosomes). An individual is “heterozygous” if more than one allele type is present at a given locus (e.g., a diploid individual with one copy each of two different alleles). The term “homogeneity” indicates that members of a group have the same genotype at one or more specific loci. In contrast, the term “heterogeneity” is used to indicate that individuals within the group differ in genotype at one or more specific loci.
The term “hybrid” refers to a variety or cultivar- that is the result of a cross of plants of two different varieties. A hybrid, as described here, can refer to plants that are genetically different at any particular loci. A hybrid can fur ther include a plant that is a variety that has been bred to have at least one different characteristic from the parent. A “Fl hybrid” refers to the first generation hybrid, “F2 hybrid” the second generation hybrid, “F3 hybrid” the third generation, and so on. A hybrid refers to any progeny that is either produced, or developed using research and development to create a new line having at least one distinct characteristic.
The terms "hybridizing specifically to", "specific hybridization", or “selectively hybridize to,” as used herein refer to the binding, duplexing, or hybridizing of a nucleic acid molecule preferentially to a particular nucleotide sequence under stringent conditions. The term “stringent conditions” refers to conditions under which a nucleic acid will hybridize preferentially to a target sequence, and to a lesser extent to, or not at all to, other off-target sequences. A “stringent hybridization” and “stringent hybridization wash conditions” in the context of nucleic acid hybridization (e.g., as in array. Southern or Northern hybridizations) are sequence dependent, and are different under different environmental parameters. An extensive guide to the hybridization of nucleic acids can be found in, e.g., Tijsscn (1993) Laboratory Techniques in Biochemistry and Molecular Biology — Hybridization with Nucleic Acid Probes part I, Ch. 2, “Overview of principles of hybridization and the strategy of nucleic acid probe assays,” Elsevier, N.Y. (“Tijssen”). Generally, highly stringent hybridization
and wash conditions are selected to be about 5 °C lower than the thermal melting point for the specific sequence at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Very stringent conditions are selected to be equal to the Tm for a particular probe. An example of stringent hybridization conditions for hybridization of complementary nucleic acids which have more than 100 complementary residues on an array or on a filter in a Southern or northern blot is 42°C using standard hybridization solutions (see, e.g., Sambrook et al. (2014) Molecular Cloning: A Laboratory Manual (Fourth Edition, Cold Spring Harbor Laboratory Press, Plainview, N.Y.)).
As used herein, the term “inbreeding” refers to the production of offspring via the mating between relatives. The plants resulting from the inbreeding process are referred to herein as “inbred plants” or “inbreds.”
The terms “increase” or “decrease” refer to a positive (increase) or negative (decrease) difference relative to a reference value, such as a control. The difference can be quantitative. In some aspects, the difference is statistically significant (e.g., P-value less than 0.05 or 0.01). In some aspects, the difference is an increase relative to a control of at least 5%, such as at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 250%, at least 300%, at least 350%, at least 400%, at least 500%, or greater than 500%. In some aspects, the difference is a decrease relative to a control of at least 5%, such as at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 98%, or 100%.
The term "introduced" refers to the incorporation of a particular nucleic acid sequence or protein into a cell, for example, via transformation. “Transformation” encompasses all techniques by which a nucleic acid molecule or protein might be introduced into such a cell, including chemical methods (e.g., calcium-phosphate transfection), physical methods (e.g., electroporation, microinjection, particle bombardment), fusion (e.g., liposomes), lipofection, nucleofection, receptor-mediated endocytosis (e.g., DNA-protein complexes, viral envelope/capsid-DNA complexes), agrobacterium-mediated transformation, biolistics (particle gun accelerator or gene gun), or other transduction and/or transfection methods. Transformation can include stable transformation, where a nucleic acid fragment is incorporated into the genome of a host cell (e.g., chromosome, plasmid, plastid or mitochondrial DNA), or transient transformation, e.g., transformation of an autonomous replicon or other transient molecule (e.g., transfected mRNA). In some aspects, a genetic modification (e.g., a substitution, insertion, or deletion) is introduced through a gene editing technique, such as an RNAi, CRISPR/Cas9, ZFN, or TALEN based technique. In some aspects, a gene (or vector carrying a gene) is introduced into a cell by transformation, transfection, or transduction.
The terms “isolated” or “purified” in reference to biological components (such as nucleic acids, proteins, or cells) are components that have been substantially separated from other biological components in the environment in which the component occurs, e.g., separated from other chromosomal and extra-
chromosomal DNA and RNA, proteins and/or cells. Nucleic acids and proteins that have been “isolated” include nucleic acids and proteins purified by standard purification methods. The term also embraces nucleic acids and proteins prepared by recombinant expression in a host cell as well as chemically synthesized nucleic acids.
Absolute purity or isolation is not required, it is intended as a relative term. Thus, for example, a purified/isolated protein, nucleic acid, or cell preparation is one in which the protein, nucleic acid, or cell is more enriched than the protein, nucleic acid, or cell is in its initial environment. In one aspect, a preparation is purified/isolated such that the protein, nucleic acid, or cell represents at least 50% of the total content of the preparation. A substantially purified protein or nucleic acid is at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 98%, or at least 99% pure. Thus, in one specific, non-limiting example, a substantially purified protein or nucleic acid is 90% free of other components.
The term “line” is used broadly to include, but is not limited to, a group of plants vegetatively propagated from a single parent plant, via tissue culture techniques or a group of inbred plants which are genetically very similar due to descent from a common parent(s). A plant is said to “belong” to a particular line if it (a) is a primary transformant (TO) plant regenerated from material of that line; (b) has a pedigree comprised of a TO plant of that line; or (c) is genetically very similar' due to common ancestry (e.g., via inbreeding or selfing). In this context, the term “pedigree” denotes the lineage of a plant, e.g. in terms of the sexual crosses affected such that a gene or a combination of genes, in heterozygous (hemizygous) or homozygous condition, imparts a desired trait to the plant (e.g., modified terpene content).
The term “locus” refers to a position on a genome that corresponds to a measurable property, e.g., a trait. Thus, a “modified terpene content trait locus” as used herein is a position on the genome of a subject plant having genetic differences, in comparison to a reference genome that results in modified terpene content in comparison to the reference plant.
The term “marker,” “genetic marker,” or “molecular marker,” refer to a nucleotide sequence or encoded product thereof e.g., a protein) used as a point of reference for identifying a linked locus. A marker can be derived from genomic nucleotide sequence or from expressed nucleotide sequences (e.g., from a spliced RNA, a cDNA, etc.), or from an encoded polypeptide, and can be represented by one or more particular valiant sequences, or by a consensus sequence. A “marker probe” is a nucleic acid molecule that can be used to identify the presence of a marker, e.g., a nucleic acid probe that is complementary to a marker locus sequence. Alternatively, in some aspects, a marker probe refers to a probe of any type that is able to distinguish (i.e., genotype) the particular allele that is present at a marker locus. A “marker locus” is a locus that can be used to track the presence of a second linked locus, e.g., a linked locus that encodes or contributes to expression of a phenotypic trait. For example, a marker locus can be used to monitor segregation of alleles at a locus, such as a QTL, that are genetically or physically linked to the marker locus. Thus, a “marker allele,” alternatively an “allele of a marker locus,” is one of a plurality of polymorphic nucleotide sequences found at a marker locus in a
population that is polymorphic for the marker locus. Examples of markers include restriction fragment length polymorphism (RFLP) markers, amplified fragment length polymorphism (AFLP) markers, single nucleotide polymorphisms (SNPs), microsatellite markers (e.g. SSRs), sequence-characterized amplified region (SCAR) markers, cleaved amplified polymorphic sequence (CAPS) markers or isozyme markers or combinations of the markers described herein which defines a specific genetic and chromosomal location.
The term “marker assisted selection” refers to the process of identifying, optionally followed by selecting, a plant from a group of plants using the presence of a molecular marker as a selection criterion. The process usually involves detecting the presence of a certain nucleic acid sequence or polymorphism in the genome of a plant.
The term “modified Cannabis plant” or “modified plant” is not a naturally occurring plant.
The term “nucleotide” refers to an organic molecule that serves as a monomeric unit of DNA and RNA. The nucleotide position is the position along a reference sequence wherein any particular monomeric unit of DNA or RNA is positioned relative to the other monomeric units of DNA or RNA.
The term “offspring” or “progeny” refer to a plant resulting as from a vegetative or sexual reproduction from one or more parent plants. For instance, an offspring/progeny plant may be obtained by cloning or selfing of a parent plant or by crossing two parent plants. An Fl is a first-generation offspring produced from parents at least one of which is used for the first time as donor of a trait, while offspring of second generation (F2) or subsequent generations (F3, F4, etc.) are specimens produced from selfings of Fl’s, F2’s etc. An Fl may thus be (and usually is) a hybrid resulting from a cross between two true breeding parents (true-breeding is homozygous for a trait), while an F2 may be (and usually is) an offspring resulting from self-pollination.
The term "operably linked" refers to the association of nucleic acid sequences on a single nucleic acid fragment such that the function of one is affected by the other. For example, a promoter is operably linked with a coding sequence when it is capable of inducing expression of that coding sequence (i.e., that the coding sequence is under the transcriptional control of the promoter). Coding sequences can be operably linked to regulatory sequences in sense or antisense orientation.
The term "plant" refers to a whole plant, cell, tissue, or other plant parts. Plant parts include any part(s) of a plant, including, for example and without limitation: seed (including mature seed and immature seed); a plant cutting; a plant cell; a plant cell culture; a plant organ (e.g., pollen, embryos, flowers, trichomes (e.g., glandular trichomes), fruits, shoots, leaves, roots, stems, and explants). Plant tissue refers to any tissue of a plant, including but not limited to, tissue from an embryo, shoot, root, stem, seed, stipule, leaf, trichome, petal, flower bud, flower, ovule, bract, branch, petiole, internode, bark, pubescence, tiller, rhizome, frond, blade, ovule, pollen, stamen. A plant tissue or plant organ may be a seed, protoplast, callus, or any other group of plant cells that is organized into a structural or functional unit. A plant cell or tissue culture may be capable of regenerating a plant having the physiological and morphological characteristics of the plant from which the cell or tissue was obtained, and of regenerating a plant having substantially the same genotype as the plant. In
contrast, some plant cells are not capable of being regenerated to produce plants. Regenerable cells in a plant cell or tissue culture may be embryos, protoplasts, meristematic cells, callus, pollen, leaves, anthers, roots, root tips, silk, flowers, kernels, ears, cobs, husks, or stalks. Plant parts include harvestable parts and parts useful for propagation of progeny plants. Plant parts useful for propagation include, for example and without limitation: seed; fruit; a cutting; a seedling; a tuber; and a rootstock. A harvestable part of a plant may be any useful part of a plant, including, for example and without limitation: flower; pollen; seedling; tuber; leaf; stem; fruit; seed; and root.
A plant cell is the structural and physiological unit of the plant. A plant cell may be in the form of an isolated single cell, or an aggregate of cells (e.g., a friable callus and a cultured cell), and may be part of a higher organized unit (e.g., a plant tissue, plant organ, and plant). Thus, a plant cell may be a protoplast, a gamete producing cell, or a cell or collection of cells that can regenerate into a whole plant. As such, a seed, which comprises multiple plant cells and is capable of regenerating into a whole plant, is considered a “plant cell.” Described herein are plants in the genus of Cannabis and plants derived therefrom, which can be produced by asexual or sexual reproduction.
The term “polymorphism” refers to a difference in the nucleotide or amino acid sequence of a given region as compar ed to a nucleotide or amino acid sequence in a homologous-region of another individual, in particular, a difference in the nucleotide of amino acid sequence of a given region which differs between individuals of the same species. A polymorphism is generally defined in relation to a reference sequence. Unless indicated otherwise, the reference sequence is the Cannabis Abacus reference genome (version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1) or CDS produced from the Cannabis Abacus reference genome. Polymorphisms include single nucleotide differences, differences in sequence of more than one nucleotide, and single or multiple nucleotide insertions, inversions and deletions; as well as single amino acid differences, differences in sequence of more than one amino acid, and single or multiple amino acid insertions, inversions, and deletions.
The terms “polynucleotide,” “polynucleotide sequence,” “nucleotide sequence,” “nucleic acid sequence,” and “nucleic acid fragment,” are used interchangeably. These terms encompass polymers composed of nucleotide units (ribonucleotides, deoxyribonucleotides, related naturally occurring structural variants, and synthetic non-naturally occurring analogs thereof). The term “oligonucleotide” typically refers to short polynucleotides, generally no greater than 150 nucleotides, for example, no greater than 125 nucleotides, no greater than 100 nucleotides, no greater than 75 nucleotides, no greater than 50 nucleotides, or no greater than 25 nucleotides. It will be understood that when a nucleic acid sequence is represented as a DNA sequence (z.e., A, T, G, C), this also includes an RNA sequence (z.e., A, U, G, C) in which “U” replaces “T.” Nucleic acids can be single- or double-stranded. Exemplary nucleic acids include cDNA, genomic DNA, synthetic DNA, RNA, or mixtures thereof.
The term “polypeptide” or “protein” refers to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical analogue of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers. The term “amino acid residue” or “amino acid” includes reference to an amino acid that is incorporated into a protein, polypeptide, or peptide. The amino acid can be a naturally occurring amino acid and, unless otherwise limited, can encompass known analogs of natural amino acids that can function in a similar manner as naturally occurring amino acids. As used herein, “recombinant” includes reference to a protein produced using cells that do not have, in their native state, an endogenous copy of the DNA able to express the protein. The cells produce the recombinant protein because they have been genetically altered by the introduction of the appropriate isolated nucleic acid sequence. The term also includes reference to a cell, or nucleic acid, or vector, that has been modified by the introduction of a heterologous nucleic acid or the alteration of a native nucleic acid to a form not native to that cell, or that the cell is derived from a cell so modified.
The term "primer" as used herein refers to an oligonucleotide, either RNA or DNA, either singlestranded or double-stranded, either derived from a biological system, generated by restriction enzyme digestion, or produced synthetically which, when placed in the proper environment, is able to functionally act as an initiator of template-dependent nucleic acid synthesis. When presented with an appropriate nucleic acid template, suitable nucleoside triphosphate precursors of nucleic acids, a polymerase enzyme, suitable cofactors and conditions such as a suitable temperature and pH, the primer may be extended at its 3' terminus by the addition of nucleotides by the action of a polymerase or similar activity to yield a primer extension product. The primer may vary in length depending on the particular conditions and requirements of the application. For example, in diagnostic applications, the oligonucleotide primer is typically 15-25 or more nucleotides in length. The primer must be of sufficient complementarity to the desired template to prime the synthesis of the desired extension product, that is, to be able anneal with the desired template strand in a manner sufficient to provide the 3’ hydroxyl moiety of the primer in appropriate juxtaposition for use in the initiation of synthesis by a polymerase or similar enzyme. It is not required that the primer sequence represent an exact complement of the desired template. For example, a non-complementary nucleotide sequence may be attached to the 5’ end of an otherwise complementary primer. Alternatively, non-complementary bases may be interspersed within the oligonucleotide primer sequence, provided that the primer sequence has sufficient complementarity with the sequence of the desired template strand to functionally provide a template-primer complex for the synthesis of the extension product.
The term “probe,” “nucleic acid probe,” or “oligonucleotide probe” as used herein, is one or more synthetic nucleic acid molecules that are complementary to a nucleic acid sequence of interest (target sequence), and hybridize to a sequence of interest when under hybridization conditions. Probes can be used to detect, analyze, and/or visualize the nucleic acid sequence of interest on a molecular level. Specific hybridization of a probe to a nucleic acid sequence of interest can be detected, for example, through a label on the probe. Probes
have a length suitable to achieve a desired specificity to the target sequence, however, are generally at least 10 nucleotides long, for example, at least 15 nucleotides, at least 20 nucleotides, or at least 50 nucleotides long. Probes can be immobilized on a solid surface (e.g., nitrocellulose, glass, quartz, fused silica slides), as in an array. The precise sequence of the particular probes described herein can be modified to a certain degree to produce probes that are “substantially identical” to the disclosed probes, but retain the ability to specifically bind to (i.e., hybridize specifically to) the same targets as the probe from which they were derived. Such modifications are specifically covered by reference to the individual probes described herein.
The term “product” as used in reference to a Cannabis product, is a composition including Cannabis (or an extract thereof). Products include, but are not limited to: a kief, hashish, bubble hash, an edible product, solvent reduced oil, sludge, e-juice, tincture, or other compositions including Cannabis (e.g., a Cannabis plant disclosed herein, or an extract thereof).
The term “promoter” refers to a nucleic acid control sequence that directs transcription of a nucleic acid. A promoter includes necessary nucleic acid sequences near the start site of transcription, and may include distal enhancer or repressor elements. A “constitutive promoter” is a promoter that is continuously active and is not subject to regulation by external signals or molecules. In contrast, the activity of an “inducible promoter” is regulated by an external signal or molecule (for example, a transcription factor). Exemplary promoters include pol III promoters (e.g., U6), pol II promoter, ubiquitin promoter, Cauliflower Mosaic Virus (CaMV) 35S promoter, or RUB1SCO promoter. The terms “initiate transcription,” “initiate expression,” “drive transcription,” and “drive expression” are used interchangeably herein and all refer to the primary function of a promoter.
The term “purified” as used herein relates to the isolation of a molecule or compound in a form that is substantially free of contaminants normally associated with the molecule or compound in a native or natural environment, or substantially enriched in concentration relative to other compounds present when the compound is first formed, and means having been increased in purity as a result of being separated from other components of the original composition. The term “purified nucleic acid” is used herein to describe a nucleic acid sequence which has been separated, produced apart from, or purified away from other biological compounds including, but not limited to polypeptides, lipids and carbohydrates, while effecting a chemical or functional change in the component (e.g., a nucleic acid may be purified from a chromosome by removing protein contaminants and breaking chemical bonds connecting the nucleic acid to the remaining DNA in the chromosome).
The term "quantitative trait loci" or "QTL" refers to the genetic elements controlling a quantitative trait.
The term “recombinant” refers to a nucleic acid or protein that has a sequence made by an artificial combination of two otherwise separated segments of sequence (e.g., a “chimeric” sequence). This artificial combination can be accomplished by chemical synthesis or by manipulation of isolated segments of nucleic acids, for example, by standard molecular biology techniques (e.g., cloning). A “recombinant expression construct” refers to an expression vector into which a nucleic acid sequence or fragment can be moved.
Preferably, it is a plasmid vector, or a fragment thereof, comprising a promoter. The choice of plasmid vector is dependent upon the method that will be used to transform host plants. Similarly, genetic elements that must be present on the plasmid vector to successfully transform, select and propagate host cells containing the chimeric gene is dependent on the specific transformation method. Different independent transformation events typically result in different levels and patterns of expression and thus multiple events must be screened to obtain lines displaying the desired expression level and pattern. Such screening may be accomplished by PCR and Southern analysis of DNA, RT-PCR and Northern analysis of mRNA expression. Western analysis of protein expression, or phenotypic analysis.
The term “reference plant” or “reference genome” refers to a reference sequence that genetic markers or sequences of a test sample can be compared to in order to detect a modification of the sequence in the test sample. In some aspects, the reference plant or genome is Abacus (Csat_AbacusV2, NCBI assembly accession GCA_025232715.1).
The terms “sequence identity” or “percent identity” are used interchangeably to refer to a sequence comparison based on identical matches between correspondingly identical positions in two or more amino acid or nucleotide sequences that are being compared. The percent identity refers to the extent to which two optimally aligned polynucleotide or peptide sequences are invariant throughout a window of alignment of components, e.g., nucleotides or amino acids. Hybridization experiments and mathematical algorithms known in the art may be used to determine percent identity. Many mathematical algorithms exist as sequence alignment computer programs known in the art that calculate percent identity. These programs may be categorized as either global sequence alignment programs or local sequence alignment programs.
The NCBI Basic Local Alignment Search Tool (BLAST) tool is often used and is available from several sources, including the National Center for Biotechnology Information (blast.ncbi.nlm.nih.gov/Blast.cgi). Various types of BLAST are available, for example, blastp, blastn, blastx, tblastn and tblastx. A description of how to determine sequence identity using this program is available on the NCBI website and other resources. In some aspects, percent sequence identity is determined by using BLAST with default parameters.
The term “substantially similar” as used herein refers to nucleic acid fragments wherein changes in one or more nucleotide bases do not affect the ability of the nucleic acid fragment to mediate gene expression or produce a certain phenotype. These terms also refer to modifications of nucleic acid fragments, such as deletion or insertion of one or more nucleotides that do not substantially alter the functional properties of the resulting nucleic acid fragment relative to the initial, unmodified fragment. A “substantially homologous sequence” refers to variants of the disclosed sequences such as those that result from site-directed mutagenesis, as well as synthetically derived sequences. A substantially homologous sequence also refers to fragments of a particular promoter nucleotide sequence disclosed herein that operate to promote the constitutive expression of an operably linked heterologous nucleic acid fragment. These promoter fragments will include at least about 20 contiguous nucleotides, for example, at least 50 contiguous nucleotides, at least 75 contiguous nucleotides, or at
least 100 contiguous nucleotides of the particular promoter nucleotide sequence disclosed herein. The nucleotides of such fragments will usually comprise the TATA recognition sequence of the particular promoter sequence. Such fragments may be obtained by use of restriction enzymes to cleave the naturally occurring promoter nucleotide sequences disclosed herein; by synthesizing a nucleotide sequence from the naturally occurring promoter DNA sequence; or may be obtained through the use of PCR technology. Functional variants of these promoter fragments, such as those resulting from site-directed mutagenesis, are encompassed by the present disclosure.
The term “single nucleotide polymorphism (SNP)” refers to a change in which a single base in the DNA differs from the base at the corresponding position of a reference genome or sequence.
The term "target region" or "nucleic acid target" refers to a nucleotide sequence that resides at a specific chromosomal location. The "target region" or "nucleic acid target" can be specifically recognized by a probe.
The term “terpene” refers to a class of secondary metabolite typically found in plants. Terpenes are hydrocarbons with small isoprene units linked to one another to form chains. Two types of terpenes/terpenoids commonly found in Cannabis include monoterpenes (10C; two isoprenes) and sesquiterpenes (15C; three isoprenes). Exemplary monoterpenes include limonene (e.g., L- limonene or D-limonene), myrcene (also referred to as P-myrcene), pinene (e.g., a-pinene or -pinene), camphene, linalool, terpinolene, terpinene (e.g., a-terpinene or y-terpinene), and ocimene (also referred to as p-ocimene). Exemplary sesquiterpenes include nerolidol (e.g., cis-nerolidol and/or rrans-nerolidol), humulene (also referred to as a-humulene), guaiol, and caryophyllene (also referred to as P-caryophyllene). “Terpenoids” are oxygen-containing terpenes.
Terpene biosynthesis starts with common isoprenoid diphosphate precursors (5 carbon) through two biosynthetic pathways, the plastidial methylerythritol phosphate (MEP) pathway and the cytosolic mevalonate (MEV) pathway. Both the MEP and MEV pathways provide isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), which are condensed into longer-chain isoprenoid diphosphates that include geranyl diphosphate (GPP) and farnesyl diphosphate (FPP). Linear isoprenoid diphosphates are substrates for monoterpene synthases (mono-TPS) and sesquiterpene synthases (sesqui-TPS), respectively, which diversify these compounds through enzymatic modifications, such as hydroxylation, dehydrogenation, acylation, and glycosylation, resulting in the production of diverse mono- and sesquiterpenes. GPP is also a building block of cannabinoid biosynthesis.
The term “transformant” refers to a cell, tissue or organism that has undergone transformation. The original transformant is designated as “TO” or “TO.” Selfing the TO produces a first transformed generation designated as “Tl” or “Tl.”
The term “transgenic” refers to any cell, cell line, callus, tissue, plant part or plant, the genome of which has been altered by the presence of a heterologous nucleic acid, such as a recombinant DNA construct, including those initial transgenic events as well as those created by sexual crosses or asexual propagation from the initial transgenic event. The term “transgenic” as used herein does not encompass the alteration of the genome
(chromosomal or extra-chromosomal) by conventional plant breeding methods or by naturally occurring events such as random cross-fertilization, non-recombinant viral infection, non-recombinant bacterial transformation, non-recombinant transposition, or spontaneous mutation. A “transgene” is a gene that has been introduced into the genome by a transformation procedure.
The term “variety” as used herein has identical meaning to the corresponding definition in the International Convention for the Protection of New Varieties of Plants (UPOV treaty ), of Dec. 2, 1961, as Revised at Geneva on Nov. 10, 1972, on Oct. 23, 1978, and on Mar. 19, 1991. Thus, “variety” means a plant grouping within a single botanical taxon of the lowest known rank, which grouping, irrespective of whether the conditions for the grant of a breeder’ s right are fully met, can be i) defined by the expression of the characteristics resulting from a given genotype or combination of genotypes, ii) distinguished from any other plant grouping by the expression of at least one of the said characteristics and iii) considered as a unit with regard to its suitability for being propagated unchanged.
The term “vector” refers to a nucleic acid molecule that can be introduced into a host cell (for example, by transformation), thereby producing a transformed host cell. A vector can include nucleic acid sequences that permit it to replicate in a host cell, such as an origin of replication. Recombinant DNA vectors are vectors containing recombinant DNA. A vector can also include one or more selectable marker genes and other genetic elements. Often vectors are DNA plasmids, however, they can also be viral vectors (DNA or RNA), cosmids, or artificial chromosomes.
III. Abbreviations
FPP farnesyl diphosphate
GC gas chromatography
GPP geranyl diphosphate
MEP methylerythritol phosphate pathway
MEV mevalonic acid pathway
NAM nested association mapping
SNP single nucleotide polymorphism
TPS terpene synthases
QC quality control
CBCV cannabichromevarin
CBCVA cannabichromevarinic acid
CBDV cannabidivarin
CBDVA cannabidivarinic acid
CBGV cannabigerivarin
CBGVA cannabigerivarinic acid
THCV tetrahydrocannabivarin
THCVA tetrahydrocannabivarinic acid
IV. Methods of Selecting and/or Producing Cannabis Plants
Disclosed are methods of producing one or more Cannabis plants having modified terpene content, comprising: (i) analyzing one or more genetic markers in a nucleic acid sample from a Cannabis plant or its germplasm; (ii) detecting one or more markers that indicate modified terpene content, (iii) crossing the Cannabis plant comprising the one or more markers, and (iv) obtaining one or more progeny plants comprising the one or more markers; wherein the one of more progeny plants have modified terpene content relative to a control, thereby producing one or more Cannabis plants having modified terpene content.
Also disclosed are methods of identifying or selecting a Cannabis plant having modified terpene content, comprising: (i) analyzing one or more genetic markers in a nucleic acid sample from the Cannabis plant or its germplasm; (ii) detecting one or more genetic markers that indicate modified terpene content; and (iii) identifying or selecting the Cannabis plant, thereby identifying or selecting the Cannabis plant having modified terpene content. In some aspects, the Cannabis plant having modified teipene content is selected for further analysis, propagation, crossing, or to make a product. In some aspects, the method further includes crossing the Cannabis plant having modified terpene content and producing one or more progeny plants having modified terpene content.
Terpene content can be measured using standard analytical techniques, e.g.. gas chromatography and/or HPLC with mass-spectrometry. Modified terpene content can be determined, for example, as a difference in terpene content relative to a control/reference, e.g., a Cannabis plant not having the one or more markers that indicate modified teipene content. In some aspects, the terpene content is modified in flowers or inflorescence, for example, in female flowers or female inflorescence. In some aspects, the teipene content is modified in trichomes (e.g., glandular trichomes). In further aspects, the terpene content is modified in leaf or other vegetative tissue. In several aspects, the modified terpene content is an increase in terpene content relative to a suitable control (e.g., a sample from a Cannabis plant not having the one or more markers that indicate modified terpene content). In some implementations, the Cannabis plant is Cannabis sativa, Cannabis indie a, or Cannabis ruderalis. In a non-limiting example, the Cannabis plant is Cannabis sativa.
Modified terpene content as used herein can include, for example, modified total terpenes, total monoterpenes, beta-myrcene, total sesquiteipenes, alpha-pinene, beta-pinene, alpha-terpinene, gammaterpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, trans-nerolidol, alpha-humulene, betacaryophyllene, and/or guaiol levels. In some aspects, modified terpene content includes or consists of modified total tcipcncs; modified total monotcrpcncs; modified total monotcrpcncs absent bcta-myrccnc; modified total sesquiteipenes; modified alpha-pinene; modified beta-pinene; modified alpha-terpinene, gamma-terpinene, and terpinolene; modified beta-myrcene to total monoterpene ratio; modified beta-ocimene; modified camphene and
D-limonene; modified linalool and trans-nerolidol; modified alpha-humulene and beta-caryophyllene; and/or modified guaiol. In some aspects, the modified terpene content is modified total terpene content. In some aspects, the modified terpene content is modified total monoterpene content. In some aspects, the modified terpene content is modified total monoterpenes content absent beta-myrcene. In some aspects, the modified terpene content is modified total sesquiterpene content. In some aspects, the modified terpene content is modified alpha-pinene content. In some aspects, the modified terpene content is modified beta-pinene content. In some aspects, the modified terpene content is modified alpha-terpinene, gamma-terpinene, and terpinolene content. In some aspects, the modified terpene content is modified beta-myrcene to monoterpene content ratio ((beta-myrcene + l)/((total monoteipenes - beta-myrcene) +1)). In some aspects, the modified terpene content is modified beta-ocimene content. In some aspects, the modified terpene content is modified camphene and D- limonene content. In some aspects, the modified terpene content is modified linalool and trans-nerolidol content. In some aspects, the modified terpene content is modified alpha-humulene and beta-caryophyllene content. In some aspects, the modified terpene content is modified guaiol content.
In some implementations, a plant produced or selected by a method disclosed herein includes a terpene content (e.g., total terpenes, total monoteipenes, beta-myrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma- teipinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, trans-nerolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content) of at least 0.1% by weight, for example, at least 0.2%, at least 0.3%, at least 0.4%, at least 0.5%, at least 0.6%, at least 0.7%, at least 0.8%, at least 0.9%, at least 1.0%, at least 1.2%, at least 1.4%, at least 1.5%, at least 1.75%, at least 2.0%, at least 2.5%, at least 3%, at least 3.5%, at least 4%, at least 4.5%, at least 5%, at least 5.5%, at least 6%, at least 6.5%, at least 7%, at least 7.5%, at least 8%, at least 8.5%, at least 9%, at least 9.5%, at least 10%, or more, by weight in at least one plant part (e.g., leaves, flowers, or trichomes). In a non-limiting example, the plant has a terpene content (e.g., total terpenes, total monoterpenes, beta-myrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, trans-nerolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content) of at least 0.2% by weight. In another non-limiting example, the plant has a teipene content (e.g., total terpenes, total monoterpenes, beta-myrcene, total sesquiterpenes, alphapinene, beta-pinene, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, trans-nerolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content) of at least 1.0% by weight. In a further non-limiting example, the plant has a terpene content (e.g., total terpenes, total monoterpenes, betamyrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma-terpinene, terpinolene, beta- ocimene, camphene, D-limonene, linalool, trans-nerolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content) of at least 3% by weight. In another non-limiting example, the plant has a terpene content (e.g., total terpenes, total monoterpenes, bcta-myrccnc, total sesquiterpenes, alpha-pinene, bcta-pincnc, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, trans-nerolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content) of at least 5% by weight.
In some implementations, a plant produced or selected by a method disclosed herein includes a terpene content (e.g., total terpenes, total monoterpenes, beta-myrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma- terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, tra/i.s-ncrolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content) of 0.1% to 10% by weight, for example, 0.1% to 9%, 0.1% to 8%, 0.1% to 7%, 0.1% to 6%, 0.1% to 5%, 0.1% to 4%, 0.1% to 3%, 0.1% to 2%, 0.1% to 1%, 0.2% to 10%, 0.2% to 9%, 0.2% to 8%, 0.2% to 7%, 0.2% to 6%, 0.2% to 5%, 0.2% to 4%, 0.2% to 3%, 0.2% to 2%, 0.2 % to 1%, 0.5% to 10%, 0.5% to 9%, 0.5% to 8%, 0.5% to 7%, 0.5% to 6%. 0.5% to 5%, 0.5% to 4%, 0.5% to 3%, 0.5% to 2%, 0.5 % to 1%, 1% to 10%, 1% to 9%, 1% to 8%, 1% to 7%, 1% to 6%, 1% to 5%, 1% to 4%, 1% to 3%, 1% to 2%, 2% to 10%, 2% to 9%, 2% to 8%, 2% to 7%, 2% to 6%, 2% to 5%, 2% to 4%, 2% to 3%, 3% to 10%, 3% to 9%, 3% to 8%, 3% to 7%, 3% to 6%, 3% to 5%, 3% to 4%, 4% to 10%, 4% to 9%, 4% to 8%, 4% to 7%, 4% to 6%, 4% to 5%, 5% to 10%, 5% to 9%, 5% to 8%, 5% to 7%, 5% to 6%, 6% to 10%, 6% to 9%, 6% to 8%, 6% to 7%, 7% to 10%, 7% to 9%, 7% to 8%, 8% to 10%, 8% to 9%, or 9% to 10% terpene content by weight in at least one plant part (e.g.. leaves, flowers, or trichomes). In some aspects, the terpene content e.g., total terpenes, total monoteipenes, beta-myrcene, total sesquiterpenes, alpha-pinene, betapinene, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, transnerolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content) is 0.2% to 7% by weight. In some aspects, the terpene content (e.g., total terpenes, total monoteipenes, beta-myrcene, total sesquiterpenes, alphapinene, beta-pinene, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, tra -ncrolidol. alpha-humulene, beta-caryophyllene, and/or guaiol content) is 0.5% to 7% by weight. In some aspects, the terpene content (e.g., total terpenes, total monoterpenes, beta-myrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D- limonene, linalool, trans-nerolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content) is 0.5% to 3% by weight. In further aspects, the terpene content (e.g., total terpenes, total monoteipenes, beta-myrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma-terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, tran -nerolidol, alpha-humulene, beta-caryophyllene, and/or guaiol content) is 1% to 7% by weight.
In some implementations, a plant produced or selected by the methods disclosed herein includes a total terpene, total monoterpene, and/or total sesquiterpene content of at least 0.1% by weight, for example, at least 0.2%, at least 0.5%, at least 0.75%, at least 1.0%, at least 1.5%, at least 2.0%, at least 2.5%, at least 3%, at least 3.5%, at least 4%, at least 4.5%, at least 5%, at least 5.5%, at least 6%, at least 6.5%, at least 7%, at least 7.5%, at least 8%, at least 8.5%, at least 9%, at least 9.5%, at least 10%, at least 11%, at least 12%, at least 13%, at least 14%, at least 15%, at least 16%, at least 17%, at least 18%, at least 19%, at least 20%, at least 25%, or more, by weight in at least one plant part (e.g., leaves, flowers, or trichomes). In a non-limiting example, the plant has a total terpene, total iiionotcrpcnc. and/or total sesquiterpene content of at least 0.2% by weight. In another non-limiting example, the plant has a total terpene, total monoterpene, and/or total sesquiterpene content
of at least 1% by weight. In a further example, the plant has a total terpene, total monoterpene, and/or total sesquiterpene content of at least 3% by weight. In another example, the plant has a total terpene, total monoterpene, and/or total sesquiterpene content of at least 5% by weight.
In some implementations, a plant produced or selected by a method disclosed herein includes a total terpene, total monoterpene, and/or total sesquiterpene of 0.1% to 10% by weight, for example, 0.1% to 9%, 0.1% to 8%, 0.1% to 7%, 0.1% to 6%, 0.1% to 5%, 0.1% to 4%, 0.1% to 3%, 0.1% to 2%, 0.1% to 1%, 0.2% to 10%, 0.2% to 9%, 0.2% to 8%, 0.2% to 7%, 0.2% to 6%, 0.2% to 5%, 0.2% to 4%, 0.2% to 3%, 0.2% to 2%, 0.2 % to 1%, 0.5% to 10%, 0.5% to 9%, 0.5% to 8%, 0.5% to 7%, 0.5% to 6%, 0.5% to 5%, 0.5% to 4%, 0.5% to 3%, 0.5% to 2%, 0.5 % to 1%, 1% to 10%, 1% to 9%, 1% to 8%, 1% to 7%, 1% to 6%, 1% to 5%, 1% to 4%, 1% to 3%, 1% to 2%, 2% to 10%, 2% to 9%, 2% to 8%, 2% to 7%, 2% to 6%, 2% to 5%, 2% to 4%, 2% to 3%, 3% to 10%, 3% to 9%, 3% to 8%, 3% to 7%, 3% to 6%, 3% to 5%, 3% to 4%, 4% to 10%, 4% to 9%, 4% to 8%, 4% to 7%, 4% to 6%, 4% to 5%, 5% to 10%, 5% to 9%, 5% to 8%, 5% to 7%, 5% to 6%, 6% to 10%, 6% to 9%, 6% to 8%, 6% to 7%, 7% to 10%, 7% to 9%, 7% to 8%, 8% to 10%, 8% to 9%, or 9% to 10% content by weight in at least one plant part (e.g., leaves, flowers, or trichomes). In some aspects, the total terpene, total monoterpene, and/or total sesquiterpene is 0.5% to 7%. In some aspects, the total terpene, total monoterpene, and/or total sesquiterpene is 0.5% to 5%. In some aspects, the total terpene, total monoterpene, and/or total sesquiterpene is 1% to 7%.
A measure of % by weight in a method disclosed herein can be % by dry weight or % by fresh weight. In several aspects, the % by weight is % by dry weight.
In some aspects, the modified beta-myrcene to monoteipene content ratio ((beta-myrcene + l)/( (total monoterpenes - beta-myrcene) +1)) is at least 0.75, for example, at least 0.8, at least 0.9, at least 1.0, at least 1.1, at least 1.2, at least 1.3, at least 1.4, at least 1.5, at least 1.6, at least 1.7, at least 1.8, at least 1.9, at least 2.0, or more. In some aspects, the modified beta-myrcene to monoterpene content ratio is 0.75 to 5, for example, 0.75 to 4, 0.75 to 3, 0.75 to 2, 0.75 to 1.5, 0.75 to 1.4, 0.75 to 1.3, 0.75 to 1.2, 0.75 to 1.1, 0.75 to 1.0, 0.8 to 4, 0.8 to 3, 0.8 to 2, 0.8 to 1 .5, 0.8 to 1 .4, 0.8 to 1 .3, 0.8 to 1 .2, 0.8 to 1 .1 , 0.8 to 1 .0, 1 to 4, 1 to 3, 1 to 2, 1 to 1 .5, 1 to 1.4, 1 to 1.3, 1 to 1.2, 1 to 1.1, 1.2 to 4, 1.2 to 3, 1.2 to 2, 1.2 to 1.5, 1.2 to 1.4, or 1.2 to 1.3. In some aspects, the modified beta-myrcene to monoterpene content ratio is about 0.75 to 1.3. In some aspects, the modified betamyrcene to monoteipene content ratio is about 1.0 to about 1.4.
The plant part can be any part of the plant selected or produced by the methods disclosed herein. In some aspects, the plant part is a flower (e.g., a female flower) or inflorescence tissue. In some aspects, the plant part is a trichome (e.g., glandular trichomes). In further aspects, the plant part is a leaf or other vegetative tissue.
In some aspects, the one or more genetic markers that indicate modified terpene content include or consist of one or more genetic markers disclosed herein, for example, one or more genetic markers described in Table 15.
In some aspects, the genetic marker is a polymorphism (e.g., SNP) found within one or more of the following haplotypes:
(a) the region on chromosome 1 :
(1) between positions 8842738 and 8882848; (2) between positions 8882848 and 8903959; (3) between positions 9095076 and 9416399; (4) between positions 10443667 and 10451533; (5) between positions 10528781 and 10557440; (6) between positions 10559732 and 10564751; (7) between positions 10624505 and 10758069; (8) between positions 10903728 and 10992149; (9) between positions 11166388 and 11189474; or (10) between positions 13919727 and 13925024; or
(b) the region on chromosome 2:
(1) between positions 93289429 and 93293528; or
(c) the region on chromosome 3:
(1) between positions 47131547 and 47202043; or
(d) the region on chromosome 4:
(1) between positions 72692194 and 72730265; or
(e) the region on chromosome 5:
(1) between positions 306727 and 345147; (2) between positions 510491 and 556304; (3) between positions 510491 and 556304; (4) between positions 510491 and 556304; (5) between positions 602915 and 611014; (6) between positions 753271 and 759520; (7 ) between positions 1093370 and 1105891; (8) between positions 1105891 and 1130028; (9) between positions 1226772 and 1338718; (10) between positions 1338718 and 1366137; (11) between positions 1338718 and 1374231; (12) between positions 1374231 and 1391376; (13) between positions 1471890 and 1492255; (14) between positions 1737866 and 1796267; (15) between positions 1806325 and 1831524; (16) between positions 1831524 and 1840325; (17) between positions 1837343 and 1879732; (18) between positions 1909913 and 1965657; (19) between positions 2000572 and 2011766; (20) between positions 2011766 and 2065182; (21 ) between positions 2038965 and 2080946; (22) between positions 2102870 and 2132683; (23) between positions 2177531 and 2288919; (24) between positions 2208629 and 2296380; (25) between positions 2208629 and 2296380; (26) between positions 2291467 and 2312553; (27) between positions 2312553 and 2341119; (28) between positions 2334487 and 2341119; (29) between positions 2341119 and 2360380; (30) between positions 2350756 and 2364964; (31) between positions 2360380 and 2366529; (32) between positions 2364964 and 2534579; (33) between positions 2366529 and 2698301; (34) between positions 3454995 and 3493107; (35) between positions 2766141 and 2788084; (36) between positions 2766141 and 2812259; (37) between positions 3074649 and 3086874; (38) between positions 3240926 and 3287072; (39) between positions 3454995 and 3493107; (40) between positions 3495386 and 3513857; (41) between positions 3548727 and 3573470; (42) between positions 3580179 and 3599637; (43) between positions 3580179 and 3604863; (44) between positions 3610795
and 3673686; (45) between positions 3699003 and 3745325; (46) between positions 3757096 and 3812523; (47) between positions 3837020 and 3864286; (48) between positions 4376633 and 4407878; (49) between positions 4384123 and 4407878; (50) between positions 5847962 and 5902323; (51) between positions 7299267 and 7334907; (52) between positions 11977355 and 11995991; (53) between positions 12356226 and 12443117; (54) between positions 12444383 and 12472642; (55) between positions 32322620 and 32387532; (56) between positions 32390660 and 32431549; (57) between positions 55471148 and 55480894; (58) between positions 62899638 and 62917837; (59) between positions 66462287 and 66482542; or (60) between positions 74391487 and 74399760; or
(f) the region on chromosome 6:
(1) between positions 1203540 and 1221496; (2) between positions 1244898 and 1477638; (3) between positions 1692286 and 1701837; (4) between positions 1714540 and 1740247; (5) between positions 1952568 and 1999618; (6) between positions 1960918 and 2060506; (7) between positions 1960918 and 2060506; (8) between positions 2918399 and 2945707; (9) between positions 3071457 and 3083137; (10) between positions 3083137 and 3092130; (11) between positions 3154063 and 3188084; (12) between positions 5135593 and 5193113; (13) between positions 5462269 and 5488226; (14) between positions 5859221 and 5875355; (15) between positions 6051132 and 6104696; (16) between positions 6104696 and 6124684; (17) between positions 6296168 and 6319308; (18) between positions 6585819 and 6608294; or (19) between positions 64927273 and 64988035; or
(g) the region on chromosome 8:
(1) between positions 14058913 and 14074065; (2) between positions 14314523 and 14339300;
(3) between positions 14851865 and 14891255; or (4) between positions 33582608 and 33596624; or
(h) the region on chr omosome 9:
(1) between positions 55096943 and 55124006; or
(i) the region on chromosome x:
(1) between positions 578861 12 and 57917472; or (2) between positions 58464133 and 58548870; wherein the reference genome is Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, the methods disclosed herein detect a haplotype associated with modified terpene content, or a haplotype that contains a terpene trait locus. In some aspects, the genetic marker is genetically linked to a terpene trait locus.
In some aspects, analyzing or detecting one or more genetic markers that indicate modified terpene content includes analyzing or detecting at least one SNP disclosed herein, for example, in Table 1, Table 2, Tabic 3, Tabic 4, Tabic 5, Tabic 6, Tabic 7, Tabic 8, Tabic 9, Tabic 10, Tabic 11, Tabic 12, Tabic 13, Tabic 15, or Table 16, respectively. In some aspects, analyzing or detecting one or more genetic markers that indicate modified terpene content includes analyzing or detecting at least two SNPs described in Table 1, Table 2, Table
3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11, Table 12, Table 13, Table 15, or Table 16, respectively. In some aspects, analyzing or detecting one or more genetic markers that indicate modified terpene content includes analyzing or detecting at least three SNPs described in Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11, Table 12, Table 13, Table 15, or Table 16, respectively. In some aspects, analyzing or detecting one or more genetic markers that indicate modified terpene content includes analyzing or detecting at least five SNPs described in Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11, Table 12, Table 13, Table 15, or Table 16, respectively. In some aspects, analyzing or detecting one or more genetic markers that indicate modified terpene content includes analyzing or detecting all of the SNPs described in Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11, Table 12, Table 13, Table 15, or Table 16, respectively.
Combinations of SNP markers disclosed herein can be useful, for example, for screening increased levels of multiple specific terpenes of interest. While any combination of SNPs disclosed herein could be useful, an exemplary subset of SNPs is provided in Table 16. In some aspects, analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting at least one SNP from the list of SNPs disclosed in Table 16. In some aspects, analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting at least two SNPs from the list of SNPs disclosed in Table 16. In some aspects, analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting at least three SNPs from the list of SNPs disclosed in Table 16. In some aspects, analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting at least four SNPs from the list of SNPs disclosed in Table 16. In some aspects, analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting at least five SNPs from the list of SNPs disclosed in Table 16. In some aspects, analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting at least six SNPs from the list of SNPs disclosed in Table 16. In some aspects, analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting at least seven SNPs from the list of SNPs disclosed in Table 16. In some aspects, analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting at least eight SNPs from the list of SNPs disclosed in Table 16. In some aspects, analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting at least nine SNPs from the list of SNPs disclosed in Table 16. In some aspects, analyzing or detecting one or more genetic markers in any of the methods disclosed here includes analyzing or detecting all the SNPs from the list of SNPs disclosed in Table 16.
In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes analyzing one or more of nucleotide positions:
8,871,401; 8,886,933; 9,101,934; 10,446,475; 10,543,062; 10,561,778; 10,633,191; 10,934,458; 11,169,492; or 13,920,896 on chromosome 1;
93,291,929 on chromosome 2;
47,140,085 on chromosome 3;
72,717,623 on chromosome 4;
330,918; 516,340; 518,238; 523,626; 608,718; 755,967; 1,100,981; 1,109,162; 1,331,433; 1,353,878; 1,366.137; 1,386,965; 1.487,633; 1,745.101; 1,828,050; 1.837,343; 1.840.325; 1,929,134; 2,003,303; 2.038.965;
2,072,869; 2,120,881 ; 2,208,629; 2,288,919; 2,291,467; 2,302,063; 2,318,276; 2,339,956; 2,346,000; 2,360,380;
2,364,964; 2,366,529; 2,534,579; 2,698,301; 2,774,108; 2,780,345; 3,081,773; 3,247,341; 3,485,895; 3,503,143;
3,564,387; 3,585,965; 3,599,637; 3,629,225; 3,704,632; 3,807,710; 3,842,906; 4,384,123; 4,391,586; 5,854,661;
7,307,552; 11,993,646; 12,418,741; 12,446,524; 32,342,917; 32,395,736; 55,475,322; 62,911,168; 66,477,802; or 74,391,606 on chromosome 5;
1,220.207; 1,288,012; 1,695,817; 1.727.397; 1,960,918; 1,999,618; 2.012.149; 2,931.923; 3,073,845; 3,091,941; 3,185,660; 5,175,087; 5,468.920; 5,868,053; 6,061,359; 6,120,135; 6,311,954; 6,589,961; or 64,943,914 on chromosome 6;
14,069,586; 14,329,191; 14,866,064; or 33,592,849 on chromosome 8;
55,114,152 on chromosome 9; or
57,912,635 or 58,545,628 on chromosome x, wherein the positions are in reference to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In several implementations, the one or more nucleic acid polymorphisms are beneficial polymorphisms associated with increased terpene content in Cannabis.
In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes analyzing at least 2 genetic markers, for example, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 12, at least 14, at least 16, at least 18, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, or more markers. In a non-limiting example, analyzing one or more genetic markers in the nucleic acid sample includes analyzing at least 3 genetic markers. In another non-limiting example, analyzing one or more genetic markers in the nucleic acid sample includes analyzing at least 5 genetic markers. In another non-limiting example, analyzing one or more genetic markers in the nucleic acid sample includes analyzing at least 7 genetic markers. In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes analyzing 2 to 50 genetic markers, for example, 2 to 40, 2 to 30, 2 to 20, 2 to 10, 5 to 50, 5 to 40, 5 to 30, 5 to 20, 5 to 10, 10 to 50, 10 to 40, 10 to 30, or 10 to 20 genetic markers. In a non-limiting example, 2 to 10 genetic markers (e.g., SNPs) arc analyzed. In some aspects, the one or more genetic markers arc genetically linked to a terpene trait locus.
In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions: 93,291,929 on chromosome 2; 72,717,623 on chromosome 4; 55,114,152 on chromosome 9; 57,912,635 on chromosome x; and/or 58,545,628 on chromosome x, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified total terpene content.
In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions: 93,291,929 on chromosome 2; 47,140,085 on chromosome 3; 72,717,623 on chromosome 4; 3,807,710 on chromosome 5; 3,842,906 on chromosome 5; 55,475,322 on chromosome 5; and/or 33,592,849 on chromosome 8, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified total monoterpene content.
In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions 1,100,981 ; 1,840,325; 2,366,529; 2,698,301; 3,081,773; 3,485,895; 3,585,965; 3,599,637; 3,629,225; and/or 4,384,123 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing nucleotide position 3,081,773 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified total monoterpenes absent beta-myrcene.
In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions 2,774,108; 5,854,661; or 7,307,552 on chromosome 5, or 1,695,817; 1,727,397; 1,960,918; 5,175,087; 14,069,586; 14,329,191; and/or 14,866,064 of chromosome 6 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified total sesquiterpene content.
In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions 516,340; 518,238; 523,626; 1,109,162; 1,366,137; 2,346,000; 2,534,579; 3,247,341; 3,503,143; and/or 3,629,225 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing nucleotide position 2,534,579 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified alpha-pinene content.
In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions 516,340; 518,238; 523,626; 608,718; 1,109,162; 1,366,137; 1,386,965; 2,003,303; 3,247,341; and/or 3,704,632 on chromosome 5 according to the Abacus Cannabis
reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing nucleotide position 1,366,137 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified beta-pinene content.
In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions 755,967; 1,840,325; 2,302,063; 2,366,529; 2,698,301; 3,485,895; 4,384,123; 11,993,646; 12,418,741; and/or 12,446,524 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing nucleotide position 2,698,301 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified alpha-terpinene, gamma-terpinene, and terpinolene content.
In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions 330,918; 1,353,878; 1,745,101; 1,828,050; 1,929,134; 2,072,869; 2,339,956; 3,081,773; 3,564,387; and/or 3,585,965 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing nucleotide position 1,929,134 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified monoterpene to beta-myrcene content ratio.
In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions 1,828,050; 2,038,965; 2,120,881; 2,208,629; 2,360,380; 2,364,964; 32,342,917; 32,395,736; 62,911,168; and/or 66,477,802 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_ Abacus V2, NCBI assembly accession GCA_025232715.1 . In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing nucleotide position 2,038,965 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified beta-ocimene content.
In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions 1,331,433; 1,487,633; 1,837,343; 2,288,919; 2,291,467; 2,318,276; 2,774,108; 2,780,345; 4,391,586; and/or 74,391,606 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing nucleotide position 2,288,919 and/or 2,774,108 on chromosome 5, according to the Abacus Cannabis reference
genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified camphene and D-limonene content.
In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions 8,871,401; 8,886,933; 9,101,934; 10,446,475; 10,543,062; 10,561,778; 10,633,191; 10,934,458; 11,169,492; and/or 13,920,896 on chromosome 1 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing nucleotide position 10,633,191 on chromosome 1, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified linalool and trans-nerolidol content.
In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions 1,695,817; 1,727,397; 1,960,918; 5,175,087; 5,468,920; 5,868,053; 6,061,359; 6,120,135; and/or 64,943,914 on chromosome 6; or 14,069,586 on chromosome 8; according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing nucleotide position 5,175,087 on chromosome 6, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified alpha-humulene and beta-caryophyllene content.
In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing one or more of nucleotide positions 1,220,207; 1,288,012; 1,999,618; 2,012,149; 2,931,923; 3,073,845; 3,091,941; 3,185,660; 6,311,954; and/or 6,589,961 on chromosome 6 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, analyzing one or more genetic markers in the nucleic acid sample includes or consists of analyzing nucleotide position 6,311,954 on chromosome 6, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_O25232715.1 . In some such aspects, the modified terpene content is modified guaiol content.
In some aspects, detecting one or more genetic markers that indicate modified teipene content includes detecting one or more of the following SNPs:
Chromosome 1:
(a) a T/T or C/T genotype at position 8871401; (b) a G/G or T/G genotype at position 8886933;
(c) a T/T or G/T genotype at position 9101934; (d) a G/G or C/G genotype at position 10446475; (e) a A/A or C/A genotype at position 10543062; (f) a G/G or T/G genotype at position 10561778; (g) a C/C or C/A genotype at position 10633191; (h) a A/A or T/A genotype at position 10934458; (i) a A/A or G/A genotype at position 11169492; (j) a C/C or T/C genotype at position 13920896;
Chromosome 2:
(k) a A/ A or T/A genotype at position 93291929;
Chromosome 3:
(l) a C/C or A/C genotype at position 47140085;
Chromosome 4:
(m) a A/A or C/A genotype at position 72717623;
Chromosome 5:
(n) a A/A or G/A genotype at position 330918; (o) a C/C or T/C genotype at position 516340;
(p) a G/G or G/A genotype at position 518238; (q) a T/T or T/C genotype at position 523626; (r) a C/C or T/C genotype at position 608718; (s) a T/T or G/T genotype at position 755967; (t) a T/T or C/T genotype at position 1100981; (u) a T/T or G/T genotype at position 1109162; (v) a T/T or C/T genotype at position 1331433; (w) a G/G or C/G genotype at position 1353878; (x) a A/A or G/A genotype at position 1366137; (y ) a C/C or T/C genotype at position 1386965; (z) a C/C or C/T genotype at position 1487633; (aa) a T/T or T/A genotype at position 1745101; (ab) a G/G or A/G genotype at position 1828050; (ac) a G/G or A/G genotype at position 1837343; (ad) a T/T or C/T genotype at position 1840325; (ae) a T/T or T/C genotype at position 1929134; (af) a A/A or G/A genotype at position 2003303; (ag) a C/C or T/C genotype at position 2038965; (ah) a T/T or C/T genotype at position 2072869; (ai) a C/C or C/T genotype at position 2120881; (aj) a G/G or G/C genotype at position 2208629; (ak) a T/T or A/T genotype at position 2288919; (al) a T/T or T/A genotype at position 2291467; (am) a A/A or C/A genotype at position 2302063; (an) a A/A or G/A genotype at position 2318276; (ao) a T/T or T/C genotype at position 2339956; (ap) a A/A or A/C genotype at position 2346000; (aq) a G/G or G/T genotype at position 2360380; (ar) a T/T or T/C genotype at position 2364964; (as) a T/T or C/T genotype at position 2366529; (at) a A/A or A/T genotype at position 2534579; (an) a G/G or A/G genotype at position 2698301; (av) a A/A or G/A genotype at position 2774108; (aw) a G/G or A/G genotype at position 2780345; (ax) a T/T or C/T genotype at position 3081773; (ay) a C/C or T/C genotype at position 3247341 ; (az) a C/C or T/C genotype at position 3485895; (ba) a C/C or T/C genotype at position 3503143; (bb) a A/A or A/T genotype at position 3564387; (be) a A/A or A/T genotype at position 3585965; (bd) a A/A or G/A genotype at position 3599637; (be) a A/A or G/A genotype at position 3629225; (bf) a C/C or T/C genotype at position 3704632; (bg) a T/T or T/C genotype at position 3807710; (bh) a T/T or T/A genotype at position 3842906; (bi) a T/T or A/T genotype at position 4384123; (bj) a G/G or G/A genotype at position 4391586; (bk) a A/A or G/A genotype at position 5854661; (bl) a T/T or C/T genotype at position 7307552; (bm) a T/T or T/A genotype at position 11993646; (bn) a T/T or C/T genotype at position 12418741; (bo) a T/T or A/T genotype at position 12446524; (bp) a T/T or A/T genotype at position 32342917; (bq) a T/T or C/T genotype at position 32395736; (br) a G/G or A/G
genotype at position 55475322; (bs) a A/A or T/A genotype at position 62911168; (bt) a G/G or A/G genotype at position 66477802; (bu) a C/C or C/T genotype at position 74391606;
Chromosome 6:
(bv) a C/C or A/C genotype at position 1220207; (bw) a T/T or C/T genotype at position 1288012; (bx) a C/C or G/C genotype at position 1695817; (by) a T/T or C/T genotype at position 1727397; (bz) a G/G or A/G genotype at position 1960918; (ca) a A/A or G/A genotype at position 1999618; (cb) a C/C or T/C genotype at position 2012149; (cc) a A/A or T/A genotype at position 2931923; (cd) a C/C or T/C genotype at position 3073845; (ce) a G/G or A/G genotype at position 3091941; (cf) a C/C or T/C genotype at position 3185660; (eg) a T/T or A/T genotype at position 5175087; (ch) a A/A or A/G genotype at position 5468920; (ci) a A/A or G/A genotype at position 5868053; (cj) a C/C or C/A genotype at position 6061359; (ck) a T/T or C/T genotype at position 6120135; (cl) a C/C or G/C genotype at position 6311954; (cm) a G/G or T/G genotype at position 6589961; (cn) a T/T or G/T genotype at position 64943914; Chromosome 8:
(co) a A/A or A/T genotype at position 14069586; (cp) a G/G or A/G genotype at position 14329191; (cq) a T/T or C/T genotype at position 14866064; (cr) a C/C or T/C genotype at position 33592849;
Chromosome 9:
(cs) a C/C or G/C genotype at position 55114152;
Chromosome X:
(ct) a C/C or T/C genotype at position 57912635; or (cu) a T/T or T/C genotype at position 58545628.
In some aspects, analyzing one or more genetic markers comprises analyzing one or more of nucleotide positions: (a) 3,081,773 on chromosome 5; (b) 10,633,191 on chromosome 1; (c) 1,366,137 on chromosome 5; (d) 1 ,929,134 on chromosome 5; (e) 2,038,965 on chromosome 5; (f) 2,288,919 on chromosome 5; (g) 2,534,579 on chromosome 5; (h) 2,698,301 on chromosome 5; (i) 2,774,108 on chromosome 5; (j) 5,175,087 on chromosome 6; and/or (k) 6,311,954 on chromosome 6. In some aspects, analyzing one or more genetic markers comprises analyzing all of nucleotide positions: (a) 3,081,773 on chromosome 5; (b) 10,633,191 on chromosome 1; (c) 1,366,137 on chromosome 5; (d) 1,929,134 on chromosome 5; (e) 2,038,965 on chromosome 5; (f) 2,288,919 on chromosome 5; (g) 2,534,579 on chromosome 5; (h) 2,698,301 on chromosome 5; (i) 2,774,108 on chromosome 5; (j) 5,175,087 on chromosome 6; and/or (k) 6,311,954 on chromosome 6.
In some aspects, detecting one or more genetic markers that indicate modified terpene content includes detecting one or more of the following SNPs: (a) a T/T or C/T genotype at position 3,081,773 on chromosome 5; (b) a C/C or C/A genotype at position 10,633,191 on chromosome 1; (c) a A/A or G/A genotype at position 1,366,137 on chromosome 5; (d) a T/T or T/C genotype at position 1,929,134 on chromosome 5; (e) a C/C or
T/C genotype at position 2,038,965 on chromosome 5; (f) a T/T or A/T genotype at position 2,288,919 on chromosome 5; (g) a A/A or A/T genotype at position 2,534,579 on chromosome 5; (h) a G/G or A/G genotype at position 2,698,301 on chromosome 5; (i) a A/A or G/A genotype at position 2,774,108 on chromosome 5; (j) a T/T or A/T genotype at position 5,175,087 on chromosome 6; and/or (k) a C/C or G/C genotype at position 6,311,954 on chromosome 6.
In some aspects, detecting one or more genetic markers (e.g., polymorphisms) that indicate modified terpene content in the nucleic acid sample includes detecting at least 2 genetic markers, for example, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 12, at least 14, at least 16, at least 18, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, or more markers. In a non-limiting example, detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting at least 3 genetic markers. In another non-limiting example, detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting at least 5 genetic markers. In another non-limiting example, detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting at least 7 genetic markers. In several implementations, the one or more genetic markers that indicate modified terpene content are beneficial markers that are associated with increased terpene content in Cannabis. In some aspects, detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting 2 to 50 genetic markers, for example, 2 to 40, 2 to 30, 2 to 20, 2 to 10, 5 to 50, 5 to 40, 5 to 30, 5 to 20, 5 to 10, 10 to 50, 10 to 40, 10 to 30, or 10 to 20 genetic markers. In a non-limiting example, 2 to 10 genetic markers (e.g., SNPs) that indicate modified terpene content are detected. In some aspects, the one or more genetic markers (e.g., SNPs) that indicate modified terpene content are genetically linked to a terpene trait locus.
In some aspects, detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions: 93,291,929 on chromosome 2; 72,717,623 on chromosome 4; 55,114,152 on chromosome 9; 57,912,635 on chromosome x; and/or 58,545,628 on chromosome x, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified total terpene content.
In some aspects, detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions: 93,291,929 on chromosome 2; 47,140,085 on chromosome 3; 72,717,623 on chromosome 4; 3,807,710 on chromosome 5; 3,842,906 on chromosome 5; 55,475,322 on chromosome 5; and/or 33,592,849 on chromosome 8, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified total monoterpene content.
In some aspects, detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 1,100,981; 1,840,325; 2,366,529; 2,698,301; 3,081,773; 3,485,895; 3,585,965; 3,599,637; 3,629,225; and/or 4,384,123 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, detecting one or more genetic markers that indicate modified terpene content include detecting a SNP that indicates modified terpene content at nucleotide position 3,081,773 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified total monoterpenes absent beta-myrcene.
In some aspects, detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 2,774,108; 5,854,661; or 7,307,552 on chromosome 5, or 1,695,817; 1,727,397; 1,960,918; 5,175,087; 14,069,586; 14,329,191; and/or 14,866,064 of chromosome 6 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified total sesquiterpene content.
In some aspects, detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 516,340; 518,238; 523,626; 1,109,162; 1,366,137; 2,346,000; 2,534,579; 3,247,341; 3,503,143; and/or 3,629,225 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, detecting one or more genetic markers that indicate modified terpene content include detecting a SNP that indicates modified terpene content at nucleotide position 2,534,579 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified alpha-pinene content.
In some aspects, detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 516,340; 518,238; 523,626; 608,718; 1,109,162; 1,366,137; 1,386,965; 2,003,303; 3,247,341; and/or 3,704,632 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, detecting one or more genetic markers that indicate modified terpene content include detecting a SNP that indicates modified terpene content at nucleotide position 1,366,137 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified bcta-pincnc content.
In some aspects, detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of
nucleotide positions 755,967; 1,840,325; 2,302,063; 2,366,529; 2,698,301; 3,485,895; 4,384,123; 11,993,646; 12,418,741; and/or 12,446,524 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, detecting one or more genetic markers that indicate modified teipene content include detecting a SNP that indicates modified terpene content at nucleotide position 2,698,301 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified alpha-terpinene, gamma-terpinene, and terpinolene content.
In some aspects, detecting one or more genetic markers that indicate modified teipene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 330,918; 1,353,878; 1,745,101; 1,828,050; 1,929,134; 2,072,869; 2,339,956; 3,081,773; 3,564,387; and/or 3,585,965 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, detecting one or more genetic markers that indicate modified terpene content include detecting a SNP that indicates modified terpene content at nucleotide position 1,929,134 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified beta-myrcene to total monoterpene content ratio.
In some aspects, detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 1,828,050; 2,038,965; 2,120,881; 2,208,629; 2,360,380; 2,364,964; 32,342,917; 32,395,736; 62,911,168; and/or 66,477,802 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, detecting one or more genetic markers that indicate modified terpene content include detecting a SNP that indicates modified terpene content at nucleotide position 2,038,965 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified beta-ocimene content.
In some aspects, detecting one or more genetic markers that indicate modified teipene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 1,331,433; 1,487,633; 1,837,343; 2,288,919; 2,291,467; 2,318,276; 2,774,108; 2,780,345; 4,391,586; and/or 74,391,606 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, detecting one or more genetic markers that indicate modified terpene content include detecting a SNP that indicates modified terpene content at nucleotide position 2,288,919 and/or 2,774,108 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified teipene content is modified camphene and D-limonene content.
In some aspects, detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 8,871,401; 8,886,933; 9,101,934; 10,446,475; 10,543,062; 10,561,778; 10,633,191; 10,934,458; 11,169,492; and/or 13,920,896 on chromosome 1 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, detecting one or more genetic markers that indicate modified terpene content include detecting a SNP that indicates modified terpene content at nucleotide position 10,633,191 on chromosome 1, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified linalool and trans-nerolidol content.
In some aspects, detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 1,695,817; 1,727,397; 1,960,918; 5,175,087; 5,468,920; 5,868,053; 6,061,359; 6,120,135; and/or 64,943,914 on chromosome 6; or 14,069,586 on chromosome 8; according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, detecting one or more genetic markers that indicate modified terpene content include detecting a SNP that indicates modified terpene content at nucleotide position 5,175,087 on chromosome 6, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified alpha-humulene and beta-caryophyllene content.
In some aspects, detecting one or more genetic markers that indicate modified terpene content in the nucleic acid sample includes detecting a SNP that indicates modified terpene content at one or more of nucleotide positions 1,220,207; 1,288,012; 1,999,618; 2,012,149; 2,931,923; 3,073,845; 3,091,941; 3,185,660; 6,311,954; and/or 6,589,961 on chromosome 6 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some aspects, detecting one or more genetic markers that indicate modified terpene content include detecting a SNP that indicates modified terpene content at nucleotide position 6,311 ,954 on chromosome 6, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. In some such aspects, the modified terpene content is modified guaiol content.
The genetic markers described herein can also be identified based on corresponding SEQ ID NOs disclosed herein, rather than a particular chromosomal location relative to the Abacus Cannabis reference genome. Corresponding SEQ ID NOs are provided in Tables 1-13 and 15. Thus, in some aspects, the one or more genetic markers comprise a polymorphism at position 51 of one or more of SEQ ID NOs: 1-99. In some aspects, detecting one or more markers that indicate modified terpene content includes detecting one or more of the following:
Chromosome 1:
(a) a T/T or C/T genotype at position 51 of SEQ ID NO: 1; (b) a G/G or T/G genotype at position 51 of SEQ ID NO: 2; (c) a T/T or G/T genotype at position 51 of SEQ ID NO: 3; (d) a G/G or C/G genotype at position 51 of SEQ ID NO: 4; (e) a A/ A or C/A genotype at position 51 of SEQ ID NO: 5; (f) a G/G or T/G genotype at position 51 of SEQ ID NO: 6; (g) a C/C or C/A genotype at position 51 of SEQ ID NO: 7; (h) a A/A or T/A genotype at position 51 of SEQ ID NO: 8; (i) a A/A or G/A genotype at position 51 of SEQ ID NO: 9; (j) a C/C or T/C genotype at position 51 of SEQ ID NO: 10;
Chromosome 2:
(k) a A/A or T/A genotype at position 51 of SEQ ID NO: 11;
Chromosome 3:
(l) a C/C or A/C genotype at position 51 of SEQ ID NO: 12;
Chromosome 4:
(m) a A/A or C/A genotype at position 51 of SEQ ID NO: 13;
Chromosome 5:
(n) a A/A or G/A genotype at position 51 of SEQ ID NO: 14; (o) a C/C or T/C genotype at position 51 of SEQ ID NO: 15; (p) a G/G or G/A genotype at position 51 of SEQ ID NO: 16; (q) a T/T or T/C genotype at position 51 of SEQ ID NO: 17; (r) a C/C or T/C genotype at position 51 of SEQ ID NO: 18; (s) a T/T or G/T genotype at position 51 of SEQ ID NO: 19; (t) a T/T or C/T genotype at position 51 of SEQ ID NO: 20; (u) a T/T or G/T genotype at position 51 of SEQ ID NO: 21; (v) a T/T or C/T genotype at position 51 of SEQ ID NO: 22; (w) a G/G or C/G genotype at position 51 of SEQ ID NO: 23; (x) a A/A or G/A genotype at position 51 of SEQ ID NO: 24; (y) a C/C or T/C genotype at position 51 of SEQ ID NO: 25; (z) a C/C or C/T genotype at position 51 of SEQ ID NO: 26; (aa) a T/T or T/A genotype at position 51 of SEQ ID NO: 27; (ah) a G/G or A/G genotype at position 51 of SEQ ID NO: 28; (ac) a G/G or A/G genotype at position 51 of SEQ ID NO: 29; (ad) a T/T or C/T genotype at position 51 of SEQ ID NO: 30; (ae) a T/T or T/C genotype at position 51 of SEQ ID NO: 31; (af) a A/A or G/A genotype at position 51 of SEQ ID NO: 32; (ag) a C/C or T/C genotype at position 51 of SEQ ID NO: 33; (ah) a T/T or C/T genotype at position 51 of SEQ ID NO: 34; (ai) a C/C or C/T genotype at position 51 of SEQ ID NO: 35; (aj) a G/G or G/C genotype at position 51 of SEQ ID NO: 36; (ak) a T/T or A/T genotype at position 51 of SEQ ID NO: 37; (al) a T/T or T/A genotype at position 51 of SEQ ID NO: 38; (am) a A/A or C/A genotype at position 51 of SEQ ID NO: 39; (an) a A/A or G/A genotype at position 51 of SEQ ID NO: 40; (ao) a T/T or T/C genotype at position 51 of SEQ ID NO: 41; (ap) a A/A or A/C genotype at position 51 of SEQ ID NO: 42; (aq) a G/G or G/T genotype at position 51 of SEQ ID NO: 43; (ar) a T/T or T/C genotype at position 51 of SEQ ID NO: 44; (as) a T/T or C/T genotype at position 51 of SEQ ID NO: 45; (at) a A/A or A/T genotype at position 51 of SEQ ID NO: 46; (an) a
G/G or A/G genotype at position 51 of SEQ ID NO: 47; (av) a A/ A or G/A genotype at position 51 of SEQ ID NO: 48; (aw) a G/G or A/G genotype at position 51 of SEQ ID NO: 49; (ax) a T/T or C/T genotype at position 51 of SEQ ID NO: 50; (ay) a C/C or T/C genotype at position 51 of SEQ ID NO: 51; (az) a C/C or T/C genotype at position 51 of SEQ ID NO: 52; (ba) a C/C or T/C genotype at position 51 of SEQ ID NO: 53; (bb) a A/A or A/T genotype at position 51 of SEQ ID NO: 54; (be) a A/A or A/T genotype at position 51 of SEQ ID NO: 55; (bd) a A/A or G/A genotype at position 51 of SEQ ID NO: 56; (be) a A/A or G/A genotype at position 51 of SEQ ID NO: 57; (bf) a C/C or T/C genotype at position 51 of SEQ ID NO: 58; (bg) a T/T or T/C genotype at position 51 of SEQ ID NO: 59; (bh) a T/T or T/A genotype at position 51 of SEQ ID NO: 60; (bi) a T/T or A/T genotype at position 51 of SEQ ID NO: 61; (bj) a G/G or G/A genotype at position 51 of SEQ ID NO: 62; (bk) a A/A or G/A genotype at position 51 of SEQ ID NO: 63; (bl) a T/T or C/T genotype at position 51 of SEQ ID NO: 64; (bm) a T/T or T/A genotype at position 51 of SEQ ID NO: 65; (bn) a T/T or C/T genotype at position 51 of SEQ ID NO: 66; (bo) a T/T or A/T genotype at position 51 of SEQ ID NO: 67; (bp) a T/T or A/T genotype at position 51 of SEQ ID NO: 68; (bq) a T/T or C/T genotype at position 51 of SEQ ID NO: 69; (br) a G/G or A/G genotype at position 51 of SEQ ID NO: 70; (bs) a A/A or T/A genotype at position 51 of SEQ ID NO: 71; (bt) a G/G or A/G genotype at position 51 of SEQ ID NO: 72; (bu) a C/C or C/T genotype at position 51 of SEQ ID NO: 73;
Chromosome 6:
(bv) a C/C or A/C genotype at position 51 of SEQ ID NO: 74; (bw) a T/T or C/T genotype at position 51 of SEQ ID NO: 75; (bx) a C/C or G/C genotype at position 51 of SEQ ID NO: 76; (by) a T/T or C/T genotype at position 51 of SEQ ID NO: 77; (bz) a G/G or A/G genotype at position 51 of SEQ ID NO: 78; (ca) a A/A or G/A genotype at position 51 of SEQ ID NO: 79; (cb) a C/C or T/C genotype at position 51 of SEQ ID NO: 80; (cc) a A/A or T/A genotype at position 51 of SEQ ID NO: 81; (cd) a C/C or T/C genotype at position 51 of SEQ ID NO: 82; (ce) a G/G or A/G genotype at position 51 of SEQ ID NO: 83; (cf) a C/C or T/C genotype at position 51 of SEQ ID NO: 84; (eg) a T/T or A/T genotype at position 51 of SEQ ID NO: 85; (ch) a A/A or A/G genotype at position 51 of SEQ ID NO: 86; (ci) a A/A or G/A genotype at position 51 of SEQ ID NO: 87; (cj) a C/C or C/A genotype at position 51 of SEQ ID NO: 88; (ck) a T/T or C/T genotype at position 51 of SEQ ID NO: 89; (cl) a C/C or G/C genotype at position 51 of SEQ ID NO: 90; (cm) a G/G or T/G genotype at position 51 of SEQ ID NO: 91; (cn) a T/T or G/T genotype at position 51 of SEQ ID NO: 92;
Chromosome 8:
(co) a A/A or A/T genotype at position 51 of SEQ ID NO: 93; (cp) a G/G or A/G genotype at position 51 of SEQ ID NO: 94; (cq) a T/T or C/T genotype at position 51 of SEQ ID NO: 95; (cr) a C/C or T/C genotype at position 51 of SEQ ID NO: 96;
Chromosome 9:
(cs) a C/C or G/C genotype at position 51 of SEQ ID NO: 97;
Chromosome X:
(ct) a C/C or T/C genotype at position 51 of SEQ ID NO: 98; (cu) a T/T or T/C genotype at position 51 of SEQ ID NO: 99.
Methods of analyzing/detecting genetic markers (e.g., SNPs) that are suitable for use in the methods disclosed herein have been described, and can include amplification of a target polynucleotide (e.g., by PCR). PCR uses a particular amplification primer pair that specifically hybridize to a target polynucleotide and produce an amplification product (the amplicon). Primers can be designed such that the amplicon can contain a nucleic acid polymorphism of interest. Methods for designing PCR primers and PCR conditions have been described, for example, in Sambrook et al. (2014) Molecular Cloning: A Laboratory Manual (Fourth Edition, Cold Spring Harbor Laboratory Press, Plainview, N.Y.). It is understood that a number of parameters in a specific PCR protocol may need to be adjusted to specific laboratory conditions and may be slightly modified and yet allow for the collection of similar results. The primers can be radiolabeled, or labeled by any suitable means (e.g., using a non-radioactive fluorescent tag), to allow for rapid visualization of the different size amplicons following an amplification reaction without any additional labeling step or visualization step.
Other examples of suitable nucleic acid amplification methods include, but are not limited to, reversetranscription PCR (RT-PCR), quantitative real-time PCR (qPCR), quantitative real-time reverse transcriptase PCR (qRT-PCR) (see, e.g., Adams, A beginner’s guide to RT-PCR, qPCR and RT-qPCR, Biochemist (Lond) (2020) 42(3): 48-53), isothermal amplification methods (see, e.g., Zanoli et al., Biosensors (2013) 3(1): 18-43), nucleic acid sequence-based amplification (NASBA) (see, e.g., Deiman and Sillekens, Mol Biotechnol (2002) 20(2): 163-79), loop-mediated isothermal amplification (LAMP) (see, e.g., Notomi et al., (2000) Nucleic Acids Res. 28(12): e63), helicase-dependent amplification (HDA) (see, e.g., Cao et al., Helicase-dependent amplification of nucleic acids, Curr Protoc Mol Biol, 104:15.11.1-15.11.12, 2013), rolling circle amplification (RCA) (see, e.g, Yao et al. Nature Protocols (2021) 16, 5460-5483), multiple displacement amplification (MDA) (see, e.g, Spits et al. Nature Protocols (2006) 1 : 1965-1970), recombinase polymerase amplification (RPA) (see, e.g., Lobato et al., Trends Analyt Chem (2018) 98: 19-35), ligase chain reaction (LCR) (see e.g., Gibriel and Adel, Mutat Res Rev Mutat Res. (2017) 773: 66-90), transcription amplification (see e.g., Kwoh et al. (1989) Proc. Natl. Acad. Sci. USA 86: 1173), self-sustained sequence replication (see e.g., Guatelli et al. (1990) Proc. Natl. Acad. Sci. USA 87: 1874), dot PCR, and linker adapter PCR. Further suitable amplification methods are also described, for example, in Sambrook et al. (2014) Molecular Cloning: A Laboratory Manual (Fourth Edition, Cold Spring Harbor Laboratory Press, Plainview, N.Y.).
In some aspects, amplification produces an amplicon that is at least 20 nucleotides in length, for example, at least 50 nucleotides in length, at least 100 nucleotides in length, at least 200 nucleotides in length, at least 300, at least 400, at least 500, at least 600, at least 700, at least 800, at least 900, at least 1000, at least 1500, at least 2000, or at least 2500 nucleotides in length. In some aspects, the amplicon is no longer than
10000 nucleotides in length, for example, no longer than 3000, no longer than 5000, no longer than 7000, or no longer than 9000 nucleotides. In some aspects, marker amplification produces an amplicon that is 20 to 10000 nucleotides in length, for example, 20 to 9000 nucleotides, 20 to 8000 nucleotides, 20 to 7000 nucleotides, 20 to 6000 nucleotides, 20 to 5000 nucleotides, 20 to 4000 nucleotides, 20 to 3000 nucleotides, 20 to 2000 nucleotides, 20 to 1500 nucleotides, 20 to 1000 nucleotides, 20 to 500 nucleotides, 20 to 400 nucleotides, 20 to 300 nucleotides, 20 to 200 nucleotides, 20 to 150 nucleotides, 20 to 100 nucleotides, 20 to 50 nucleotides, 50 to 9000 nucleotides, 50 to 8000 nucleotides, 50 to 7000 nucleotides, 50 to 6000 nucleotides, 50 to 5000 nucleotides, 50 to 4000 nucleotides, 50 to 3000 nucleotides, 50 to 2000 nucleotides, 50 to 1000 nucleotides, 50 to 500 nucleotides, 50 to 400 nucleotides, 50 to 300 nucleotides, 50 to 200 nucleotides, 50 to 150 nucleotides, 50 to 100 nucleotides, 100 to 9000 nucleotides, 100 to 8000 nucleotides, 100 to 7000 nucleotides, 100 to 6000 nucleotides, 100 to 5000 nucleotides, 100 to 4000 nucleotides, 100 to 3000 nucleotides, 100 to 2000 nucleotides,
100 to 1000 nucleotides, 100 to 500 nucleotides, 100 to 400 nucleotides, 100 to 300 nucleotides, 100 to 200 nucleotides, 100 to 150 nucleotides, 250 to 9000 nucleotides, 250 to 8000 nucleotides, 250 to 7000 nucleotides, 250 to 6000 nucleotides, 250 to 5000 nucleotides, 250 to 4000 nucleotides, 250 to 3000 nucleotides, 250 to 2000 nucleotides, 250 to 1000 nucleotides, 250 to 500 nucleotides, 250 to 400 nucleotides, or 250 to 300 nucleotides. In some aspects, the amplicon is 100 to 4000 nucleotides. In some aspects, the amplicon is 200 to 3000 nucleotides. In some aspects, the amplicon is at least 51 nucleotides. In some aspects, the amplicon is at least
101 nucleotides.
The presence of a nucleic acid polymorphism in an amplicon can be determined (detected), for example, by directly sequencing the amplicon, performing a restriction enzyme digest (e.g, restriction fragment length polymorphism (RFLP)), or by using a detection probe. In some implementations, detection includes using PCR, quantitative PCR (qPCR), reverse-transcription PCR (RT-PCR), quantitative real-time reverse transcriptase PCR (qRT-PCR), and/or sequencing methods. In some aspects, detection includes using PCR, quantitative PCR (qPCR), and/or sequencing based detection methods.
PCR detection and quantification using dual-labeled fluorogenic oligonucleotide probes, commonly referred to as “TaqMan™” probes, can also be performed according to the present disclosure. These probes are composed of short (e.g., 20-25 base) oligodeoxynucleotides that are labeled with two different fluorescent dyes. On the 5' terminus of each probe is a reporter dye, and on the 3' terminus of each probe a quenching dye is found. The oligonucleotide probe sequence is complementary to an internal target sequence present in a PCR amplicon. When the probe is intact, energy transfer occurs between the two fluorophores and emission from the reporter is quenched by the quencher by FRET. During the extension phase of PCR, the probe is cleaved by 5' nuclease activity of the polymerase used in the reaction, thereby releasing the reporter from the oligonucleotide - qucnchcr and producing an increase in reporter emission intensity. TaqMan™ probes arc oligonucleotides that have a label and a quencher, where the label is released during amplification by the exonuclease action of the polymerase used in amplification, providing a real time measure of amplification during synthesis. A variety of
TaqMan™ reagents are commercially available, e.g., from Applied Biosystems as well as from a variety of specialty vendors such as Biosearch Technologies.
In some implementations, detecting a nucleic acid polymorphism includes use of an oligonucleotide primer or probe. In general, synthetic methods for making oligonucleotides, including probes or primers are known. For example, oligonucleotides can be synthesized chemically according to the solid phase phosphoramidite triester method. Oligonucleotides, including modified oligonucleotides, can also be ordered from a variety of commercial sources. Nucleic acid probes to the marker loci can be cloned and/or synthesized. Any suitable label can be used with a probe. Detectable labels suitable for use with nucleic acid probes include, for example, any composition detectable by spectroscopic, radioisotopic, photochemical, biochemical, immunochemical, electrical, optical, or chemical means. Useful labels include biotin for staining with labeled streptavidin conjugate, magnetic beads, fluorescent dyes, radio labels, enzymes, and colorimetric labels. Other labels include ligands which bind to antibodies labeled with fluorophores, chemiluminescent agents, and enzymes. A probe can also constitute radio labeled PCR primers that are used to generate a radio labeled amplicon. It is not intended that the nucleic acid probes be limited to any particular size, however, nucleic acid probes are typically 20-100 base pairs.
Amplification is not always required for detection of a nucleic acid polymorphism (e.g. Southern blotting or RFLP detection). Separate detection probes can also be omitted in amplification/detection methods, e.g., by performing a real time amplification reaction that detects product formation by modification of the relevant amplification primer upon incorporation into a product, incorporation of labeled nucleotides into an amplicon, or by monitoring changes in molecular rotation properties of amplicons as compared to unamplified precursors (e.g., by fluorescence polarization).
In some implementations, the nucleic acid polymorphism is detected by sequencing a nucleic acid fragment comprising a target sequence of interest, or by whole genome sequencing (or whole transcriptome sequencing). Non-limiting examples of suitable sequencing methods include capillary electrophoresis (e.g., Sanger sequencing) and high-throughput sequencing (e.g., Illumina® or 454 Sequencing®). High-throughput sequencing includes short read or long read techniques. In some implementations, sequencing includes whole genome sequencing (e.g., sequencing the genome of a Cannabis plant of interest). In some aspects, sequencing includes targeted sequencing (sequencing of a particular nucleic acid or amplicon of interest). In some aspects, sequencing includes sequencing a transcriptome (RNA-Seq) (e.g., sequencing the transcriptome of a Cannabis plant selected or produced by a method disclosed herein). In some implementations, sequencing does not include sequencing of RNA. In some implementations, the genome is sequenced.
In some aspects, the methods disclosed herein include a step wherein a Cannabis plant including one or more markers that indicate modified terpene content as disclosed herein is identified and/or selected. In some aspects, the Cannabis plant including one or more markers that indicate modified terpene content is selected for further analysis, propagation, crossing, or to make a product (e.g., a kief, hashish, bubble hash, an edible
product, solvent reduced oil, sludge, e-juice, or tincture). The product is not, or excludes, any naturally occurring products.
In some aspects, the methods disclosed herein include a step wherein a Cannabis plant identified as including one or more markers that indicate modified terpene content is crossed (e.g., selfed, sibling crossed, outcrossed, or backcrossed). In some aspects, crossing includes marker-assisted selection (MAS) for at least two generations. In some aspects, progeny plants comprising the one or more markers are obtained from the cross. In some aspects, progeny plants including the one or more markers have modified terpene content (e.g., increased terpene content) relative to a control, for example (and without limitation), a sibling progeny plant that does not include the one or more markers indicating modified terpene content, or a parent plant that does not include the one or more markers indicating modified terpene content. Suitable methods of crossing are further described herein, for example, in section V, “Plant Breeding.”
Cannabis plants identified, selected, or produced by a method disclosed herein are encompassed by this disclosure, as well as material derived from such plants (e.g., a plant part), including seed, tissue, or cells (including protoplasts); and progeny of the plant (e.g., F1-F7, for example, Fl and/or F2 progeny). In some aspects, the Cannabis plant is Cannabis sativa, Cannabis indica, or Cannabis ruderaJis. In a non-limiting example, the Cannabis plant is Cannabis sativa. In some implementations, the plant includes one or more genetic markers indicating increased terpene content disclosed herein.
V. Plant Breeding
The plants disclosed herein, including plants identified, selected, or produced by a method disclosed herein, can be used for plant breeding (e.g., crossing). In some aspects, a plant disclosed herein is used to develop new, unique, and superior variety or hybrid with a desired phenotype (e.g., increased terpene production/content) .
Details of existing Cannabis plants varieties and breeding methods are described in Potter et al. (2011, World Wide Weed: Global Trends in Cannabis Cultivation and Its Control), Holland (2010, The Pot Book: A Complete Guide to Cannabis, Inner Traditions/Bear & Co, ISBN1594778981, 9781594778988), Green I (2009, The Cannabis Grow Bible: The Definitive Guide to Growing Marijuana for Recreational and Medical Use, Green Candy Press, 2009, ISBN 1931160589, 9781931160582), Green II (2005, The Cannabis Breeder's Bible: The Definitive Guide to Marijuana Genetics, Cannabis Botany and Creating Strains for the Seed Market, Green Candy Press, 1931160279, 9781931160278), Starks (1990, Marijuana Chemistry: Genetics, Processing & Potency, ISBN 0914171399, 9780914171393), Clarke (1981, Marijuana Botany, an Advanced Study: The Propagation and Breeding of Distinctive Cannabis, Ronin Publishing, ISBN 091417178X, 9780914171782), Short (2004, Cultivating Exceptional Cannabis: An Expert Breeder Shares His Secrets, ISBN 1936807122, 9781936807123), Cervantes (2004, Marijuana Horticulture: The Indoor/Outdoor Medical Grower's Bible, Van Patten Publishing, ISBN 187882323X, 9781878823236), Franck et al. (1990, Marijuana Grower's Guide, Red
Eye Press, ISBN 0929349016, 9780929349015), Grotenhermen and Russo (2002, Cannabis and Cannabinoids: Pharmacology, Toxicology, and Therapeutic Potential, Psychology Press, ISBN 0789015080, 9780789015082), Rosenthal (2007, The Big Book of Buds: More Marijuana Varieties from the World's Great Seed Breeders, ISBN 1936807068, 9781936807062), Clarke, RC (Cannabis: Evolution and Ethnobotany 2013 (In press)), King, J (Cannabible Vols 1-3, 2001-2006), and four volumes of Rosenthal's Big Book of Buds series (2001, 2004, 2007, and 2011).
The development of commercial Cannabis cultivars requires the development of Cannabis varieties, the crossing of these varieties, and the evaluation of the crosses. Pedigree breeding and recurrent selection breeding methods may be used to develop cultivars from breeding populations. Breeding programs may combine desirable traits from two or more varieties or various broad-based sources into breeding pools from which cultivars are developed by selfing and selection of desired phenotypes. The new cultivars may be crossed with other varieties and the hybrids from these crosses are evaluated to determine which have commercial potential.
In some implementations, a plant identified, selected, or produced by a method disclosed herein is crossed. Exemplary types of crosses include selfing, sibling crossing, outcrossing, and backcrossing. Suitable methods of crossing are disclosed herein.
Pedigree selection, where both single plant selection and mass selection practices are employed, may be used for the generation of new varieties. Pedigree breeding is used commonly for the improvement of selfpollinating crops or inbred lines of cross-pollinating crops. Two parents which possess favorable, complementary traits are crossed to produce an Fl. An F2 population is produced by selfing one or several Fl’s or by intercrossing two Fl's (sib mating). Selection of the best individuals usually begins in the F2 population; then, beginning in the F3, the best individuals in the best families are usually selected. Replicated testing of families, or hybrid combinations involving individuals of these families, often follows in the F4 generation to improve the effectiveness of selection for traits with low heritability. At an advanced stage of inbreeding (e.g., F6 and F7), the best lines or mixtures of phenotypically similar lines are tested for potential release as new cultivars.
Choice of breeding or selection methods depends on the mode of plant reproduction, the heritability of the trait(s) being improved, and the type of cultivar used commercially (e.g., Fl hybrid cultivar, pureline cultivar, etc.). For highly heritable traits, a choice of superior individual plants evaluated at a single location will be effective, whereas for traits with low heritability, selection should be based on mean values obtained from replicated evaluations of families of related plants. Popular selection methods commonly include pedigree selection, modified pedigree selection, mass selection, and recurrent selection.
Mass and recurrent selections can be used to improve populations of either self- or cross-pollinating crops. A genetically variable population of heterozygous individuals may be identified or created by intercrossing several different parents. The best plants may be selected based on individual superiority,
outstanding progeny, or excellent combining ability. Preferably, the selected plants are intercrossed to produce a new population in which further cycles of selection are continued.
Backcross breeding has been used to transfer genes for a simply inherited, highly heritable trait into a desirable homozygous cultivar or line that is the recurrent parent. The source of the trait to be transferred is called the donor parent. The resulting plant is expected to have the attributes of the recurrent parent (e.g., cultivar) and the desirable trait transferred from the donor parent. After the initial cross, individuals possessing the phenotype of the donor parent may be selected and repeatedly crossed (backcrossed) to the recurrent parent. The resulting plant is expected to have the attributes of the recurrent parent (e.g., cultivar) and the desirable trait transferred from the donor parent.
A single-seed descent procedure refers to planting a segregating population, harvesting a sample of one seed per plant, and using the one-seed sample to plant the next generation. When the population has advanced from the F2 to the desired level of inbreeding, the plants from which lines are derived will each trace to different F2 individuals. The number of plants in a population declines each generation due to failure of some seeds to germinate or some plants to produce at least one seed. As a result, not all of the F2 plants originally sampled in the population will be represented by a progeny when generation advance is completed.
Mutation breeding is another method of introducing new traits into Cannabis varieties. Mutations that occur spontaneously or are artificially induced can be useful sources of variability for a plant breeder. The goal of artificial mutagenesis is to increase the rate of mutation for a desired characteristic. Mutation rates can be increased by many different means including temperature, long-term seed storage, tissue culture conditions, radiation (such as X-rays, Gamma rays, neutrons, Beta radiation, or ultraviolet radiation), chemical mutagens (such as base analogs like 5-bromo-uracil), antibiotics, alkylating agents (such as sulfur mustards, nitrogen mustards, epoxides, ethyleneamines, sulfates, sulfonates, sulfones, or lactones), azide, hydroxylamine, nitrous acid or acridines. Once a desired trait is observed through mutagenesis the trait may then be incorporated into existing germplasm by traditional breeding techniques. Details of mutation breeding can be found, for example, in Principles of Cultivar Development by Fehr, Macmillan Publishing Company, 1993.
The complexity of inheritance also influences the choice of the breeding method. Backcross breeding may be used to transfer one or a few favorable genes for a highly heritable trait into a desirable cultivar. This approach has been used extensively for breeding disease-resistant cultivars. Various recurrent selection techniques are used to improve quantitatively inherited traits controlled by numerous genes. The use of recurrent selection in self -pollinating crops depends on the ease of pollination, the frequency of successful hybrids from each pollination, and the number of hybrid offspring from each successful cross.
Additional breeding methods are available, e.g., methods discussed in Chahal and Gosal (Principles and procedures of plant breeding: biotechnological and conventional approaches, CRC Press, 2002, ISBN 084931321X, 9780849313219), Taji et al. (In vitro plant breeding, Routledge, 2002, ISBN 156022908X, 9781560229087), Richards (Plant breeding systems, Taylor & Francis US, 1997, ISBN 0412574500,
9780412574504), Hayes (Methods of Plant Breeding, Publisher: READ BOOKS, 2007, ISBN1406737062, 9781406737066).
The production of double haploids can also be used for the development of homozygous varieties in a breeding program. Double haploids are produced by the doubling of a set of chromosomes from a heterozygous plant to produce a completely homozygous individual (e.g., see Wan et al., Theor. Appl. Genet., 77:889-892, 1989).
Some implementations of the methods disclosed herein include marker assisted selection (MAS). MAS is a powerful shortcut to selecting for desired phenotypes and for introgressing desired traits into cultivars (e.g., introgressing desired traits into elite lines). MAS is easily adapted to high throughput molecular analysis methods that can quickly screen large numbers of plant or germplasm genetic material for the markers of interest and is much more cost effective than raising and observing plants for visible traits. Thus, MAS can be used in the methods disclosed herein to produce plants with desired traits (e.g., increased terpene content).
VI. Plants and Products
Also disclosed are Cannabis plants that have modified terpene content (e.g., increased or decreased terpene levels relative to a control) made by any of the methods disclosed herein. Material derived from the Cannabis plants, including seed, tissue, or cells (including protoplasts); or progeny of the plant, such as Fl or F2 progeny, are encompassed by this disclosure. The Cannabis plant can be Cannabis sativa, Cannabis indica, or Cannabis ruderalis. In a non-limiting example, the Cannabis plant is Cannabis sativa.
Further disclosed are products made or derived from a Cannabis plant identified, selected, or produced by a method disclosed herein. The product may be any product known in the Cannabis arts, and can include, but is not limited to, extracts, a kief, hashish, bubble hash, an edible product, a flower, a seed, solvent reduced oil, sludge, e-juice, or tincture. Kief refers to a composition of concentrated Cannabis trichomes, which are accumulated by being sifted from Cannabis flowers or buds using a mesh screen or sieve. Hashish (or hash) refers to a compressed or purified preparation from Cannabis tissue containing trichomes (e.g., flowers).
Bubble hash refers to a solid concentr ation of Cannabis trichomes made from a solventless extraction method. As used herein, Cannabis sludges are solvent-free Cannabis extracts made via multigas extraction including the refrigerant 134A, butane, iso-butane and propane in a ratio that delivers a very complete and balanced extraction of cannabinoids and essential oils. E-juice (vape juice) refers to a liquid composition for use in an e-cigarette. A tincture refers to an alcohol-based extract, for example, an extract of Cannabis tissue dissolved in an alcohol. A practitioner can readily determine suitable known methods for producing any products disclosed herein.
The product can be formulated for administration to a subject (e.g., a human), such as by an injection (e.g., intravenous, subcutaneous, intramuscular, parenteral), or by topical, oral, or pulmonary administration. In some aspects, the product is a recreational product. In some aspects, the product is a therapeutic product (e.g., medicament).
In some examples, the composition is for pulmonary administration. The compositions include, but are not limited to, dry powder compositions consisting of the powder of a Cannabis oil described herein, and the powder of a suitable carrier and/or lubricant. The compositions for pulmonary administration can be inhaled from any suitable dry powder inhaler device. In certain instances, the compositions may be conveniently delivered in the form of an aerosol spray from pressurized packs or a nebulizer, with the use of a suitable propellant, for example, dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide, or other suitable gas. In the case of a pressurized aerosol, the dosage unit can be determined by providing a valve to deliver a metered amount. Capsules and cartridges of, for example, gelatin for use in an inhaler or insufflator can be formulated containing a powder mix of the compound(s) and a suitable powder base, for example, lactose or starch.
For oral administration, a composition can take the form of, e.g., a tablet or a capsule prepared by conventional means with a pharmaceutically acceptable excipient. Preferred are tablets and gelatin capsules comprising the active ingredient(s), together with (a) diluents or fillers, e.g.. lactose, dextrose, sucrose, mannitol, maltodextrin, lecithin, agarose, xanthan gum, guar gum, sorbitol, cellulose (e.g., ethyl cellulose, microcrystalline cellulose), glycine, pectin, polyacrylates and/or calcium hydrogen phosphate, calcium sulfate, (b) lubricants; e.g., silica, anhydrous colloidal silica, talcum, stearic acid, its magnesium or calcium salt (e.g., magnesium stearate or calcium stearate), metallic stearates, colloidal silicon dioxide, hydrogenated vegetable oil, corn starch, sodium benzoate, sodium acetate and/or polyethyleneglycol; for tablets also (c) binders, e.g., magnesium aluminum silicate, starch paste, gelatin, tragacanth, methylcellulose, sodium carboxymethylcellulose, polyvinylpyrrolidone and/or hydroxypropyl methylcellulose; if desired (d) disintegrants, e.g., starches (e.g., potato starch or sodium starch), glycolate, agar, alginic acid or its sodium or potassium salt, or effervescent mixtures; (e) wetting agents, e.g., sodium lauryl sulfate, and/or (f) absorbents, colorants, flavors and sweeteners. Tablets can be either uncoated or coated according to known methods. The excipients described herein can also be used for preparation of buccal dosage forms and sublingual dosage forms (e.g., films and lozenges) as described, for example, in U.S. Pat. Nos. 5,981 ,552 and 8,475,832. Formulation in chewing gums as described, for example, in U.S. Pat. No. 8,722,022, is also contemplated.
Further preparations for oral administration can take the form of, for example, solutions, syrups, suspensions, or toothpastes. Liquid preparations for oral administration can be prepared by conventional means with pharmaceutically acceptable additives, for example, suspending agents, for example, sorbitol syrup, cellulose derivatives, or hydrogenated edible fats; emulsifying agents, for example, lecithin, xanthan gum, or acacia; non-aqueous vehicles, for example, almond oil, sesame oil, hemp seed oil, fish oil, oily esters, ethyl alcohol, or fractionated vegetable oils; and preservatives, for example, methyl or propyl-p-hydroxybenzoates or sorbic acid. The preparations can also contain buffer salts, flavoring, coloring, and/or sweetening agents as appropriate.
Typical formulations for topical administration include creams, ointments, sprays, lotions, hydrocolloid dressings, and patches, as well as eye drops, ear drops, and deodorants. Cannabis extracts/oils can be administered via transdermal patches as described, for example, in U.S. Pat. Appl. Pub. No. 2015/0126595 and
U.S. Pat. No. 8,449,908. Formulation for rectal or vaginal administration is also contemplated. The Cannabis oils can be formulated, for example, as suppositories containing conventional suppository bases such as cocoa butter and other glycerides as described in U.S. Pat. Nos. 5,508,037 and 4,933,363. Compositions can contain other solidifying agents such as shea butter, beeswax, kokum butter, mango butter, illipe butter, tamanu butter, carnauba wax, emulsifying wax, soy wax, castor wax, rice bran wax, and candelilla wax. Compositions can further include clays (e.g., bentonite, French green clays, Fuller's earth, Rhassoul clay, white kaolin clay) and salts (e.g., sea salt, Himalayan pink salt, and magnesium salts such as Epsom salt).
The compositions disclosed herein can be formulated for administration by injection, for example, by bolus injection or continuous infusion. Formulations for injection can be presented in unit dosage form, for example, in ampoules or in multi-dose containers, optionally with an added preservative. Injectable compositions are preferably aqueous isotonic solutions or suspensions, and suppositories are preferably prepared from fatty emulsions or suspensions. The compositions may be sterilized and/or contain adjuvants, such as preserving, stabilizing, wetting or emulsifying agents, solution promoters, salts for regulating the osmotic pressure, buffers, and/or other ingredients. Alternatively, the compositions can be in powder form for reconstitution with a suitable vehicle, for example, a carrier oil, before use. In addition, the compositions may also contain other therapeutic agents or substances.
The compositions can be prepared according to conventional mixing, granulating, and/or coating methods, and contain from about 0.1 to about 75%, for example from about 1% to about 50%, of a Cannabis extract. In general, subjects receiving a Cannabis composition orally are administered doses ranging from about 1 to about 2000 mg of Cannabis extract. A small dose ranging from about 1 to about 20 mg can typically be administered orally when treatment is initiated, and the dose can be increased (e.g., doubled) over a period of days or weeks until the optimal or maximum dose is reached.
V. Nucleic Acids and Kits
Kits for use in research, breeding, or other application are also provided. Such kits may include oligonucleotide probes and/or primers to detect a genetic marker disclosed herein (e.g., any probes or primers disclosed herein). In some aspects, the kit includes seed or germplasm of a Cannabis plant. In some aspects, the kit includes DNA from a Cannabis plant, for example, that is useful as a positive or negative control. In some aspects, the kits include enzymes (e.g., polymerase), dNTPs, enzymatic substrates, reagents for colorimetric or fluorescent detection, buffers, etc. In some implementations, the kit components arc in separate containers. The kit can be used to practice any of the methods disclosed herein. In some implementations, the kit is for detecting a genetic marker (e.g., SNP) or set of genetic markers disclosed herein. The kits may include
instructional materials containing directions (i.e., protocols) for the practice of the methods of this disclosure. While the instructional materials typically include written or printed materials, they are not limited to such. Any medium capable of storing such instructions and communicating them to an end user is contemplated. Such media include, but are not limited to, electronic storage media (e.g., magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), cloud-based media, and the like. Such media may include addresses to internet sites that provide such instructional materials.
Clauses
Clause 1. A method for producing one or more Cannabis plants having modified terpene content, comprising: (i) analyzing one or more genetic markers in a nucleic acid sample from a Cannabis plant or its germplasm; (ii) detecting one or more genetic markers that indicate modified terpene content, (iii) crossing the Cannabis plant comprising the one or more genetic markers indicating modified terpene content, and (iv) obtaining one or more progeny plants comprising the one or more genetic markers indicating modified terpene content, and wherein the one or more progeny plants have modified terpene content relative to a control.
Clause 2. A method for selecting a Cannabis plant having modified terpene content, comprising: (i) analyzing one or more genetic markers in a nucleic acid sample from the Cannabis plant or its germplasm; (ii) detecting one or more genetic markers that indicate modified terpene content; and (iii) selecting the Cannabis plant, thereby selecting the Cannabis plant having modified terpene content.
Clause 3. The method of any one of the prior clauses, wherein the Cannabis plant having modified terpene content is selected for further analysis, propagation, crossing, or to make a product.
Clause 4. The method of any one of the prior clauses, further comprising crossing the Cannabis plant having modified terpene content and producing one or more progeny plants having modified terpene content.
Clause 5. The method of any one of the prior clauses, wherein analyzing comprises using PCR, quantitative PCR (qPCR), and/or sequencing; and/or detecting comprises using an oligonucleotide primer set or probe. Clause 6. The method of any one of the prior clauses, wherein crossing comprises selfing, sibling crossing, outcrossing, or backcrossing.
Clause 7. The method of any one of the prior clauses, wherein the selfing, sibling crossing, outcrossing, or backcrossing comprises marker-assisted selection for at least two generations.
Clause 8. The method of any one of the prior clauses, wherein analyzing one or more genetic markers in the nucleic acid sample comprises analyzing one or more of nucleotide positions: 8,871,401; 8,886,933; 9,101,934; 10,446,475; 10,543,062; 10,561,778; 10,633,191 ; 10,934,458; 11,169,492; or 13,920,896 on chromosome 1; 93,291,929 on chromosome 2; 47,140,085 on chromosome 3; 72,717,623 on chromosome 4; 330,918; 516,340; 518,238; 523,626; 608,718; 755,967; 1,100,981; 1,109,162; 1,331,433; 1,353,878; 1,366,137; 1,386,965; 1,487,633; 1,745,101; 1,828,050; 1,837,343; 1,840,325; 1,929,134; 2,003,303; 2,038,965; 2,072,869; 2,120,881; 2,208,629; 2,288,919; 2,291,467; 2,302,063; 2,318,276; 2,339,956; 2,346,000; 2,360,380; 2,364,964; 2,366,529;
2,534,579; 2,698,301 ; 2,774.108; 2,780,345; 3,081,773; 3,247,341; 3,485,895; 3,503,143; 3,564,387; 3,585,965; 3,599,637; 3,629,225; 3,704,632; 3,807,710; 3,842,906; 4,384,123; 4,391,586; 5,854,661; 7,307,552;
11,993,646; 12,418,741; 12,446,524; 32,342,917; 32,395,736; 55,475,322; 62,911,168; 66,477,802; or 74,391,606 on chromosome 5; 1,220,207; 1,288,012; 1,695,817; 1,727,397; 1,960,918; 1,999,618; 2,012,149; 2,931,923; 3,073,845; 3,091,941; 3,185,660; 5,175,087; 5,468,920; 5,868,053; 6,061,359; 6,120,135; 6,311,954; 6,589,961; or 64,943,914 on chromosome 6; 14,069,586; 14,329,191; 14,866,064; or 33,592,849 on chromosome 8; 55,114,152 on chromosome 9; or 57,912,635 or 58,545,628 on chromosome x, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. Clause 9. The method of any one of the prior clauses, wherein the genetic markers that indicate modified terpene content comprise one or more of:
Chromosome 1:
(a) a ITT or C/T genotype at position 8871401; (b) a G/G or T/G genotype at position 8886933;
(c) a T/T or G/T genotype at position 9101934; (d) a G/G or C/G genotype at position 10446475; (e) a A/A or C/A genotype at position 10543062; (f) a G/G or T/G genotype at position 10561778; (g) a C/C or C/A genotype at position 10633191; (h) a A/A or T/A genotype at position 10934458; (i) a A/A or G/A genotype at position 11169492; (j) a C/C or T/C genotype at position 13920896;
Chromosome 2:
(k) a A/A or T/A genotype at position 93291929;
Chromosome 3:
(l) a C/C or A/C genotype at position 47140085;
Chromosome 4:
(m) a A/A or C/A genotype at position 72717623;
Chromosome 5:
(n) a A/A or G/A genotype at position 330918; (o) a C/C or T/C genotype at position 516340;
(p) a G/G or G/A genotype at position 518238; (q) a T/T or T/C genotype at position 523626; (r) a C/C or T/C genotype at position 608718; (s) a T/T or G/T genotype at position 755967; (t) a T/T or C/T genotype at position 1100981; (u) a T/T or G/T genotype at position 1109162; (v) a T/T or C/T genotype at position 1331433; (w) a G/G or C/G genotype at position 1353878; (x) a A/A or G/A genotype at position 1366137; (y) a C/C or T/C genotype at position 1386965; (z) a C/C or C/T genotype at position 1487633; (aa) a T/T or T/A genotype at position 1745101; (ab) a G/G or A/G genotype at position 1828050; (ac) a G/G or A/G genotype at position 1837343; (ad) a T/T or C/T genotype at position 1840325; (ae) a T/T or T/C genotype at position 1929134; (af) a A/A or G/A genotype at position 2003303; (ag) a C/C or T/C genotype at position 2038965; (ah) a T/T or C/T genotype at position 2072869; (ai) a C/C or C/T genotype at position 2120881; (aj) a G/G or G/C genotype at position 2208629; (ak) a T/T or A/T genotype at position 2288919; (al) a T/T or T/A
genotype at position 2291467; (am) a A/A or C/A genotype at position 2302063; (an) a A/A or G/A genotype at position 2318276; (ao) a T/T or T/C genotype at position 2339956; (ap) a A/A or A/C genotype at position 2346000; (aq) a G/G or G/T genotype at position 2360380; (ar) a T/T or T/C genotype at position 2364964; (as) a T/T or C/T genotype at position 2366529; (at) a A/A or A/T genotype at position 2534579; (an) a G/G or A/G genotype at position 2698301; (av) a A/A or G/A genotype at position 2774108; (aw) a G/G or A/G genotype at position 2780345; (ax) a T/T or C/T genotype at position 3081773; (ay) a C/C or T/C genotype at position 3247341; (az) a C/C or T/C genotype at position 3485895; (ba) a C/C or T/C genotype at position 3503143; (bb) a A/A or A/T genotype at position 3564387; (be) a A/A or A/T genotype at position 3585965; (bd) a A/A or G/A genotype at position 3599637; (be) a A/A or G/A genotype at position 3629225; (bf) a C/C or T/C genotype at position 3704632; (bg) a T/T or T/C genotype at position 3807710; (bh) a T/T or T/A genotype at position 3842906; (bi) a T/T or A/T genotype at position 4384123; (bj) a G/G or G/A genotype at position 4391586; (bk) a A/A or G/A genotype at position 5854661; (bl) a T/T or C/T genotype at position 7307552; (bm) a T/T or T/A genotype at position 11993646; (bn) a T/T or C/T genotype at position 12418741; (bo) a T/T or A/T genotype at position 12446524; (bp) a T/T or A/T genotype at position 32342917; (bq) a T/T or C/T genotype at position 32395736; (br) a G/G or A/G genotype at position 55475322; (bs) a A/A or T/A genotype at position 62911168; (bt) a G/G or A/G genotype at position 66477802; (bu) a C/C or C/T genotype at position 74391606;
Chromosome 6:
(bv) a C/C or A/C genotype at position 1220207; (bw) a T/T or C/T genotype at position 1288012; (bx) a C/C or G/C genotype at position 1695817; (by) a T/T or C/T genotype at position 1727397; (bz) a G/G or A/G genotype at position 1960918; (ca) a A/A or G/A genotype at position 1999618; (cb) a C/C or T/C genotype at position 2012149; (cc) a A/A or T/A genotype at position 2931923; (cd) a C/C or T/C genotype at position 3073845; (ce) a G/G or A/G genotype at position 3091941 ; (cf) a C/C or T/C genotype at position 3185660; (eg) a T/T or A/T genotype at position 5175087; (ch) a A/A or A/G genotype at position 5468920; (ci) a A/A or G/A genotype at position 5868053; (cj) a C/C or C/A genotype at position 6061359; (ck) a T/T or C/T genotype at position 6120135; (cl) a C/C or G/C genotype at position 6311954; (cm) a G/G or T/G genotype at position 6589961; (cn) a T/T or G/T genotype at position 64943914;
Chromosome 8:
(co) a A/A or A/T genotype at position 14069586; (cp) a G/G or A/G genotype at position 14329191; (cq) a T/T or C/T genotype at position 14866064; (cr) a C/C or T/C genotype at position 33592849;
Chromosome 9:
(cs) a C/C or G/C genotype at position 55114152;
Chromosome X:
(ct) a C/C or T/C genotype at position 57912635; or (cu) a T/T or T/C genotype at position 58545628.
Clause 10. The method of any one of the prior clauses, wherein analyzing one or more genetic markers in the nucleic acid sample comprises:
(a) analyzing one or more of nucleotide positions: 93,291,929 on chromosome 2; 72,717,623 on chromosome 4; 55,114,152 on chromosome 9; 57,912,635 on chromosome x; and/or 58,545,628 on chromosome x, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1;
(b) analyzing one or more of nucleotide positions: 93,291,929 on chromosome 2; 47,140,085 on chromosome 3; 72,717,623 on chromosome 4; 3,807,710 on chromosome 5; 3,842,906 on chromosome 5; 55,475,322 on chromosome 5; and/or 33,592,849 on chromosome 8, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1;
(c) analyzing one or more of nucleotide positions 1,100,981; 1,840,325; 2,366,529; 2,698,301; 3,081,773; 3,485,895; 3,585,965; 3,599,637; 3,629,225; and/or 4,384,123 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1;
(d) analyzing one or more of nucleotide positions 2,774,108; 5,854,661; or 7,307,552 on chromosome 5, or 1,695,817; 1,727,397; 1,960,918; 5,175,087; 14,069,586; 14,329,191; and/or 14,866,064 of chromosome 6 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1;
(e) analyzing one or more of nucleotide positions 516,340; 518,238; 523,626; 1,109,162; 1,366,137; 2,346,000; 2,534,579; 3,247,341; 3,503,143; and/or 3,629,225 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1;
(f) analyzing one or more of nucleotide positions 516,340; 518,238; 523,626; 608,718; 1,109,162; 1 ,366,137; 1 ,386,965; 2,003,303; 3,247,341 ; and/or 3,704,632 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1;
(g) analyzing one or more of nucleotide positions 755,967; 1,840,325; 2,302,063; 2,366,529; 2,698,301; 3,485,895; 4,384,123; 11,993,646; 12,418,741; and/or 12,446,524 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1;
(h) analyzing one or more of nucleotide positions 330,918; 1,353,878; 1,745,101; 1,828,050; 1,929,134; 2,072,869; 2,339,956; 3,081,773; 3,564,387; and/or 3,585,965 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1;
(i) analyzing one or more of nucleotide positions 1,828,050; 2,038,965; 2,120,881; 2,208,629; 2,360,380; 2,364,964; 32,342,917; 32,395,736; 62,911,168; and/or 66,477,802 on chromosome 5 according to
the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1;
(j) analyzing one or more of nucleotide positions 1 ,331 ,433 ; 1,487,633; 1,837,343; 2,288,919; 2,291,467; 2,318,276; 2,774,108; 2,780,345; 4,391,586; and/or 74,391,606 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1;
(k) analyzing one or more of nucleotide positions 8,871,401; 8,886,933; 9,101,934; 10,446,475; 10,543,062; 10,561,778; 10,633,191; 10,934,458; 11,169,492; and/or 13,920,896 on chromosome 1 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1;
(l) analyzing one or more of nucleotide positions 1,695,817; 1,727,397; 1,960,918; 5,175,087; 5,468,920; 5,868,053; 6,061,359; 6,120,135; and/or 64,943,914 on chromosome 6; or 14,069,586 on chromosome 8; according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1; or
(m) analyzing one or more of nucleotide positions 1,220,207; 1,288,012; 1,999,618; 2,012,149; 2,931,923; 3,073,845; 3,091,941; 3,185,660; 6,311,954; and/or 6,589,961 on chromosome 6 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1. Clause 11. The method of any one of the prior clauses, comprising:
(i) analyzing the genetic markers of (a), wherein the modified terpene content is modified total terpene content;
(ii) analyzing the genetic markers of (b), wherein the modified terpene content is modified total monoterpene content;
(iii) analyzing the genetic markers of (c), wherein the modified terpene content is modified total monoterpenes content absent beta-myrcene;
(iv) analyzing the genetic markers of (d), wherein the modified terpene content is modified total sesquiterpene content;
(v) analyzing the genetic markers of (e), wherein the modified teipene content is modified alpha-pinene content;
(vi) analyzing the genetic markers of (f), wherein the modified teipene content is modified beta-pinene content;
(vii) analyzing the genetic markers of (g), wherein the modified terpene content is modified alphaterpinene, gamma-terpinene, and terpinolene content;
(viii) analyzing the genetic markers of (h), wherein the modified terpene content is modified monotcrpcnc to bcta-myrccnc content ratio;
(ix) analyzing the genetic markers of (i), wherein the modified teipene content is modified beta-ocimene content;
(x) analyzing the genetic markers of (j), wherein the modified terpene content is modified camphene and D-limonene content;
(xi) analyzing the genetic markers of (k), wherein the modified terpene content is modified linalool and tron.v-ncrolidol content;
(xii) analyzing the genetic markers of (1), wherein the modified terpene content is modified alpha- humulene and beta-caryophyllene content; or
(xiii) analyzing the genetic markers of (m), wherein the modified terpene content is modified guaiol content.
Clause 12. The method of any one of the prior clauses, wherein analyzing one or more genetic markers in the nucleic acid sample comprises analyzing one or more of nucleotide positions:
(a) 3,081,773 on chromosome 5; (b) 10,633,191 on chromosome 1; (c) 1,366,137 on chromosome 5; (d) 1,929,134 on chromosome 5; (e) 2,038,965 on chromosome 5; (f) 2,288,919 on chromosome 5; (g) 2,534,579 on chromosome 5; (h) 2,698,301 on chromosome 5; (i) 2,774,108 on chromosome 5; (j) 5,175,087 on chromosome 6; or (k) 6,311,954 on chromosome 6.
Clause 13. The method of any one of the prior clauses, wherein analyzing one or more genetic markers in the nucleic acid sample comprises analyzing nucleotide positions:
(a) 3,081,773 on chromosome 5; (b) 10,633,191 on chromosome 1; (c) 1,366,137 on chromosome 5; (d) 1,929,134 on chromosome 5; (e) 2,038,965 on chromosome 5; (f) 2,288,919 on chromosome 5; (g) 2,534,579 on chromosome 5; (h) 2,698,301 on chromosome 5; (i) 2,774,108 on chromosome 5; (j) 5,175,087 on chromosome 6; and (k) 6,311,954 on chromosome 6.
Clause 14. The method of any one of the prior clauses, wherein the genetic markers that indicate modified terpene content indicated increased terpene content and comprise:
(a) a T/T or C/T genotype at position 3,081,773 on chromosome 5; (b) a C/C or C/A genotype at position 10,633,191 on chromosome 1; (c) a A/ A or G/A genotype at position 1,366,137 on chromosome 5; (d) a T/T or T/C genotype at position 1 ,929,134 on chromosome 5; (e) a C/C or T/C genotype at position 2,038,965 on chromosome 5; (f) a T/T or A/T genotype at position 2,288,919 on chromosome 5; (g) a A/A or A/T genotype at position 2,534,579 on chromosome 5; (h) a G/G or A/G genotype at position 2,698,301 on chromosome 5; (i) a A/A or G/A genotype at position 2,774,108 on chromosome 5; (j) a T/T or A/T genotype at position 5,175,087 on chromosome 6; or (k) a C/C or G/C genotype at position 6,311,954 on chromosome 6.
Clause 15. The method of clause 14, comprising:
(i) analyzing the genetic marker of (a), wherein the increased terpene content is increased total monotcrpcnc content absent bcta-myrccnc;
(ii) analyzing the genetic marker of (b), wherein the increased terpene content is increased linalool and/or nerolidol 2 content;
(iii) analyzing the genetic marker of (c), wherein the increased terpene content is increased beta-pinene content;
(iv) analyzing the genetic marker of (d), wherein the increased terpene content is increased betamyrcene content;
(v) analyzing the genetic marker of (e), wherein the increased terpene content is increased beta-ocimene content;
(vi) analyzing the genetic marker of (f), wherein the increased terpene content is increased camphene and/or d-limonene content;
(vii) analyzing the genetic marker of (g), wherein the increased terpene content is increased alphapinene content;
(viii) analyzing the genetic marker of (h), wherein the increased terpene content is increased alphaterpinene, gamma-terpinene, and/or terpinolene content;
(iv) analyzing the genetic marker of (i), wherein the increased terpene content is increased camphene and/or d-limonene content;
(x) analyzing the genetic marker of (j), wherein the increased terpene content is increased alpha- humulene and/or beta-caryophyllene content; or
(xi) analyzing the genetic marker of (k), wherein the increased terpene content is increased guaiol content.
Clause 16. The method of any one of the prior clauses, wherein the genetic markers that indicate modified terpene content comprise a polymorphism at position 51 of one or more of: SEQ ID NOs: 1-99.
Clause 17. The method of any one of the prior clauses, wherein the polymorphism at position 51 comprises: Chromosome 1:
(a) a T/T or C/T genotype at position 51 of SEQ ID NO: 1; (b) a G/G or T/G genotype at position 51 of SEQ ID NO: 2; (c) a T/T or G/T genotype at position 51 of SEQ ID NO: 3; (d) a G/G or C/G genotype at position 51 of SEQ ID NO: 4; (e) a A/A or C/A genotype at position 51 of SEQ ID NO: 5; (f) a G/G or T/G genotype at position 51 of SEQ ID NO: 6; (g) a C/C or C/A genotype at position 51 of SEQ ID NO: 7; (h) a A/A or T/A genotype at position 51 of SEQ ID NO: 8; (i) a A/A or G/A genotype at position 51 of SEQ ID NO: 9; (j) a C/C or T/C genotype at position 51 of SEQ ID NO: 10;
Chromosome 2:
(k) a A/A or T/A genotype at position 51 of SEQ ID NO: 11 ;
Chromosome 3:
(l) a C/C or A/C genotype at position 51 of SEQ ID NO: 12; Chromosome 4:
(m) a A/A or C/A genotype at position 51 of SEQ ID NO: 13;
Chromosome 5:
(n) a A/A or G/A genotype at position 51 of SEQ ID NO: 14; (o) a C/C or T/C genotype at position 51 of SEQ ID NO: 15; (p) a G/G or G/A genotype at position 51 of SEQ ID NO: 16; (q) a T/T or T/C genotype at position 51 of SEQ ID NO: 17; (r) a C/C or T/C genotype at position 51 of SEQ ID NO: 18; (s) a T/T or G/T genotype at position 51 of SEQ ID NO: 19; (t) a T/T or C/T genotype at position 51 of SEQ ID NO: 20; (u) a T/T or G/T genotype at position 51 of SEQ ID NO: 21; (v) a T/T or C/T genotype at position 51 of SEQ ID NO: 22; (w) a G/G or C/G genotype at position 51 of SEQ ID NO: 23; (x) a A/A or G/A genotype at position 51 of SEQ ID NO: 24; (y) a C/C or T/C genotype at position 51 of SEQ ID NO: 25; (z) a C/C or C/T genotype at position 51 of SEQ ID NO: 26; (aa) a T/T or T/A genotype at position 51 of SEQ ID NO: 27; (ab) a G/G or A/G genotype at position 51 of SEQ ID NO: 28; (ac) a G/G or A/G genotype at position 51 of SEQ ID NO: 29; (ad) a T/T or C/T genotype at position 51 of SEQ ID NO: 30; (ae) a T/T or T/C genotype at position 51 of SEQ ID NO: 31; (af) a A/A or G/A genotype at position 51 of SEQ ID NO: 32; (ag) a C/C or T/C genotype at position 51 of SEQ ID NO: 33; (ah) a T/T or C/T genotype at position 51 of SEQ ID NO: 34; (ai) a C/C or C/T genotype at position 51 of SEQ ID NO: 35; (aj) a G/G or G/C genotype at position 51 of SEQ ID NO: 36; (ak) a T/T or A/T genotype at position 51 of SEQ ID NO: 37; (al) a T/T or T/A genotype at position 51 of SEQ ID NO: 38; (am) a A/A or C/A genotype at position 51 of SEQ ID NO: 39; (an) a A/A or G/A genotype at position 51 of SEQ ID NO: 40; (ao) a T/T or T/C genotype at position 51 of SEQ ID NO: 41; (ap) a A/A or A/C genotype at position 51 of SEQ ID NO: 42; (aq) a G/G or G/T genotype at position 51 of SEQ ID NO: 43; (ar) a T/T or T/C genotype at position 51 of SEQ ID NO: 44; (as) a T/T or C/T genotype at position 51 of SEQ ID NO: 45; (at) a A/A or A/T genotype at position 51 of SEQ ID NO: 46; (an) a G/G or A/G genotype at position 51 of SEQ ID NO: 47; (av) a A/A or G/A genotype at position 51 of SEQ ID NO: 48; (aw) a G/G or A/G genotype at position 51 of SEQ ID NO: 49; (ax) a T/T or C/T genotype at position 51 of SEQ ID NO: 50; (ay) a C/C or T/C genotype at position 51 of SEQ ID NO: 51 ; (az) a C/C or T/C genotype at position 51 of SEQ ID NO: 52; (ba) a C/C or T/C genotype at position 51 of SEQ ID NO: 53; (bb) a A/A or A/T genotype at position 51 of SEQ ID NO: 54; (be) a A/A or A/T genotype at position 51 of SEQ ID NO: 55; (bd) a A/A or G/A genotype at position 51 of SEQ ID NO: 56; (be) a A/A or G/A genotype at position 51 of SEQ ID NO: 57; (bf) a C/C or T/C genotype at position 51 of SEQ ID NO: 58; (bg) a T/T or T/C genotype at position 51 of SEQ ID NO: 59; (bh) a T/T or T/A genotype at position 51 of SEQ ID NO: 60; (bi) a T/T or A/T genotype at position 51 of SEQ ID NO: 61; (bj) a G/G or G/A genotype at position 51 of SEQ ID NO: 62; (bk) a A/A or G/A genotype at position 51 of SEQ ID NO: 63; (bl) a T/T or C/T genotype at position 51 of SEQ ID NO: 64; (bm) a T/T or T/A genotype at position 51 of SEQ ID NO: 65; (bn) a T/T or C/T genotype at position 51 of SEQ ID NO: 66; (bo) a T/T or A/T genotype at position 51 of SEQ ID NO: 67; (bp) a T/T or A/T genotype at position 51 of SEQ ID NO: 68; (bq) a T/T or C/T genotype at position 51 of SEQ ID NO: 69; (br) a G/G
or A/G genotype at position 51 of SEQ ID NO: 70; (bs) a A/ A or T/A genotype at position 51 of SEQ ID NO: 71; (bt) a G/G or A/G genotype at position 51 of SEQ ID NO: 72; (bu) a C/C or C/T genotype at position 51 of SEQ ID NO: 73;
Chromosome 6:
(bv) a C/C or A/C genotype at position 51 of SEQ ID NO: 74; (bw) a T/T or C/T genotype at position 51 of SEQ ID NO: 75; (bx) a C/C or G/C genotype at position 51 of SEQ ID NO: 76; (by) a T/T or C/T genotype at position 51 of SEQ ID NO: 77; (bz) a G/G or A/G genotype at position 51 of SEQ ID NO: 78; (ca) a A/A or G/A genotype at position 51 of SEQ ID NO: 79; (cb) a C/C or T/C genotype at position 51 of SEQ ID NO: 80; (cc) a A/A or T/A genotype at position 51 of SEQ ID NO: 81; (cd) a C/C or T/C genotype at position 51 of SEQ ID NO: 82; (ce) a G/G or A/G genotype at position 51 of SEQ ID NO: 83; (cf) a C/C or T/C genotype at position 51 of SEQ ID NO: 84; (eg) a T/T or A/T genotype at position 51 of SEQ ID NO: 85; (ch) a A/A or A/G genotype at position 51 of SEQ ID NO: 86; (ci) a A/A or G/A genotype at position 51 of SEQ ID NO: 87; (cj) a C/C or C/A genotype at position 51 of SEQ ID NO: 88; (ck) a T/T or C/T genotype at position 51 of SEQ ID NO: 89; (cl) a C/C or G/C genotype at position 51 of SEQ ID NO: 90; (cm) a G/G or T/G genotype at position 51 of SEQ ID NO: 91; (cn) a T/T or G/T genotype at position 51 of SEQ ID NO: 92;
Chromosome 8:
(co) a A/A or A/T genotype at position 51 of SEQ ID NO: 93; (cp) a G/G or A/G genotype at position 51 of SEQ ID NO: 94; (cq) a T/T or C/T genotype at position 51 of SEQ ID NO: 95; (cr) a C/C or T/C genotype at position 51 of SEQ ID NO: 96;
Chromosome 9:
(cs) a C/C or G/C genotype at position 51 of SEQ ID NO: 97;
Chromosome X:
(ct) a C/C or T/C genotype at position 51 of SEQ ID NO: 98; (cu) a T/T or T/C genotype at position 51 of SEQ ID NO: 99.
Clause 18. The method of any one of the prior clauses, wherein the one or more genetic markers comprise:
(a) SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 97, SEQ ID NO: 98, and/or SEQ ID NO: 99;
(b) SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 59, SEQ ID NO: 60, SEQ ID NO: 70, and/or SEQ ID NO: 96;
(c) SEQ ID NO: 20, SEQ ID NO: 30, SEQ ID NO: 45, SEQ ID NO: 47, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 55, SEQ ID NO: 56, SEQ ID NO: 57, and/or SEQ ID NO: 61;
(d) SEQ ID NO: 48, SEQ ID NO: 63, SEQ ID NO: 64, SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 78, SEQ ID NO: 85, SEQ ID NO: 93, SEQ ID NO: 94, and/or SEQ ID NO: 95;
(e) SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 21, SEQ ID NO: 24, SEQ ID NO: 42, SEQ ID NO: 46, SEQ ID NO: 51, SEQ ID NO: 53, and/or SEQ ID NO: 57;
(f) SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 21, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 32, SEQ ID NO: 51, and/or SEQ ID NO: 58;
(g) SEQ ID NO: 19, SEQ ID NO: 30, SEQ ID NO: 39, SEQ ID NO: 45, SEQ ID NO: 47, SEQ ID NO: 52, SEQ ID NO: 61, SEQ ID NO: 65, SEQ ID NO: 66, and/or SEQ ID NO: 67;
(h) SEQ ID NO: 14, SEQ ID NO: 23, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 31, SEQ ID NO: 34, SEQ ID NO: 41, SEQ ID NO: 50, SEQ ID NO: 54, and/or SEQ ID NO: 55;
(i) SEQ ID NO: 28, SEQ ID NO: 33, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 68, SEQ ID NO: 69, SEQ ID NO: 71, and/or SEQ ID NO: 72;
(j) SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 48, SEQ ID NO: 49, SEQ ID NO: 62, and/or SEQ ID NO: 73;
(k) SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, and/or SEQ ID NO: 10;
(l) SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 78, SEQ ID NO: 85, SEQ ID NO: 86, SEQ ID NO: 87, SEQ ID NO: 88, SEQ ID NO: 89, SEQ ID NO: 92, and/or SEQ ID NO: 93; or
(m) SEQ ID NO: 74, SEQ ID NO: 75, SEQ ID NO: 79, SEQ ID NO: 80, SEQ ID NO: 81, SEQ ID NO: 82, SEQ ID NO: 83, SEQ ID NO: 84, SEQ ID NO: 90, and/or SEQ ID NO: 91.
Clause 19. The method of any one of the prior clauses, wherein analyzing one or more genetic markers comprises analyzing 2 to 10 genetic markers.
Clause 20. The method of any one of the prior clauses, wherein the one or more genetic markers comprises a polymorphism relative to a reference genome within any one or more haplotypes, wherein the haplotypes comprise:
(a) the region on chromosome 1 :
(1) between positions 8842738 and 8882848; (2) between positions 8882848 and 8903959; (3) between positions 9095076 and 9416399; (4) between positions 10443667 and 10451533; (5) between positions 10528781 and 10557440; (6) between positions 10559732 and 10564751 ; (7) between positions 10624505 and 10758069; (8) between positions 10903728 and 10992149; (9) between positions 11166388 and 11189474; or (10) between positions 13919727 and 13925024; or
(b) the region on chr omosome 2:
(1) between positions 93289429 and 93293528; or
(c) the region on chromosome 3:
(1) between positions 47131547 and 47202043; or
(d) the region on chromosome 4:
(1) between positions 72692194 and 72730265; or
(e) the region on chromosome 5:
(1) between positions 306727 and 345147; (2) between positions 510491 and 556304; (3) between positions 510491 and 556304; (4) between positions 510491 and 556304; (5) between positions 602915 and 611014; (6) between positions 753271 and 759520; (7) between positions 1093370 and 1105891; (8) between positions 1105891 and 1130028; (9) between positions 1226772 and 1338718; (10) between positions 1338718 and 1366137; (11) between positions 1338718 and 1374231; (12) between positions 1374231 and 1391376; (13) between positions 1471890 and 1492255; (14) between positions 1737866 and 1796267; (15) between positions 1806325 and 1831524; (16) between positions 1831524 and 1840325; (17) between positions 1837343 and 1879732; (18) between positions 1909913 and 1965657; (19) between positions 2000572 and 2011766; (20) between positions 2011766 and 2065182; (21) between positions 2038965 and 2080946; (22) between positions 2102870 and 2132683; (23) between positions 2177531 and 2288919; (24) between positions 2208629 and 2296380; (25) between positions 2208629 and 2296380; (26) between positions 2291467 and 2312553; (27) between positions 2312553 and 2341119; (28) between positions 2334487 and 2341119; (29) between positions 2341119 and 2360380; (30) between positions 2350756 and 2364964; (31) between positions 2360380 and 2366529; (32) between positions 2364964 and 2534579; (33) between positions 2366529 and 2698301; (34) between positions 3454995 and 3493107; (35) between positions 2766141 and 2788084; (36) between positions 2766141 and 2812259; (37) between positions 3074649 and 3086874; (38) between positions 3240926 and 3287072; (39) between positions 3454995 and 3493107; (40) between positions 3495386 and 3513857; (41) between positions 3548727 and 3573470; (42) between positions 3580179 and 3599637; (43) between positions 3580179 and 3604863; (44) between positions 3610795 and 3673686; (45) between positions 3699003 and 3745325; (46) between positions 3757096 and 3812523; (47) between positions 3837020 and 3864286; (48) between positions 4376633 and 4407878; (49) between positions 4384123 and 4407878; (50) between positions 5847962 and 5902323; (51) between positions 7299267 and 7334907; (52) between positions 11977355 and 11995991; (53) between positions 12356226 and 124431 17; (54) between positions 12444383 and 12472642; (55) between positions 32322620 and 32387532; (56) between positions 32390660 and 32431549; (57) between positions 55471148 and 55480894; (58) between positions 62899638 and 62917837; (59) between positions 66462287 and 66482542; or (60) between positions 74391487 and 74399760; or (f) the region on chromosome 6:
(1) between positions 1203540 and 1221496; (2) between positions 1244898 and 1477638; (3) between positions 1692286 and 1701837; (4) between positions 1714540 and 1740247; (5) between positions 1952568 and 1999618; (6) between positions 1960918 and 2060506; (7) between positions 1960918 and 2060506; (8) between positions 2918399 and 2945707; (9) between positions 3071457 and 3083137; (10) between positions 3083137 and 3092130; (11) between positions 3154063 and 3188084; (12) between positions 5135593 and 5193113; (13) between positions 5462269 and 5488226; (14)
between positions 5859221 and 5875355; (15) between positions 6051132 and 6104696; (16) between positions 6104696 and 6124684; (17) between positions 6296168 and 6319308; (18) between positions 6585819 and 6608294; or (19) between positions 64927273 and 64988035; or
(g) the region on chromosome 8:
(1) between positions 14058913 and 14074065; (2) between positions 14314523 and 14339300;
(3) between positions 14851865 and 14891255; or (4) between positions 33582608 and 33596624; or
(h) the region on chromosome 9:
(1) between positions 55096943 and 55124006; or
(i) the region on chromosome x:
(1) between positions 57886112 and 57917472; or (2) between positions 58464133 and 58548870; wherein the reference genome is Abacus Cannabis reference genome version Csat_AbacusV2, NCB1 assembly accession GCA_025232715.1.
Clause 21. The method of any one of the prior clauses, wherein the one or more genetic markers are genetically linked to a terpene trait locus.
Clause 22. The method of any one of the prior clauses, wherein the modified terpene content comprises modified total terpenes, total monoterpenes, beta-myrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma- terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, tran. -ncrolidol, alpha-humulene, beta-caryophyllene, and/or guaiol levels.
Clause 23. The method of any one of the prior clauses, wherein the modified terpene content comprises increased terpene content relative to the control.
Clause 24. The method of any one of the prior clauses, wherein the modified terpene content comprises increased total terpenes absent beta myrcene.
Clause 25. The method of any one of the prior clauses, wherein the modified terpene content comprises increased alpha-pinene.
Clause 26. The method of any one of the prior clauses, wherein the modified terpene content comprises increased beta-pinene.
Clause 27. The method of any one of the prior clauses, wherein the modified terpene content comprises increased alpha-terpinene, gamma-terpinene, and/or terpinolene.
Clause 28. The method of any one of the prior clauses, wherein the modified terpene content comprises increased beta-ocimene.
Clause 29. The method of any one of the prior clauses, wherein the modified terpene content comprises increased camphene and/or D-limoncnc.
Clause 30. The method of any one of the prior clauses, wherein the modified terpene content comprises increased linalool and/or trans-nerolidol.
Clause 31. The method of any one of the prior clauses, wherein the modified terpene content comprises increased guaiol.
Clause 32. The method of any one of the prior clauses, wherein the modified terpene content comprises increased beta-myrcene.
Clause 33. The method of any one of the prior clauses, wherein the one or more genetic markers that indicate modified terpene content indicate increased terpene content relative to a control.
Clause 34. The method of any one of the prior clauses, wherein the control is a Cannabis plant without the one or more markers that indicate modified terpene content.
Clause 35. The method of any one of the prior clauses, wherein the terpene content is modified in flower and/or leaf tissue, optionally wherein the flower tissue is floral trichomes.
Clause 36. The method of any one of the prior clauses, wherein the modified terpene content is increased terpene content relative to the control.
Clause 37. A Cannabis plant produced by the method of any one of the prior clauses.
Clause 38. A seed, plant part, tissue culture, or protoplast of the plant of clause 37.
Clause 39. A method of Cannabis breeding, comprising crossing the Cannabis plant of clause 37.
Clause 40. The method of clause 39, wherein crossing comprises selfing, sibling crossing, outcrossing, or backcrossing.
Clause 41. A Cannabis product produced from the plant of clause 37.
Clause 42. The Cannabis product of claim 41, wherein the product is a kief, hashish, bubble hash, an edible product, solvent reduced oil, sludge, e-juice, or tincture.
EXAMPLES
The following examples are provided to illustrate particular features of certain aspects of the disclosure, but the scope of the claims should not be limited to those features exemplified.
Example 1 Materials and Methods
Plant material and terpene chemotyping
A diversity panel consisting of 145 cannabis seed lots with at least three accessions each (two seed lots had two accessions each; n=900; mapping set), a diversity panel consisting of 33 cannabis seed lots with at least four accessions each (n=463; first validation set), and a diversity panel consisting of individual varieties and small seed lots that were excluded from the mapping set and first validation set (n=397 ; second validation set) were used for mapping and validating genetic markers of terpene abundance. Terpenes were measured using dried flower tissue of one plant per accession using gas chromatography (GC) as % of dry weight. For one F2 population (n=131) which was part of the first validation set terpenes were measured using dried flower tissue of
up to three clonal replicates per accession; the average across clonal replicates was used for validation. In total, data were collected for 22 terpenes: 15 monoterpenes (alpha-pinene, alpha-terpinene, beta-myrcene, beta- ocimene, beta-pinene, camphene, delta-3-carene, D-limonene, eucalyptol, isopulegol, linalool, p-cymene, and terpinolene) and seven sesquiterpenes (alpha-Bisabolol, alpha-humulene, beta-caryophyllene, caryophyllene oxide, guaiol, c iv- nerol idol (also known as nerolidol 1), and tra/z.v-ncrolidol (also known as nerolidol 2)). All 22 terpenes combined were used to map and validate total Terpenes, all 15 monoterpenes combined were used to map and validate total Monoterpenes, and all seven sesquiterpenes combined were used to map and validate total Sesquiterpenes. In addition to individual terpenes, mapping and validation was done for combinations of terpenes with highly correlated levels: 1. alpha-terpinene, gamma-terpinene, and terpinolene; 2. camphene and D-limonene; 3. alpha-humulene and beta-caryophyllene; 4. linalool and trans-nerolidol (Allen et al., "Genomic characterization of the complete terpene synthase gene family from Cannabis sativa," PloS one 14.9 (2019): e0222363; Zager et al., "Gene networks underlying cannabinoid and terpenoid accumulation in cannabis," Plant physiology 180.4 (2019): 1877-1897). Beta-myrcene was mapped and validated as the ratio of beta-myrcene to total monoterpenes (=(beta-myrcene+l)/(total monoterpenes - beta-myrcene +1). In addition, total monoterpenes was mapped and validated after exclusion of beta-myrcene.
Association mapping
The mapping set and both validation sets were genotyped with an Illumina bead array. After initial SNP quality control (QC), further filtering steps were performed to filter out known low quality SNPs, followed by filtering for missing data (<10%) and minor allele frequency (>1%) using vcftools (Danecek et al., "The variant call format and VCFtools," Bioinformatics 27.15 (2011): 2156-2158). Missing data were subsequently imputed in the mapping set (R package NAM “snpQC” option; Xavier et al., "NAM: association studies in multiple populations," Bioinformatics 31.23 (2015): 3862-3864), resulting in 36,073 SNPs for the mapping set. Subsequently, nested association mapping (NAM) was performed on terpene data collected for the mapping set of 900 diversity panel accessions with the R package NAM using seed lots as family structure and a kinship matrix to control for relatedness (GWAS2 function).
SNP marker validation
Significant SNP markers or the top ten significant SNP markers - if more than ten markers were significantly associated with a terpene trait - were subsequently validated in a two-step process using two validation sets. In the first step average terpene trait values were calculated in the mapping set (n=900) and the first validation set (n=463) for the three genotypic states observed for each SNP marker: homozygous reference allele, heterozygous, and homozygous alternate allele. The homozygous genotype with the highest average terpene trait value is referred to as the beneficial genotype. The homozygous genotype with the lowest average
terpene trait value is referred to as the detrimental genotype. The heterozygous genotype is considered beneficial in addition to a homozygous genotype if the heterozygous genotype has either an average terpene trait value intermediate between the homozygous reference allele and homozygous alternate allele genotype average trait values or has an average terpene trait value similar' to the beneficial homozygous genotype. Subsequently, beneficial genotypes for the mapping set and the first validation set were compared. A SNP marker was considered validated in the first validation set if the beneficial genotype for the mapping set matched the beneficial genotype for the first validation set.
In the second validation step the average terpene trait value across all 397 accessions in the second validation set was compared with the average terpene trait value after selecting for the beneficial and the detrimental genotypes, respectively, of all SNP markers that were validated in the first validation set for a given terpene trait. The combination of beneficial genotypes is considered validated if the beneficial genotypes of the SNP markers in combination result in an increased average terpene trait value as compared to the average without SNP marker selection. The combination of detrimental genotypes is considered validated if the detrimental genotypes of the SNP markers in combination result in a decreased average terpene trait value as compared to the average without SNP marker selection.
Example 2
SNP Markers
SNP markers for Total Terpenes, Total Monoterpenes, and Total Sesquiterpenes
NAM of total Terpenes in the diversity panel resulted in the identification of five significant (p-value <Bonferroni threshold of 1.39E-06) SNP markers on chromosomes 2, 4, 9, and X. Three of these five SNPs were validated in the first validation set (Table 1; Table 15). In combination, the beneficial genotypes for these three SNP markers resulted in increased total Terpenes in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
NAM of total Monoterpenes in the diversity panel resulted in the identification of seven significant SNP markers on chromosomes 2, 3, 4, 5, and 8. Four of these seven SNPs were validated in the first validation set (Table 2; Table 15). In combination, the beneficial genotypes for these four SNP markers resulted in increased total Monoterpenes in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
NAM of total Monoterpenes excluding beta-Myrcene (also referred to as total Monoterpenes - beta- Myrcene) in the diversity panel resulted in the identification of 65 significant SNP markers located on chromosome 5. Nine of the top ten SNPs were validated in the first validation set (Table 3; Table 15). In combination, the beneficial genotypes for these nine SNP markers resulted in increased total Monotcrpcncs excluding beta-Myrcene in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
NAM of total Sesquiterpenes in the diversity panel resulted in the identification of 33 significant SNP markers located on chromosomes 3, 4, 5, 6, and 8; the majority of these markers consisting of 11 SNP markers are located on chromosome 6. All of the top ten SNPs were validated in the first validation set (Table 4; Table 15). The combination of homozygous alternate beneficial genotypes for three of these ten SNP markers resulted in increased total Sesquiterpenes in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
SNP markers for alpha-Pinene and beta-Pinene
NAM of alpha-Pinene in the diversity panel resulted in the identification of 2157 significant SNP markers on all ten chromosomes; the majority of these markers consisting of 362 of these SNP markers are located on chromosome 5. Two of the top ten SNPs were validated in the first validation set (Table 5; Table 15). In combination, the beneficial genotypes of these two SNP markers resulted in increased alpha-Pinene in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
NAM of beta-Pinene in the diversity panel resulted in the identification of 837 significant SNP markers on all ten chromosomes; 194 of these SNP markers are located on chromosome 5. Five of the top ten SNPs were validated in the first validation set (Table 6; Table 15). In combination, the beneficial genotypes of these five SNP markers resulted in increased beta-Pinene in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
SNP markers for alpha-Terpinene, gamma-Terpinene, and Terpinolene
NAM of the combination of alpha-Terpinene, gamma-Terpinene, and Terpinolene (also referred to as alpha-Terpinene + gamma-Terpinene + Terpinolene) in the diversity panel resulted in the identification of 1395 significant SNP markers on all ten chromosomes; the majority of these markers consisting of 912 of these SNP markers are located on chromosome 5. Eight of the top ten SNPs were validated in the first validation set (Table 7; Table 15). In combination, the beneficial genotypes of these eight SNP markers resulted in increased alpha- Terpinene + gamma-Terpinene + Terpinolene in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
SNP markers for beta-Myrcene to Total Monoterpenes Ratio
NAM of beta-Myrcene to total Monoterpenes Ratio in the diversity panel resulted in the identification of 121 significant SNP markers on all ten chromosomes; the majority of these markers consisting of 90 of these SNP markers are located on chromosome 5. Six of the top ten SNPs were validated in the first validation set (Table 8; Table 15). In combination, the beneficial genotypes for these six SNP markers resulted in increased
beta-Myrcene to total Monoterpenes ratio in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
SNP markers for beta-Ocimene
NAM of beta-Ocimene in the diversity panel resulted in the identification of 864 significant SNP markers on all 10 chromosomes; the majority of these markers consisting of 259 of these SNP markers are located on chromosome 5. Six of the top ten SNPs were validated in the first validation set (Table 9; Table 15). In combination, the beneficial genotypes for these six SNP markers resulted in increased beta-Ocimene in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
SNP markers for Camphene and D-Limonene
NAM of the combination of Camphene and D-Limonene (also referred to as Camphene + D-Limonene) in the diversity panel resulted in the identification of 209 significant SNP markers on all 10 chromosomes; the majority of these markers consisting of 103 of these SNP markers are located on chromosome 5. Nine of the top 10 SNPs were validated in the first validation set (Table 10; Table 15). In combination, the beneficial genotypes for these nine SNP markers resulted in increased Camphene and D-Limonene in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
SNP markers for Linalool and trans-Nerolidol
NAM of the combination of linalool and tra/rv-ncrolidol (also referred to as linalool + tra/i.v-ncrolidol) in the diversity panel resulted in the identification of 481 significant SNP markers on all ten chromosomes; the majority of these markers consisting of 258 are located at the of chromosome 1. All ten of the top ten SNPs were validated in the first validation set (Table 11; Table 15). In combination, the beneficial genotypes for these ten SNP markers resulted in increased linalool and trans-nerolidol in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
SNP markers for alpha-Humulene and beta-Caryophyllene
NAM of the combination of alpha-Humulene and beta-Caryophyllene (also referred to as alpha- Humulene + beta-Caryophyllene) in the diversity panel resulted in the identification of 46 significant SNP markers on chromosomes 3, 4, 5, 6, and 8; the majority of these markers consisting of 23 SNPs are located on chromosome 6. Nine of the top ten SNPs were validated in the first validation set (Table 12; Table 15). In combination, the beneficial genotypes for these nine SNP markers resulted in increased alpha-Humulene + beta- Caryophyllene in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
SNP markers for Guaiol
NAM of Guaiol in the diversity panel resulted in the identification of 526 significant SNP markers on chromosomes 3, 4, 5, 6, 7, 8, 9, and X; the majority of these markers consisting of 488 SNPs are located on chromosome 6. Nine of the top ten SNPs were validated in the first validation set (Table 13; Table 15). In combination, the beneficial genotypes for these nine SNP markers resulted in increased Guaiol in the second validation set, therefore validating the combination of beneficial genotypes of these SNP markers (Table 14).
Example 3 Combinations of SNP Markers
Combinations of SNP markers disclosed herein can be useful, for example, for screening plants having increased levels of terpenes of interest. While any combination of SNPs disclosed herein could be useful, an exemplary subset of SNPs is provided in Table 16. In an exemplary method, at least one SNP selected from the list of SNPs in Table 16 is analyzed and/or detected in a nucleic acid sample from a Cannabis plant, indicating increased terpene levels in that plant. In another exemplary method, at least two SNPs selected from the list of SNPs in Table 16 are analyzed, and at least one SNP indicating increased terpene levels is detected in a nucleic acid sample from a Cannabis plant, indicating increased terpene levels in that plant. In a further exemplary method, all of the SNPs from the list of SNPs provided in Table 16 are analyzed, and at least one SNP indicating increased terpene levels is detected in a nucleic acid sample from a Cannabis plant, indicating increased terpene levels in that plant.
Table 1. Significantly associated SNP markers with total Terpenes identified after NAM of a diversity panel (n=900). First column: SNP marker number; Second column: SNP marker name; Third column, NAM p-value; Fourth column, genotype associated with high total Terpenes: A=homozygous for reference allele, B=homozygous for alternative allele, X=heterozygous, *=beneficial genotype validated based on averages for
5 genotype calls in first validation set (n=463); Fifth column, reference allele call; Sixth column, alternative allele call; Seventh column, Abacus reference genome chromosome; Eight column, Abacus reference position (Csat_AbacusV2; NCBI assembly accession GCA_025232715.1); Ninth column, left flanking SNP of haplotype surrounding SNP marker; Tenth column, right flanking SNP of haplotype surrounding SNP marker; Eleventh column, Abacus reference genome position left flanking SNP of haplotype surrounding SNP marker; Twelfth column, Abacus reference genome position right flanking SNP of haplotype surrounding SNP marker. In this context a haplotype surrounding a significantly associated SNP marker consists of the genomic
10 region flanked by the nearest non-significant SNP on either side of the SNP marker.
Table 2. Significantly associated SNP markers with total Monoterpenes identified after NAM of a diversity panel (n=900). First column: SNP marker number; Second column: SNP marker name; Third column, NAM p-value; Fourth column, genotype associated with high Monoterpenes: A=homozygous for reference allele, B=homozygous for alternative allele, X=heterozygous, *=beneficial genotype validated based on averages for
5 genotype calls in first validation set (n=463); Fifth column, reference allele call; Sixth column, alternative allele call; Seventh column, Abacus reference genome chromosome; Eighth column, Abacus reference genome position (Csat_AbacusV2; NCBI assembly accession GCA_025232715.1); Ninth column, left flanking SNP of haplotype surrounding SNP marker; Tenth column, right flanking SNP of haplotype surrounding SNP marker; Eleventh column, Abacus reference genome position left flanking SNP of haplotype surrounding SNP marker; Twelfth column, Abacus reference genome position
right flanking SNP of haplotype surrounding SNP marker. In this context a haplotype surrounding a significantly associated SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
Table 3. Significantly associated SNP markers with total Monoterpenes excluding beta-Myrcene identified after NAM of a diversity panel (n=900). First column: SNP marker number; Second column: SNP marker name; Third column, NAM p-value; Fourth column, genotype associated with high total Monoterpenes excluding beta-Myrcene: A=homozygous for reference allele, B=homozygous for alternative allele, X=heterozygous, *=beneficial genotype validated based on averages for genotype calls in first validation set (n=463), A=B inferred based on segregation patterns in both the mapping
5 and first validation data sets; Fifth column, reference allele call; Sixth column, alternative allele call; Seventh column, Abacus reference genome chromosome; Eighth column, Abacus reference genome position (Csat_AbacusV2; NCBI assembly accession GCA_025232715.1); Ninth column, left flanking SNP of haplotype surrounding SNP marker; Tenth column, right flanking SNP of haplotype surrounding SNP marker; Eleventh column, Abacus reference genome position left flanking SNP of haplotype surrounding SNP marker; Twelfth column, Abacus reference genome position right flanking SNP of haplotype surrounding SNP marker. In this context a haplotype surrounding a significantly associated SNP marker consists of the genomic
Table 4. Top ten significantly associated SNP markers with total Sesquiterpenes identified after NAM of a diversity panel (n=900). First column: SNP marker number; Second column: SNP marker name; Third column, NAM p-value; Fourth column, genotype associated with high total Sesquiterpenes: A=homozygous for reference allele, B=homozygous for alternative allele, X=heterozygous, *=beneficial genotype validated based on averages for genotype calls in first validation set (n=463); Fifth column, reference allele call; Sixth column, alternative allele call; Seventh column, Abacus reference
5 genome chromosome; Eighth column, Abacus reference genome position (Csat_AbacusV2; NCBI assembly accession GCA_025232715.1); Ninth column, left flanking SNP of haplotype surrounding SNP marker; Tenth column, right flanking SNP of haplotype surrounding SNP marker; Eleventh column, Abacus reference genome position left flanking SNP of haplotype surrounding SNP marker; Twelfth column, Abacus reference genome position right flanking SNP of haplotype surrounding SNP marker. In this context a haplotype surrounding a significantly associated SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
Table 5. Top ten significantly associated SNP markers with alpha-Pinene identified after NAM of a diversity panel (n=900). First column: SNP marker number; Second column: SNP marker name; Third column, NAM p-value; Fourth column, genotype associated with high alpha-Pinene: A=homozygous for reference allele, B=homozygous for alternative allele, X=heterozygous, *=beneficial genotype validated based on averages for genotype calls in first validation set (n=463); Fifth column, reference allele call; Sixth column, alternative allele call; Seventh column, Abacus reference genome chromosome;
5 Eighth column, Abacus reference genome position (Csat_AbacusV2; NCBI assembly accession GCA_025232715.1); Ninth column, left flanking SNP of haplotype surrounding SNP marker; Tenth column, right flanking SNP of haplotype surrounding SNP marker; Eleventh column, Abacus reference
genome position left flanking SNP of haplotype surrounding SNP marker; Twelfth column, Abacus reference genome position right flanking SNP of haplotype surrounding SNP marker. In this context a haplotype surrounding a significantly associated SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
Table 6. Top ten significantly associated SNP markers with beta-Pinene identified after NAM of a diversity panel (n=900). First column: SNP marker number; Second column: SNP marker name; Third column, NAM p-value; Fourth column, genotype associated with high beta-Pinene: A=homozygous for reference allele, B=homozygous for alternative allele, X=heterozygous, *=beneficial genotype validated based on averages for genotype calls in first validation set (n=463); Fifth column, reference allele call; Sixth column, alternative allele call; Seventh column, Abacus reference genome chromosome;
5 Eighth column, Abacus reference genome position (Csat_AbacusV2; NCBI assembly accession GCA_025232715.1); Ninth column, left flanking SNP of haplotype surrounding SNP marker; Tenth column, right flanking SNP of haplotype surrounding SNP marker; Eleventh column, Abacus reference genome position left flanking SNP of haplotype surrounding SNP marker; Twelfth column, Abacus reference genome position right flanking SNP of haplotype surrounding SNP marker. In this context a haplotype surrounding a significantly associated SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
Table 7. Top ten significantly associated SNP markers with the combination of alpha-Terpinene, gamma-Terpinene, and Terpinolene identified after NAM of a diversity panel (n=900). First column: SNP marker number; Second column: SNP marker name; Third column, NAM p-value; Fourth column, genotype associated with high alpha-Terpinene, gamma-Terpinene, and Terpinolene: A=homozygous for reference allele, B=homozygous for alternative allele, X=heterozygous; *=beneficial genotype validated based on averages for genotype calls in first validation set (n=463); Fifth column,
5 reference allele call; Sixth column, alternative allele call; Seventh column, Abacus reference genome chromosome; Eighth column, Abacus reference genome position (Csat_AbacusV2; NCBI assembly accession GCA_025232715.1); Ninth column, left flanking SNP of haplotype surrounding SNP marker; Tenth column, right flanking SNP of haplotype surrounding SNP marker; Eleventh column, Abacus reference genome position left flanking SNP of haplotype surrounding SNP marker; Twelfth column, Abacus reference genome position right flanking SNP of haplotype surrounding SNP marker. In this context a haplotype surrounding a significantly associated SNP marker consists of the genomic region flanked by the nearest non-significant SNP on
Table 8. Top ten significantly associated SNP markers with beta-Myrcene to total Monoterpenes ratio identified after NAM of a diversity panel (n=900). First column: SNP marker number; Second column: SNP marker name; Third column, NAM p-value; Fourth column, genotype associated with high beta-Myrcene to total Monoterpenes ratio: A=homozygous for reference allele, B=homozygous for alternative allele, X=heterozygous, *=beneficial genotype validated based on averages for genotype calls in first validation set (n=463); Fifth column, reference allele call; Sixth column, alternative
5 allele call; Seventh column, Abacus reference genome chromosome; Eighth column, Abacus reference genome position (Csat_AbacusV2; NCBI assembly accession GCA_025232715.1); Ninth column, left flanking SNP of haplotype surrounding SNP marker; Tenth column, right flanking SNP of haplotype surrounding SNP marker; Eleventh column, Abacus reference genome position left flanking SNP of haplotype surrounding SNP marker; Twelfth column, Abacus reference genome position right flanking SNP of haplotype surrounding SNP marker. In this context a haplotype surrounding a significantly associated SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
Table 9. Top ten significantly associated SNP markers with beta-Ocimene identified after NAM of a diversity panel (n=900). First column: SNP marker number; Second column: SNP marker name; Third column, NAM p-value; Fourth column, genotype associated with high beta-Ocimene: A=homozygous for reference allele, B=homozygous for alternative allele, X=heterozygous, *=beneficial genotype validated based on averages for genotype calls in first
validation set (n=463); Fifth column, reference allele call; Sixth column, alternative allele call; Seventh column, Abacus reference genome chromosome; Eighth column. Abacus reference genome position (Csat_AbacusV2; NCBI assembly accession GCA_025232715.1); Ninth column, left flanking SNP of haplotype surrounding SNP marker; Tenth column, right flanking SNP of haplotype surrounding SNP marker; Eleventh column, Abacus reference genome position left flanking SNP of haplotype surrounding SNP marker; Twelfth column, Abacus reference genome position right flanking SNP of
5 haplotype surrounding SNP marker. In this context a haplotype surrounding a significantly associated SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
Table 10. Top ten significantly associated SNP markers with the combination of Camphene and D-Limonene identified after NAM of a diversity panel (n=900). First column: SNP marker number; Second column: SNP marker name; Third column, NAM p-value; Fourth column, genotype associated with high Camphene and D-Limonene: A=homozygous for reference allele, B=homozygous for alternative allele, X=heterozygous, *=beneficial genotype validated based on averages for genotype calls in first validation set (n=463); Fifth column, reference allele call; Sixth column, alternative allele call;
5 Seventh column, Abacus reference genome chromosome; Eighth column, Abacus reference genome position (Csat_AbacusV2; NCBI assembly accession GCA_025232715.1); Ninth column, left flanking SNP of haplotype surrounding SNP marker; Tenth column, right flanking SNP of haplotype surrounding SNP marker; Eleventh column, Abacus reference genome position left flanking SNP of haplotype surrounding SNP marker; Twelfth column, Abacus reference genome position right flanking SNP of haplotype surrounding SNP marker. In this context a haplotype surrounding a significantly associated SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
Table 11. Top ten significantly associated SNP markers with the combination of linalool and trans-nerolidol identified after NAM of a diversity panel (n=900). First column: SNP marker number; Second column: SNP marker name; Third column, NAM p-value; Fourth column, genotype associated with high linalool and trans-nerolidol: A=homozygous for reference allele, B=homozygous for alternative allele, X=heterozygous, *=beneficial genotype validated based on averages for genotype calls in first validation set (n=463); Fifth column, reference allele call; Sixth column, alternative allele call;
5 Seventh column, Abacus reference genome chromosome; Eighth column, Abacus reference genome position (Csat_AbacusV2; NCBI assembly accession GCA_025232715.1); Ninth column, left flanking SNP of haplotype surrounding SNP marker; Tenth column, right flanking SNP of haplotype surrounding SNP marker; Eleventh column, Abacus reference genome position left flanking SNP of haplotype surrounding SNP marker; Twelfth column, Abacus reference genome position right flanking SNP of haplotype surrounding SNP marker. In this context a haplotype surrounding a significantly associated SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
Table 12. Top ten significantly associated SNP markers with alpha-Humulene and beta-Caryophyllene identified after NAM of a diversity panel
(n=900). First column: SNP marker number; Second column: SNP marker name; Third column, NAM p-value; Fourth column, genotype associated with high alpha-Humulene and beta-Caryophyllene: A=homozygous for reference allele, B=homozygous for alternative allele, X=heterozygous, *=beneficial genotype validated based on averages for genotype calls in first validation set (n=463); Fifth column, reference allele call; Sixth column, alternative
5 allele call; Seventh column, Abacus reference genome chromosome; Eighth column, Abacus reference genome position (Csat_AbacusV2; NCBI assembly accession GCA.025232715.1); Ninth column, left flanking SNP of haplotype surrounding SNP marker; Tenth column, right flanking SNP of
haplotype surrounding SNP marker; Eleventh column, Abacus reference genome position left flanking SNP of haplotype surrounding SNP marker;
Twelfth column, Abacus reference genome position right flanking SNP of haplotype surrounding SNP marker. In this context a haplotype surrounding a significantly associated SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
Table 13. Top ten significantly associated SNP markers with Guaiol identified after NAM of a diversity panel (n=900). First column: SNP marker number; Second column: SNP marker name; Third column, NAM p-value; Fourth column, genotype associated with high Guiaol: A=homozygous for reference allele, B=homozygous for alternative allele, X=heterozygous, *=beneficial genotype validated based on averages for genotype calls in first validation set (n=463); Fifth column, reference allele call; Sixth column, alternative allele call; Seventh column, Abacus reference genome chromosome;
5 Eighth column, Abacus reference genome position (Csat_AbacusV2; NCBI assembly accession GCA_025232715.1); Ninth column, left flanking SNP of haplotype surrounding SNP marker; Tenth column, right flanking SNP of haplotype surrounding SNP marker; Eleventh column, Abacus reference genome position left flanking SNP of haplotype surrounding SNP marker; Twelfth column, Abacus reference genome position right flanking SNP of haplotype surrounding SNP marker. In this context a haplotype surrounding a significantly associated SNP marker consists of the genomic region flanked by the nearest non-significant SNP on either side of the SNP marker.
Table 14. Validation based on the second validation set (n=397) of combinations of validated SNP markers, which were validated in the first validation set. First column: mapped trait; Second column: average value (% of dry weight) for mapped terpene trait in second validation set without using markers to make selections; Third column: number of SNP markers used in combination to make selections; Fourth column: average value (% of dry weight) for
5 mapped terpene trait after selecting for the beneficial genotype of the combination of markers that were validated for the mapped trait in the first
validation set; Fifth column: number of accessions containing the beneficial genotype for the combination of markers to make the selection; Sixth column: average value (% of dry weight) for mapped terpene trait after selecting for the detrimental genotype of the combination of markers that were validated for the mapped trait in the fust validation set; Seventh column: number of accessions containing the detrimental genotype for the combination of markers to make the selection; ^selection made based on homozygous beneficial genotypes for the most significant validated SNP marker per
5 chromosome, three SNP markers in total, NA=there were no accessions with the combination of SNP marker genotypes in the second validation set.
Table 15. 50 bp flanking sequences with SNP marker at position 51 bp. First column: SNP marker number; Second column SNP marker name; Third column: 50 bp flanking sequences with SNP marker at position 51 bp.
Table 16. Exemplary markers used for marker assisted selection for single terpenes as well as terpene combinations. First column: marker name. Second column: beneficial genotypes (A=homozygous reference allele, X=heterozygous, B=homozygous alternate allele) for high (=H) or detrimental genotypes for low (=L) terpenes. All markers have an additive effect on terpene levels. Third column: chromosome on which marker resides. Fourth column: position of marker on chromosome in bp. Fifth column: information about the target terpene.
It will be apparent that the precise details of the methods or compositions described may be varied or modified without departing from the spirit of the described aspects of the disclosure. We claim all such modifications and variations that fall within the scope and spirit of the claims below.
Claims
1. A method for producing one or more Cannabis plants having modified terpene content, comprising:
(i) analyzing one or more genetic markers in a nucleic acid sample from a Cannabis plant or its germplasm;
(ii) detecting one or more genetic markers that indicate modified terpene content,
(iii) crossing the Cannabis plant comprising the one or more genetic markers indicating modified terpene content, and
(iv) obtaining one or more progeny plants comprising the one or more genetic markers indicating modified terpene content, and wherein the one or more progeny plants have modified terpene content relative to a control.
2 A method for selecting a Cannabis plant having modified terpene content, comprising:
(i) analyzing one or more genetic markers in a nucleic acid sample from the Cannabis plant or its germplasm;
(ii) detecting one or more genetic markers that indicate modified terpene content; and
(iii) selecting the Cannabis plant, thereby selecting the Cannabis plant having modified terpene content.
3 The method of claim 2, wherein the Cannabis plant having modified terpene content is selected for further analysis, propagation, crossing, or to make a product.
4 The method of claim 2, further comprising crossing the Cannabis plant having modified terpene content and producing one or more progeny plants having modified terpene content.
5 The method of claim 1 or claim 2, wherein: analyzing comprises using PCR, quantitative PCR (qPCR), and/or sequencing; and/or detecting comprises using an oligonucleotide primer set or probe.
6 The method of claim 1 or claim 4, wherein crossing comprises selfing, sibling crossing, outcrossing, or backcrossing.
7 The method of claim 6, wherein the selfing, sibling crossing, outcrossing, or backcrossing comprises marker-assisted selection for at least two generations.
8. The method of claim 1 or claim 2, wherein analyzing one or more genetic markers in the nucleic acid sample comprises analyzing one or more of nucleotide positions:
(a) 3,081,773 on chromosome 5;
(b) 10,633,191 on chromosome 1;
(c) 1,366,137 on chromosome 5;
(d) 1,929,134 on chromosome 5;
(e) 2,038,965 on chromosome 5;
(f) 2,288,919 on chromosome 5;
(g) 2,534,579 on chromosome 5;
(h) 2,698,301 on chromosome 5;
(i) 2,774,108 on chromosome 5;
(j) 5,175,087 on chromosome 6; or
(k) 6,311,954 on chromosome 6.
9 The method of claim 8, wherein the genetic markers that indicate modified terpene content indicate increased terpene content, and comprise:
(a) a T/T or C/T genotype at position 3,081,773 on chromosome 5;
(b) a C/C or C/A genotype at position 10,633,191 on chromosome 1;
(c) a A/A or G/A genotype at position 1,366,137 on chromosome 5;
(d) a T/T or T/C genotype at position 1,929,134 on chromosome 5;
(e) a C/C or T/C genotype at position 2,038,965 on chromosome 5;
(f) a T/T or A/T genotype at position 2,288,919 on chromosome 5;
(g) a A/A or A/T genotype at position 2,534,579 on chromosome 5;
(h) a G/G or A/G genotype at position 2,698,301 on chromosome 5;
(i) a A/A or G/A genotype at position 2,774,108 on chromosome 5;
(j) a T/T or A/T genotype at position 5,175,087 on chromosome 6; or
(k) a C/C or G/C genotype at position 6,311,954 on chromosome 6.
10 The method of claim 9, comprising:
(i) analyzing the genetic marker of (a), wherein the increased terpene content is increased total monoterpene content absent beta-myrcene;
(ii) analyzing the genetic marker of (b), wherein the increased terpene content is increased linalool and/or nerolidol 2 content;
(iii) analyzing the genetic marker of (c), wherein the increased terpene content is increased beta-pinene content;
(iv) analyzing the genetic marker of (d), wherein the increased terpene content is increased betamyrcene content;
(v) analyzing the genetic marker of (e), wherein the increased terpene content is increased beta-ocimene content;
(vi) analyzing the genetic marker of (f), wherein the increased terpene content is increased camphene and/or d-limonene content;
(vii) analyzing the genetic marker of (g), wherein the increased terpene content is increased alphapinene content;
(viii) analyzing the genetic marker of (h), wherein the increased terpene content is increased alphaterpinene, gamma-terpinene, and/or terpinolene content;
(iv) analyzing the genetic marker of (i), wherein the increased terpene content is increased camphene and/or d-limonene content;
(x) analyzing the genetic marker of (j), wherein the increased terpene content is increased alpha- humulene and/or beta-caryophyllene content; or
(xi) analyzing the genetic marker of (k), wherein the increased terpene content is increased guaiol content.
11. The method of claim 1 or claim 2, wherein analyzing the one or more genetic markers in the nucleic acid sample comprises analyzing one or more of nucleotide positions:
8,871,401; 8,886,933; 9,101,934; 10,446,475; 10,543,062; 10,561,778; 10,633,191; 10,934,458;
11,169,492; or 13,920,896 on chromosome 1;
93,291,929 on chromosome 2;
47,140,085 on chromosome 3;
72,717,623 on chromosome 4;
330,918; 516,340; 518,238; 523,626; 608,718; 755,967; 1,100,981; 1,109,162; 1,331,433; 1,353,878; 1 366,137; 1,386,965; 1,487,633; 1,745,101; 1,828,050; 1,837,343; 1,840,325; 1,929,134; 2,003,303; 2,038,965;
2072,869; 2,120,881; 2,208,629; 2,288,919; 2,291,467; 2,302,063; 2,318,276; 2,339,956; 2,346,000; 2,360,380;
2 364,964; 2,366,529; 2,534,579; 2,698,301; 2,774,108; 2,780,345; 3,081,773; 3,247,341; 3,485,895; 3,503,143;
3 564,387; 3,585,965; 3,599,637; 3,629,225; 3,704,632; 3,807,710; 3,842,906; 4,384,123; 4,391,586; 5,854,661;
7 307,552; 11,993,646; 12,418,741; 12,446,524; 32,342,917; 32,395,736; 55,475,322; 62,911,168; 66,477,802; or 74,391,606 on chromosome 5;
1,220,207; 1,288,012; 1,695,817; 1,727,397; 1,960,918; 1,999,618; 2,012,149; 2,931,923; 3,073,845;
3,091,941; 3,185,660; 5,175,087; 5,468,920; 5,868,053; 6,061,359; 6,120,135; 6,311,954; 6,589,961; or 64,943,914 on chromosome 6;
14,069,586; 14,329,191; 14,866,064; or 33,592,849 on chromosome 8;
55,114,152 on chromosome 9; or
57,912,635 or 58,545,628 on chromosome x, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
12. The method of claim 11, wherein the genetic markers that indicate modified terpene content comprise one or more of:
Chromosome 1:
(a) a T/T or C/T genotype at position 8871401;
(b) a G/G or T/G genotype at position 8886933;
(c) a T/T or G/T genotype at position 9101934;
(d) a G/G or C/G genotype at position 10446475;
(e) a A/A or C/A genotype at position 10543062;
(f) a G/G or T/G genotype at position 10561778;
(g) a C/C or C/A genotype at position 10633191;
(h) a A/A or T/A genotype at position 10934458;
(i) a A/A or G/A genotype at position 11169492;
(j) a C/C or T/C genotype at position 13920896;
Chromosome 2:
(k) a A/A or T/A genotype at position 93291929;
Chromosome 3:
(l) a C/C or A/C genotype at position 47140085;
Chromosome 4:
(m) a A/A or C/A genotype at position 72717623;
Chromosome 5:
(n) a A/A or G/A genotype at position 330918;
(o) a C/C or T/C genotype at position 516340;
(p) a G/G or G/A genotype at position 518238;
(q) a T/T or T/C genotype at position 523626;
(r) a C/C or T/C genotype at position 608718;
(s) a T/T or G/T genotype at position 755967;
(t) a T/T or C/T genotype at position 1100981;
(u) a T/T or G/T genotype at position 1109162;
(v) a T/T or C/T genotype at position 1331433;
(w) a G/G or C/G genotype at position 1353878;
(x) a A/A or G/A genotype at position 1366137;
(y) a C/C or T/C genotype at position 1386965;
(z) a C/C or C/T genotype at position 1487633;
(aa) a T/T or T/A genotype at position 1745101;
(ab) a G/G or A/G genotype at position 1828050;
(ac) a G/G or A/G genotype at position 1837343;
(ad) a T/T or C/T genotype at position 1840325;
(ae) a T/T or T/C genotype at position 1929134;
(af) a A/A or G/A genotype at position 2003303;
(ag) a C/C or T/C genotype at position 2038965;
(ah) a T/T or C/T genotype at position 2072869;
(ai) a C/C or C/T genotype at position 2120881;
(aj) a G/G or G/C genotype at position 2208629;
(ak) a T/T or A/T genotype at position 2288919;
(al) a T/T or T/A genotype at position 2291467;
(am) a A/A or C/A genotype at position 2302063;
(an) a A/A or G/A genotype at position 2318276;
(ao) a T/T or T/C genotype at position 2339956;
(ap) a A/A or A/C genotype at position 2346000;
(aq) a G/G or G/T genotype at position 2360380;
(ar) a T/T or T/C genotype at position 2364964;
(as) a T/T or C/T genotype at position 2366529;
(at) a A/A or A/T genotype at position 2534579;
(an) a G/G or A/G genotype at position 2698301;
(av) a A/A or G/A genotype at position 2774108;
(aw) a G/G or A/G genotype at position 2780345;
(ax) a T/T or C/T genotype at position 3081773;
(ay) a C/C or T/C genotype at position 3247341;
(az) a C/C or T/C genotype at position 3485895;
(ba) a C/C or T/C genotype at position 3503143;
(bb) a A/A or A/T genotype at position 3564387;
(be) a A/A or A/T genotype at position 3585965;
(bd) a A/A or G/A genotype at position 3599637;
(be) a A/A or G/A genotype at position 3629225;
(bf) a C/C or T/C genotype at position 3704632;
(bg) a T/T or T/C genotype at position 3807710;
(bh) a T/T or T/A genotype at position 3842906;
(bi) a T/T or A/T genotype at position 4384123;
(bj) a G/G or G/A genotype at position 4391586;
(bk) a A/A or G/A genotype at position 5854661;
(bl) a T/T or C/T genotype at position 7307552;
(bm) a T/T or T/A genotype at position 11993646;
(bn) a T/T or C/T genotype at position 12418741;
(bo) a T/T or A/T genotype at position 12446524;
(bp) a T/T or A/T genotype at position 32342917;
(bq) a T/T or C/T genotype at position 32395736;
(br) a G/G or A/G genotype at position 55475322;
(bs) a A/A or T/A genotype at position 62911168;
(bt) a G/G or A/G genotype at position 66477802;
(bu) a C/C or C/T genotype at position 74391606;
Chromosome 6:
(bv) a C/C or A/C genotype at position 1220207;
(bw) a T/T or C/T genotype at position 1288012;
(bx) a C/C or G/C genotype at position 1695817;
(by) a T/T or C/T genotype at position 1727397;
(bz) a G/G or A/G genotype at position 1960918;
(ca) a A/A or G/A genotype at position 1999618;
(cb) a C/C or T/C genotype at position 2012149;
(cc) a A/A or T/A genotype at position 2931923;
(cd) a C/C or T/C genotype at position 3073845;
(ce) a G/G or A/G genotype at position 3091941;
(cf) a C/C or T/C genotype at position 3185660;
(eg) a T/T or A/T genotype at position 5175087;
(ch) a A/A or A/G genotype at position 5468920;
(ci) a A/A or G/A genotype at position 5868053;
(cj) a C/C or C/A genotype at position 6061359;
(ck) a T/T or C/T genotype at position 6120135;
(cl) a C/C or G/C genotype at position 6311954;
(cm) a G/G or T/G genotype at position 6589961;
(cn) a T/T or G/T genotype at position 64943914;
Chromosome 8:
(co) a A/A or A/T genotype at position 14069586;
(cp) a G/G or A/G genotype at position 14329191;
(cq) a T/T or C/T genotype at position 14866064;
(cr) a C/C or T/C genotype at position 33592849;
Chromosome 9:
(cs) a C/C or G/C genotype at position 55114152;
Chromosome X:
(ct) a C/C or T/C genotype at position 57912635; or
(cu) a T/T or T/C genotype at position 58545628.
13. The method of claim 11, wherein analyzing one or more genetic markers in the nucleic acid sample comprises:
(a) analyzing one or more of nucleotide positions: 93,291,929 on chromosome 2; 72,717,623 on chromosome 4; 55,114,152 on chromosome 9; 57,912,635 on chromosome x; and/or 58,545,628 on chromosome x, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1;
(b) analyzing one or more of nucleotide positions: 93,291 ,929 on chromosome 2; 47,140,085 on chromosome 3; 72,717,623 on chromosome 4; 3,807,710 on chromosome 5; 3,842,906 on chromosome 5; 55,475,322 on chromosome 5; and/or 33,592,849 on chromosome 8, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232 15.1;
(c) analyzing one or more of nucleotide positions 1,100,981; 1,840,325; 2,366,529; 2,698,301; 3 081,773; 3,485,895; 3,585,965; 3,599,637; 3,629,225; and/or 4,384,123 on chromosome 5, according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1 ;
(d) analyzing one or more of nucleotide positions 2,774,108; 5,854,661; or 7,307,552 on chromosome 5, or 1,695,817; 1,727,397; 1,960,918; 5,175,087; 14,069,586; 14,329,191; and/or 14,866,064 of chromosome 6
according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1;
(e) analyzing one or more of nucleotide positions 516,340; 518,238; 523,626; 1,109,162; 1,366,137; 2,346,000; 2,534,579; 3,247,341; 3,503,143; and/or 3,629,225 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1;
(f) analyzing one or more of nucleotide positions 516,340; 518,238; 523,626; 608,718; 1,109,162; 1 366,137; 1,386,965; 2,003,303; 3,247,341; and/or 3,704,632 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1;
(g) analyzing one or more of nucleotide positions 755,967; 1,840,325; 2,302,063; 2,366,529; 2,698,301; 3 485,895; 4,384,123; 11,993,646; 12,418,741; and/or 12,446,524 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1;
(h) analyzing one or more of nucleotide positions 330,918; 1,353,878; 1,745,101; 1,828,050; 1,929,134; 2072,869; 2,339,956; 3,081,773; 3,564,387; and/or 3,585,965 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1;
(i) analyzing one or more of nucleotide positions 1,828,050; 2,038,965; 2,120,881; 2,208,629; 2 360,380; 2,364,964; 32,342,917; 32,395,736; 62,911,168; and/or 66,477,802 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1;
(j) analyzing one or more of nucleotide positions 1,331,433; 1,487,633; 1,837,343; 2,288,919; 2 291,467; 2,318,276; 2,774,108; 2,780,345; 4,391,586; and/or 74,391,606 on chromosome 5 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1;
(k) analyzing one or more of nucleotide positions 8,871,401; 8,886,933; 9,101,934; 10,446,475; 10 543,062; 10,561,778; 10,633,191; 10,934,458; 11,169,492; and/or 13,920,896 on chromosome 1 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1 ;
(l) analyzing one or more of nucleotide positions 1,695,817; 1,727,397; 1,960,918; 5,175,087; 5 468,920; 5,868,053; 6,061,359; 6,120,135; and/or 64,943,914 on chromosome 6; or 14,069,586 on chromosome 8; according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1; or
(m) analyzing one or more of nucleotide positions 1,220,207; 1,288,012; 1,999,618; 2,012,149; 2 931,923; 3,073,845; 3,091,941; 3,185,660; 6,311,954; and/or 6,589,961 on chromosome 6 according to the Abacus Cannabis reference genome version Csat_AbacusV2, NCBI assembly accession GCA_025232715.1.
14 The method of claim 13, comprising:
(i) analyzing the genetic markers of (a), wherein the modified terpene content is modified total terpene content;
(ii) analyzing the genetic markers of (b), wherein the modified terpene content is modified total monoterpene content;
(iii) analyzing the genetic markers of (c), wherein the modified terpene content is modified total monoterpenes content absent beta-myrcene;
(iv) analyzing the genetic markers of (d), wherein the modified terpene content is modified total sesquiterpene content;
(v) analyzing the genetic markers of (e), wherein the modified teipene content is modified alpha-pinene content;
(vi) analyzing the genetic markers of (f), wherein the modified teipene content is modified beta-pinene content;
(vii) analyzing the genetic markers of (g), wherein the modified terpene content is modified alphaterpinene, gamma-terpinene, and terpinolene content;
(viii) analyzing the genetic markers of (h), wherein the modified terpene content is modified monoterpene to beta-myrcene content ratio;
(ix) analyzing the genetic markers of (i), wherein the modified teipene content is modified beta-ocimene content;
(x) analyzing the genetic markers of (j), wherein the modified terpene content is modified camphene and D-limonene content;
(xi) analyzing the genetic markers of (k), wherein the modified terpene content is modified linalool and tro/w-ncrolidol content;
(xii) analyzing the genetic markers of (1), wherein the modified terpene content is modified alpha- humulene and beta-caryophyllene content; or
(xiii) analyzing the genetic markers of (m), wherein the modified terpene content is modified guaiol content.
15. The method of claim 1 or claim 2, wherein the genetic markers that indicate modified terpene content comprise a polymorphism at position 51 of one or more of: SEQ ID NOs: 1-99.
16. The method of claim 15, wherein the polymorphism at position 51 comprises:
Chromosome 1:
(a) a T/T or C/T genotype at position 51 of SEQ ID NO: 1;
(b) a G/G or T/G genotype at position 51 of SEQ ID NO: 2;
(c) a T/T or G/T genotype at position 51 of SEQ ID NO: 3;
(d) a G/G or C/G genotype at position 51 of SEQ ID NO: 4;
(e) a A/A or C/A genotype at position 51 of SEQ ID NO: 5;
(f) a G/G or T/G genotype at position 51 of SEQ ID NO: 6;
(g) a C/C or C/A genotype at position 51 of SEQ ID NO: 7;
(h) a A/A or T/A genotype at position 51 of SEQ ID NO: 8;
(i) a A/A or G/A genotype at position 51 of SEQ ID NO: 9;
(j) a C/C or T/C genotype at position 51 of SEQ ID NO: 10; Chromosome 2:
(k) a A/A or T/A genotype at position 51 of SEQ ID NO: 11; Chromosome 3:
(l) a C/C or A/C genotype at position 51 of SEQ ID NO: 12; Chromosome 4:
(m) a A/A or C/A genotype at position 51 of SEQ ID NO: 13; Chromosome 5:
(n) a A/A or G/A genotype at position 51 of SEQ ID NO: 14;
(o) a C/C or T/C genotype at position 51 of SEQ ID NO: 15;
(p) a G/G or G/A genotype at position 51 of SEQ ID NO: 16;
(q) a T/T or T/C genotype at position 51 of SEQ ID NO: 17;
(r) a C/C or T/C genotype at position 51 of SEQ ID NO: 18;
(s) a T/T or G/T genotype at position 51 of SEQ ID NO: 19;
(t) a T/T or C/T genotype at position 51 of SEQ ID NO: 20;
(u) a T/T or G/T genotype at position 51 of SEQ ID NO: 21;
(v) a T/T or C/T genotype at position 51 of SEQ ID NO: 22;
(w) a G/G or C/G genotype at position 51 of SEQ ID NO: 23;
(x) a A/A or G/A genotype at position 51 of SEQ ID NO: 24;
(y) a C/C or T/C genotype at position 51 of SEQ ID NO: 25;
(z) a C/C or C/T genotype at position 51 of SEQ ID NO: 26;
(aa) a T/T or T/A genotype at position 51 of SEQ ID NO: 27;
(ab) a G/G or A/G genotype at position 51 of SEQ ID NO: 28;
(ac) a G/G or A/G genotype at position 51 of SEQ ID NO: 29;
(ad) a T/T or C/T genotype at position 51 of SEQ ID NO: 30;
(ac) a T/T or T/C genotype at position 51 of SEQ ID NO: 31;
(af) a A/A or G/A genotype at position 51 of SEQ ID NO: 32;
(ag) a C/C or T/C genotype at position 51 of SEQ ID NO: 33;
(ah) a T/T or C/T genotype at position 51 of SEQ ID NO: 34;
(ai) a C/C or C/T genotype at position 51 of SEQ ID NO: 35;
(aj) a G/G or G/C genotype at position 51 of SEQ ID NO: 36;
(ak) a T/T or A/T genotype at position 51 of SEQ ID NO: 37;
(al) a T/T or T/A genotype at position 51 of SEQ ID NO: 38;
(am) a A/ A or C/A genotype at position 51 of SEQ ID NO: 39;
(an) a A/A or G/A genotype at position 51 of SEQ ID NO: 40;
(ao) a T/T or T/C genotype at position 51 of SEQ ID NO: 41;
(ap) a A/A or A/C genotype at position 51 of SEQ ID NO: 42;
(aq) a G/G or G/T genotype at position 51 of SEQ ID NO: 43;
(ar) a T/T or T/C genotype at position 51 of SEQ ID NO: 44;
(as) a T/T or C/T genotype at position 51 of SEQ ID NO: 45;
(at) a A/A or A/T genotype at position 51 of SEQ ID NO: 46;
(au) a G/G or A/G genotype at position 51 of SEQ ID NO: 47;
(av) a A/A or G/A genotype at position 51 of SEQ ID NO: 48;
(aw) a G/G or A/G genotype at position 51 of SEQ ID NO: 49;
(ax) a T/T or C/T genotype at position 51 of SEQ ID NO: 50;
(ay) a C/C or T/C genotype at position 51 of SEQ ID NO: 51;
(az) a C/C or T/C genotype at position 51 of SEQ ID NO: 52;
(ba) a C/C or T/C genotype at position 51 of SEQ ID NO: 53;
(bb) a A/A or A/T genotype at position 51 of SEQ ID NO: 54;
(be) a A/A or A/T genotype at position 51 of SEQ ID NO: 55;
(bd) a A/A or G/A genotype at position 51 of SEQ ID NO: 56;
(be) a A/A or G/A genotype at position 51 of SEQ ID NO: 57;
(bf) a C/C or T/C genotype at position 51 of SEQ ID NO: 58;
(bg) a T/T or T/C genotype at position 51 of SEQ ID NO: 59;
(bh) a T/T or T/A genotype at position 51 of SEQ ID NO: 60;
(bi) a T/T or A/T genotype at position 51 of SEQ ID NO: 61;
(bj) a G/G or G/A genotype at position 51 of SEQ ID NO: 62;
(bk) a A/A or G/A genotype at position 51 of SEQ ID NO: 63;
(bl) a T/T or C/T genotype at position 51 of SEQ ID NO: 64;
(bm) a T/T or T/A genotype at position 51 of SEQ ID NO: 65;
(bn) a T/T or C/T genotype at position 51 of SEQ ID NO: 66;
(bo) a T/T or A/T genotype at position 51 of SEQ ID NO: 67;
(bp) a T/T or A/T genotype at position 51 of SEQ ID NO: 68;
(bq) a T/T or C/T genotype at position 51 of SEQ ID NO: 69;
(br) a G/G or A/G genotype at position 51 of SEQ ID NO: 70;
(bs) a A/A or T/A genotype at position 51 of SEQ ID NO: 71;
(bt) a G/G or A/G genotype at position 51 of SEQ ID NO: 72;
(bu) a C/C or C/T genotype at position 51 of SEQ ID NO: 73;
Chromosome 6:
(bv) a C/C or A/C genotype at position 51 of SEQ ID NO: 74;
(bw) a T/T or C/T genotype at position 51 of SEQ ID NO: 75;
(bx) a C/C or G/C genotype at position 51 of SEQ ID NO: 76;
(by) a T/T or C/T genotype at position 51 of SEQ ID NO: 77;
(bz) a G/G or A/G genotype at position 51 of SEQ ID NO: 78;
(ca) a A/A or G/A genotype at position 51 of SEQ ID NO: 79;
(cb) a C/C or T/C genotype at position 51 of SEQ ID NO: 80;
(cc) a A/A or T/A genotype at position 51 of SEQ ID NO: 81;
(cd) a C/C or T/C genotype at position 51 of SEQ ID NO: 82;
(ce) a G/G or A/G genotype at position 51 of SEQ ID NO: 83;
(cf) a C/C or T/C genotype at position 51 of SEQ ID NO: 84;
(eg) a T/T or A/T genotype at position 51 of SEQ ID NO: 85;
(ch) a A/A or A/G genotype at position 51 of SEQ ID NO: 86;
(ci) a A/A or G/A genotype at position 51 of SEQ ID NO: 87;
(cj) a C/C or C/A genotype at position 51 of SEQ ID NO: 88;
(ck) a T/T or C/T genotype at position 51 of SEQ ID NO: 89;
(cl) a C/C or G/C genotype at position 51 of SEQ ID NO: 90;
(cm) a G/G or T/G genotype at position 51 of SEQ ID NO: 91;
(cn) a T/T or G/T genotype at position 51 of SEQ ID NO: 92;
Chromosome 8:
(co) a A/A or A/T genotype at position 51 of SEQ ID NO: 93;
(cp) a G/G or A/G genotype at position 51 of SEQ ID NO: 94;
(cq) a T/T or C/T genotype at position 51 of SEQ ID NO: 95;
(cr) a C/C or T/C genotype at position 51 of SEQ ID NO: 96;
Chromosome 9:
(cs) a C/C or G/C genotype at position 51 of SEQ ID NO: 97;
Chromosome X:
(ct) a C/C or T/C genotype at position 51 of SEQ ID NO: 98;
(cu) a T/T or T/C genotype at position 51 of SEQ ID NO: 99.
17. The method of claim 15, wherein the one or more genetic markers comprise:
(a) SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 97, SEQ ID NO: 98, and/or SEQ ID NO: 99;
(b) SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 59, SEQ ID NO: 60, SEQ ID NO: 70, and/or SEQ ID NO: 96;
(c) SEQ ID NO: 20, SEQ ID NO: 30, SEQ ID NO: 45, SEQ ID NO: 47, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 55, SEQ ID NO: 56, SEQ ID NO: 57, and/or SEQ ID NO: 61;
(d) SEQ ID NO: 48, SEQ ID NO: 63, SEQ ID NO: 64, SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 78, SEQ ID NO: 85, SEQ ID NO: 93, SEQ ID NO: 94, and/or SEQ ID NO: 95;
(e) SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 21, SEQ ID NO: 24, SEQ ID NO: 42, SEQ ID NO: 46, SEQ ID NO: 51, SEQ ID NO: 53, and/or SEQ ID NO: 57;
(f) SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 21, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 32, SEQ ID NO: 51, and/or SEQ ID NO: 58;
(g) SEQ ID NO: 19, SEQ ID NO: 30, SEQ ID NO: 39, SEQ ID NO: 45, SEQ ID NO: 47, SEQ ID NO: 52, SEQ ID NO: 61, SEQ ID NO: 65, SEQ ID NO: 66, and/or SEQ ID NO: 67;
(h) SEQ ID NO: 14, SEQ ID NO: 23, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 31, SEQ ID NO: 34, SEQ ID NO: 41, SEQ ID NO: 50, SEQ ID NO: 54, and/or SEQ ID NO: 55;
(i) SEQ ID NO: 28, SEQ ID NO: 33, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 68, SEQ ID NO: 69, SEQ ID NO: 71, and/or SEQ ID NO: 72;
(j) SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 48, SEQ ID NO: 49, SEQ ID NO: 62, and/or SEQ ID NO: 73;
(k) SEQ ID NO: 1 , SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, and/or SEQ ID NO: 10;
(l) SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 78, SEQ ID NO: 85, SEQ ID NO: 86, SEQ ID NO: 87, SEQ ID NO: 88, SEQ ID NO: 89, SEQ ID NO: 92, and/or SEQ ID NO: 93; or
(m) SEQ ID NO: 74, SEQ ID NO: 75, SEQ ID NO: 79, SEQ ID NO: 80, SEQ ID NO: 81, SEQ ID NO: 82, SEQ ID NO: 83, SEQ ID NO: 84, SEQ ID NO: 90, and/or SEQ ID NO: 91.
18. The method of claim 1 or claim 2, wherein analyzing one or more genetic markers comprises analyzing 2 to 10 genetic markers.
19. The method of claim 1 or claim 2, wherein the one or more genetic markers are genetically linked to a terpene trait locus.
20. The method of claim 1 or claim 2, wherein the modified terpene content comprises modified total terpenes, total monoterpenes, beta-myrcene, total sesquiterpenes, alpha-pinene, beta-pinene, alpha-terpinene, gamma- terpinene, terpinolene, beta-ocimene, camphene, D-limonene, linalool, rrans-nerolidol, alpha-humulene, beta-caryophyllene, and/or guaiol levels.
21. The method of claim 1 or claim 2, wherein the modified terpene content is increased terpene content relative to the control.
22. The method of claim 1 , wherein the control is a Cannabis plant without the one or more markers that indicate modified terpene content.
23. The method of claim 1 or claim 2, wherein the terpene content is modified in flower and/or leaf tissue.
24. A Cannabis plant produced by the method claim 8.
25. A seed, plant part, tissue culture, or protoplast of the plant of claim 24.
26. A method of Cannabis breeding, comprising crossing the Cannabis plant of claim 24.
27. The method of claim 26, wherein crossing comprises selfing, sibling crossing, outcrossing, or backcrossing.
28. A Cannabis product produced from the plant of claim 24.
29. The Cannabis product of claim 28, wherein the product is a kief, hashish, bubble hash, an edible product, solvent reduced oil, sludge, e-juice, or tincture
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202363611288P | 2023-12-18 | 2023-12-18 | |
| US63/611,288 | 2023-12-18 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025137000A1 true WO2025137000A1 (en) | 2025-06-26 |
Family
ID=96138788
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2024/060604 Pending WO2025137000A1 (en) | 2023-12-18 | 2024-12-17 | Terpene genetic markers |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025137000A1 (en) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190230882A1 (en) * | 2016-05-27 | 2019-08-01 | New West Genetics | Industrial hemp cannabis cultivars and seeds with stable cannabinoid profiles |
| US20200316015A1 (en) * | 2017-02-07 | 2020-10-08 | Elevate Technologies Llc | Terpene-based compositions, methods of preparations and uses thereof |
| WO2022165507A1 (en) * | 2021-01-28 | 2022-08-04 | Central Coast Agriculture, Inc. | Marker-assisted breeding in cannabis plants |
| US20230002779A1 (en) * | 2020-06-29 | 2023-01-05 | Front Range Biosciences, Inc. | Characterization of plant cultivars based on terpene synthase gene profiles |
-
2024
- 2024-12-17 WO PCT/US2024/060604 patent/WO2025137000A1/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190230882A1 (en) * | 2016-05-27 | 2019-08-01 | New West Genetics | Industrial hemp cannabis cultivars and seeds with stable cannabinoid profiles |
| US20200316015A1 (en) * | 2017-02-07 | 2020-10-08 | Elevate Technologies Llc | Terpene-based compositions, methods of preparations and uses thereof |
| US20230002779A1 (en) * | 2020-06-29 | 2023-01-05 | Front Range Biosciences, Inc. | Characterization of plant cultivars based on terpene synthase gene profiles |
| WO2022165507A1 (en) * | 2021-01-28 | 2022-08-04 | Central Coast Agriculture, Inc. | Marker-assisted breeding in cannabis plants |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20230242932A1 (en) | Autoflowering Markers | |
| Polashock et al. | The American cranberry: first insights into the whole genome of a species adapted to bog habitat | |
| Wolters et al. | Identification of alleles of carotenoid pathway genes important for zeaxanthin accumulation in potato tubers | |
| US11920187B2 (en) | Varin markers | |
| US20200270623A1 (en) | Method for differentiating cannabis plant cultivars based on cannabinoid synthase paralogs | |
| US12295309B2 (en) | Melon with red flesh linked to earliness | |
| Li et al. | Comparative biochemical and transcriptome analysis provides insights into the regulatory mechanism of striped leaf albinism in arecanut (Areca catechu L.) | |
| MX2013010485A (en) | Genetic markers for myb28. | |
| CN105392361A (en) | Solanium lycopersicum plants having pink fruits | |
| US20230087919A1 (en) | Cannabis Hybrid Varieties and Parent Lines | |
| ES2711627T3 (en) | Genetic markers for resistance to orobanca in sunflower | |
| US20250137075A1 (en) | Varin genes | |
| WO2025137000A1 (en) | Terpene genetic markers | |
| US11240978B2 (en) | Hemp variety NBS CBD-1 | |
| CA3189202A1 (en) | Varin profiles | |
| US20240117450A1 (en) | Powdery mildew markers for cannabis | |
| WO2025184562A1 (en) | Genetic markers for plant height | |
| WO2024182623A2 (en) | Genes and genetic markers associated with high varin production | |
| US20240341254A1 (en) | Varin profiles | |
| US20250127104A1 (en) | Cannabinoid markers | |
| WO2024092249A2 (en) | Flower initiation markers | |
| WO2021138501A1 (en) | Cannabinoid synthase markers | |
| WO2023137336A1 (en) | Hermaphroditism markers | |
| US20250361518A1 (en) | Autoflowering genes | |
| CN108289429B (en) | Tomato Plants That Produce Fruits That Contain Beneficial Compounds |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24908776 Country of ref document: EP Kind code of ref document: A1 |