WO2025030093A1 - Zygote-preferred expression - Google Patents
Zygote-preferred expression Download PDFInfo
- Publication number
- WO2025030093A1 WO2025030093A1 PCT/US2024/040716 US2024040716W WO2025030093A1 WO 2025030093 A1 WO2025030093 A1 WO 2025030093A1 US 2024040716 W US2024040716 W US 2024040716W WO 2025030093 A1 WO2025030093 A1 WO 2025030093A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- seq
- promoter
- plant
- length
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/113—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/415—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from plants
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8216—Methods for controlling, regulating or enhancing expression of transgenes in plant cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases [RNase]; Deoxyribonucleases [DNase]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]
Definitions
- ZYGOTE-PREFERRED EXPRESSION FIELD This dislcosure is related to the field of plant biotechnology, specifically agriculture biotechnology and gene editing, as well as plant breeding.
- the presently disclosed subject matter relates to using transforming a haploid inducing line so that it contains DNA coding for cellular machinery capable of editing genes, which is preferentially expressed in zygote cells.
- CLAIM FOR PRIORITY This application claims priority to application serial no. PCT/CN2023/110941, filed August 3, 2023, which is incorporated by reference in its entirety.
- the first plant is crossed with the second plant to obtain haploid progeny in which the chromosomes of the haploid inducer line are eliminated and the haploid chromosomes of the recipient plant have the desired edit.
- HI-Edit methods are detailed in PCT publication WO2018/102816. See also, Kelliher, T et al. (2019) One-step genome editing of elite crop germplasm during haploid induction. Nature Biotech 37: 287–292.
- zygote-preferred promoters such as a Vacuolar Sorting Protein promoter (VSP) or Ring Zinc-Finger Domain Protein promoter (RZDP) or SUMO-conjugating enzyme (SCE1) that preferentially drive expression of a transcript from a nucleic acid operably linked to the promoter in early zygotes and male gametes.
- VSP Vacuolar Sorting Protein promoter
- RZDP Ring Zinc-Finger Domain Protein promoter
- SCE1 SUMO-conjugating enzyme
- zygote-preferred promoters are employed in nucleic acid constructs to drive expression of gene editing system components that are employed in gene editing procedures, such as HI-Edit, that exploit haploid induction to provide a plant with a desired edit.
- the disclosure provides a synthetic DNA construct comprising a zygote-preferred promoter operably linked to a first nucleotide sequence of interest (“NSOI”).
- the zygote-preferred promoter is a vacuolar sorting protein promoter (“VSP promoter”).
- VSP promoter comprises a sequence: a) selected from the group consisting of SEQ ID NOs: 1–27 or a functional fragment thereof; or b) an orthologous promoter to SEQ ID NO: 1.
- the VSP promoter comprises SEQ ID NO:27.
- the zygote-preferred promoter is a ring zinc-finger domain protein promoter (“RZDP promoter”).
- the RZDP promoter comprises a sequence: (a) selected from the group consisting of SEQ ID NOs: 28– 43 or a functional fragment thereof; or (b) orthologous promoter to SEQ ID NO: 28.
- the RZDP promoter comprises SEQ ID NO: 28.
- the zygote-preferred promoter is a SUMO-conjugating enzyme 1 promoter (“SCE1 promoter”).
- the SCE1 promoter comprises a sequence a) selected from the group consisting of SEQ ID NOs: 83–85 or a functional fragment thereof; or b) an orthologous promoter to SEQ ID NO: 83.
- the SCE1 promoter comprises SEQ ID NO: 84. In some instances, the SCE1 promoter comprises SEQ ID NO: 85.
- the synthetic DNA construct further comprises a U3 promoter operably linked to a second NSOI. In some instances, the U3 promoter comprises a sequence: (a) selected from the group consisting of SEQ ID NOs: 44–49 or a functional fragment thereof; or (b) orthologous promoter to SEQ ID NO: 44. In some embodiments, the U3 promoter comprises SEQ ID NO: 45. In some embodiments, the synthetic DNA construct further comprises a terminator operably linked to the NSOI. In some instances, the terminator is a ubiquitin terminator.
- the ubiquitin terminator comprises SEQ ID NOs: 88, 89, 90, 91, 92, or 93.
- the first NSOI comprises a sequence encoding a nuclease.
- the nuclease is a zinc finger nuclease (“ZFN”), a meganuclease (“MN”), a transcription activator-like effector nuclease (TALEN), or a CRISPR nuclease.
- the CRISPR nuclease is selected from the group consisting of Cas9, Cas12a, Cas12b, Cas12c, Cas12d, Cas12e, Cas12f, Cas12i, Cas12j, Cas12l, Cas1, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas10, Cas11, Cas13a, Cas13b, Cas13c, Cas13d, Cas14, any other CRISPR-Cas nuclease, any mutant thereof, a nickase variant, and a deactivated variant.
- the CRISPR nuclease further comprises a fusion domain, e.g., selected from the group consisting of a deaminase, a uracil DNA glycosylase, a reverse transcriptase, and an exonuclease.
- the second NSOI of the synthetic DNA construct comprises a sequence encoding for at least one guide RNA.
- the at least one guide RNA is encoded by a sequence selected from the group consisting of SEQ ID NO: 50–61.
- the synthetic DNA construct comprises a sequence selected from the group consisting of SEQ ID NO: 62–72.
- the disclosure provides a plant cell comprising a synthetic DNA construct as described herein, e.g., in the preceding paragraph.
- the plant cell is a pollen cell or an egg cell.
- the disclosure provides a plant, e.g., a maize plant, comprising the plant cell.
- the disclosure provides a method of obtaining an edited progeny plant, comprising: (a) providing a first plant, wherein the first plant is transformed to comprise the synthetic construct of claims 1-18; (b) pollinating a second plant; and (c) selecting at least one progeny produced by the pollination of step (b), wherein the progeny possesses an edit; thereby obtaining an edited progeny plant.
- the first plant is a haploid inducer line of the plant.
- the haploid inducer line is a paternal haploid inducer, e.g.,a paternal haploid inducer line comprising a mutation in a CENH3 gene.
- the haploid inducer lines is a maternal haploid inducer, e.g., a maternal haploid inducer line comprising a mutation in a MATL gene.
- the second plant comprises plant genomic DNA to be edited.
- the edited progeny plant is a haploid progeny plant.
- the haploid progeny plant comprises the genome of the second plant but not the first plant.
- the haploid progeny plant is a maize haploid progeny plant.
- the phrase "A, B, C, and/or D” includes A, B, C, and D individually, but also includes any and all combinations and subcombinations of A, B, C, and D (e.g., AB, AC, AD, BC, BD, CD, ABC, ABD, and BCD).
- one of more of the elements to which the "and/or” refers can also individually be present in single or multiple occurrences in the combinations(s) and/or subcombination(s).
- plant as used herein can refer to a whole plant, or any part or component of a plant at any stage of development; and includes reference to a cell or tissue culture derived from a plant.
- plant can refer to components or organs, e.g., leaves, stems, roots, plant tissues, seeds and/or plant cells.
- plant cell refers to a structural and physiological unit of a plant, comprising a protoplast and a cell wall. The plant cell may be in form of an isolated single cell or a cultured cell, or as a part of higher organized unit such as, for example, plant tissue, a plant organ, or a whole plant.
- the plant cell may be derived from or part of an angiosperm or gymnosperm.
- the plant cell may be a monocotyledonous plant cell (e.g., a maize cell, a rice cell, a sorghum cell, a sugarcane cell, a barley cell, a wheat cell, an oat cell, a turf grass cell, or an ornamental grass cell) or a dicotyledonous plant cell (e.g., a tobacco cell, a pepper cell, an eggplant cell, a sunflower cell, a crucifer cell, a flax cell, a potato cell, a cotton cell, a soybean cell, a sugar bee cell, or an oilseed rape cell.
- a monocotyledonous plant cell e.g., a maize cell, a rice cell, a sorghum cell, a sugarcane cell, a barley cell, a wheat cell, an oat cell, a turf grass cell, or an ornamental grass cell
- plant cell culture refers to cultures of plant units such as, for example, protoplasts, cell culture cells, cells in plant tissues, pollen, pollen tubes, ovules, embryo sacs, zygotes and embryos at various stages of development.
- plant tissue refers to a group of plant cells organized into a structural and functional unit. Any tissue of a plant in planta or in culture is included. This term includes, but is not limited to, whole plants, plant organs, plant seeds, tissue culture and any group of plant cells organized into structural and/or functional units. The use of this term in conjunction with, or in the absence of, any specific type of plant tissue as listed above or otherwise embraced by this definition is not intended to be exclusive of any other type of plant tissue.
- plant part refers to a part of a plant, including single cells and cell tissues such as plant cells that are intact in plants, cell clumps and tissue cultures from which plants can be regenerated.
- plant parts include, but are not limited to, single cells and tissues from pollen, ovules, zygotes, leaves, embryos, roots, root tips, anthers, flowers, flower parts, fruits, stems, shoots, cuttings, and seeds; as well as pollen, ovules, egg cells, zygotes, leaves, embryos, roots, root tips, anthers, flowers, flower parts, fruits, stems, shoots, cuttings, scions, rootstocks, seeds, protoplasts, calli, and the like.
- progeny and “progeny plant” refer to a plant generated from vegetative or sexual reproduction from one or more parent plants.
- the term “progeny” can ref to any descent of a particular cross or parent plant.
- progeny plants result from the breeding of two individuals, although some species (particularly some plants and hermaphroditic animals) can be selfed (i.e., the same plant acts as the donor of both male and female gametes).
- the descendant(s) can be, for example, of the F1, the F2, or any subsequent generation.
- progeny plants result from Hi-Edit method.
- Haploid induction is a class of plant phenomena characterized by loss of one parent's set of chromosomes (the chromosomes from the haploid inducer parent) from the embryo at some time during or after fertilization, often during early embryo development. Haploid induction is also known as gynogenesis if the inducer line is used as the male in the cross, or androgenesis if the inducer line is used as the female in the cross. Haploid induction has been observed in numerous plant species, such as sorghum, barley, wheat, maize, Arabidopsis, and many other species.
- both parent lines used in the induction cross are both diploids, so their gametes (egg cells and sperm cells) are haploids.
- Haploid induction is frequently a medium to low penetrance trait of the inducer line, so the resulting progeny, depending on the species or situation, may be either diploid (if no genome loss takes place) or haploids (if genome loss does indeed take place).
- the "haploid" progeny produced will have a gametic chromosome number, e.g., diploids (if the parent is tetraploid) or triploids (if the parent is hexaploid).
- haploids possess half the number of chromosomes of either parent; thus haploids of diploid organisms (e.g., maize) exhibit monoploidy; haploids of tetraploid organisms (e.g., ryegrasses) exhibit diploidy; haploids of hexaploid organisms (e.g., wheat) exhibit triploidy.
- HI-Edit refers to “haploid-induction editing” of a genome, which employs a haploid-inducer plant modified to express gene editing machinery to deliver the editing machinery to the genome to be edited of a recipient plant.
- the first plant is crossed with the second plant to obtain haplolid progeny in which the chromosomes of the haploid inducer line are eliminated and the haploid chromosomes of the recipient plant have the desired edit.
- HI-Edit methods are detailed in PCT publication WO2018/102816. See also, Kelliher, T et al. (2019) One-step genome editing of elite crop germplasm during haploid induction. Nature Biotech 37: 287–292. [0020]
- a plant referred to here as a "doubled haploid" is generated by doubling the haploid set of chromosomes.
- a plant or seed that is obtained from a doubled haploid plant that is selfed to any number of generations may still be identified as a doubled haploid plant.
- a doubled haploid plant is considered a homozygous plant.
- a plant is considered to be doubled haploid if it is fertile, even if the entire vegetative part of the plant does not consist of the cells with the doubled set of chromosomes; that is, a plant will be considered doubled haploid if it contains viable gametes, even if it is chimeric in vegetative tissues.
- a “promoter” refers to a polynucleotide sequence comprising sequences upstream from a 5’ transcription start site that controls transcription of a gene or sequence to which it is operably linked.
- a promoter includes signals for RNA polymerase binding and transcription initiation.
- a promoter may also contain regulatory elements such enhancers, repressor binding sites, and the like.
- a nucleotide sequence for a promoter may comprise a region encoding 5’ untranslated sequence of a transcript of the native gene from which the promoter is derived.
- a VSP promoter sequence of any one of SEQ ID NOS:1-27, or a functional fragment thereof may comprise a region encoding 5’ untranslated sequences of a VSP transcript; or an RZDP promoter sequence of any one of SEQ ID NOS: 28-43, or a functional fragment thereof; may comprise a region encoding 5’ untranslated sequences of an RZDP transcript.
- the promoter does not comprise a region encoding 5’ untranslated sequences.
- a promoter may be a U3 promoter sequence of any one of SEQ ID NOS:44-49, or a functional fragment thereof. In some embodiments, such a promoter does not comprise untranscribed sequences.
- a “nucleotide sequence of interest” refers to a polynucleotide to be expressed by a synthetic construct comprising the polynucleotide operably linked to a promoter to control transcription of the polynucleotide.
- a transcript produced by the synthetic construct may be a protein-encoding RNA, e.g., that encodes a nuclease, or a non- protein-coding transcript, such as a guide RNA.
- an “endogenous” or “native” nucleic acid sequence refers to a nucleic acid sequence that occurs naturally in the genome of an organism.
- a “gene” is a defined region that is located within a genome that includes, in addition to coding nucleic acid sequence, comprises other sequences, primarily regulatory sequences responsible for the control of the expression, that is to say the transcription and, wherein a protein is produced by a gene, translation of the coding portion. Genes can include both coding and non-coding regions (e.g., introns, regulatory elements, promoters, enhancers, termination sequences and 5' and 3' untranslated regions). A gene typically expresses mRNA, functional RNA, or a specific protein, including regulatory sequences. Genes may or may not be capable of being used to produce a functional protein. In some embodiments, a gene refers to only the coding region.
- a gene refers to a gene as found in nature.
- chimeric gene refers to any gene that contains 1) DNA sequences, including regulatory and coding sequences that are not found together in nature, or 2) sequences encoding parts of proteins not naturally adjoined, or 3) parts of promoters that are not naturally adjoined. Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that are derived from different sources or comprise regulatory sequences and coding sequences derived from the same source, but arranged in a manner different from that found in nature.
- a gene may be “isolated” by which is meant a nucleic acid molecule that is substantially or essentially free from components normally found in association with the nucleic acid molecule in its natural state.
- nucleic acid and “polynucleotide” are used interchangeably and as used herein refer to deoxyribonucleic acids (DNA) or ribonucleic acids (RNA) and polymers thereof in either single- or double-stranded form, as well as to both sense and anti-sense strands of RNA, cDNA, genomic DNA, mitochondrial DNA, and synthetic forms and mixed polymers of the above.
- DNA is the genetic material while RNA is involved in the transfer of information contained within DNA into proteins.
- a “genome” is the entire body of genetic material contained in each cell of an organism. It is understood that when an RNA is described, its corresponding cDNA is also described, wherein uridine is represented as thymidine.
- a nucleotide refers to a ribonucleotide, deoxynucleotide or a modified form of either type of nucleotide, or combinations thereof.
- a polynucleotide disclosed herein may include either or both naturally occurring and modified nucleotides linked together by naturally occurring and/or non-naturally occurring nucleotide linkages.
- the nucleic acid molecules may be modified chemically or biochemically or may contain non-natural or derivatized nucleotide bases, as will be readily appreciated by those of skill in the art. Such modifications include, for example, labels, methylation, substitution of one or more of the naturally occurring nucleotides with an analogue, internucleotide modifications such as uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoramidates, carbamates, and the like), charged linkages (e.g., phosphorothioates, phosphorodithioates, and the like), pendent moieties (e.g., polypeptides), intercalators (e.g., acridine, psoralen, and the like), chelators, alkylators, and modified linkages (e.g., alpha anomeric nucleic acids, and the like).
- uncharged linkages e.g., methyl phosphonates, phosphotriesters
- a reference to a nucleic acid sequence encompasses its complement unless otherwise specified. Thus, a reference to a nucleic acid molecule having a particular sequence should be understood to encompass its complementary strand, with its complementary sequence. Nucleotide sequences are “complementary” when they specifically hybridize in solution (e.g., according to Watson-Crick base pairing rules). The term also includes codon-optimized nucleic acids that encode the same polypeptide sequence.
- nucleic acids can be unpurified, purified, or attached, for example, to a synthetic material such as a bead or column matrix.
- reference sequence in the context of a nucleic acid sequence refers to a defined nucleotide sequence used as a basis for nucleotide sequence comparison.
- a reference sequence can be a promoter sequence of any one of SEQ ID NOS:1-27, SEQ ID NOS:28-43; or SEQ ID NOS 44-49; a sequence of any one of SEQ ID Nos:50-61 that encodes a guide RNA; or a nucleic acid sequence of anyone of SEQ ID NOS:62-72 that encodes a construct.
- the term “corresponding to” in the context of nucleic acid sequences as used in the present disclosure refers to certain positions, or certain regions of a nucleotide sequence of interest that align with these positions or regions of a reference sequence when the two sequences are optimally aligned, but that are not necessarily in these exact numerical positions of the two sequences.
- sequence comparison and multiple sequence alignment algorithms are, respectively, the Basic Local Alignment Search Tool (BLAST) and ClustalW/ClustalW2/Clustal Omega programs available on the Internet (e.g., the website of the EMBL-EBI).
- BLAST Basic Local Alignment Search Tool
- ClustalW/ClustalW2/Clustal Omega programs available on the Internet (e.g., the website of the EMBL-EBI).
- Other suitable programs include, but are not limited to, GAP, BestFit, Plot Similarity, and FASTA, which are part of the Accelrys GCG Package available from Accelrys, Inc. of San Diego, Calif., United States of America.
- a percentage of sequence identity refers to sequence identity over the full length of a nucleic acid or polypeptide sequence.
- nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions in sequences that encode proteins), alleles, SNPs, and complementary sequences as well as the sequence explicitly indicated.
- polypeptide e.g., a polymer of amino acid residues.
- the terms encompass amino acid chains of any length, including full-length proteins, wherein the amino acid residues are linked by covalent peptide bonds.
- identity refers to a sequence that has at least 60% sequence identity to a reference sequence.
- percent identity can be any integer from 60% to 100%.
- Exemplary embodiments include at least: 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%, as compared to a reference sequence using the programs described herein; preferably BLAST using standard parameters, as described below.
- sequence comparison typically one sequence acts as a reference sequence to which test sequences are compared.
- test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Default program parameters can be used, or alternative parameters can be designated.
- sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters.
- a “comparison window,” as used herein, includes reference to a segment of any one of the number of contiguous positions selected from the group consisting of from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned.
- Methods of alignment of sequences for comparison are well-known in the art.
- Optimal alignment of sequences for comparison may be conducted by the local homology algorithm of Smith and Waterman Add. APL. Math.2:482 (1981), by the homology alignment algorithm of Needleman and Wunsch J. Mol. Biol.48:443 (1970), by the search for similarity method of Pearson and Lipman Proc. Natl.
- Biol.48(3):443-453 as implemented by the "needle" program, distributed as part of the EMBOSS software package (Rice, P., Longden, I., and Bleasby, A., EMBOSS: The European Molecular Biology Open Software Suite, 2000, Trends in Genetics 16, (6) pp276-277, versions 6.3.1 available from EMBnet at embnet.org/resource/emboss and emboss.sourceforge.net, among other sources) using default gap penalties and scoring matrices (EBLOSUM62 for protein and EDNAFULL for DNA). Equivalent programs may also be used.
- equivalent program any sequence comparison program that, for any two sequences in question, generates an alignment having identical nucleotide residue matches and an identical percent sequence identity when compared to the corresponding alignment generated by needle from EMBOSS version 6.3.1.
- Additional mathematical algorithms are known in the art and can be utilized for the comparison of two sequences. See, for example, the algorithm of Karlin and Altschul (1990) Proc. Natl. Acad. Sci. USA 87:2264, modified as in Karlin and Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-5877. Such an algorithm is incorporated into the BLAST programs of Altschul et al. (1990) J. Mol. Biol.215:403.
- BLAST nucleotide searches can be performed with the BLASTN program (nucleotide query searched against nucleotide sequences) to obtain nucleotide sequences homologous to nucleic acid molecules of the invention, or with the BLASTX program (translated nucleotide query searched against protein sequences) to obtain protein sequences homologous to nucleic acid molecules of the invention.
- BLAST protein searches can be performed with the BLASTP program (protein query searched against protein sequences) to obtain amino acid sequences homologous to protein molecules of the invention, or with the TBLASTN program (protein query searched against translated nucleotide sequences) to obtain nucleotide sequences homologous to protein molecules of the invention.
- Gapped BLAST in BLAST 2.0
- PSI-Blast can be used to perform an iterated search that detects distant relationships between molecules. See Altschul et al. (1997) supra.
- the default parameters of the respective programs e.g., BLASTX and BLASTN
- Alignment may also be performed manually by inspection.
- an “isolated” nucleic acid molecule is a nucleic acid molecule or nucleotide sequence that is not immediately contiguous with nucleotide sequences with which it is immediately contiguous (one on the 5' end and one on the 3' end) in the naturally occurring genome of the organism from which it is derived. Accordingly, in one embodiment, an isolated nucleic acid includes some or all of the 5' non-coding (e.g., promoter) sequences that are immediately contiguous to a coding sequence.
- 5' non-coding e.g., promoter
- the term therefore includes, for example, a recombinant nucleic acid that is incorporated into a vector, into an autonomously replicating plasmid or virus, or into the genomic DNA of a prokaryote or eukaryote, or which exists as a separate molecule (e.g., a cDNA or a genomic DNA fragment produced by PCR or restriction endonuclease treatment), independent of other sequences. It also includes a recombinant nucleic acid that is part of a hybrid nucleic acid molecule encoding an additional RNA or polyipeptide sequence.
- An “isolated” nucleic acid molecule can also include a polynucleotide derived from and inserted into the same natural, original cell type, but which is present in a non-natural state, e.g., present in a different copy number, and/or under the control of different regulatory sequences than that found in the native state of the nucleic acid molecule. “Isolated” does not necessarily mean that the preparation is technically pure (homogeneous), but it is sufficiently pure to provide the nucleic acid in a form in which it can be used for the intended purpose.
- SEQ ID NO: 1 is the nucleotide sequence of a VSP promoter from Arabidopsis.
- SEQ ID NO: 2 is the nucleotide sequence of a VSP promoter from Arabidopsis.
- SEQ ID NO: 3 is the nucleotide sequence of a VSP promoter from Arabidopsis.
- SEQ ID NO: 4 is the nucleotide sequence of a VSP promoter from Arabidopsis.
- SEQ ID NO: 5 is the nucleotide sequence of a VSP promoter from Arabidopsis.
- SEQ ID NO: 6 is the nucleotide sequence of a VSP promoter from Arabidopsis.
- SEQ ID NO: 7 is the nucleotide sequence of a VSP promoter from sunflower.
- SEQ ID NO: 8 is the nucleotide sequence of a VSP promoter from sunflower.
- SEQ ID NO: 9 is the nucleotide sequence of a VSP promoter from sunflower.
- SEQ ID NO: 10 is the nucleotide sequence of a VSP promoter from sunflower.
- SEQ ID NO: 11 is the nucleotide sequence of a VSP promoter from rice.
- SEQ ID NO: 12 is the nucleotide sequence of a VSP promoter from rice.
- SEQ ID NO: 13 is the nucleotide sequence of a VSP promoter from rice.
- SEQ ID NO: 14 is the nucleotide sequence of a VSP promoter from rice.
- SEQ ID NO: 15 is the nucleotide sequence of a VSP promoter from tomato.
- SEQ ID NO: 16 is the nucleotide sequence of a VSP promoter from tomato.
- SEQ ID NO: 17 is the nucleotide sequence of a VSP promoter from tomato.
- SEQ ID NO: 18 is the nucleotide sequence of a VSP promoter from soybean.
- SEQ ID NO: 19 is the nucleotide sequence of a VSP promoter from soybean.
- SEQ ID NO: 20 is the nucleotide sequence of a VSP promoter from soybean.
- SEQ ID NO: 21 is the nucleotide sequence of a VSP promoter from soybean.
- SEQ ID NO: 22 is the nucleotide sequence of a VSP promoter from soybean.
- SEQ ID NO: 23 is the nucleotide sequence of a VSP promoter from soybean.
- SEQ ID NO: 24 is the nucleotide sequence of a VSP promoter from soybean.
- SEQ ID NO: 25 is the nucleotide sequence of a VSP promoter from soybean.
- SEQ ID NO: 26 is the nucleotide sequence of a VSP promoter from maize.
- SEQ ID NO: 27 is the nucleotide sequence of a VSP promoter from maize.
- SEQ ID NO: 28 is the nucleotide sequence of a RZDP promoter from maize.
- SEQ ID NO: 29 is the nucleotide sequence of a RZDP promoter from Arabidopsis.
- SEQ ID NO: 30 is the nucleotide sequence of a RZDP promoter from Arabidopsis.
- SEQ ID NO: 31 is the nucleotide sequence of a RZDP promoter from rice.
- SEQ ID NO: 32 is the nucleotide sequence of a RZDP promoter from rice.
- SEQ ID NO: 33 is the nucleotide sequence of a RZDP promoter from tomato.
- SEQ ID NO: 34 is the nucleotide sequence of a RZDP promoter from tomato.
- SEQ ID NO: 35 is the nucleotide sequence of a RZDP promoter from tomato.
- SEQ ID NO: 36 is the nucleotide sequence of a RZDP promoter from tomato.
- SEQ ID NO: 37 is the nucleotide sequence of a RZDP promoter from soybean.
- SEQ ID NO: 38 is the nucleotide sequence of a RZDP promoter from soybean.
- SEQ ID NO: 39 is the nucleotide sequence of a RZDP promoter from soybean.
- SEQ ID NO: 40 is the nucleotide sequence of a RZDP promoter from soybean.
- SEQ ID NO: 41 is the nucleotide sequence of a RZDP promoter from sunflower.
- SEQ ID NO: 42 is the nucleotide sequence of a RZDP promoter from sunflower.
- SEQ ID NO: 43 is the nucleotide sequence of a RZDP promoter from sunflower.
- SEQ ID NO: 44 is the nucleotide sequence of a U3 promoter from Arabidopsis.
- SEQ ID NO: 45 is the nucleotide sequence of a U3 promoter from rice (prOsU3-01).
- SEQ ID NO: 46 is the nucleotide sequence of a U3 promoter from rice (prOsU3-02).
- SEQ ID NO: 47 is the nucleotide sequence of a U3 promoter from rice (prOsU3-03).
- SEQ ID NO: 48 is the nucleotide sequence of a U3 promoter from wheat.
- SEQ ID NO: 49 is the nucleotide sequence of a U3 promoter from maize.
- SEQ ID NO: 50 is the nucleotide sequence encoding a guide RNA targeting VLHP1-1 & VLHP1-2.
- SEQ ID NO: 51 is the nucleotide sequence encoding a guide RNA targeting VLHP1-1 & VLHP1-2.
- SEQ ID NO: 52 is the nucleotide sequence encoding a guide RNA targeting GW2-1 & GW2-2.
- SEQ ID NO: 53 is the nucleotide sequence encoding a guide RNA targeting SBEIIb.
- SEQ ID NO: 54 is the nucleotide sequence encoding a guide RNA targeting GL2.
- SEQ ID NO: 55 is the nucleotide sequence encoding a guide RNA targeting Waxy1.
- SEQ ID NO: 56 is the nucleotide sequence encoding a guide RNA targeting O2 first exon.
- SEQ ID NO: 57 is the nucleotide sequence encoding a guide RNA targeting O2 second exon.
- SEQ ID NO: 58 is the nucleotide sequence encoding a guide RNA targeting O2 third exon.
- SEQ ID NO: 59 is the nucleotide sequence encoding a guide RNA targeting YellowEndosperm.
- SEQ ID NO: 60 is the nucleotide sequence encoding a guide RNA targeting UBL.
- SEQ ID NO: 61 is the nucleotide sequence encoding a guide RNA targeting UPL3.
- SEQ ID NO: 62 is the nucleotide sequence encoding construct 23396.
- SEQ ID NO: 63 is the nucleotide sequence encoding construct 23397.
- SEQ ID NO: 64 is the nucleotide sequence encoding construct 23399.
- SEQ ID NO: 65 is the nucleotide sequence encoding construct 24520.
- SEQ ID NO: 66 is the nucleotide sequence encoding construct 26258.
- SEQ ID NO: 67 is the nucleotide sequence encoding construct 26296.
- SEQ ID NO: 68 is the nucleotide sequence encoding construct 27145.
- SEQ ID NO: 69 is the nucleotide sequence encoding construct 27146.
- SEQ ID NO: 70 is the nucleotide sequence encoding construct 27226.
- SEQ ID NO: 71 is the nucleotide sequence encoding construct 27234.
- SEQ ID NO: 72 is the nucleotide sequence encoding construct 27241.
- SEQ ID NO: 73 is the nucleotide sequence encoding construct 27680.
- SEQ ID NO: 74 is the nucleotide sequence encoding construct 28255.
- SEQ ID NO: 75 is the nucleotide sequence encoding construct 28291.
- SEQ ID NO: 76 is the nucleotide sequence encoding construct 28292.
- SEQ ID NO: 77 is the nucleotide sequence encoding construct 28293.
- SEQ ID NO: 78 is the nucleotide sequence encoding construct 28510.
- SEQ ID NO: 79 is the nucleotide sequence encoding construct 28520.
- SEQ ID NO: 80 is the nucleotide sequence encoding construct 28560.
- SEQ ID NO: 81 is the nucleotide sequence encoding construct 28825.
- SEQ ID NO: 82 is the nucleotide sequence encoding construct 28834.
- SEQ ID NO: 83 is the nucleotide sequence of a SCE1 promoter from maize.
- SEQ ID NO: 84 is the nucleotide sequence of a SCE1 promoter from maize.
- SEQ ID NO: 85 is the nucleotide sequence of a SCE1 promoter from maize.
- SEQ ID NO: 86 is the nucleotide sequence of a VSP-01 promoter from maize.
- SEQ ID NO: 87 is the nucleotide sequence of a VSP-02 promoter from maize.
- SEQ ID NO: 88 is the nucleotide sequence of a ubiquitin terminator (“tUbi1- 04”) from maize.
- SEQ ID NO: 89 is the nucleotide sequence of a ubiquitin terminator (“tUbi1- 06”) from maize.
- SEQ ID NO: 90 is the nucleotide sequence of a ubiquitin terminator (“tUbi1- 09”) from maize gene GRMZM2G409726.
- SEQ ID NO: 91 is the nucleotide sequence of a ubiquitin terminator from sorghum bicolor.
- SEQ ID NO: 92 is the nucleotide sequence of a ubiquitin terminator from Medicago truncatula.
- SEQ ID NO: 93 is the nucleotide sequence of a ubiquitin terminator from Glycine max. DETAILED DESCRIPTION
- the present disclosure features nucleic acid constructs comprising zygote-preferred promoters, such as a Vacuolar Sorting Protein promoter (VSP) or Ring Zinc-Finger Domain Protein promoter (RZDP) or SUMO-conjugating enzyme (SCE1) that preferentially drive expression of a transcript from a nucleic acid operably linked to the promoter in early zygotes and male gametes.
- VSP Vacuolar Sorting Protein promoter
- RZDP Ring Zinc-Finger Domain Protein promoter
- SCE1 SUMO-conjugating enzyme
- zygote-preferred promoters are employed in nucleic acid constructs to drive expression of gene editing system components that are employed in gene editing procedures, such as HI-Edit, that exploit haploid induction to provide a plant with a desired edit.
- a zygote-referred promoter causes its downstream sequence to be preferentially expressed in the zygote.
- a zygote-preferred promoter of the present disclosure provides a zygote-edit rate of at least 10%. Zygote-edit rate correlates with HI-edit efficiency; and thus a zygote-preferred promoter as described herein can, in some embodiments, be employed in HI-edit methods.
- HI-Edit efficiency refers to the measurement, usually expressed as a percentage, of progeny plants produced from a HI-Edit cross which are both edited and are (or were, if doubled) haploid. “HI-Edit efficiency,” “haploid editing rate,” and “HI-edit rate” are used interchangeably throughout. [0133]
- a zygote-preferred promoter for use in HI-Edit methods has a zygote-edit rate of at least about 10%. A zygote-edit rate can be determined using available methods, such as those described in Section three of Example 1.
- the relative chimerism of new edits in diploid F1 offspring is assessed after out- crossing of a transgenic parent to a non-transgenic parent line.
- the transgenic parent expresses a DNA modification enzyme such as a Case protein, e.g., Cas9 or Cas12a, and guide RNAs to target a gene for modification in the non-transgenic parent.
- Next generation sequencing (NGS) is typically used to determine whether new edits (those produced in the F1, hybrid plant) are made in the target gene of the non-transgenic parent. Because the non- transgenic parent does not contain the required CRISPR-Cas editing machinery, edits in the non-transgenic target gene can only have occurred in the F1 offspring.
- This assay allows measurement of not only how often this editing occurred, but an approximate developmental timing of when the editing occurs in the course of the F1 embryo development. Editing that occurs early, for instance in the 1-cell zygote stage, should produce a pure biallelic outcome, where nearly 50% of the reads are a new type of edit that was not found in the transgenic parental plant and the other 50% of reads match an editing outcome from the E0 transgenic parent generation. Editing that occurs later, in the multicellular early embryo, should not produce such a high read percentage – but rather may show a mixture of editing outcomes at lower read rates (i.e. it would be chimeric or mosaic).
- a zygote-preferred promoter that exhibits an edit rate of 10% or higher can then be selected for use in any context in which zygote-preferred expression is of interest.
- a promoter is employed in a HI-edit procedure.
- VSP Promoters [0135]
- a promoter used in a HI-Edit procedure comprises a VSP promoter exhibiting a zygote edit rate of 10% or higher.
- a VSP promoter is from an endogenous rice or maize gene or is an orthologous promoter.
- a VSP promoter is from an endogenous Arabidopsis, sunflower, rice, tomato, or maize gene.
- a VSP promoter is a rice VSP promoter from LOC_Os09g09480/Os09g0267600 that is biparentally expressed in zygotes (Anderson et al. (2017) The Zygotic Transition Is Initiated in Unicellular Plant Zygotes with Asymmetric Activation of Parental Genomes. Dev Cell 43(3):349-358).
- a VSP is a promoter from a maize gene Zm00001d011353/Zm00001eb358220. Additional examples of ortholopgous VSP genes are provided in Table 1. Table 1. Example orthologues of OsVSP and ZmVSP in diverse crops.
- a VSP promoter comprises a sequence of any one of SEQ ID NOS:1-27 or a functional fragment thereof. In some embodiments, a VSP promoter comprises a functional fragment of any one of SEQ ID NOS:1-27, 86 or 87 of at least 100, 200, 300, 400, or 500 nucleotides in length.
- a VSP promoter comprises a functional fragment of any one of SEQ ID NOS:1-27, 86 or 87 of at least 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 15001600, 1700, 1800, or 1900 nucleotides in length; or at least 2000, 2100, 2200, 2300, 2400, or 2500 nucleotide in length.
- a VSP promoter comprises a variant of a functional fragment of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000; or at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800 or 1900 nucleotides in length having at least 80% identity, or at least 85%, at least 90%, or at least 95% identity to the corresponding segment of any one of SEQ ID NOS:1-27, 86 or 87.
- such a functional fragment has at least 96% identity, or at least 97%, 98%, 99%, or greater, identity to the corresponding segment of any one SEQ ID NOS:1-27.
- a functional fragment of a VSP promoter lacks 5’- untranslated region sequences of the mRNA.
- the 5’ untranslated region can be readily confirmed/determined using known techniques, such as 5’ RACE analysis.
- Additional functional fragments e.g., deletions or variants of any one of SEQ ID NOS:1-27, can be determined by assessing mutagenized and/or truncated regions of the promoter to determine those fragments that exhibit at least about 10% zygote-editing activity.
- a VSP promoter comprises SEQ ID NO:27, or a functional fragment thereof.
- the functional fragment comprises a region of SEQ SEQ ID NO:27 of at least 100, 200, 300, 400, or 500 nucleotide in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:27 of at least 500, 600, 700, 800, 900, or 1000 nucleotide in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:27 of at least 1100, 1200, 1300, 1400, or 1500 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:27 of at least 1600 nucleotide in length, or at least 1700, 1800, or 1900 nucleotides in length.
- a VSP promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:27, or to a segment of SEQ ID NO:27 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 in length; or comprises at least 90% identity, or at least 95% identity, to a segment of SEQ ID NO:27 of at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, or 1900 nucleotides in length.
- the VSP promoter does not comprise 5’ untranslated region sequences.
- a VSP promoter is a functional fragment of SEQ ID NO:27 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:27 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0140] In some embodiments, a VSP promoter comprises SEQ ID NO:12, or a functional fragment thereof.
- the fragment comprises a region of SEQ ID NO:12 of least 100, 150, 200, 250, 300, 350, 400, or 450 nucleotides in length.
- a functional VSP promoter comprises a region having at least 90% identity, or at least 95% identity to SEQ ID NO:12, or to a segment of SEQ ID NO:12 of at least 100, 150, 200, 250, 300, 350, 400, or 450 nucleotides in length.
- the VSP promoter does not comprise 5’ untranslated region sequences.
- a VSP promoter is a functional fragment of SEQ ID NO:12 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:12 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full- length sequence. [0141] In some embodiments, a VSP promoter comprises SEQ ID NO:14, or a functional fragment thereof.
- the functional fragment comprises a region of SEQ ID NO:14 of at least 100, 200, 300, 400, or 500 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:14 of at least 600, 700, 800, 900, or 1000 nucleotide in length or of at least 1100, 1200, 1300, 1400, or 1500 nucleotide in length. In some embodiments, a functional VSP promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:14, or to a segment of SEQ ID NO:14 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 nucleotides in length.
- a functional VSP promoter comprises a region having at least 90% identity, or at least 95% identity, to a segment of SEQ ID NO:14 of at least 1100, 1200, 1300, 1400 or 1500 nucleotides in length. In some embodiments, the VSP promoter does not comprise 5’ untranslated region sequences. In some embodiments, a VSP promoter is a functional fragment of SEQ ID NO:14 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:14 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence.
- the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence.
- a VSP promoter comprises SEQ ID NO:11 or SEQ ID NO:13 or a functional fragment thereof.
- the functional fragment comprises a region of SEQ ID NO:11 or SEQ ID NO:13 of at least 100, 200, 300, 400, or 500 nucleotides in length.
- the functional fragment comprises a region of SEQ ID NO:11 or SEQ ID NO:13 of at least 500, 700, 800, 900, or 1000 nucleotide in length.
- the functional fragment comprises a region of SEQ ID NO:11 or SEQ ID NO:13 of at least 1100, 1200, 1300, 1400, or 1500 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:11 or SEQ ID NO:13 of at least 1600 nucleotide in length, or at least 1700, 1800, or 1900 nucleotides in length. In some embodiments, a VSP promoter comprises a region having at least 90% identity, or at least 95% identity.
- the VSP promoter does not comprise 5’ untranslated region sequences.
- a VSP promoter is a functional fragment of SEQ ID NO:11 or SEQ ID NO:13 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:11 or SEQ ID NO:13 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0143] In some embodiments, a VSP promoter comprises SEQ ID NO:26, or a functional fragment thereof.
- the functional fragment comprises a region of SEQ ID NO:26 of at least 100, 200, 300, 400, or 500 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:26 of at least 500, 600, 700, 800, 900, or 1000 nucleotide in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:26 of at least 1100, 1200, 1300, 1400, or 1500 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:26 of at least 1600 nucleotide in length, or at least 1700, 1800, or 1900 nucleotides in length.
- a VSP promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:26, or to a segment of SEQ ID NO:26 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 in length; or comprises at least 90% identity, or at least 95% identity, to a segment of SEQ ID NO:26 of at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, or 1900 nucleotides in length.
- the VSP promoter does not comprise 5’ untranslated region sequences.
- a VSP promoter is a functional fragment of SEQ ID NO:26 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:26 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full- length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0144] In some embodiments, a VSP promoter comprises any one of SEQ ID NOS:1, 3, 5, 6, 7, 8, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24, or a functional fragment thereof.
- the functional fragment comprises a region of any one of SEQ ID NOS:1, 3, 5, 6, 7, 8, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 of at least 100, 200, 300, 400, or 500 nucleotide in length. In some embodiments, the functional fragment comprises a region of any one of SEQ ID NOS:1, 3, 5, 6, 7, 8, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 of at least 500, 600, 700, 800, 900, or 1000 nucleotide in length. In some embodiments, the functional fragment comprises a region of any one of SEQ ID NOS:1, 3, 5, 6, 7, 8, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 of at least 1100, 1200, 1300, 1400, or 1500 nucleotides in length.
- the functional fragment comprises a region of any one of SEQ ID NOS:1, 3, 5, 6, 7, 8, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 of at least 1600 nucleotide in length, or at least 1700, 1800, or 1900 nucleotides in length.
- a VSP promoter comprises a a region having at least 90% identity, or at least 95% identity, to any one of SEQ ID NOS:1, 3, 5, 6, 7, 8, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24, or to a segment of any one of SEQ ID NOS:1, 3, 5, 6, 7, 8, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 in length; or comprises at least 90% identity, or at least 95% identity, to a segment of any one of SEQ ID NOS:1, 3, 5, 6, 7, 8, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 of at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, or 1900 nucleotides in length.
- the VSP promoter does not comprise 5’ untranslated region sequences.
- a VSP promoter is a functional fragment of one of SEQ ID NOS:1, 3, 5, 6, 7, 8, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 lacking one or more nucleotides from the 5’ and/or 3’ end of the full-length sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence.
- the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence.
- a VSP promoter comprises SEQ ID NO:2, or a functional fragment thereof.
- the functional fragment comprises a region of SEQ ID NO:2 of at least 100, 200, 300, 400, or 450 nucleotides in length.
- a functional VSP promoter comprises a region having at least 90% identity or at least 95% identity to SEQ ID NO:2, or to a segment of SEQ ID NO:2 of at least 100, 200, 300, 400, or 450 nucleotides in length.
- the VSP promoter does not comprise 5’ untranslated region sequences.
- a VSP promoter is a functional fragment of SEQ ID NO:2 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:2 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0146] In some embodiments, a VSP promoter comprises SEQ ID NO:4, or a functional fragment thereof.
- the functional fragment comprises a region of SEQ ID NO:4 of at least 100, 200, 300, 400, or 500 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:4 of at least 600, 700, or 800 nucleotides in length. In some embodiments, a VSP promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:4, or to a segment of SEQ ID NO:4 of at least 100, 200, 300, 400, or 500 nucleotides in length; or at least 600, 700, or 800 nucleotides. In some embodiments, the VSP promoter does not comprise 5’ untranslated region sequences.
- a VSP promoter is a functional fragment of SEQ ID NO:4 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:4 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0147] In some embodiments, a VSP promoter comprises SEQ ID NO:9, or a functional fragment thereof.
- the functional fragment comprises a region of SEQ ID NO:9 of at least 100, 200, 300, 400, 500, or 600 nucleotides in length.
- a functional VSP promoter comprises a region having at least 90% identity, or at least 95% identity to SEQ ID NO:9, or to a segment of SEQ ID NO:9 of at least 100, 200, 300, 400, 500, or 600 nucleotides in length.
- the VSP promoter does not comprise 5’ untranslated region sequences.
- a VSP promoter is a functional fragment of SEQ ID NO:9 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:9 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full- length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0148] In some embodiments, a VSP promoter comprises SEQ ID NO:25, or a functional fragment thereof.
- the functional fragment comprises a region of SEQ ID NO:25 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 nucleotide in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:25 of at least 1100, 1200, 1300, 1400, 1500, 1600, or 1700 nucleotides in length.
- a VSP promoter comprises a region having at least 90% identity or at least 95% identity to SEQ ID NO:25, or to a segment of SEQ ID NO:25 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 nucleotides in length; or a segment of SEQ ID NO:25 of at least 1100, 1200, 1300, 1400, 1500, 1600, or 1700 nucleotide in length.
- the VSP promoter does not comprise 5’ untranslated regions sequences.
- a VSP promoter is a functional fragment of SEQ ID NO:25 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:25 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0149] In some embodiments, a VSP promoter comprises SEQ ID NO:86 or 87, or a functional fragment thereof.
- the functional fragment comprises a region of SEQ ID NO:86 or 87 of at least 500, 600, 700, 800, 900, or 1000 nucleotide in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:86 or 87 of at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, or 1900 nucleotides in length. In some embodiments, the functional fragment comprises a region of at least 2000, 2100, 2200, 2300, 2400, or 2500 nucleotide in length.
- a VSP promoter comprises a region having at least 90% identity or at least 95% identity to SEQ ID NO:86 or 87, or to a segment of SEQ ID NO:86 or 87 of at least 500, 600, 700, 800, 900, or 1000 nucleotides in length; or a segment of SEQ ID NO:25 of at least 1100, 1200, 1300, 1400, 1500, 1600, 170, 1800, or 1900 nucleotides in length; or a segment of at least 2000, 2100, 2200, 2300, 2400 or 2500 nucleotides in length.
- a VSP promoter is a functional fragment of SEQ ID NO:86 or 87 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:86 or 87 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. II.
- a zygote-preferred promoter of the present disclosure comprises an RZDP promoter exhibiting zygote editing of about 10% or higher.
- an RZDP promoter is from an endogenous maize gene, or from an orthologous promoter.
- an RZDP promoter is from an endogenous rice, maize, Arabidopsis, soybean, tomator or sunflower gene.
- an RZDP promoter is from a Zm00001d050090 gene. Additional examples of ortholopgous RZDP genes are provided in Table 2. Table 2. Example orthologues of ZmRZDP in diverse crops.
- an RZD promoter comprises a sequence of any one of SEQ ID NOS:28-43, or a functional fragment thereof. In some embodiments, an RZD promoter comprises a functional fragment of any one of SEQ ID NOS:28-43 of at least 100, 200, 300, 400, or 500 nucleotides in length.
- an RZD promoter comprises a functional fragment of any one of SEQ ID NOS:28-43 of at least 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, or 1500 nucleotides in length. In some embodiments, an RZD promoter comprises a functional fragment of any one of SEQ ID NOS:28-43 of at least 1600, 1700, 1800, or 1900 nucleotides in length.
- an RZD promoter comprises a variant of a functional fragment of any one of SEQ ID NOS:28-43 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 nucleotides in length, or at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800 or 1900 nucleotides in length, having at least 80% identity, or at least 85%, at least 90%, or at least 95% identity to the corresponding segment of any one of SEQ ID NOS:28-43.
- such a functional fragment has at least 96% identity, or at least 97%, 98%, 99%, or greater, identity to the corresponding segment of any one SEQ ID NOS:28-43.
- a functional fragment of an RZDP promoter lacks 5’- untranslated region sequences of the mRNA.
- the 5’ untranslated region can be readily confirmed/determined using known techniques, such as 5’ RACE analysis.
- Additional functional fragments e.g., deletions or variants of any one of SEQ ID NOS:28-43, can be determined by assessing mutagenized and/or truncated regions of the promoter to determine those fragments that exhibit at least about 10% zygote-editing activity.
- an RZD promoter comprises SEQ ID NO:28, or a functional fragment thereof.
- the functional fragment comprises a region of SEQ SEQ ID NO:28 of at least 100, 200, 300, 400, or 500 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:28 of at least 500, 600, 700, 800, 900, or 1000 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:28 of at least 1100, 1200, 1300, 1400, or 1500 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:28 of at least 1600 nucleotides in length, or at least 1700, 1800, or 1900 nucleotides in length.
- an RZD promoter comprises aregion having at least 90% identity, or at least 95% identity, to SEQ ID NO:28, or to a segment of SEQ ID NO:28 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 nucleotides in length; or comprises at least 90% identity, or at least 95% identity, to a segment of SEQ ID NO:28 of at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, or 1900 nucleotides in length.
- the RZD promoter does not comprise 5’ untranslated region sequences.
- an RZD promoter is a functional fragment of SEQ ID NO:28 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:28 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence.
- an RZDP promoter comprises any one of SEQ ID NOS:29, 31, 32, 33, 35, 36, 37, 39, 40, 41, or 42, or a functional fragment thereof.
- the functional fragment comprises a region of any one of SEQ ID NOS:29, 31, 32, 33, 35, 36, 37, 39, 40, 41, or 42 of at least 100, 200, 300, 400, or 500 nucleotides in length.
- the functional fragment comprises a region of any one of SEQ ID NOS:29, 31, 32, 33, 35, 36, 37, 39, 40, 41, or 42 of at least 600, 700, 800, 900 or 1000 nucleotides in length.
- the functional fragment comprises a region of any one of SEQ ID NOS:29, 31, 32, 33, 35, 36, 37, 39, 40, 41, or 42 of at least 1100, 1200, 1300, 1400, 1500 nucleotides in length, or at least 1600, 1700, at 1800, or at 1900 nucleotides in length.
- an RZDP promoter comprises a region having at least 90% identity or at least 95% identity to any one of SEQ ID NOS:29, 31, 32, 33, 35, 36, 37, 39, 40, 41, or 42, or to a segment of any one of SEQ ID NOS:29, 31, 32, 33, 35, 36, 37, 39, 40, 41, or 42 of at least 100, 200, 300, 400, or 500 nucleotides in length; or a segment of at least 600, 700, 800, 900, or 1000 nucleotides in length.
- an RZDP comprises a region having at least 90% identity or at least 95% identity to any one of SEQ ID NOS:29, 31, 32, 33, 35, 36, 37, 39, 40, 41, or 42, or to a segment of any one of SEQ ID NOS:29, 31, 32, 33, 35, 36, 37, 39, 40, 41, or 42 of at least 11, 1200, 1300, 1400, 1500, 1600, 1700, 1800, or 1900 nucleotides in length.
- the RZDP promoter does not comprise 5’ untranslated region sequences.
- an RZD promoter is a functional fragment of one of SEQ ID NOS:29, 31, 32, 33, 35, 36, 37, 39, 40, 41, or 42 lacking one or more nucleotides from the 5’ and/or 3’ end of the full-length sequence, where the segment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full- length sequence.
- the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence.
- an RZDP promoter comprises SEQ ID NO:30, or a functional fragment thereof.
- the functional fragment comprises a region of SEQ ID NO:30 of at least 100, 200, 300, 400, or 500 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:30 of at least 600, 700, or 800 nucleotides in length. In some embodiments, an RZD promoter comprises a region having at least 90% identity or at least 95% identity to SEQ ID NO:30, or to a segment of SEQ ID NO:30 of at least 100, 200, 300, 400, 500, 600, 700, or 800 nucleotides in length. In some embodiments, the RZDP promoter does not comprise 5’ untranslated region sequences.
- an RZD promoter is a functional fragment of SEQ ID NO:30 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full- length SEQ ID NO:30 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0157] In some embodiments, an RZDP promoter comprises SEQ ID NO:34 or a functional fragment thereof.
- the functional fragment comprises a region of SEQ ID NO:34 of at least 100, 200 or 300 nucleotides in length.
- an RZDP promoter comprises a region having at least 90% identity or at least 95% identity to SEQ ID NO:34, or to a segment of SEQ ID NO:34 of at least 100, 200, or 300 nucleotides in length.
- the RZDP promoter does not comprise 5’ untranslated region sequences.
- an RZD promoter is a functional fragment of SEQ ID NO:34 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full- length SEQ ID NO:34 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0158] In some embodiments, an RZDP promoter comprises SEQ ID NO:38, or a functional fragment thereof.
- the functional fragment comprises a region of SEQ ID NO:38 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 nucleotides in length. In some embodiments, the fragment comprises a region of SEQ ID NO:38 at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, or 1800 nucleotides in length.
- an RZD promoter comprises a region having at least 90% identity or at least 95% identity to SEQ ID NO:38, or to a segment of SEQ ID NO:38 at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 nucleotides in length; or to a segment of at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, or 1800 nucleotides in length.
- the RZDP promoter does not comprise 5’ untranslated region sequences.
- an RZD promoter is a functional fragment of SEQ ID NO:38 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:38 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0159] In some embodiments, an RZDP promoter comprises SEQ ID NO:43, or a functional fragment thereof.
- the fragment comprises at a region of SEQ ID NO:43 of least 1100 nucleotides in length.
- the functional fragment comprises a region of SEQ ID NO:43 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, or 1100 nucleotides in length.
- an RZD promoter comprises a region having at least 90% identity or at least 95% identity to SEQ ID NO:43, or to a segment of SEQ ID NO:43 at least 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, or 1100 nucleotides in length.
- the RZDP promoter does not comprise 5’ untranslated region.
- an RZD promoter is a functional fragment of SEQ ID NO:43 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:43 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. III.
- a zygote-preferred promoter of the present disclosure comprises an SCE1 promoter exhibiting zygote editing of about 10% or higher.
- an SCE1 promoter is from an endogenous maize gene, or from an orthologous promoter.
- an SCE1 promoter is from an endogenous rice, maize, Arabidopsis, soybean, tomato or sunflower gene.
- an SCE1 promoter is from a Zm00001d002570 gene.
- an SCE1 promoter comprises a sequence of any one of SEQ ID NOS:83-85 or a functional fragment thereof. In some embodiments, an SCE1 promoter comprises a functional fragment of any one of SEQ ID NOS:83-85 of at least 300, 400, or 500 nucleotides in length. In some embodiments, an SCE1 promoter comprises a functional fragment of any one of SEQ ID NOS:83-85 of at least 600, 700, 800, 900, 1000, 1100, 1200, or 1300 nucleotides in length.
- an SCE1 promoter comprises a functional fragment of SEQ ID NO:84 or 85 of 1400, 15001600, 1700, 1800, or 1900 nucleotides in length. In some embodiments, an SCE1 promoter comprises a functional fragment of SEQ ID NO:84 or 85 of 2000, 2100, 2200, 2300, or 2400 nucleotide in length; or in some embodiments, a functional fragment of at least 2500, 2600, 2700, 2800, 2900, or 3000 nucleotides in length. In some embodiments, an SCE1 promoter comprises a sequence having at least 90%, 92%, 92%, 93%, or 94% identity to any one of SEQ ID NOS: 83-85.
- an SCE1 promoter comprises a sequence having at least 95%, 96%, 97%, 98%, or 99% identity to any one of SEQ ID NOS: 83-85.
- the SCE1 promoter comprises SEQ ID NO:83, 84, or 85.
- a functional fragment of an SCE1 promoter lacks 5’- untranslated region sequences of the mRNA. The 5’ untranslated region can be readily confirmed/determined using known techniques, such as 5’ RACE analysis.
- an SCE! promoter comprises SEQ ID NO:83, or a functional fragment thereof.
- the functional fragment comprises a region of SEQ ID NO:83 of at least 500, 600, 700, 800, 900, or 1000 nucleotide in length.
- the functional fragment comprises a region of SEQ ID NO:83 of at least 1100 nucleotides in length or at least 1200 or 1300 nucleotides in length.
- an SCE1 promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:83, or to a segment of SEQ ID NO:83 of at least 500, 600, 700, 800, 900, or 1000 nucleotides in length; or comprises at least 90% identity, or at least 95% identity, to a segment of SEQ ID NO:83 of at least 1100, 1200, or 1300 nucleotides in length.
- an SCE1 promoter is a functional fragment of SEQ ID NO:83 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:83 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0164] In some embodiments, an SCE1 promoter comprises SEQ ID NO:84 or SEQ ID NO:85, or a functional fragment thereof.
- the functional fragment comprises a region of SEQ ID NO:84 or SEQ ID NO:85 of at least 500, 750, or 1000 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:84 or SEQ ID NO:85 of at least 1100, 1200, 1300, 1400, or 1500 nucleotides in length; or a region of at least 1600, 1700, 1800, 1900, or 2000 nucleotides in length.
- the functional fragment comprises a region of SEQ ID NO:84 or SEQ ID NO:85 of at least 2100, 2200, 2300, 2400, or 2500 nucleotides in length; or a region of at least 2600, 2700, 2800, 2900, 3000, or 3100 nucleotides in length.
- an SCE1 promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:84 or 85, or to a segment of SEQ ID NO:84 or 85 comprising the functional fragment.
- an SCE1 promoter is a functional fragment of SEQ ID NO:84 or 85 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full- length SEQ ID NO:84 or 85 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. IV.
- Expression construct comprising promoters
- the present disclosure provide expression constructs and transgenic plant cells that comprise a zygote-preferred promoter, e.g., a VSP promoter or RZD promoter as described herein. to drive expression of a nucleic acid sequence of interest in a plant.
- the nucleic acid sequence of interest is a nucleic acid modification enzyme, e.g., a nuclease.
- an expression construct can be a single vector that encodes two or more expression products of interest.
- expression of one product of interest is driven by a zygote-preferred promoter as described herein, and expression of a second product of interest is driven by a different promoter.
- the expression system comprises a binary vector encoding two expression products of interest, e.g., in which expression of a desired gene product, e.g., a nucleic acid modification enzyme, such as a DNA editing nuclease, is driven by a zygote-preferred promoter and expression of an RNA product of interest, e.g., a guide RNA is drive by an RNA Polymerase III promoter.
- an expression construct comprises a zygote-preferred promoter as described herein that drives expression of a zinc-finger nuclease (ZFN).
- ZFNs are a fusion between the cleavage domain of FokI and a DNA recognition domain containing 3 or more zinc finger motifs. Examples of ZFNs include, but are not limited to, those described in Urnov et al., Nature Reviews Genetics, 2010, 11:636-646; Gaj et al., Nat Methods, 2012, 9(8):805-7; U.S.
- a zygote-preferred promoter may be used to drive expression of a TAL-effector nuclease (TALEN) may be used.
- TALEN TAL-effector nuclease
- TALENs are engineered transcription activator-like effector nucleases that contain a central domain of DNA-binding tandem repeats, a nuclear localization signal, and a C-terminal transcriptional activation domain.
- TALENs can be produced by fusing a TAL effector DNA binding domain to a DNA cleavage domain.
- a TALE protein may be fused to a nuclease such as a wild-type or mutated FokI endonuclease or the catalytic domain of FokI.
- TALENs and their uses for gene editing are found, e.g., in U.S.
- a zygote-preferred promoter may be used to drive expression of a meganuclease.
- Meganucleases can be modular DNA-binding nucleases such as any fusion protein comprising at least one catalytic domain of an endonuclease and at least one DNA binding domain or protein specifying a nucleic acid target sequence.
- the DNA-binding domain can contain at least one motif that recognizes single- or double-stranded DNA.
- the meganuclease can be monomeric or dimeric.
- an expression construct may comprise a zygote-preferred promoter as described herein that drive drives expression of a CRISPR nuclease used in a CRISPR editing system.
- a Cas protein expressed under the control of the zygote-preferred promoter is Cas9, Cas12a (formerly referred to as Cpf1), Cas12b (formerly referred to as C2c1), Cas13a (formerly referred to as C2c2), C2c3, Cas13b, or a Cas protein or orthogols proteins from prokaryotic organism.
- the Cas protein is a (modified) Cas9, such as a (modified) Staphylococcus aureus Cas9 (SaCas9) or (modified) Streptococcus pyogenes Cas9 (SpCas9).
- the Cas protein is Cas12a, optionally from Acidaminococcus sp., such as Acidaminococcus sp. BV3L6 Cpf1 (AsCas12a) or Lachnospiraceae bacterium Cas12a , such as Lachnospiraceae bacterium MA2020 or Lachnospiraceae bacterium MD2006 (LBCas12a). See U.S. Pat.
- the Cas12a protein may be from Moraxella bovoculi AAX08_00205 [Mb2Cas12a] or Moraxella bovoculi AAX11_00205 [Mb3Cas12a]. See WO 2017/189308, incorporated herein by reference in its entirety.
- the Cas protein is (modified) C2c2, such as Leptotrichia wadei C2c2 (LwC2c2) or Listeria newyorkensis FSL M6-0635 C2c2 (LbFSLC2c2).
- the (modified) Cas protein is C2c1.
- the (modified) Cas protein is C2c3. In certain embodiments, the (modified) Cas protein is Cas13b. Other Cas enzymes are available to a person skilled in the art.
- a CRISPR nuclease further comprises a fusion domain, such as a deaminase, a uracil DNA glycosylase, a reverse transcriptase, or an exonuclease
- a modified CRISPR nuclease include chimeric Cas proteins such as dCas9-FokI, dCpf1-FokI, chimeric Cas9-cytidine deaminase, chimeric Cas9-adenine deaminase, a nickase Cas9 (nCas9), chimeric dCas9 non-FokI nuclease and dCpf1 non-FokI nuclease.
- an expression construct such as a binary vector, comprises a first polynucleotide comprising a zygote-preferred promoter operably linked to a sequence encoding a DNA modification enzyme and a second polynucleotide comprising an RNA Polymerase III promoter operably linked to a nucleic acid sequence encoding at least one guide RNA.
- the zygote-referred promoter is a VSP or RZD promoter as described herein, e.g., a VSP promoter of any one of SEQ ID NOS:1-27, 86 or 87 or a functional fragment or variant thereof; or an RZD promoter of any one of SEQ ID NOS:28- 43, or a functional fragment of variant thereof.
- the zygote-referred promoter is an SCE1 promoter as described herein, e.g., an SCE1 promoter of any one of SEQ ID NOS:83-85, or a functional fragment of variant thereof.
- the second polynucleotide comprises a U3 promoter operably linked to a sequence encoding one or more guide RNAs.
- U3 promoters [0173]
- the U3 promoter is from rice, wheat, maize, or Arabidopsis.
- the U3 promoter that drives expression of one or more guide RNAs comprises a sequence having at least 70%, or at least 75%, 80%, or 85% identity to a U3 promoter sequence of any one of SEQ ID NOS:44-49.
- the U3 promoter sequence has at least 90% or at least 95% identity to any one of SEQ ID NOS:44- 49.
- the U3 promoter sequence has at least 96%, 97%, 98%, or 99% identity to any one of SEQ ID NOS:44-49, or comprises any one of SEQ ID NOS:44-49.
- Functional fragments e.g., deletions or variants of any one of SEQ ID NOS:44-49, can be determined by assessing mutagenized and/or truncated regions of the promoter to determine those fragments that exhibit at least about 50% or at least about 70% of promoter activity compared to full length promoter sequence.
- a U3 promoter comprises SEQ ID NO:44, or a functional fragment thereof.
- the functional fragment comprises a region of SEQ ID NO:44 of at least 100, 150, 200, 250, or 300 nucleotides in length.
- a U3 promoter comprises aregion having at least 90% identity, or at least 95% identity, to SEQ ID NO:44, or to a segment of SEQ ID NO:44 of at least 100, 150, 200, 250, or 300 nucleotides in length.
- a U3 promoter is a functional fragment of SEQ ID NO:44 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full- length SEQ ID NO:44 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0176] In some embodiments, a U3 promoter comprises SEQ ID NO:45, or a functional fragment thereof.
- the functional fragment comprises a region of SEQ ID NO:45 of at least 100, 150, 200, 250, 300, or 350 nucleotides in length.
- a U3 promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:45, or to a segment of SEQ ID NO:45 of at least 100, 150, 200, 250, 300, or 350 nucleotides in length.
- a U3 promoter is a functional fragment of SEQ ID NO:45 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:45 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0177] In some embodiments, a U3 promoter comprises SEQ ID NO:46, or a functional fragment thereof.
- the functional fragment comprises a region of SEQ ID NO:46 of at least 100, 150, 200, 250, 300, or 350 nucleotides in length.
- a U3 promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:46, or to a segment of SEQ ID NO:46 of at least 100, 150, 200, 250, 300, or 350 nucleotides in length.
- a U3 promoter is a functional fragment of SEQ ID NO:46 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:46 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0178] In some embodiments, a U3 promoter comprises SEQ ID NO:47, or a functional fragment thereof.
- the functional fragment comprises a region of SEQ ID NO:47 of at least 100, 150, 200, 250, 300, or 350 nucleotides in length.
- a U3 promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:47, or to a segment of SEQ ID NO:47 of at least 100, 150, 200, 250, 300, or 350 nucleotides in length.
- a U3 promoter is a functional fragment of SEQ ID NO:47 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:47 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0179] In some embodiments, a U3 promoter comprises SEQ ID NO:48, or a functional fragment thereof.
- the functional fragment comprises a region of SEQ ID NO:48 of at least 200, 250, 300, 350, 400, or 450 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:48 of at least 500, 550, 600, 650, 700, 750, 800, or 850 nucleotides in length. In some embodiments, a U3 promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:48, or to a segment of SEQ ID NO:48 of at least 200, 250, 300, 350, 400, or 450 nucleotides in length.
- a U3 promoter comprises a a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:48, or to a segment of SEQ ID NO:48 of at least 500, 550, 600, 650, 700, 750, 800, or 850 nucleotides in length.
- a U3 promoter is a functional fragment of SEQ ID NO:48 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:48 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence.
- the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence.
- a U3 promoter comprises SEQ ID NO:49, or a functional fragment thereof.
- the functional fragment comprises a region of SEQ ID NO:49 of at least 200, 250, 300, 350, 400, or 450 nucleotides in length.
- the functional fragment comprises a region of SEQ ID NO:49 of at least 500, 550, 600, 650, 700, or 750 nucleotides in length.
- a U3 promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:49, or to a segment of SEQ ID NO:49 of at least 200, 250, 300, 350, 400, or 450 nucleotides in length. In some embodiments, a U3 promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:49, or to a segment of SEQ ID NO:49 of at least 500, 550, 600, 650, 700, or 750 nucleotides in length.
- a U3 promoter is a functional fragment of SEQ ID NO:49 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:49 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full- length sequence.
- An expression construct as described herein may comprise further regulatory elements.
- a “regulatory element” as used herein includes any sequence that influences construction or function of the expression construct, e.g., influences transcription and/or translation in a cell in which the expression construct is expressed. Such sequences include transcriptional or translations enhancers, additional promoters or promoter elements that drive expression of other gene product encoded by the construct, such as genes encoding selectable markers. Additional regulatory elements include introns and transcriptional terminators. Such regulatory elements may be endogenous or heterologous to the host cell or to each other. [0182] A variety of transcriptional terminators are available for use in expression constructs. These are responsible for the termination of transcription beyond the transgene and correct mRNA polyadenylation.
- the termination region may be native with the transcriptional initiation region, may be native with the operably linked DNA sequence of interest, may be native with the plant host, or may be derived from another source (i.e., foreign or heterologous to the promoter, the DNA sequence of interest, the plant host, or any combination thereof).
- Appropriate transcriptional terminators are those that are known to function in plants and include the CAMV pSOY1 terminator, the tml terminator, the nopaline synthase terminator and the pea rbcs E9 terminator. These can be used in both monocotyledons and dicotyledons.
- a gene's native transcription terminator may be used.
- Termination regions used in the expression cassettes can be obtained from, e.g., the Ti-plasmid of A. tumefaciens, such as the octopine synthase and nopaline synthase termination regions. See also Guerineau et al. (1991) Mol. Gen. Genet.262: 141-144; Proudfoot (1991) Cell 64:671-674; Sanfacon et al. (1991) Genes Dev.5: 141-149; Mogen et al. (990) Plant Cell 2: 1261-1272; Munroe et al. (1990) Gene 91: 151-158; Ballas et al.
- an expression construct employed in a HI-edit procedure comprises a binary vector having the promoter driving CRISPR/Cas, CRISPR/Cas, terminator of the CRISPR/Cas cassette; promoter driving gRNA expression, gRNA target gene and gRNA sequence of a construct shown in Table 3.
- the construct comprises the vector components of construct 27145 or 27146 as set for in Table 3.
- an expression construct employed in a HI-edit procedure comprises SEQ ID NO:68 or SEQ ID NO:69 or a variant thereof having at least 75%, at least 80%, or at least 85% identity to SEQ ID NO:68 or SEQ ID NO:69.
- the expression construct sequence has at least 90% or at least 95% identity to SEQ ID NO:68 or SEQ ID NO:69.
- V. Gene Editing Procedures [0184] In a further aspect, the disclosure provides method of performing gene editing to obtain a progeny plant having a desired genotype. In some embodiments, the method is HI- Edit. As indicated above, HI-Edit methodology is detailed in WO2018/102816. In brief, in the present disclosure, pollen from a first plant that is transformed with any synthetic expression construct of the present disclosure to express a DNA modification enzyme, e.g., a gene editing nuclease under the control of a zygote-preferred promoter, to pollinate a second plant.
- a DNA modification enzyme e.g., a gene editing nuclease under the control of a zygote-preferred promoter
- the promoter is a VSP promoter comprising the sequence of any one of SEQ ID NOS:1-27, 86, or 87, or a functional fragment thereof. In some embodiments, the promoter has at least 70%, 75%, 80%, or 85% identity, or at least 90%, or at least 95% identity to any one of SEQ ID NOS:1-27, 86 or 87.
- the promoter comprises a variant of a functional fragment of at least 100, 200, 300, or 500 nucleotides in length; or at least 600, 700, 800, 900, or 1000 nucleotides in length, or at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, or 1900 nucleotides in length that has at least 70%, 75%, 80%, or 85% identity to the corresponding segment of any one of SEQ ID NOS:1-27, 86 or 87, or that has at least 90%, or at least 95% identity to the corresponding segment of any one of SEQ NOS:1-27, 86, or 87.
- the VSP promoter comprises SEQ ID NO:1 or is an orthologous promoter to SEQ ID NO:1. In some embodiments, the VSP promoter comprises SEQ ID NO:27 or a functional fragment thereof. In some embodiments, the VSP promoter comprises SEQ ID NO:86 or 87 or a functional fragment thereof, e.g., of at least 2100, 2200, 2300, 2400, or 2500 nucleotides in length. In some embodiments, the promoter is an RZD promoter comprising the sequence of any one of SEQ ID NOS:28-43, or a functional fragment thereof.
- the promoter has at least 70%, 75%, 80%, or 85% identity, or at least 90%, or at least 95% identity to any one of SEQ ID NOS:28-43.
- an RZD promoter comprises a variant of a functional fragment of at least 100, 200, 300, or 500 nucleotides in length; or at least 600, 700, 800, 900, or 1000 nucleotides in length, or at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, or 1900 nucleotides in length that has at least 70%, 75%, 80%, or 85% identity to the corresponding segment of any one of SEQ ID NOS:28-43, or that has at least 90% or at least 95% identity to the corresponding segment of any one of SEQ NOS:28-43.
- the RZD promoter comprises SEQ ID NO:28 or a functional fragment thereof, or is an orthologous promoter to SEQ ID NO:28.
- the first plant is also genetically modified to express one or more gRNA sequences for the target gene to be edited in the second plant.
- the promoter is an SCE1 promoter comprising the sequence of SEQ ID NO: 83, 84, or 85, or a functional fragment thereof.
- the promoter has at least 70%, 75%, 80%, or 85% identity, or at least 90%, or at least 95% identity to SEQ ID NO: 83, 84, or 85 or to functional fragment thereof, for example, a fragment of SEQ ID NO:83 of at least 1100, 1200 or 1300 nucleotides in length; or a fragment of SEQ ID NO:84 of at least 2700, 2800, 2900, 300, or 3100 nucleotides in length; or a fragment of SEQ ID NO:85 of at least 2800, 2900, 300, 3100, 3200, or 3300 nucleotides in length.
- the second plant can contain the genomic DNA to be edited.
- the edited progeny plant is a haploid progeny plant.
- the haploid progeny comprises the genome of the second plant, but not the first plant.
- the first plant is a haploid inducer line.
- the haploid inducer line can be a paternal haploid inducer, e.g., having a tailswap mutation in a CENH3 gene (see, e.g., Ravi and Chan, Nature 464:615-618, 2010) or another CENH3 mutation (see, e.g., Maheshwari et al, Genome Research 27(3), 471-478, 2017).
- the haploid inducer line can be a maternal haploid inducer line, e.g., a line that comprises a mutation in a gene that encodes Patatin-like phospholipase A2 ⁇ (also referred to herein as a MATRILINEAL (MATL) gene.
- a paternal haploid inducer e.g., having a tailswap mutation in a CENH3 gene (see, e.g., Ravi and Chan, Nature 464:615-618, 2010) or another CENH3 mutation (see, e.g.
- the MATL gene comprises a loss of function mutation.
- the haploid inducer line may be an ig-type haploid induction, which results from a mutation in the INDETERMINATE GAMETOPHYTE1 gene.
- Any monocot or dicot plant can be modified to express a nucleic acid of interest using a zygote-preferred promoter of the present invention.
- the plants is maize, wheat, rice, barley, oat, turf grass, Brassica, tomato, pepper, lettuce, eggplant, soybean, sunflower, sugar beet, cotton, alfalfa, tobacco and many others.
- promoters were searched and selected from maize and rice transcriptome databases for use in this study.
- the putatively sperm specific ZmDUO1A promoter was selected, along with two constitutively active promoters, including prSoUBI4 (sugar cane Polyubiquitin4) and prOsACT1 (rice Actin1, i.e., LOC_Os03g50885).
- the prSoUBI4 promoter was used as a positive control, as it induced at least 3% haploid editing (“HI-Edit efficiency”) when operably linked to a CRISPR-Cas9 enzyme in five or six maize lines tested [Kelliher, T et al. (2019) One-step genome editing of elite crop germplasm during haploid induction. Nature Biotech 37: 287–292; see also U.S. Patent No.10,285,348] with a range of guide RNA (gRNA) targets driven by the rice U3 promoter, prOsU3.
- gRNA guide RNA
- the prSoUBI4:cCas9 and prOsU3:gRNA combination also induced editing of haploid wheat embryos produced via wide crosses with maize pollen—indicating strong expression of the CRISPR-Cas system components (gRNA and protein) in the sperm cell and/or zygote prior to, during, and/or after fertilization.
- CRISPR-Cas system components gRNA and protein
- AtDUO1A is specifically expressed in sperm cells, and activates sperm-expressed genes in Arabidopsis [Borg, M et al.
- prSoUBI4 Native maize expression of prSoUBI4 is not available, since it is from sugar cane, but the homolog ZmUbi1 (Zm00001d015327) is in the 99th percentile of expressed genes in maize sperm and 12-hour zygotes [Chen J et al. (2017) Zygotic Genome Activation Occurs Shortly after Fertilization in Maize. Plant Cell 29(9):2106-2125]. [0190] In addition to these well-characterized promoters, we identified three new promoters that haven’t previously been characterized in any transgenic context and proceeded to test whether they could drive a higher HI-Edit efficiency than prSoUBI4.
- ZmVSP Volt Sorting Protein
- Zm00001d011353/Zm00001eb358220 the maize ortholog of a rice gene (LOC_Os09g09480/Os09g0267600) that is biparentally expressed in zygotes
- LOC_Os09g09480/Os09g0267600 the maize ortholog of a rice gene that is biparentally expressed in zygotes
- Expression data from that study shows that the OsVsp transcript is highly expressed in rice zygote cells 2.5 hours after pollination, and that its transcripts are produced from both sperm- and egg cell-derived chromosomes.
- the prVSP expression profile in rice and maize fit the desired pattern for a male HI-Edit promoter, namely, high expression in male gametes and early zygotes from paternally-inherited chromosomes.
- Table 1 provides illustrative gene names for VSP homologs. The promoters and terminators from these genes may also act as efficient HI-Edit regulatory elements for driving the high expression of the CRISPR enzymes and/or guide RNAs in the sperm and zygote cell types.
- ZmRZDP maize Ring Zinc-Finger Domain Protein gene
- the prZmRZDP expression profile is a good fit for the desired expression pattern for a male HI-Edit promoter.
- the zygote expression may not be from paternal chromosomes, as the OsRZDP ortholog is expressed as a maternal transcript in [Anderson SN et al. (2017) The Zygotic Transition Is Initiated in Unicellular Plant Zygotes with Asymmetric Activation of Parental Genomes. Dev Cell 43(3):349-358]. This has note tested in maize.
- Illustrative orthologoues of ZmRZDP are sjpwm om Table 2.
- the promoters and terminators from these genes may also act as efficient HI-Edit regulatory elements for driving the high expression of the CRISPR enzymes and/or guide RNAs in the sperm and zygote cell types.
- SCE1 characterized in maize as SUMO-conjugating enzyme 1 (“ZmSCE1,” Zm00001d002570) was found by mining mRNASeq datasets, looking for genes that are medium/highly expressed in sperm cells and early zygotes (preferably from the paternal allele, which was previously unknown for this gene). Mined data shows the SCE1 transcript is in the 99th percentile of expressed genes in both maize sperm and zygotes. We tested two versions of the promoter.
- Vectors 26296, 27145, 27146 and 27234 also contained a DNA donor to GLOSSY2 (ZmGL2) for inducing homologous recombination with homology arms of 400 bp on either side of the target site.
- the donor is flanked by gRNA cutting sites so it can be liberated from the T-DNA.
- the Cas9 gRNA targets were ZmVLHP1/2-01 (5’- GCAGGAGGCGTCGAGCAGCG-3’) (in vector 23396), ZmVLHP1/2-02 (5’- GCTGGAGCTGAGCTTCCGGG-3’) (in vector 23397), ZmGW2-1/2 (5’- AAGCTCGCGCCCTGCTACCC-3’) (in vector 23399), and ZmSBEIIb (5’- ATTGATAGAGCACATGAGCT -3’) (in vector 24520).
- the Cas12a gRNA targets were ZmGL2 (5’-GTCACAGATCACAAACTTCAAATG-3’) (in vectors 26296, 27226, 27145, 27146, and 27234), ZmWaxy1 (5’- GGGAAAGACCGAGGAGAAGATCT-3’) (in vectors 26258, 27226, 27145, 27146, 27234, and 27241), ZmO2-01 (targeting the OPAQUE2 gene) (5’-CTGTATCTCGAGCGTCTGGCTGA-3’) (in vector 26258), ZmYellowEndosperm1 (5’- CTATCTTATCCTAAAGATGGTGG-3’) (in vector 26258), ZmUBL (Ubiquitin ligase) (5’- GGAAGGAAAAGGTATCTGAAGG-3’) (vectors 26258 and 27241), and ZmUPL3 (Ubiquitin ligase) (5’-GGAGGGAAAAGGTGTCTGAGGC
- the inbred line NP2222 and a novel inbred haploid inducer derived from the material 20BD917233 were grown in glasshouse under 16:8 photoperiod (light: dark), and 26 ⁇ /16 ⁇ (day/night).
- the material 20BD917233 was an F7 stage inbred line derived from the biparental breeding population SYN-INBB23 x RWKS/Z21S//RWKS. Ears were harvested 10 or 11 days after pollination, dehusked and desilked, and sterilized with sodium hypochlorite solution. Immature embryos were isolated. Vectors from Table 3 were transformed as described in published protocol [Zhong, H et al.
- Zygote editing and HI-Edit of the prSoUbi4 and prOsU6 combination shows correspondence between the zygote editing test and the HI-Edit rate
- Transformation events were acclimated in a growth chamber for 1-2 weeks and seedlings were then transplanted into pots in the glasshouse, which contained high pressure sodium (HPS) lights as supplemental lighting source for plant growth. Growth conditions were as described in Table 9. The E0 plant tassels and ears were bagged before pollen shed and silk emergence. NP2222 plants and other inbred tester lines were also grown under the same conditions and ears and tassels were bagged for reciprocal crossing. Table 5. Greenhouse conditions for E0 plant growth.
- a rapid proxy assay was initially used to assess the potential HI-Edit efficiency of the different promoters in this study, prior to selecting promoters and events that would be tested for HI-Edit efficiency.
- the purpose of the proxy assay was to quickly assess the potential for the promoters to drive good HI-Edit rates without the reduced sample size intrinsic to assessing editing outcomes in haploids which are always a minority of the progeny resulting from a haploid induction cross.
- the relative efficiency of promoters for early zygote editing was able to be assessed in the absence of haploid induction.
- next generation sequencing was used to determine whether new edits (those produced in the F1, hybrid plant) were made in the target gene of the non-transgenic parent. Because the non-transgenic parent does not contain the required CRISPR-Cas editing machinery, edits in the non-transgenic target gene can only have occurred in the F1 offspring. Furthermore, this proxy assay allowed measurement of not only how often this editing occurred, but an approximate developmental timing of when the editing in the course of the F1 embryo development.
- Editing that occurs early, for instance in the 1-cell zygote stage, should produce a pure biallelic outcome, where nearly 50% of the reads are a new type of edit that was not found in the transgenic parental plant and the other 50% of reads match an editing outcome from the E0 transgenic parent generation. Editing that occurs later, in the multicellular early embryo, should not produce such a high read percentage – but rather may show a mixture of editing outcomes at lower read rates (i.e. it would be chimeric or mosaic). [0198] To test the zygotic editing rate, we conducted outcrossing and editing pattern analysis from five T1 events from vector 26258 (see Table 4, except PLANTHIE31) used as either a female (4) or male (1) to the tester SYN-INBG78.
- T2 generation plants homozygous for the CRISPR transgenes were outcrossed as males (pollen donors) onto ears of tester inbred lines from the stiff stalk line NP2222 and the non-stiff stalk line SYN- INBG78. Haploids were color sorted, and the haploid induction rate was about 15.5%. The haploids were submitted for molecular analysis and edited haploids are called by the result of the Taqman assay for the editing target site. This assay detects the WT sequence; mutations are not able to amplify or bind the Taqman probe, resulting in a haploid copy call of zero.
- the edited haploids are those that have a zero copy call for the target genes.
- the haploid editing rate for the Waxy1 target averages around 0.8%.
- This lower HI-Edit rate is about 3-4x lower than the HI-Edit rate of the original publication, which corresponds well to the 3-4x lower F1 zygote editing rate of 26258. Therefore, it appears that the zygote editing rate may be a useful assay for predicting HI-Edit efficiency. Table 8.
- HI-Edit rate of the Waxy1 target site from two events (PLANTHIE30 and [0201]
- the different promoters used to express the guide RNAs (26258 uses prOsU6, while the original vectors used prOsU3), the different guide RNAs used, the different Cas enzyme (26258 uses Cas12a; the original vectors used Cas9), the different testers and inducer line genetic backgrounds used in the trials, differences in the environmental conditions, or a combination of some or all of these variables.
- the F1 zygote assay is a good proxy for HI-Edit efficiency based on this comparison.
- vector 27146 has prZmVSP (Zm00001d011353) driving Cas12a and prOsU3 to drive gRNA targeting GL2 and Waxy1.
- vector 27145 prZmRZDP
- 27226 rice Actin1
- 27234 prZmDUO1A
- Haploid induction rate and haploid editing rate summarizing the two inducers and the different promoter combinations and gRNAs.
- event PLANTHIE36- showed a dramatically higher HI-Edit rate in both testers according to both Taqman and NGS data, and the Fl zygote editing rate was also the highest (75% and 43% for tester 8 and 2 respectively) of any event of that construct.
- event PLANTHIE37- showed the highest HI-Edit rate and highest Fl-zygote rate in both testers and gRNA targets.
- event PLANTHIE38- had the highest HI-Edit and F1-zygote rate. This correlation suggests that the F1 zygote assay is an effective proxy for HI-Edit.
- the 27146 VSP promoter result also shows the potential for this promoter, given the right T-DNA insertion site to drive exceptionally high HI-Edit rates compared to the prSoUbi4 standard control.
- Table 11a Haploid induction rates and haploid editing rates of individual events from the materials tested in the new inducer, NP3003RS, along with Fl diploid editing NGS data.
- zygote-preferred promoters were first used for paternal expression in maize sperm, we also tested whether they can be used for maternal expression in maize egg cells. To do so, T2 seeds of event NP3003RS (homozygous for Cas12a and guide RNA) were sown and then crossed as female with Tester1 pollen. The resulting F1 embryos were isolated and genotyped for zygote editing detection. The below table shows the zygote editing rate (“ZER”) results. Table 12. Promoter zygote editing rates (“ZER”) in crosses.
- F1# refers to the number of F1 zygotes obtained from the cross. It’s clear the promoters prZmRZDP, prZmVSP, prSCE1 can enable efficient zygote editing from female side.
- Construct Annotations [0208] Construct features are provided in Tables 13–33. The terms “minimum” and “maximum” in the tables refer to the position of the first and last nucleotide of the insertion, respectively, in the construct. Table 13. Construct 23396
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Wood Science & Technology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Biotechnology (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- Biophysics (AREA)
- Medicinal Chemistry (AREA)
- Physics & Mathematics (AREA)
- Plant Pathology (AREA)
- Cell Biology (AREA)
- Botany (AREA)
- Gastroenterology & Hepatology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
The presently disclosed subject matter relates to using a transformed haploid inducing line so that it contains DNA coding for cellular machinery capable of editing genes. This machinery is preferentially expressed in zygote cells. Exemplary promoters include, but are not limited to VSP promoters, RZDP promoters, SCE1 promoters, and U3 promoters. These promoters may be obtained from Arabidopsis, maize, soybean, wheat, sunflower, rice, and tomato.
Description
ZYGOTE-PREFERRED EXPRESSION FIELD [0001] This dislcosure is related to the field of plant biotechnology, specifically agriculture biotechnology and gene editing, as well as plant breeding. The presently disclosed subject matter relates to using transforming a haploid inducing line so that it contains DNA coding for cellular machinery capable of editing genes, which is preferentially expressed in zygote cells. CLAIM FOR PRIORITY [0002] This application claims priority to application serial no. PCT/CN2023/110941, filed August 3, 2023, which is incorporated by reference in its entirety. SEQUENCE LISTING [0003] This application is accompanied by a sequence listing entitled 82223PCT12mo.xml, created July 26, 2024, which is approximately 555 kilobytes in size. This sequence listing is incorporated herein by reference in its entirety. This sequence listing is submitted herewith and complies with 37 C.F.R. §§ 1.831–1.835. BACKGROUND [0004] “Haploid-induction editing” (HI-Edit) of a genome, employs a haploid-inducer plant modified that expresses gene editing machinery to deliver the editing machinery to a genome to be edited of a recipient plant. In such a procedure, the first plant is crossed with the second plant to obtain haploid progeny in which the chromosomes of the haploid inducer line are eliminated and the haploid chromosomes of the recipient plant have the desired edit. HI-Edit methods are detailed in PCT publication WO2018/102816. See also, Kelliher, T et al. (2019) One-step genome editing of elite crop germplasm during haploid induction. Nature Biotech 37: 287–292. BRIEF SUMMARY [0005] The present disclosure features nucleic acid constructs comprising zygote-preferred promoters, such as a Vacuolar Sorting Protein promoter (VSP) or Ring Zinc-Finger Domain Protein promoter (RZDP) or SUMO-conjugating enzyme (SCE1) that preferentially drive expression of a transcript from a nucleic acid operably linked to the promoter in early zygotes and male gametes. In some embodiments, zygote-preferred promoters are employed in
nucleic acid constructs to drive expression of gene editing system components that are employed in gene editing procedures, such as HI-Edit, that exploit haploid induction to provide a plant with a desired edit. [0006] In one aspect, the disclosure provides a synthetic DNA construct comprising a zygote-preferred promoter operably linked to a first nucleotide sequence of interest (“NSOI”). In some embodiments, the zygote-preferred promoter is a vacuolar sorting protein promoter (“VSP promoter”). In some instances, the VSP promoter comprises a sequence: a) selected from the group consisting of SEQ ID NOs: 1–27 or a functional fragment thereof; or b) an orthologous promoter to SEQ ID NO: 1. In some embodiments, the VSP promoter comprises SEQ ID NO:27. In alternative embodiments, the zygote-preferred promoter is a ring zinc-finger domain protein promoter (“RZDP promoter”). In some instances, the RZDP promoter comprises a sequence: (a) selected from the group consisting of SEQ ID NOs: 28– 43 or a functional fragment thereof; or (b) orthologous promoter to SEQ ID NO: 28. In some embodiments, the RZDP promoter comprises SEQ ID NO: 28. In some embodiments, the zygote-preferred promoter is a SUMO-conjugating enzyme 1 promoter (“SCE1 promoter”). In some instances, the SCE1 promoter comprises a sequence a) selected from the group consisting of SEQ ID NOs: 83–85 or a functional fragment thereof; or b) an orthologous promoter to SEQ ID NO: 83. In some embodiments, the SCE1 promoter comprises SEQ ID NO: 84. In some instances, the SCE1 promoter comprises SEQ ID NO: 85. In some embodiments the synthetic DNA construct further comprises a U3 promoter operably linked to a second NSOI. In some instances, the U3 promoter comprises a sequence: (a) selected from the group consisting of SEQ ID NOs: 44–49 or a functional fragment thereof; or (b) orthologous promoter to SEQ ID NO: 44. In some embodiments, the U3 promoter comprises SEQ ID NO: 45. In some embodiments, the synthetic DNA construct further comprises a terminator operably linked to the NSOI. In some instances, the terminator is a ubiquitin terminator. In some embodiments, the ubiquitin terminator comprises SEQ ID NOs: 88, 89, 90, 91, 92, or 93. In some embodiments of the synthetic DNA construct, the first NSOI comprises a sequence encoding a nuclease. In some instances, the nuclease is a zinc finger nuclease (“ZFN”), a meganuclease (“MN”), a transcription activator-like effector nuclease (TALEN), or a CRISPR nuclease. In some embodiments, the CRISPR nuclease is selected from the group consisting of Cas9, Cas12a, Cas12b, Cas12c, Cas12d, Cas12e, Cas12f, Cas12i, Cas12j, Cas12l, Cas1, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas10, Cas11, Cas13a, Cas13b, Cas13c, Cas13d, Cas14, any other CRISPR-Cas nuclease, any mutant
thereof, a nickase variant, and a deactivated variant. In some instances, the CRISPR nuclease further comprises a fusion domain, e.g., selected from the group consisting of a deaminase, a uracil DNA glycosylase, a reverse transcriptase, and an exonuclease. In some embodiments, the second NSOI of the synthetic DNA construct comprises a sequence encoding for at least one guide RNA. In some embodiments, the at least one guide RNA is encoded by a sequence selected from the group consisting of SEQ ID NO: 50–61. In some embodiments, the synthetic DNA construct comprises a sequence selected from the group consisting of SEQ ID NO: 62–72. [0007] In a further aspect, the disclosure provides a plant cell comprising a synthetic DNA construct as described herein, e.g., in the preceding paragraph. In some embodiments, the plant cell is a pollen cell or an egg cell. In another aspect, the disclosure provides a plant, e.g., a maize plant, comprising the plant cell. [0008] In another aspect, the disclosure provides a method of obtaining an edited progeny plant, comprising: (a) providing a first plant, wherein the first plant is transformed to comprise the synthetic construct of claims 1-18; (b) pollinating a second plant; and (c) selecting at least one progeny produced by the pollination of step (b), wherein the progeny possesses an edit; thereby obtaining an edited progeny plant. In some embodiments, the first plant is a haploid inducer line of the plant. In some instances the haploid inducer line is a paternal haploid inducer, e.g.,a paternal haploid inducer line comprising a mutation in a CENH3 gene. In other embodiments, the haploid inducer lines is a maternal haploid inducer, e.g., a maternal haploid inducer line comprising a mutation in a MATL gene. In some embodiments, the second plant comprises plant genomic DNA to be edited. In some embodiments, the edited progeny plant is a haploid progeny plant. In some embodiments, the haploid progeny plant comprises the genome of the second plant but not the first plant. In some embodiments, the haploid progeny plant is a maize haploid progeny plant. [0009] The Summary above is provided to introduce a selection of concepts that are further described below in the Detailed Description. This Summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used as an aid in limiting the scope of the claimed subject matter. DETAILED DESCRIPTION [0010] The following description recites various aspects and embodiments of the present compositions and methods. No particular embodiment is intended to define the scope of the
compositions and methods. Rather, the embodiments merely provide non-limiting examples of various compositions and methods that are at least included within the scope of the disclosed compositions and methods. The description is to be read from the perspective of one of ordinary skill in the art; therefore, information well known to the skilled artisan is not necessarily included. Terminology [0011] All technical and scientific terms used herein, unless otherwise defined below, are intended to have the same meaning as commonly understood by one of ordinary skill in the art. References to techniques employed herein are intended to refer to the techniques as commonly understood in the art, including variations on those techniques and/or substitutions of equivalent techniques that would be apparent to one of skill in the art. While the following terms are believed to be well understood by one of ordinary skill in the art, the following definitions are set forth to facilitate explanation of the presently disclosed subject. [0012] As used herein, the singular forms “a”, “an” and “the” include plural referents unless the content clearly dictates otherwise. For example, reference to “a cell” includes one or more cells and can include tissues or organs. [0013] The term "about," as used herein when referring to a measurable value such as an amount of mass, weight, time, volume, concentration or percentage is meant to encompass variations that are appropriate to perform the of in some embodiments +20%, in some embodiments +10%, in some embodiments +5%, in some embodiments +1%, in some embodiments +0.5%, and in some embodiments +0.1 % from the specified amount, as such variations are appropriate to perform the disclosed methods and/or employ the discloses compositions, nucleic acids, polypeptides, etc. Accordingly, unless indicated to the contrary, the numerical parameters set forth in this specification and attached claims are approximations that can vary depending upon the desired properties sought to be obtained bythe presently disclosed subject matter. [0014] to the entities being present singly or in combination. Thus, for example, the phrase "A, B, C, and/or D" includes A, B, C, and D individually, but also includes any and all combinations and subcombinations of A, B, C, and D (e.g., AB, AC, AD, BC, BD, CD, ABC, ABD, and BCD). In some embodiments, one of more of the elements to which the "and/or" refers can also individually be present in single or multiple occurrences in the combinations(s) and/or subcombination(s).
[0015] The term “plant” as used herein can refer to a whole plant, or any part or component of a plant at any stage of development; and includes reference to a cell or tissue culture derived from a plant. Thus, for example, “plant” can refer to components or organs, e.g., leaves, stems, roots, plant tissues, seeds and/or plant cells. [0016] The term “plant cell” as used herein refers to a structural and physiological unit of a plant, comprising a protoplast and a cell wall. The plant cell may be in form of an isolated single cell or a cultured cell, or as a part of higher organized unit such as, for example, plant tissue, a plant organ, or a whole plant. The plant cell may be derived from or part of an angiosperm or gymnosperm. The plant cell may be a monocotyledonous plant cell (e.g., a maize cell, a rice cell, a sorghum cell, a sugarcane cell, a barley cell, a wheat cell, an oat cell, a turf grass cell, or an ornamental grass cell) or a dicotyledonous plant cell (e.g., a tobacco cell, a pepper cell, an eggplant cell, a sunflower cell, a crucifer cell, a flax cell, a potato cell, a cotton cell, a soybean cell, a sugar bee cell, or an oilseed rape cell. The term “plant cell culture” as used herein refers to cultures of plant units such as, for example, protoplasts, cell culture cells, cells in plant tissues, pollen, pollen tubes, ovules, embryo sacs, zygotes and embryos at various stages of development. The term “plant tissue” as used herein refers to a group of plant cells organized into a structural and functional unit. Any tissue of a plant in planta or in culture is included. This term includes, but is not limited to, whole plants, plant organs, plant seeds, tissue culture and any group of plant cells organized into structural and/or functional units. The use of this term in conjunction with, or in the absence of, any specific type of plant tissue as listed above or otherwise embraced by this definition is not intended to be exclusive of any other type of plant tissue. The term “plant part” as used herein refers to a part of a plant, including single cells and cell tissues such as plant cells that are intact in plants, cell clumps and tissue cultures from which plants can be regenerated. Examples of plant parts include, but are not limited to, single cells and tissues from pollen, ovules, zygotes, leaves, embryos, roots, root tips, anthers, flowers, flower parts, fruits, stems, shoots, cuttings, and seeds; as well as pollen, ovules, egg cells, zygotes, leaves, embryos, roots, root tips, anthers, flowers, flower parts, fruits, stems, shoots, cuttings, scions, rootstocks, seeds, protoplasts, calli, and the like. [0017] As used herein, the terms "progeny" and "progeny plant" refer to a plant generated from vegetative or sexual reproduction from one or more parent plants. The term “progeny” can ref to any descent of a particular cross or parent plant. Typically, progeny plants result from the breeding of two individuals, although some species (particularly some plants and
hermaphroditic animals) can be selfed (i.e., the same plant acts as the donor of both male and female gametes). The descendant(s) can be, for example, of the F1, the F2, or any subsequent generation. In some embodiments, “progeny” plants result from Hi-Edit method. [0018] Haploid induction ("HI") is a class of plant phenomena characterized by loss of one parent's set of chromosomes (the chromosomes from the haploid inducer parent) from the embryo at some time during or after fertilization, often during early embryo development. Haploid induction is also known as gynogenesis if the inducer line is used as the male in the cross, or androgenesis if the inducer line is used as the female in the cross. Haploid induction has been observed in numerous plant species, such as sorghum, barley, wheat, maize, Arabidopsis, and many other species. Commonly, during haploid induction, both parent lines used in the induction cross are both diploids, so their gametes (egg cells and sperm cells) are haploids. Haploid induction is frequently a medium to low penetrance trait of the inducer line, so the resulting progeny, depending on the species or situation, may be either diploid (if no genome loss takes place) or haploids (if genome loss does indeed take place). If the parent line that is crossed to the haploid inducer is not diploid, but rather a tetraploid, hexaploid, or other plant of higher ploidy, the "haploid" progeny produced will have a gametic chromosome number, e.g., diploids (if the parent is tetraploid) or triploids (if the parent is hexaploid). Thus, as used herein, "haploids" possess half the number of chromosomes of either parent; thus haploids of diploid organisms (e.g., maize) exhibit monoploidy; haploids of tetraploid organisms (e.g., ryegrasses) exhibit diploidy; haploids of hexaploid organisms (e.g., wheat) exhibit triploidy. [0019] The term “HI-Edit” refers to “haploid-induction editing” of a genome, which employs a haploid-inducer plant modified to express gene editing machinery to deliver the editing machinery to the genome to be edited of a recipient plant. In such a procedure, the first plant is crossed with the second plant to obtain haplolid progeny in which the chromosomes of the haploid inducer line are eliminated and the haploid chromosomes of the recipient plant have the desired edit. HI-Edit methods are detailed in PCT publication WO2018/102816. See also, Kelliher, T et al. (2019) One-step genome editing of elite crop germplasm during haploid induction. Nature Biotech 37: 287–292. [0020] A plant referred to here as a "doubled haploid" is generated by doubling the haploid set of chromosomes. A plant or seed that is obtained from a doubled haploid plant that is selfed to any number of generations may still be identified as a doubled haploid plant. A
doubled haploid plant is considered a homozygous plant. A plant is considered to be doubled haploid if it is fertile, even if the entire vegetative part of the plant does not consist of the cells with the doubled set of chromosomes; that is, a plant will be considered doubled haploid if it contains viable gametes, even if it is chimeric in vegetative tissues. [0021] A "promoter" refers to a polynucleotide sequence comprising sequences upstream from a 5’ transcription start site that controls transcription of a gene or sequence to which it is operably linked. A promoter includes signals for RNA polymerase binding and transcription initiation. As used herein, a promoter may also contain regulatory elements such enhancers, repressor binding sites, and the like. In some embodiments, a nucleotide sequence for a promoter may comprise a region encoding 5’ untranslated sequence of a transcript of the native gene from which the promoter is derived. For example, a VSP promoter sequence of any one of SEQ ID NOS:1-27, or a functional fragment thereof, may comprise a region encoding 5’ untranslated sequences of a VSP transcript; or an RZDP promoter sequence of any one of SEQ ID NOS: 28-43, or a functional fragment thereof; may comprise a region encoding 5’ untranslated sequences of an RZDP transcript. In some embodiments, the promoter does not comprise a region encoding 5’ untranslated sequences. By way of further example, a promoter may be a U3 promoter sequence of any one of SEQ ID NOS:44-49, or a functional fragment thereof. In some embodiments, such a promoter does not comprise untranscribed sequences. [0022] As used herein, a “nucleotide sequence of interest” refers to a polynucleotide to be expressed by a synthetic construct comprising the polynucleotide operably linked to a promoter to control transcription of the polynucleotide. A transcript produced by the synthetic construct may be a protein-encoding RNA, e.g., that encodes a nuclease, or a non- protein-coding transcript, such as a guide RNA. [0023] As used herein, an “endogenous” or “native” nucleic acid sequence refers to a nucleic acid sequence that occurs naturally in the genome of an organism. [0024] A “gene” is a defined region that is located within a genome that includes, in addition to coding nucleic acid sequence, comprises other sequences, primarily regulatory sequences responsible for the control of the expression, that is to say the transcription and, wherein a protein is produced by a gene, translation of the coding portion. Genes can include both coding and non-coding regions (e.g., introns, regulatory elements, promoters, enhancers, termination sequences and 5' and 3' untranslated regions). A gene typically expresses
mRNA, functional RNA, or a specific protein, including regulatory sequences. Genes may or may not be capable of being used to produce a functional protein. In some embodiments, a gene refers to only the coding region. The term “native gene” refers to a gene as found in nature. The term “chimeric gene” refers to any gene that contains 1) DNA sequences, including regulatory and coding sequences that are not found together in nature, or 2) sequences encoding parts of proteins not naturally adjoined, or 3) parts of promoters that are not naturally adjoined. Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that are derived from different sources or comprise regulatory sequences and coding sequences derived from the same source, but arranged in a manner different from that found in nature. A gene may be “isolated” by which is meant a nucleic acid molecule that is substantially or essentially free from components normally found in association with the nucleic acid molecule in its natural state. Such components include other cellular material, culture medium from recombinant production, and/or various chemicals used in chemically synthesizing the nucleic acid molecule. [0025] The terms “nucleic acid” and “polynucleotide” are used interchangeably and as used herein refer to deoxyribonucleic acids (DNA) or ribonucleic acids (RNA) and polymers thereof in either single- or double-stranded form, as well as to both sense and anti-sense strands of RNA, cDNA, genomic DNA, mitochondrial DNA, and synthetic forms and mixed polymers of the above. In higher plants, DNA is the genetic material while RNA is involved in the transfer of information contained within DNA into proteins. A “genome” is the entire body of genetic material contained in each cell of an organism. It is understood that when an RNA is described, its corresponding cDNA is also described, wherein uridine is represented as thymidine. In particular embodiments, a nucleotide refers to a ribonucleotide, deoxynucleotide or a modified form of either type of nucleotide, or combinations thereof. In addition, a polynucleotide disclosed herein may include either or both naturally occurring and modified nucleotides linked together by naturally occurring and/or non-naturally occurring nucleotide linkages. The nucleic acid molecules may be modified chemically or biochemically or may contain non-natural or derivatized nucleotide bases, as will be readily appreciated by those of skill in the art. Such modifications include, for example, labels, methylation, substitution of one or more of the naturally occurring nucleotides with an analogue, internucleotide modifications such as uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoramidates, carbamates, and the like), charged linkages (e.g., phosphorothioates, phosphorodithioates, and the like), pendent moieties (e.g.,
polypeptides), intercalators (e.g., acridine, psoralen, and the like), chelators, alkylators, and modified linkages (e.g., alpha anomeric nucleic acids, and the like). The above term is also intended to include any topological conformation, including single-stranded, double-stranded, partially duplexed, triplex, hairpinned, circular and padlocked conformations. A reference to a nucleic acid sequence encompasses its complement unless otherwise specified. Thus, a reference to a nucleic acid molecule having a particular sequence should be understood to encompass its complementary strand, with its complementary sequence. Nucleotide sequences are “complementary” when they specifically hybridize in solution (e.g., according to Watson-Crick base pairing rules). The term also includes codon-optimized nucleic acids that encode the same polypeptide sequence. It is also understood that nucleic acids can be unpurified, purified, or attached, for example, to a synthetic material such as a bead or column matrix. [0026] As used herein, the term "reference sequence" in the context of a nucleic acid sequence refers to a defined nucleotide sequence used as a basis for nucleotide sequence comparison. In some embodiments, a reference sequence can be a promoter sequence of any one of SEQ ID NOS:1-27, SEQ ID NOS:28-43; or SEQ ID NOS 44-49; a sequence of any one of SEQ ID Nos:50-61 that encodes a guide RNA; or a nucleic acid sequence of anyone of SEQ ID NOS:62-72 that encodes a construct. [0027] The term “corresponding to” in the context of nucleic acid sequences as used in the present disclosure refers to certain positions, or certain regions of a nucleotide sequence of interest that align with these positions or regions of a reference sequence when the two sequences are optimally aligned, but that are not necessarily in these exact numerical positions of the two sequences. While optimal alignment and scoring can be accomplished manually, the process is facilitated by a computer-implemented alignment algorithm. Readily available sequence comparison and multiple sequence alignment algorithms are, respectively, the Basic Local Alignment Search Tool (BLAST) and ClustalW/ClustalW2/Clustal Omega programs available on the Internet (e.g., the website of the EMBL-EBI). Other suitable programs include, but are not limited to, GAP, BestFit, Plot Similarity, and FASTA, which are part of the Accelrys GCG Package available from Accelrys, Inc. of San Diego, Calif., United States of America. See also Smith & Waterman, 1981; Needleman & Wunsch, 1970; Pearson & Lipman, 1988; Ausubel et al., 1988; and Sambrook & Russell, 2001.
[0028] One example of an algorithm that is suitable for determining percent sequence identity and sequence similarity is the BLAST algorithm, which is described in Altschul et al., 1990. In some embodiments, a percentage of sequence identity refers to sequence identity over the full length of a nucleic acid or polypeptide sequence. [0029] Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions in sequences that encode proteins), alleles, SNPs, and complementary sequences as well as the sequence explicitly indicated. [0030] The terms “polypeptide,” “peptide,” and “protein” are used interchangeably herein to refer to a polymer of amino acid residues. As used herein, the terms encompass amino acid chains of any length, including full-length proteins, wherein the amino acid residues are linked by covalent peptide bonds. [0031] The terms “identity” or “substantial identity,” as used in the context of a polynucleotide or polypeptide sequence described herein, refers to a sequence that has at least 60% sequence identity to a reference sequence. Alternatively, percent identity can be any integer from 60% to 100%. Exemplary embodiments include at least: 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%, as compared to a reference sequence using the programs described herein; preferably BLAST using standard parameters, as described below. One of skill will recognize that these values can be appropriately adjusted to determine corresponding identity of proteins encoded by two nucleotide sequences by taking into account codon degeneracy, amino acid similarity, reading frame positioning and the like. [0032] For sequence comparison, typically one sequence acts as a reference sequence to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters. [0033] A “comparison window,” as used herein, includes reference to a segment of any one of the number of contiguous positions selected from the group consisting of from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may
be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. Methods of alignment of sequences for comparison are well-known in the art. Optimal alignment of sequences for comparison may be conducted by the local homology algorithm of Smith and Waterman Add. APL. Math.2:482 (1981), by the homology alignment algorithm of Needleman and Wunsch J. Mol. Biol.48:443 (1970), by the search for similarity method of Pearson and Lipman Proc. Natl. Acad. Sci. (U.S.A.) 85: 2444 (1988), by computerized implementations of these algorithms (e.g., BLAST), or by manual alignment and visual inspection. [0034] Unless otherwise stated, identity and similarity will be calculated by the Needleman-Wunsch global alignment and scoring algorithms (Needleman and Wunsch (1970) J. Mol. Biol.48(3):443-453) as implemented by the "needle" program, distributed as part of the EMBOSS software package (Rice, P., Longden, I., and Bleasby, A., EMBOSS: The European Molecular Biology Open Software Suite, 2000, Trends in Genetics 16, (6) pp276-277, versions 6.3.1 available from EMBnet at embnet.org/resource/emboss and emboss.sourceforge.net, among other sources) using default gap penalties and scoring matrices (EBLOSUM62 for protein and EDNAFULL for DNA). Equivalent programs may also be used. By "equivalent program" is intended any sequence comparison program that, for any two sequences in question, generates an alignment having identical nucleotide residue matches and an identical percent sequence identity when compared to the corresponding alignment generated by needle from EMBOSS version 6.3.1. [0035] Additional mathematical algorithms are known in the art and can be utilized for the comparison of two sequences. See, for example, the algorithm of Karlin and Altschul (1990) Proc. Natl. Acad. Sci. USA 87:2264, modified as in Karlin and Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-5877. Such an algorithm is incorporated into the BLAST programs of Altschul et al. (1990) J. Mol. Biol.215:403. BLAST nucleotide searches can be performed with the BLASTN program (nucleotide query searched against nucleotide sequences) to obtain nucleotide sequences homologous to nucleic acid molecules of the invention, or with the BLASTX program (translated nucleotide query searched against protein sequences) to obtain protein sequences homologous to nucleic acid molecules of the invention. BLAST protein searches can be performed with the BLASTP program (protein query searched against protein sequences) to obtain amino acid sequences homologous to protein molecules of the invention, or with the TBLASTN program (protein query searched against translated nucleotide sequences) to obtain nucleotide sequences homologous to protein molecules of the
invention. To obtain gapped alignments for comparison purposes, Gapped BLAST (in BLAST 2.0) can be utilized as described in Altschul et al. (1997) Nucleic Acids Res. 25:3389. Alternatively, PSI-Blast can be used to perform an iterated search that detects distant relationships between molecules. See Altschul et al. (1997) supra. When utilizing BLAST, Gapped BLAST, and PSI-Blast programs, the default parameters of the respective programs (e.g., BLASTX and BLASTN) can be used. Alignment may also be performed manually by inspection. [0036] Thus, an “isolated” nucleic acid molecule is a nucleic acid molecule or nucleotide sequence that is not immediately contiguous with nucleotide sequences with which it is immediately contiguous (one on the 5' end and one on the 3' end) in the naturally occurring genome of the organism from which it is derived. Accordingly, in one embodiment, an isolated nucleic acid includes some or all of the 5' non-coding (e.g., promoter) sequences that are immediately contiguous to a coding sequence. The term therefore includes, for example, a recombinant nucleic acid that is incorporated into a vector, into an autonomously replicating plasmid or virus, or into the genomic DNA of a prokaryote or eukaryote, or which exists as a separate molecule (e.g., a cDNA or a genomic DNA fragment produced by PCR or restriction endonuclease treatment), independent of other sequences. It also includes a recombinant nucleic acid that is part of a hybrid nucleic acid molecule encoding an additional RNA or polyipeptide sequence. An “isolated” nucleic acid molecule can also include a polynucleotide derived from and inserted into the same natural, original cell type, but which is present in a non-natural state, e.g., present in a different copy number, and/or under the control of different regulatory sequences than that found in the native state of the nucleic acid molecule. “Isolated” does not necessarily mean that the preparation is technically pure (homogeneous), but it is sufficiently pure to provide the nucleic acid in a form in which it can be used for the intended purpose. BRIEF DESCRIPTION OF THE SEQUENCES IN THE SEQUENCE LISTING [0037] SEQ ID NO: 1 is the nucleotide sequence of a VSP promoter from Arabidopsis. [0038] SEQ ID NO: 2 is the nucleotide sequence of a VSP promoter from Arabidopsis.
[0039] SEQ ID NO: 3 is the nucleotide sequence of a VSP promoter from Arabidopsis. [0040] SEQ ID NO: 4 is the nucleotide sequence of a VSP promoter from Arabidopsis. [0041] SEQ ID NO: 5 is the nucleotide sequence of a VSP promoter from Arabidopsis. [0042] SEQ ID NO: 6 is the nucleotide sequence of a VSP promoter from Arabidopsis. [0043] SEQ ID NO: 7 is the nucleotide sequence of a VSP promoter from sunflower. [0044] SEQ ID NO: 8 is the nucleotide sequence of a VSP promoter from sunflower. [0045] SEQ ID NO: 9 is the nucleotide sequence of a VSP promoter from sunflower. [0046] SEQ ID NO: 10 is the nucleotide sequence of a VSP promoter from sunflower. [0047] SEQ ID NO: 11 is the nucleotide sequence of a VSP promoter from rice. [0048] SEQ ID NO: 12 is the nucleotide sequence of a VSP promoter from rice. [0049] SEQ ID NO: 13 is the nucleotide sequence of a VSP promoter from rice. [0050] SEQ ID NO: 14 is the nucleotide sequence of a VSP promoter from rice. [0051] SEQ ID NO: 15 is the nucleotide sequence of a VSP promoter from tomato. [0052] SEQ ID NO: 16 is the nucleotide sequence of a VSP promoter from tomato. [0053] SEQ ID NO: 17 is the nucleotide sequence of a VSP promoter from tomato. [0054] SEQ ID NO: 18 is the nucleotide sequence of a VSP promoter from soybean. [0055] SEQ ID NO: 19 is the nucleotide sequence of a VSP promoter from soybean. [0056] SEQ ID NO: 20 is the nucleotide sequence of a VSP promoter from soybean. [0057] SEQ ID NO: 21 is the nucleotide sequence of a VSP promoter from soybean. [0058] SEQ ID NO: 22 is the nucleotide sequence of a VSP promoter from soybean. [0059] SEQ ID NO: 23 is the nucleotide sequence of a VSP promoter from soybean.
[0060] SEQ ID NO: 24 is the nucleotide sequence of a VSP promoter from soybean. [0061] SEQ ID NO: 25 is the nucleotide sequence of a VSP promoter from soybean. [0062] SEQ ID NO: 26 is the nucleotide sequence of a VSP promoter from maize. [0063] SEQ ID NO: 27 is the nucleotide sequence of a VSP promoter from maize. [0064] SEQ ID NO: 28 is the nucleotide sequence of a RZDP promoter from maize. [0065] SEQ ID NO: 29 is the nucleotide sequence of a RZDP promoter from Arabidopsis. [0066] SEQ ID NO: 30 is the nucleotide sequence of a RZDP promoter from Arabidopsis. [0067] SEQ ID NO: 31 is the nucleotide sequence of a RZDP promoter from rice. [0068] SEQ ID NO: 32 is the nucleotide sequence of a RZDP promoter from rice. [0069] SEQ ID NO: 33 is the nucleotide sequence of a RZDP promoter from tomato. [0070] SEQ ID NO: 34 is the nucleotide sequence of a RZDP promoter from tomato. [0071] SEQ ID NO: 35 is the nucleotide sequence of a RZDP promoter from tomato. [0072] SEQ ID NO: 36 is the nucleotide sequence of a RZDP promoter from tomato. [0073] SEQ ID NO: 37 is the nucleotide sequence of a RZDP promoter from soybean. [0074] SEQ ID NO: 38 is the nucleotide sequence of a RZDP promoter from soybean. [0075] SEQ ID NO: 39 is the nucleotide sequence of a RZDP promoter from soybean. [0076] SEQ ID NO: 40 is the nucleotide sequence of a RZDP promoter from soybean. [0077] SEQ ID NO: 41 is the nucleotide sequence of a RZDP promoter from sunflower. [0078] SEQ ID NO: 42 is the nucleotide sequence of a RZDP promoter from sunflower. [0079] SEQ ID NO: 43 is the nucleotide sequence of a RZDP promoter from sunflower. [0080] SEQ ID NO: 44 is the nucleotide sequence of a U3 promoter from Arabidopsis.
[0081] SEQ ID NO: 45 is the nucleotide sequence of a U3 promoter from rice (prOsU3-01). [0082] SEQ ID NO: 46 is the nucleotide sequence of a U3 promoter from rice (prOsU3-02). [0083] SEQ ID NO: 47 is the nucleotide sequence of a U3 promoter from rice (prOsU3-03). [0084] SEQ ID NO: 48 is the nucleotide sequence of a U3 promoter from wheat. [0085] SEQ ID NO: 49 is the nucleotide sequence of a U3 promoter from maize. [0086] SEQ ID NO: 50 is the nucleotide sequence encoding a guide RNA targeting VLHP1-1 & VLHP1-2. [0087] SEQ ID NO: 51 is the nucleotide sequence encoding a guide RNA targeting VLHP1-1 & VLHP1-2. [0088] SEQ ID NO: 52 is the nucleotide sequence encoding a guide RNA targeting GW2-1 & GW2-2. [0089] SEQ ID NO: 53 is the nucleotide sequence encoding a guide RNA targeting SBEIIb. [0090] SEQ ID NO: 54 is the nucleotide sequence encoding a guide RNA targeting GL2. [0091] SEQ ID NO: 55 is the nucleotide sequence encoding a guide RNA targeting Waxy1. [0092] SEQ ID NO: 56 is the nucleotide sequence encoding a guide RNA targeting O2 first exon. [0093] SEQ ID NO: 57 is the nucleotide sequence encoding a guide RNA targeting O2 second exon. [0094] SEQ ID NO: 58 is the nucleotide sequence encoding a guide RNA targeting O2 third exon. [0095] SEQ ID NO: 59 is the nucleotide sequence encoding a guide RNA targeting YellowEndosperm.
[0096] SEQ ID NO: 60 is the nucleotide sequence encoding a guide RNA targeting UBL. [0097] SEQ ID NO: 61 is the nucleotide sequence encoding a guide RNA targeting UPL3. [0098] SEQ ID NO: 62 is the nucleotide sequence encoding construct 23396. [0099] SEQ ID NO: 63 is the nucleotide sequence encoding construct 23397. [0100] SEQ ID NO: 64 is the nucleotide sequence encoding construct 23399. [0101] SEQ ID NO: 65 is the nucleotide sequence encoding construct 24520. [0102] SEQ ID NO: 66 is the nucleotide sequence encoding construct 26258. [0103] SEQ ID NO: 67 is the nucleotide sequence encoding construct 26296. [0104] SEQ ID NO: 68 is the nucleotide sequence encoding construct 27145. [0105] SEQ ID NO: 69 is the nucleotide sequence encoding construct 27146. [0106] SEQ ID NO: 70 is the nucleotide sequence encoding construct 27226. [0107] SEQ ID NO: 71 is the nucleotide sequence encoding construct 27234. [0108] SEQ ID NO: 72 is the nucleotide sequence encoding construct 27241. [0109] SEQ ID NO: 73 is the nucleotide sequence encoding construct 27680. [0110] SEQ ID NO: 74 is the nucleotide sequence encoding construct 28255. [0111] SEQ ID NO: 75 is the nucleotide sequence encoding construct 28291. [0112] SEQ ID NO: 76 is the nucleotide sequence encoding construct 28292. [0113] SEQ ID NO: 77 is the nucleotide sequence encoding construct 28293. [0114] SEQ ID NO: 78 is the nucleotide sequence encoding construct 28510. [0115] SEQ ID NO: 79 is the nucleotide sequence encoding construct 28520. [0116] SEQ ID NO: 80 is the nucleotide sequence encoding construct 28560. [0117] SEQ ID NO: 81 is the nucleotide sequence encoding construct 28825. [0118] SEQ ID NO: 82 is the nucleotide sequence encoding construct 28834.
[0119] SEQ ID NO: 83 is the nucleotide sequence of a SCE1 promoter from maize. [0120] SEQ ID NO: 84 is the nucleotide sequence of a SCE1 promoter from maize. [0121] SEQ ID NO: 85 is the nucleotide sequence of a SCE1 promoter from maize. [0122] SEQ ID NO: 86 is the nucleotide sequence of a VSP-01 promoter from maize. [0123] SEQ ID NO: 87 is the nucleotide sequence of a VSP-02 promoter from maize. [0124] SEQ ID NO: 88 is the nucleotide sequence of a ubiquitin terminator (“tUbi1- 04”) from maize. [0125] SEQ ID NO: 89 is the nucleotide sequence of a ubiquitin terminator (“tUbi1- 06”) from maize. [0126] SEQ ID NO: 90 is the nucleotide sequence of a ubiquitin terminator (“tUbi1- 09”) from maize gene GRMZM2G409726. [0127] SEQ ID NO: 91 is the nucleotide sequence of a ubiquitin terminator from sorghum bicolor. [0128] SEQ ID NO: 92 is the nucleotide sequence of a ubiquitin terminator from Medicago truncatula. [0129] SEQ ID NO: 93 is the nucleotide sequence of a ubiquitin terminator from Glycine max. DETAILED DESCRIPTION [0130] The present disclosure features nucleic acid constructs comprising zygote-preferred promoters, such as a Vacuolar Sorting Protein promoter (VSP) or Ring Zinc-Finger Domain Protein promoter (RZDP) or SUMO-conjugating enzyme (SCE1) that preferentially drive expression of a transcript from a nucleic acid operably linked to the promoter in early zygotes and male gametes. In some embodiments, zygote-preferred promoters are employed in nucleic acid constructs to drive expression of gene editing system components that are employed in gene editing procedures, such as HI-Edit, that exploit haploid induction to provide a plant with a desired edit. [0131] A zygote-referred promoter causes its downstream sequence to be preferentially expressed in the zygote. When used in connection with HI-Edit, a zygote-preferred promoter of the present disclosure provides a zygote-edit rate of at least 10%. Zygote-edit rate
correlates with HI-edit efficiency; and thus a zygote-preferred promoter as described herein can, in some embodiments, be employed in HI-edit methods. [0132] The term “HI-Edit efficiency” refers to the measurement, usually expressed as a percentage, of progeny plants produced from a HI-Edit cross which are both edited and are (or were, if doubled) haploid. “HI-Edit efficiency,” “haploid editing rate,” and “HI-edit rate” are used interchangeably throughout. [0133] For purposes of this disclosure, a zygote-preferred promoter for use in HI-Edit methods has a zygote-edit rate of at least about 10%. A zygote-edit rate can be determined using available methods, such as those described in Section three of Example 1. For example, the relative chimerism of new edits in diploid F1 offspring is assessed after out- crossing of a transgenic parent to a non-transgenic parent line. The transgenic parent expresses a DNA modification enzyme such as a Case protein, e.g., Cas9 or Cas12a, and guide RNAs to target a gene for modification in the non-transgenic parent. Next generation sequencing (NGS) is typically used to determine whether new edits (those produced in the F1, hybrid plant) are made in the target gene of the non-transgenic parent. Because the non- transgenic parent does not contain the required CRISPR-Cas editing machinery, edits in the non-transgenic target gene can only have occurred in the F1 offspring. This assay allows measurement of not only how often this editing occurred, but an approximate developmental timing of when the editing occurs in the course of the F1 embryo development. Editing that occurs early, for instance in the 1-cell zygote stage, should produce a pure biallelic outcome, where nearly 50% of the reads are a new type of edit that was not found in the transgenic parental plant and the other 50% of reads match an editing outcome from the E0 transgenic parent generation. Editing that occurs later, in the multicellular early embryo, should not produce such a high read percentage – but rather may show a mixture of editing outcomes at lower read rates (i.e. it would be chimeric or mosaic). [0134] A zygote-preferred promoter that exhibits an edit rate of 10% or higher can then be selected for use in any context in which zygote-preferred expression is of interest. In preferred embodiment, such a promoter is employed in a HI-edit procedure. I. VSP Promoters [0135] In some embodiments, a promoter used in a HI-Edit procedure comprises a VSP promoter exhibiting a zygote edit rate of 10% or higher. In some embodiments, a VSP promoter is from an endogenous rice or maize gene or is an orthologous promoter. In some
embodiments, a VSP promoter is from an endogenous Arabidopsis, sunflower, rice, tomato, or maize gene. For example, in some embodiments, a VSP promoter is a rice VSP promoter from LOC_Os09g09480/Os09g0267600 that is biparentally expressed in zygotes (Anderson et al. (2017) The Zygotic Transition Is Initiated in Unicellular Plant Zygotes with Asymmetric Activation of Parental Genomes. Dev Cell 43(3):349-358). In some embodiments a VSP is a promoter from a maize gene Zm00001d011353/Zm00001eb358220. Additional examples of ortholopgous VSP genes are provided in Table 1. Table 1. Example orthologues of OsVSP and ZmVSP in diverse crops.
[0136] Promoters and terminators of the genes listed above, or orthologus genes, e.g., those that provide a zygote editing efficiency of at least 10%, may act as efficient HI-Edit regulatory elements for driving robust expression of CRISPR enzymes and/or guide RNAs in the sperm and zygote cell types. [0137] In some embodiments, a VSP promoter comprises a sequence of any one of SEQ ID NOS:1-27 or a functional fragment thereof. In some embodiments, a VSP promoter comprises a functional fragment of any one of SEQ ID NOS:1-27, 86 or 87 of at least 100, 200, 300, 400, or 500 nucleotides in length. In some embodiments, a VSP promoter comprises a functional fragment of any one of SEQ ID NOS:1-27, 86 or 87 of at least 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 15001600, 1700, 1800, or 1900 nucleotides in
length; or at least 2000, 2100, 2200, 2300, 2400, or 2500 nucleotide in length. In some embodiments, a VSP promoter comprises a variant of a functional fragment of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000; or at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800 or 1900 nucleotides in length having at least 80% identity, or at least 85%, at least 90%, or at least 95% identity to the corresponding segment of any one of SEQ ID NOS:1-27, 86 or 87. In some embodiments, such a functional fragment has at least 96% identity, or at least 97%, 98%, 99%, or greater, identity to the corresponding segment of any one SEQ ID NOS:1-27. [0138] In some embodiments, a functional fragment of a VSP promoter lacks 5’- untranslated region sequences of the mRNA. The 5’ untranslated region can be readily confirmed/determined using known techniques, such as 5’ RACE analysis. Additional functional fragments, e.g., deletions or variants of any one of SEQ ID NOS:1-27, can be determined by assessing mutagenized and/or truncated regions of the promoter to determine those fragments that exhibit at least about 10% zygote-editing activity. [0139] In some embodiments, a VSP promoter comprises SEQ ID NO:27, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ SEQ ID NO:27 of at least 100, 200, 300, 400, or 500 nucleotide in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:27 of at least 500, 600, 700, 800, 900, or 1000 nucleotide in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:27 of at least 1100, 1200, 1300, 1400, or 1500 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:27 of at least 1600 nucleotide in length, or at least 1700, 1800, or 1900 nucleotides in length. In some embodiments, a VSP promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:27, or to a segment of SEQ ID NO:27 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 in length; or comprises at least 90% identity, or at least 95% identity, to a segment of SEQ ID NO:27 of at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, or 1900 nucleotides in length. In some embodiments, the VSP promoter does not comprise 5’ untranslated region sequences. In some embodiments, a VSP promoter is a functional fragment of SEQ ID NO:27 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:27 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the
functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0140] In some embodiments, a VSP promoter comprises SEQ ID NO:12, or a functional fragment thereof. In some embodiments, the fragment comprises a region of SEQ ID NO:12 of least 100, 150, 200, 250, 300, 350, 400, or 450 nucleotides in length. In some embodiments, a functional VSP promoter comprises a region having at least 90% identity, or at least 95% identity to SEQ ID NO:12, or to a segment of SEQ ID NO:12 of at least 100, 150, 200, 250, 300, 350, 400, or 450 nucleotides in length. In some embodiments, the VSP promoter does not comprise 5’ untranslated region sequences. In some embodiments, a VSP promoter is a functional fragment of SEQ ID NO:12 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:12 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full- length sequence. [0141] In some embodiments, a VSP promoter comprises SEQ ID NO:14, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ ID NO:14 of at least 100, 200, 300, 400, or 500 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:14 of at least 600, 700, 800, 900, or 1000 nucleotide in length or of at least 1100, 1200, 1300, 1400, or 1500 nucleotide in length. In some embodiments, a functional VSP promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:14, or to a segment of SEQ ID NO:14 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 nucleotides in length. In some embodiments, a functional VSP promoter comprises a region having at least 90% identity, or at least 95% identity, to a segment of SEQ ID NO:14 of at least 1100, 1200, 1300, 1400 or 1500 nucleotides in length. In some embodiments, the VSP promoter does not comprise 5’ untranslated region sequences. In some embodiments, a VSP promoter is a functional fragment of SEQ ID NO:14 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:14 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence.
[0142] In some embodiments, a VSP promoter comprises SEQ ID NO:11 or SEQ ID NO:13 or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ ID NO:11 or SEQ ID NO:13 of at least 100, 200, 300, 400, or 500 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:11 or SEQ ID NO:13 of at least 500, 700, 800, 900, or 1000 nucleotide in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:11 or SEQ ID NO:13 of at least 1100, 1200, 1300, 1400, or 1500 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:11 or SEQ ID NO:13 of at least 1600 nucleotide in length, or at least 1700, 1800, or 1900 nucleotides in length. In some embodiments, a VSP promoter comprises a region having at least 90% identity, or at least 95% identity. to SEQ ID NO:11 or SEQ ID NO:13, or to a segment of SEQ ID NO:11 or SEQ ID NO:13 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 in length; or comprises at least 90% identity, or at least 95% identity, to a segment of SEQ ID NO:11 or SEQ ID NO:13 of at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, or 1900 nucleotides in length. In some embodiments, the VSP promoter does not comprise 5’ untranslated region sequences. In some embodiments, a VSP promoter is a functional fragment of SEQ ID NO:11 or SEQ ID NO:13 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:11 or SEQ ID NO:13 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0143] In some embodiments, a VSP promoter comprises SEQ ID NO:26, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ ID NO:26 of at least 100, 200, 300, 400, or 500 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:26 of at least 500, 600, 700, 800, 900, or 1000 nucleotide in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:26 of at least 1100, 1200, 1300, 1400, or 1500 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:26 of at least 1600 nucleotide in length, or at least 1700, 1800, or 1900 nucleotides in length. In some embodiments, a VSP promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:26, or to a segment of SEQ ID NO:26 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 in length; or comprises at least 90% identity, or at least
95% identity, to a segment of SEQ ID NO:26 of at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, or 1900 nucleotides in length. In some embodiments, the VSP promoter does not comprise 5’ untranslated region sequences. In some embodiments, a VSP promoter is a functional fragment of SEQ ID NO:26 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:26 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full- length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0144] In some embodiments, a VSP promoter comprises any one of SEQ ID NOS:1, 3, 5, 6, 7, 8, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of any one of SEQ ID NOS:1, 3, 5, 6, 7, 8, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 of at least 100, 200, 300, 400, or 500 nucleotide in length. In some embodiments, the functional fragment comprises a region of any one of SEQ ID NOS:1, 3, 5, 6, 7, 8, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 of at least 500, 600, 700, 800, 900, or 1000 nucleotide in length. In some embodiments, the functional fragment comprises a region of any one of SEQ ID NOS:1, 3, 5, 6, 7, 8, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 of at least 1100, 1200, 1300, 1400, or 1500 nucleotides in length. In some embodiments, the functional fragment comprises a region of any one of SEQ ID NOS:1, 3, 5, 6, 7, 8, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 of at least 1600 nucleotide in length, or at least 1700, 1800, or 1900 nucleotides in length. In some embodiments, a VSP promoter comprises a a region having at least 90% identity, or at least 95% identity, to any one of SEQ ID NOS:1, 3, 5, 6, 7, 8, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24, or to a segment of any one of SEQ ID NOS:1, 3, 5, 6, 7, 8, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 in length; or comprises at least 90% identity, or at least 95% identity, to a segment of any one of SEQ ID NOS:1, 3, 5, 6, 7, 8, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 of at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, or 1900 nucleotides in length. In some embodiments, the VSP promoter does not comprise 5’ untranslated region sequences. In some embodiments, a VSP promoter is a functional fragment of one of SEQ ID NOS:1, 3, 5, 6, 7, 8, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 lacking one or more nucleotides from the 5’ and/or 3’ end of the full-length sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the
functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0145] In some embodiments, a VSP promoter comprises SEQ ID NO:2, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ ID NO:2 of at least 100, 200, 300, 400, or 450 nucleotides in length. In some embodiments, a functional VSP promoter comprises a region having at least 90% identity or at least 95% identity to SEQ ID NO:2, or to a segment of SEQ ID NO:2 of at least 100, 200, 300, 400, or 450 nucleotides in length. In some embodiments, the VSP promoter does not comprise 5’ untranslated region sequences. In some embodiments, a VSP promoter is a functional fragment of SEQ ID NO:2 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:2 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0146] In some embodiments, a VSP promoter comprises SEQ ID NO:4, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ ID NO:4 of at least 100, 200, 300, 400, or 500 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:4 of at least 600, 700, or 800 nucleotides in length. In some embodiments, a VSP promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:4, or to a segment of SEQ ID NO:4 of at least 100, 200, 300, 400, or 500 nucleotides in length; or at least 600, 700, or 800 nucleotides. In some embodiments, the VSP promoter does not comprise 5’ untranslated region sequences. In some embodiments, a VSP promoter is a functional fragment of SEQ ID NO:4 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:4 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0147] In some embodiments, a VSP promoter comprises SEQ ID NO:9, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ ID NO:9 of at least 100, 200, 300, 400, 500, or 600 nucleotides in length. In some embodiments, a functional VSP promoter comprises a region having at least 90% identity, or
at least 95% identity to SEQ ID NO:9, or to a segment of SEQ ID NO:9 of at least 100, 200, 300, 400, 500, or 600 nucleotides in length. In some embodiments, the VSP promoter does not comprise 5’ untranslated region sequences. In some embodiments, a VSP promoter is a functional fragment of SEQ ID NO:9 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:9 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full- length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0148] In some embodiments, a VSP promoter comprises SEQ ID NO:25, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ ID NO:25 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 nucleotide in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:25 of at least 1100, 1200, 1300, 1400, 1500, 1600, or 1700 nucleotides in length. In some embodiments, a VSP promoter comprises a region having at least 90% identity or at least 95% identity to SEQ ID NO:25, or to a segment of SEQ ID NO:25 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 nucleotides in length; or a segment of SEQ ID NO:25 of at least 1100, 1200, 1300, 1400, 1500, 1600, or 1700 nucleotide in length. In some embodiments, the VSP promoter does not comprise 5’ untranslated regions sequences. In some embodiments, a VSP promoter is a functional fragment of SEQ ID NO:25 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:25 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0149] In some embodiments, a VSP promoter comprises SEQ ID NO:86 or 87, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ ID NO:86 or 87 of at least 500, 600, 700, 800, 900, or 1000 nucleotide in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:86 or 87 of at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, or 1900 nucleotides in length. In some embodiments, the functional fragment comprises a region of at least 2000, 2100, 2200, 2300, 2400, or 2500 nucleotide in length. In some embodiments, a VSP promoter comprises a region having at least 90% identity or at least 95% identity to SEQ ID NO:86 or 87, or to a segment of SEQ ID NO:86 or 87 of at least 500, 600, 700, 800, 900, or
1000 nucleotides in length; or a segment of SEQ ID NO:25 of at least 1100, 1200, 1300, 1400, 1500, 1600, 170, 1800, or 1900 nucleotides in length; or a segment of at least 2000, 2100, 2200, 2300, 2400 or 2500 nucleotides in length. In some embodiments, a VSP promoter is a functional fragment of SEQ ID NO:86 or 87 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:86 or 87 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. II. RZDP Promoters [0150] In some embodiments, a zygote-preferred promoter of the present disclosure, e.g, employed in a HI-Edit procedure, comprises an RZDP promoter exhibiting zygote editing of about 10% or higher. In some embdoiments, an RZDP promoter is from an endogenous maize gene, or from an orthologous promoter. In some embodiments, an RZDP promoter is from an endogenous rice, maize, Arabidopsis, soybean, tomator or sunflower gene. For example, in some embodiments, an RZDP promoter is from a Zm00001d050090 gene. Additional examples of ortholopgous RZDP genes are provided in Table 2. Table 2. Example orthologues of ZmRZDP in diverse crops.
[0151] Promoters and terminators of the genes listed above, or orthologus genes, e.g., those that have a zygote editing efficiency of at least 10%, may act as efficient HI-Edit regulatory elements for driving the high expression of the CRISPR enzymes and/or guide RNAs in the sperm and zygote cell types.
[0152] In some embodiments, an RZD promoter comprises a sequence of any one of SEQ ID NOS:28-43, or a functional fragment thereof. In some embodiments, an RZD promoter comprises a functional fragment of any one of SEQ ID NOS:28-43 of at least 100, 200, 300, 400, or 500 nucleotides in length. In some embodiments, an RZD promoter comprises a functional fragment of any one of SEQ ID NOS:28-43 of at least 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, or 1500 nucleotides in length. In some embodiments, an RZD promoter comprises a functional fragment of any one of SEQ ID NOS:28-43 of at least 1600, 1700, 1800, or 1900 nucleotides in length. In some embodiments, an RZD promoter comprises a variant of a functional fragment of any one of SEQ ID NOS:28-43 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 nucleotides in length, or at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800 or 1900 nucleotides in length, having at least 80% identity, or at least 85%, at least 90%, or at least 95% identity to the corresponding segment of any one of SEQ ID NOS:28-43. In some embodiments, such a functional fragment has at least 96% identity, or at least 97%, 98%, 99%, or greater, identity to the corresponding segment of any one SEQ ID NOS:28-43. [0153] In some embodiments, a functional fragment of an RZDP promoter lacks 5’- untranslated region sequences of the mRNA. The 5’ untranslated region can be readily confirmed/determined using known techniques, such as 5’ RACE analysis. Additional functional fragments, e.g., deletions or variants of any one of SEQ ID NOS:28-43, can be determined by assessing mutagenized and/or truncated regions of the promoter to determine those fragments that exhibit at least about 10% zygote-editing activity. [0154] In some embodiments, an RZD promoter comprises SEQ ID NO:28, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ SEQ ID NO:28 of at least 100, 200, 300, 400, or 500 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:28 of at least 500, 600, 700, 800, 900, or 1000 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:28 of at least 1100, 1200, 1300, 1400, or 1500 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:28 of at least 1600 nucleotides in length, or at least 1700, 1800, or 1900 nucleotides in length. In some embodiments, an RZD promoter comprises aregion having at least 90% identity, or at least 95% identity, to SEQ ID NO:28, or to a segment of SEQ ID NO:28 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 nucleotides in length; or comprises at least 90% identity, or at least 95% identity, to a segment of SEQ ID NO:28 of
at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, or 1900 nucleotides in length. In some embodiments, the RZD promoter does not comprise 5’ untranslated region sequences. In some embodiments, an RZD promoter is a functional fragment of SEQ ID NO:28 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:28 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0155] In some embodiments, an RZDP promoter comprises any one of SEQ ID NOS:29, 31, 32, 33, 35, 36, 37, 39, 40, 41, or 42, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of any one of SEQ ID NOS:29, 31, 32, 33, 35, 36, 37, 39, 40, 41, or 42 of at least 100, 200, 300, 400, or 500 nucleotides in length. In some embodiments, the functional fragment comprises a region of any one of SEQ ID NOS:29, 31, 32, 33, 35, 36, 37, 39, 40, 41, or 42 of at least 600, 700, 800, 900 or 1000 nucleotides in length. In some embodiments, the functional fragment comprises a region of any one of SEQ ID NOS:29, 31, 32, 33, 35, 36, 37, 39, 40, 41, or 42 of at least 1100, 1200, 1300, 1400, 1500 nucleotides in length, or at least 1600, 1700, at 1800, or at 1900 nucleotides in length. In some embodiments, an RZDP promoter comprises a region having at least 90% identity or at least 95% identity to any one of SEQ ID NOS:29, 31, 32, 33, 35, 36, 37, 39, 40, 41, or 42, or to a segment of any one of SEQ ID NOS:29, 31, 32, 33, 35, 36, 37, 39, 40, 41, or 42 of at least 100, 200, 300, 400, or 500 nucleotides in length; or a segment of at least 600, 700, 800, 900, or 1000 nucleotides in length. In some embodiments, an RZDP comprises a region having at least 90% identity or at least 95% identity to any one of SEQ ID NOS:29, 31, 32, 33, 35, 36, 37, 39, 40, 41, or 42, or to a segment of any one of SEQ ID NOS:29, 31, 32, 33, 35, 36, 37, 39, 40, 41, or 42 of at least 11, 1200, 1300, 1400, 1500, 1600, 1700, 1800, or 1900 nucleotides in length. In some embodiments, the RZDP promoter does not comprise 5’ untranslated region sequences. In some embodiments, an RZD promoter is a functional fragment of one of SEQ ID NOS:29, 31, 32, 33, 35, 36, 37, 39, 40, 41, or 42 lacking one or more nucleotides from the 5’ and/or 3’ end of the full-length sequence, where the segment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full- length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence.
[0156] In some embodiments, an RZDP promoter comprises SEQ ID NO:30, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ ID NO:30 of at least 100, 200, 300, 400, or 500 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:30 of at least 600, 700, or 800 nucleotides in length. In some embodiments, an RZD promoter comprises a region having at least 90% identity or at least 95% identity to SEQ ID NO:30, or to a segment of SEQ ID NO:30 of at least 100, 200, 300, 400, 500, 600, 700, or 800 nucleotides in length. In some embodiments, the RZDP promoter does not comprise 5’ untranslated region sequences. In some embodiments, an RZD promoter is a functional fragment of SEQ ID NO:30 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full- length SEQ ID NO:30 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0157] In some embodiments, an RZDP promoter comprises SEQ ID NO:34 or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ ID NO:34 of at least 100, 200 or 300 nucleotides in length. In some embodiments, an RZDP promoter comprises a region having at least 90% identity or at least 95% identity to SEQ ID NO:34, or to a segment of SEQ ID NO:34 of at least 100, 200, or 300 nucleotides in length. In some embodiments, the RZDP promoter does not comprise 5’ untranslated region sequences. In some embodiments, an RZD promoter is a functional fragment of SEQ ID NO:34 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full- length SEQ ID NO:34 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0158] In some embodiments, an RZDP promoter comprises SEQ ID NO:38, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ ID NO:38 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 nucleotides in length. In some embodiments, the fragment comprises a region of SEQ ID NO:38 at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, or 1800 nucleotides in length. In some embodiments, an RZD promoter comprises a region having at least 90% identity or at least 95% identity to SEQ ID NO:38, or to a segment of SEQ ID NO:38 at least 100, 200,
300, 400, 500, 600, 700, 800, 900, or 1000 nucleotides in length; or to a segment of at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, or 1800 nucleotides in length. In some embodiments, the RZDP promoter does not comprise 5’ untranslated region sequences. In some embodiments, an RZD promoter is a functional fragment of SEQ ID NO:38 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:38 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0159] In some embodiments, an RZDP promoter comprises SEQ ID NO:43, or a functional fragment thereof. In some embodiments, the fragment comprises at a region of SEQ ID NO:43 of least 1100 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:43 of at least 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, or 1100 nucleotides in length. In some embodiments, an RZD promoter comprises a region having at least 90% identity or at least 95% identity to SEQ ID NO:43, or to a segment of SEQ ID NO:43 at least 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, or 1100 nucleotides in length. In some embodiments, the RZDP promoter does not comprise 5’ untranslated region. In some embodiments, an RZD promoter is a functional fragment of SEQ ID NO:43 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:43 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. III. SCE1 Promoters [0160] In some embodiments, a zygote-preferred promoter of the present disclosure, e.g., employed in a HI-Edit procedure, comprises an SCE1 promoter exhibiting zygote editing of about 10% or higher. In some embodiments, an SCE1 promoter is from an endogenous maize gene, or from an orthologous promoter. In some embodiments, an SCE1 promoter is from an endogenous rice, maize, Arabidopsis, soybean, tomato or sunflower gene. For example, in some embodiments, an SCE1 promoter is from a Zm00001d002570 gene. [0161] In some embodiments, an SCE1 promoter comprises a sequence of any one of SEQ ID NOS:83-85 or a functional fragment thereof. In some embodiments, an SCE1 promoter
comprises a functional fragment of any one of SEQ ID NOS:83-85 of at least 300, 400, or 500 nucleotides in length. In some embodiments, an SCE1 promoter comprises a functional fragment of any one of SEQ ID NOS:83-85 of at least 600, 700, 800, 900, 1000, 1100, 1200, or 1300 nucleotides in length. In some embodiments, an SCE1 promoter comprises a functional fragment of SEQ ID NO:84 or 85 of 1400, 15001600, 1700, 1800, or 1900 nucleotides in length. In some embodiments, an SCE1 promoter comprises a functional fragment of SEQ ID NO:84 or 85 of 2000, 2100, 2200, 2300, or 2400 nucleotide in length; or in some embodiments, a functional fragment of at least 2500, 2600, 2700, 2800, 2900, or 3000 nucleotides in length. In some embodiments, an SCE1 promoter comprises a sequence having at least 90%, 92%, 92%, 93%, or 94% identity to any one of SEQ ID NOS: 83-85. In some embodiments, an SCE1 promoter comprises a sequence having at least 95%, 96%, 97%, 98%, or 99% identity to any one of SEQ ID NOS: 83-85. In some embodiments, the SCE1 promoter comprises SEQ ID NO:83, 84, or 85. [0162] In some embodiments, a functional fragment of an SCE1 promoter lacks 5’- untranslated region sequences of the mRNA. The 5’ untranslated region can be readily confirmed/determined using known techniques, such as 5’ RACE analysis. Additional functional fragments, e.g., deletions or variants of any one of SEQ ID NOS:83-, can be determined by assessing mutagenized and/or truncated regions of the promoter to determine those fragments that exhibit at least about 10% zygote-editing activity. [0163] In some embodiments, an SCE! promoter comprises SEQ ID NO:83, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ ID NO:83 of at least 500, 600, 700, 800, 900, or 1000 nucleotide in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:83 of at least 1100 nucleotides in length or at least 1200 or 1300 nucleotides in length. In some embodiments, an SCE1 promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:83, or to a segment of SEQ ID NO:83 of at least 500, 600, 700, 800, 900, or 1000 nucleotides in length; or comprises at least 90% identity, or at least 95% identity, to a segment of SEQ ID NO:83 of at least 1100, 1200, or 1300 nucleotides in length. In some embodiments, an SCE1 promoter is a functional fragment of SEQ ID NO:83 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:83 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the
functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0164] In some embodiments, an SCE1 promoter comprises SEQ ID NO:84 or SEQ ID NO:85, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ ID NO:84 or SEQ ID NO:85 of at least 500, 750, or 1000 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:84 or SEQ ID NO:85 of at least 1100, 1200, 1300, 1400, or 1500 nucleotides in length; or a region of at least 1600, 1700, 1800, 1900, or 2000 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:84 or SEQ ID NO:85 of at least 2100, 2200, 2300, 2400, or 2500 nucleotides in length; or a region of at least 2600, 2700, 2800, 2900, 3000, or 3100 nucleotides in length. In some embodiments, an SCE1 promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:84 or 85, or to a segment of SEQ ID NO:84 or 85 comprising the functional fragment. In some embodiments, an SCE1 promoter is a functional fragment of SEQ ID NO:84 or 85 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full- length SEQ ID NO:84 or 85 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. IV. Expression construct comprising promoters [0165] In a further aspect, the present disclosure provide expression constructs and transgenic plant cells that comprise a zygote-preferred promoter, e.g., a VSP promoter or RZD promoter as described herein. to drive expression of a nucleic acid sequence of interest in a plant. In some embodiments, the nucleic acid sequence of interest is a nucleic acid modification enzyme, e.g., a nuclease. [0166] In some embodiments, an expression construct can be a single vector that encodes two or more expression products of interest. In some embodiments, expression of one product of interest is driven by a zygote-preferred promoter as described herein, and expression of a second product of interest is driven by a different promoter. In some embodiments, the expression system comprises a binary vector encoding two expression products of interest, e.g., in which expression of a desired gene product, e.g., a nucleic acid modification enzyme, such as a DNA editing nuclease, is driven by a zygote-preferred
promoter and expression of an RNA product of interest, e.g., a guide RNA is drive by an RNA Polymerase III promoter. [0167] In certain embodiments, an expression construct comprises a zygote-preferred promoter as described herein that drives expression of a zinc-finger nuclease (ZFN). ZFNs are a fusion between the cleavage domain of FokI and a DNA recognition domain containing 3 or more zinc finger motifs. Examples of ZFNs include, but are not limited to, those described in Urnov et al., Nature Reviews Genetics, 2010, 11:636-646; Gaj et al., Nat Methods, 2012, 9(8):805-7; U.S. Patent Nos.6,534,261; 6,607,882; 6,746,838; 6,794,136; 6,824,978; 6,866,997; 6,933,113; 6,979,539; 7,013,219; 7,030,215; 7,220,719; 7,241,573; 7,241,574; 7,585,849; 7,595,376; 6,903,185; 6,479,626; and U.S. Application Publication Nos.2003/0232410 and 2009/0203140. [0168] In some embodiments, a zygote-preferred promoter may be used to drive expression of a TAL-effector nuclease (TALEN) may be used. TALENs are engineered transcription activator-like effector nucleases that contain a central domain of DNA-binding tandem repeats, a nuclear localization signal, and a C-terminal transcriptional activation domain. TALENs can be produced by fusing a TAL effector DNA binding domain to a DNA cleavage domain. For instance, a TALE protein may be fused to a nuclease such as a wild-type or mutated FokI endonuclease or the catalytic domain of FokI. Detailed descriptions of TALENs and their uses for gene editing are found, e.g., in U.S. Patent Nos.8,440,431; 8,440,432; 8,450,471; 8,586,363; and 8,697,853; Scharenberg et al., Curr Gene Ther, 2013, 13(4):291-303; Gaj et al., Nat Methods, 2012, 9(8):805-7; Beurdeley et al., Nat Commun, 2013, 4:1762; and Joung and Sander, Nat Rev Mol Cell Biol, 2013, 14(1):49-55. [0169] In some embodiments, a zygote-preferred promoter may be used to drive expression of a meganuclease. Meganucleases can be modular DNA-binding nucleases such as any fusion protein comprising at least one catalytic domain of an endonuclease and at least one DNA binding domain or protein specifying a nucleic acid target sequence. The DNA-binding domain can contain at least one motif that recognizes single- or double-stranded DNA. The meganuclease can be monomeric or dimeric. Detailed descriptions of useful meganucleases and their application in gene editing are found, e.g., in Silva et al., Curr Gene Ther, 2011, 11(1):11-27; Zaslavoskiy et al., BMC Bioinformatics, 2014, 15:191; Takeuchi et al., Proc Natl Acad Sci USA, 2014, 111(11):4061-4066, and U.S. Patent Nos.7,842,489; 7,897,372; 8,021,867; 8,163,514; 8,133,697; 8,021,867; 8,119,361; 8,119,381; 8,124,36; and 8,129,134.
[0170] In some embodiments, an expression construct may comprise a zygote-preferred promoter as described herein that drive drives expression of a CRISPR nuclease used in a CRISPR editing system. In some embodiments, a Cas protein expressed under the control of the zygote-preferred promoter is Cas9, Cas12a (formerly referred to as Cpf1), Cas12b (formerly referred to as C2c1), Cas13a (formerly referred to as C2c2), C2c3, Cas13b, or a Cas protein or orthogols proteins from prokaryotic organism. In certain embodiments, the Cas protein is a (modified) Cas9, such as a (modified) Staphylococcus aureus Cas9 (SaCas9) or (modified) Streptococcus pyogenes Cas9 (SpCas9). In certain embodiments, the Cas protein is Cas12a, optionally from Acidaminococcus sp., such as Acidaminococcus sp. BV3L6 Cpf1 (AsCas12a) or Lachnospiraceae bacterium Cas12a , such as Lachnospiraceae bacterium MA2020 or Lachnospiraceae bacterium MD2006 (LBCas12a). See U.S. Pat. No. 10,669,540, incorporated herein by reference in its entirety. Alternatively, the Cas12a protein may be from Moraxella bovoculi AAX08_00205 [Mb2Cas12a] or Moraxella bovoculi AAX11_00205 [Mb3Cas12a]. See WO 2017/189308, incorporated herein by reference in its entirety. In certain embodiments, the Cas protein is (modified) C2c2, such as Leptotrichia wadei C2c2 (LwC2c2) or Listeria newyorkensis FSL M6-0635 C2c2 (LbFSLC2c2). In certain embodiments, the (modified) Cas protein is C2c1. In certain embodiments, the (modified) Cas protein is C2c3. In certain embodiments, the (modified) Cas protein is Cas13b. Other Cas enzymes are available to a person skilled in the art. [0171] In some embodiments, a CRISPR nuclease further comprises a fusion domain, such as a deaminase, a uracil DNA glycosylase, a reverse transcriptase, or an exonuclease Examples of a modified CRISPR nuclease include chimeric Cas proteins such as dCas9-FokI, dCpf1-FokI, chimeric Cas9-cytidine deaminase, chimeric Cas9-adenine deaminase, a nickase Cas9 (nCas9), chimeric dCas9 non-FokI nuclease and dCpf1 non-FokI nuclease. [0172] In some embodiments, an expression construct, such as a binary vector, comprises a first polynucleotide comprising a zygote-preferred promoter operably linked to a sequence encoding a DNA modification enzyme and a second polynucleotide comprising an RNA Polymerase III promoter operably linked to a nucleic acid sequence encoding at least one guide RNA. In some embodiments, the zygote-referred promoter is a VSP or RZD promoter as described herein, e.g., a VSP promoter of any one of SEQ ID NOS:1-27, 86 or 87 or a functional fragment or variant thereof; or an RZD promoter of any one of SEQ ID NOS:28- 43, or a functional fragment of variant thereof. In some embodiments, the zygote-referred promoter is an SCE1 promoter as described herein, e.g., an SCE1 promoter of any one of
SEQ ID NOS:83-85, or a functional fragment of variant thereof. In some embodiments, the second polynucleotide comprises a U3 promoter operably linked to a sequence encoding one or more guide RNAs. A. U3 promoters [0173] In some embodiments, the U3 promoter is from rice, wheat, maize, or Arabidopsis. In some embodiments, the U3 promoter that drives expression of one or more guide RNAs comprises a sequence having at least 70%, or at least 75%, 80%, or 85% identity to a U3 promoter sequence of any one of SEQ ID NOS:44-49. In some embodiments, the U3 promoter sequence has at least 90% or at least 95% identity to any one of SEQ ID NOS:44- 49. In some embodiments, the U3 promoter sequence has at least 96%, 97%, 98%, or 99% identity to any one of SEQ ID NOS:44-49, or comprises any one of SEQ ID NOS:44-49. [0174] Functional fragments, e.g., deletions or variants of any one of SEQ ID NOS:44-49, can be determined by assessing mutagenized and/or truncated regions of the promoter to determine those fragments that exhibit at least about 50% or at least about 70% of promoter activity compared to full length promoter sequence. [0175] In some embodiments, a U3 promoter comprises SEQ ID NO:44, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ ID NO:44 of at least 100, 150, 200, 250, or 300 nucleotides in length. In some embodiments, a U3 promoter comprises aregion having at least 90% identity, or at least 95% identity, to SEQ ID NO:44, or to a segment of SEQ ID NO:44 of at least 100, 150, 200, 250, or 300 nucleotides in length. In some embodiments, a U3 promoter is a functional fragment of SEQ ID NO:44 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full- length SEQ ID NO:44 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0176] In some embodiments, a U3 promoter comprises SEQ ID NO:45, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ ID NO:45 of at least 100, 150, 200, 250, 300, or 350 nucleotides in length. In some embodiments, a U3 promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:45, or to a segment of SEQ ID NO:45 of at least 100, 150, 200, 250,
300, or 350 nucleotides in length. In some embodiments, a U3 promoter is a functional fragment of SEQ ID NO:45 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:45 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0177] In some embodiments, a U3 promoter comprises SEQ ID NO:46, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ ID NO:46 of at least 100, 150, 200, 250, 300, or 350 nucleotides in length. In some embodiments, a U3 promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:46, or to a segment of SEQ ID NO:46 of at least 100, 150, 200, 250, 300, or 350 nucleotides in length. In some embodiments, a U3 promoter is a functional fragment of SEQ ID NO:46 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:46 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0178] In some embodiments, a U3 promoter comprises SEQ ID NO:47, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ ID NO:47 of at least 100, 150, 200, 250, 300, or 350 nucleotides in length. In some embodiments, a U3 promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:47, or to a segment of SEQ ID NO:47 of at least 100, 150, 200, 250, 300, or 350 nucleotides in length. In some embodiments, a U3 promoter is a functional fragment of SEQ ID NO:47 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:47 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0179] In some embodiments, a U3 promoter comprises SEQ ID NO:48, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ ID NO:48 of at least 200, 250, 300, 350, 400, or 450 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:48 of at least 500,
550, 600, 650, 700, 750, 800, or 850 nucleotides in length. In some embodiments, a U3 promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:48, or to a segment of SEQ ID NO:48 of at least 200, 250, 300, 350, 400, or 450 nucleotides in length. In some embodiments, a U3 promoter comprises a a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:48, or to a segment of SEQ ID NO:48 of at least 500, 550, 600, 650, 700, 750, 800, or 850 nucleotides in length. In some embodiments, a U3 promoter is a functional fragment of SEQ ID NO:48 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:48 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full-length sequence. [0180] In some embodiments, a U3 promoter comprises SEQ ID NO:49, or a functional fragment thereof. In some embodiments, the functional fragment comprises a region of SEQ ID NO:49 of at least 200, 250, 300, 350, 400, or 450 nucleotides in length. In some embodiments, the functional fragment comprises a region of SEQ ID NO:49 of at least 500, 550, 600, 650, 700, or 750 nucleotides in length. In some embodiments, a U3 promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:49, or to a segment of SEQ ID NO:49 of at least 200, 250, 300, 350, 400, or 450 nucleotides in length. In some embodiments, a U3 promoter comprises a region having at least 90% identity, or at least 95% identity, to SEQ ID NO:49, or to a segment of SEQ ID NO:49 of at least 500, 550, 600, 650, 700, or 750 nucleotides in length. In some embodiments, a U3 promoter is a functional fragment of SEQ ID NO:49 in which one or more nucleotides is deleted from the 5’ and/or 3’ end of the full-length SEQ ID NO:49 sequence, where the functional fragment is at least 90%, at least 91%, at least 92%, at least 93%, or at least 94% of the length of the full-length sequence. In some embodiments, the functional fragment is at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the length of the full- length sequence. B. Additional Expression Construct Regulatory elements [0181] An expression construct as described herein may comprise further regulatory elements. A “regulatory element” as used herein includes any sequence that influences construction or function of the expression construct, e.g., influences transcription and/or
translation in a cell in which the expression construct is expressed. Such sequences include transcriptional or translations enhancers, additional promoters or promoter elements that drive expression of other gene product encoded by the construct, such as genes encoding selectable markers. Additional regulatory elements include introns and transcriptional terminators. Such regulatory elements may be endogenous or heterologous to the host cell or to each other. [0182] A variety of transcriptional terminators are available for use in expression constructs. These are responsible for the termination of transcription beyond the transgene and correct mRNA polyadenylation. The termination region may be native with the transcriptional initiation region, may be native with the operably linked DNA sequence of interest, may be native with the plant host, or may be derived from another source (i.e., foreign or heterologous to the promoter, the DNA sequence of interest, the plant host, or any combination thereof). Appropriate transcriptional terminators are those that are known to function in plants and include the CAMV pSOY1 terminator, the tml terminator, the nopaline synthase terminator and the pea rbcs E9 terminator. These can be used in both monocotyledons and dicotyledons. In addition, a gene's native transcription terminator may be used. Termination regions used in the expression cassettes can be obtained from, e.g., the Ti-plasmid of A. tumefaciens, such as the octopine synthase and nopaline synthase termination regions. See also Guerineau et al. (1991) Mol. Gen. Genet.262: 141-144; Proudfoot (1991) Cell 64:671-674; Sanfacon et al. (1991) Genes Dev.5: 141-149; Mogen et al. (990) Plant Cell 2: 1261-1272; Munroe et al. (1990) Gene 91: 151-158; Ballas et al. (1989) Nucleic Acids Res.17:7891-7903; and Joshi et al. (1987) Nucleic Acids Res. 15:9627-9639. [0183] Illustrative expression constructs are provided in Table 3. Additional vector configurations and regulatory elements are provided in constructs 23396, 23397, 23399, 24520, 26258, 26296, 27145, 27146, 27226, 272234, 27241, 27680, 28255, 28291, 28292, 28293, 28510, 28520, 28560, 28825, and 28834 are described in Table 3. In some embodiments, an expression construct employed in a HI-edit procedure comprises a binary vector having the promoter driving CRISPR/Cas, CRISPR/Cas, terminator of the CRISPR/Cas cassette; promoter driving gRNA expression, gRNA target gene and gRNA sequence of a construct shown in Table 3. In some embodiments, the construct comprises the vector components of construct 27145 or 27146 as set for in Table 3. In some embodiments, an expression construct employed in a HI-edit procedure comprises SEQ ID NO:68 or SEQ ID NO:69 or a variant thereof having at least 75%, at least 80%, or at least 85% identity to
SEQ ID NO:68 or SEQ ID NO:69. In some embodiments, the expression construct sequence has at least 90% or at least 95% identity to SEQ ID NO:68 or SEQ ID NO:69. V. Gene Editing Procedures [0184] In a further aspect, the disclosure provides method of performing gene editing to obtain a progeny plant having a desired genotype. In some embodiments, the method is HI- Edit. As indicated above, HI-Edit methodology is detailed in WO2018/102816. In brief, in the present disclosure, pollen from a first plant that is transformed with any synthetic expression construct of the present disclosure to express a DNA modification enzyme, e.g., a gene editing nuclease under the control of a zygote-preferred promoter, to pollinate a second plant. In some embodiments, the promoter is a VSP promoter comprising the sequence of any one of SEQ ID NOS:1-27, 86, or 87, or a functional fragment thereof. In some embodiments, the promoter has at least 70%, 75%, 80%, or 85% identity, or at least 90%, or at least 95% identity to any one of SEQ ID NOS:1-27, 86 or 87. In some embodiments, the promoter comprises a variant of a functional fragment of at least 100, 200, 300, or 500 nucleotides in length; or at least 600, 700, 800, 900, or 1000 nucleotides in length, or at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, or 1900 nucleotides in length that has at least 70%, 75%, 80%, or 85% identity to the corresponding segment of any one of SEQ ID NOS:1-27, 86 or 87, or that has at least 90%, or at least 95% identity to the corresponding segment of any one of SEQ NOS:1-27, 86, or 87. In some embodiments, the VSP promoter comprises SEQ ID NO:1 or is an orthologous promoter to SEQ ID NO:1. In some embodiments, the VSP promoter comprises SEQ ID NO:27 or a functional fragment thereof. In some embodiments, the VSP promoter comprises SEQ ID NO:86 or 87 or a functional fragment thereof, e.g., of at least 2100, 2200, 2300, 2400, or 2500 nucleotides in length. In some embodiments, the promoter is an RZD promoter comprising the sequence of any one of SEQ ID NOS:28-43, or a functional fragment thereof. In some embodiments, the promoter has at least 70%, 75%, 80%, or 85% identity, or at least 90%, or at least 95% identity to any one of SEQ ID NOS:28-43. In some embodiments, an RZD promoter comprises a variant of a functional fragment of at least 100, 200, 300, or 500 nucleotides in length; or at least 600, 700, 800, 900, or 1000 nucleotides in length, or at least 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, or 1900 nucleotides in length that has at least 70%, 75%, 80%, or 85% identity to the corresponding segment of any one of SEQ ID NOS:28-43, or that has at least 90% or at least 95% identity to the corresponding segment of any one of SEQ NOS:28-43. In some embodiments, the RZD promoter comprises SEQ ID NO:28 or a functional fragment thereof,
or is an orthologous promoter to SEQ ID NO:28. In typical embodiments, the first plant is also genetically modified to express one or more gRNA sequences for the target gene to be edited in the second plant. In some embodiments, the promoter is an SCE1 promoter comprising the sequence of SEQ ID NO: 83, 84, or 85, or a functional fragment thereof. In some embodiments, the promoter has at least 70%, 75%, 80%, or 85% identity, or at least 90%, or at least 95% identity to SEQ ID NO: 83, 84, or 85 or to functional fragment thereof, for example, a fragment of SEQ ID NO:83 of at least 1100, 1200 or 1300 nucleotides in length; or a fragment of SEQ ID NO:84 of at least 2700, 2800, 2900, 300, or 3100 nucleotides in length; or a fragment of SEQ ID NO:85 of at least 2800, 2900, 300, 3100, 3200, or 3300 nucleotides in length. [0185] Following pollination of the second plant, selection can be performed for progeny that possess the desired edit using well known technology to detect the edit. In HI-Edit methods, the second plant can contain the genomic DNA to be edited. In some embodiments, the edited progeny plant is a haploid progeny plant. The haploid progeny comprises the genome of the second plant, but not the first plant. [0186] In some embodiments, the first plant is a haploid inducer line. The haploid inducer line can be a paternal haploid inducer, e.g., having a tailswap mutation in a CENH3 gene (see, e.g., Ravi and Chan, Nature 464:615-618, 2010) or another CENH3 mutation (see, e.g., Maheshwari et al, Genome Research 27(3), 471-478, 2017). In other embodiments, the haploid inducer line can be a maternal haploid inducer line, e.g., a line that comprises a mutation in a gene that encodes Patatin-like phospholipase A2α (also referred to herein as a MATRILINEAL (MATL) gene. In some embodiments, the MATL gene comprises a loss of function mutation. In some embodiments, the haploid inducer line may be an ig-type haploid induction, which results from a mutation in the INDETERMINATE GAMETOPHYTE1 gene. [0187] Any monocot or dicot plant can be modified to express a nucleic acid of interest using a zygote-preferred promoter of the present invention. In some embodiments, the plants is maize, wheat, rice, barley, oat, turf grass, Brassica, tomato, pepper, lettuce, eggplant, soybean, sunflower, sugar beet, cotton, alfalfa, tobacco and many others.
EXAMPLES I. Promoter Mining and Selection [0188] To identify promoters with sperm- and/or zygote-specific expression patterns, promoters were searched and selected from maize and rice transcriptome databases for use in this study. The putatively sperm specific ZmDUO1A promoter was selected, along with two constitutively active promoters, including prSoUBI4 (sugar cane Polyubiquitin4) and prOsACT1 (rice Actin1, i.e., LOC_Os03g50885). The prSoUBI4 promoter was used as a positive control, as it induced at least 3% haploid editing (“HI-Edit efficiency”) when operably linked to a CRISPR-Cas9 enzyme in five or six maize lines tested [Kelliher, T et al. (2019) One-step genome editing of elite crop germplasm during haploid induction. Nature Biotech 37: 287–292; see also U.S. Patent No.10,285,348] with a range of guide RNA (gRNA) targets driven by the rice U3 promoter, prOsU3. In that publication, the prSoUBI4:cCas9 and prOsU3:gRNA combination also induced editing of haploid wheat embryos produced via wide crosses with maize pollen—indicating strong expression of the CRISPR-Cas system components (gRNA and protein) in the sperm cell and/or zygote prior to, during, and/or after fertilization. [0189] While the HI-Edit performance of prZmDUO1A and prOsAct1 has not yet been tested during HI-Edit, AtDUO1A is specifically expressed in sperm cells, and activates sperm-expressed genes in Arabidopsis [Borg, M et al. (2011) The R2R3 MYB Transcription Factor DUO1 Activates a Male Germline-Specific Regulon Essential for Sperm Cell Differentiation in Arabidopsis Plant Cell.23(2): 534–549], and OsAct1 expression in 2.5- hour old rice zygote cells is in the 98th percentile of the studied genes [Anderson SN et al. (2017) The Zygotic Transition Is Initiated in Unicellular Plant Zygotes with Asymmetric Activation of Parental Genomes. Dev Cell 43(3):349-358]. Native maize expression of prSoUBI4 is not available, since it is from sugar cane, but the homolog ZmUbi1 (Zm00001d015327) is in the 99th percentile of expressed genes in maize sperm and 12-hour zygotes [Chen J et al. (2017) Zygotic Genome Activation Occurs Shortly after Fertilization in Maize. Plant Cell 29(9):2106-2125]. [0190] In addition to these well-characterized promoters, we identified three new promoters that haven’t previously been characterized in any transgenic context and proceeded to test whether they could drive a higher HI-Edit efficiency than prSoUBI4. First, ZmVSP (Vacuolar Sorting Protein; Zm00001d011353/Zm00001eb358220) was identified as the
maize ortholog of a rice gene (LOC_Os09g09480/Os09g0267600) that is biparentally expressed in zygotes [Anderson SN et al. (2017) The Zygotic Transition Is Initiated in Unicellular Plant Zygotes with Asymmetric Activation of Parental Genomes. Dev Cell 43(3):349-358]. Expression data from that study shows that the OsVsp transcript is highly expressed in rice zygote cells 2.5 hours after pollination, and that its transcripts are produced from both sperm- and egg cell-derived chromosomes. This is notable because at 2.5 hours after pollination, most gene transcripts are maternally-inherited. Maize RNA-seq data [Chen J et al. (2017) Zygotic Genome Activation Occurs Shortly after Fertilization in Maize. Plant Cell 29(9):2106-2125] showed that the ZmVsp transcript accumulated to a high level of expression in maize egg cells and extremely high expression in maize sperm cells and zygotes; Vsp expression was in the 75th, 99th, and 97th percentile of all expressed genes in maize eggs, sperm cells, and zygotes, respectively. Thus, the prVSP expression profile in rice and maize fit the desired pattern for a male HI-Edit promoter, namely, high expression in male gametes and early zygotes from paternally-inherited chromosomes. Table 1 provides illustrative gene names for VSP homologs. The promoters and terminators from these genes may also act as efficient HI-Edit regulatory elements for driving the high expression of the CRISPR enzymes and/or guide RNAs in the sperm and zygote cell types. [0191] In addition to VSP, the maize Ring Zinc-Finger Domain Protein gene (“ZmRZDP,” Zm00001d050090) transcript is in the 93rd and 98th percentile of expressed genes in maize sperm and 12-hour zygotes, respectively [Chen J et al. (2017) Zygotic Genome Activation Occurs Shortly after Fertilization in Maize. Plant Cell 29(9):2106-2125]. ZmRZDP expression was not detected in V1 and V3 shoot apical meristems, according to a public maize gene atlas study [Hoopes GM et al. (2019) An updated gene atlas for maize reveals organ-specific and stress-induced genes. Plant J 97(6): 1154-1167]. Thus, the prZmRZDP expression profile is a good fit for the desired expression pattern for a male HI-Edit promoter. However, the zygote expression may not be from paternal chromosomes, as the OsRZDP ortholog is expressed as a maternal transcript in [Anderson SN et al. (2017) The Zygotic Transition Is Initiated in Unicellular Plant Zygotes with Asymmetric Activation of Parental Genomes. Dev Cell 43(3):349-358]. This has note tested in maize. Illustrative orthologoues of ZmRZDP are sjpwm om Table 2. The promoters and terminators from these genes may also act as efficient HI-Edit regulatory elements for driving the high expression of the CRISPR enzymes and/or guide RNAs in the sperm and zygote cell types.
[0192] Further, SCE1, characterized in maize as SUMO-conjugating enzyme 1 (“ZmSCE1,” Zm00001d002570) was found by mining mRNASeq datasets, looking for genes that are medium/highly expressed in sperm cells and early zygotes (preferably from the paternal allele, which was previously unknown for this gene). Mined data shows the SCE1 transcript is in the 99th percentile of expressed genes in both maize sperm and zygotes. We tested two versions of the promoter. The short eversion (SEQ ID NO: 83) did not express in pollen. The long version (SEQ ID NO: 85) does express in pollen and curiously has a higher ZER when expressed from zygote female side as compared to the male side. II. Vector Construction and Transformation [0193] Binary vectors were constructed with the coding sequence for CRISPR/Cas enzymes driven by the constitutive, sperm-specific, or zygote-specific promoters. The gRNA cassettes were driven by rice U6 or U3 promoters. The phosphomannose isomerase (PMI) cassette [Negrotto, D et al. (2000) The use of phosphomannose- isomerase as a selectable marker to recover transgenic maize plants (Zea mays L.) via Agrobacterium transformation Plant Cell Rep.19, 798–803] was included. In most vectors, the CRISPR enzyme used was LbCas12a was optimized for maize (vector 26296, 27145, 27146 and 27234). All Cas12a vectors used an optimized long linker (6x GGGGS) between the SV40 NLS and LbCas12a sequence at the N-terminus and another linker with two SV40 NLS sequences (GSPKK KRKVS GGSSG GSPKK KRKV) at the C-terminus. Vectors 26296, 27145, 27146 and 27234 also contained a DNA donor to GLOSSY2 (ZmGL2) for inducing homologous recombination with homology arms of 400 bp on either side of the target site. The donor is flanked by gRNA cutting sites so it can be liberated from the T-DNA. [0194] The Cas9 gRNA targets were ZmVLHP1/2-01 (5’- GCAGGAGGCGTCGAGCAGCG-3’) (in vector 23396), ZmVLHP1/2-02 (5’- GCTGGAGCTGAGCTTCCGGG-3’) (in vector 23397), ZmGW2-1/2 (5’- AAGCTCGCGCCCTGCTACCC-3’) (in vector 23399), and ZmSBEIIb (5’- ATTGATAGAGCACATGAGCT -3’) (in vector 24520). The Cas12a gRNA targets were ZmGL2 (5’-GTCACAGATCACAAACTTCAAATG-3’) (in vectors 26296, 27226, 27145, 27146, and 27234), ZmWaxy1 (5’- GGGAAAGACCGAGGAGAAGATCT-3’) (in vectors 26258, 27226, 27145, 27146, 27234, and 27241), ZmO2-01 (targeting the OPAQUE2 gene) (5’-CTGTATCTCGAGCGTCTGGCTGA-3’) (in vector 26258), ZmYellowEndosperm1 (5’- CTATCTTATCCTAAAGATGGTGG-3’) (in vector 26258), ZmUBL (Ubiquitin ligase) (5’-
GGAAGGAAAAGGTATCTGAAGG-3’) (vectors 26258 and 27241), and ZmUPL3 (Ubiquitin ligase) (5’-GGAGGGAAAAGGTGTCTGAGGC-3’) (in vectors 26258 and 27241), ZmO2-02 (5’- GGGCGCCTGAGCAACAAGAGTTC -3’) (in vector 27241), ZmO2- 03 (5’- CTCACTCTTTCCTCGGTAG-3’) (in vector 27241) (Table 3).
[0195] To generate transgenic events, the inbred line NP2222 and a novel inbred haploid inducer derived from the material 20BD917233, were grown in glasshouse under 16:8 photoperiod (light: dark), and 26^/16^ (day/night). The material 20BD917233 was an F7 stage inbred line derived from the biparental breeding population SYN-INBB23 x RWKS/Z21S//RWKS. Ears were harvested 10 or 11 days after pollination, dehusked and desilked, and sterilized with sodium hypochlorite solution. Immature embryos were isolated. Vectors from Table 3 were transformed as described in published protocol [Zhong, H et al. (2018) Advances in Agrobacterium-mediated Maize Transformation. Methods Mol Biol 1676:41-59]. Transgenic E0 events were characterized by Taqman assay for zygosity and editing check. Single and two-copy events (Table 4) were selected and sent to the greenhouse for further growth. Table 4. E0 events sent to GH for out crossing (n.t. = Not Tested)
III. Zygote editing and HI-Edit of the prSoUbi4 and prOsU6 combination shows correspondence between the zygote editing test and the HI-Edit rate [0196] Transformation events were acclimated in a growth chamber for 1-2 weeks and seedlings were then transplanted into pots in the glasshouse, which contained high pressure sodium (HPS) lights as supplemental lighting source for plant growth. Growth conditions were as described in Table 9. The E0 plant tassels and ears were bagged before pollen shed and silk emergence. NP2222 plants and other inbred tester lines were also grown under the same conditions and ears and tassels were bagged for reciprocal crossing. Table 5. Greenhouse conditions for E0 plant growth.
[0197] A rapid proxy assay was initially used to assess the potential HI-Edit efficiency of the different promoters in this study, prior to selecting promoters and events that would be tested for HI-Edit efficiency. The purpose of the proxy assay was to quickly assess the potential for the promoters to drive good HI-Edit rates without the reduced sample size intrinsic to assessing editing outcomes in haploids which are always a minority of the progeny resulting from a haploid induction cross. By assessing the relative chimerism of new edits in the diploid F1 offspring after out-crossing of the transgenic E0 parent to a non- transgenic line, the relative efficiency of promoters for early zygote editing was able to be assessed in the absence of haploid induction. To do this, next generation sequencing (NGS) was used to determine whether new edits (those produced in the F1, hybrid plant) were made in the target gene of the non-transgenic parent. Because the non-transgenic parent does not contain the required CRISPR-Cas editing machinery, edits in the non-transgenic target gene can only have occurred in the F1 offspring. Furthermore, this proxy assay allowed measurement of not only how often this editing occurred, but an approximate developmental timing of when the editing in the course of the F1 embryo development. Editing that occurs early, for instance in the 1-cell zygote stage, should produce a pure biallelic outcome, where nearly 50% of the reads are a new type of edit that was not found in the transgenic parental
plant and the other 50% of reads match an editing outcome from the E0 transgenic parent generation. Editing that occurs later, in the multicellular early embryo, should not produce such a high read percentage – but rather may show a mixture of editing outcomes at lower read rates (i.e. it would be chimeric or mosaic). [0198] To test the zygotic editing rate, we conducted outcrossing and editing pattern analysis from five T1 events from vector 26258 (see Table 4, except PLANTHIE31) used as either a female (4) or male (1) to the tester SYN-INBG78. F1 ears harvested 17 days after pollination were dehusked, silks were removed, and kernel caps were sliced off. Individual embryos (~3-7 mm) were isolated and put into 96-well blocks for genomic DNA extraction. The T-DNA presence and target site mutations were analyzed by Taqman probes. The editing rate at O2 and YellowEndosperm1 were low in most events, but >50% for Waxy1, UBL, and UPL3, so these three targets edited genotypes were determined by NGS (next generation sequencing). The parental edit(s) at the target site were determined and routinely found in the F1 offspring with ~50% read abundance. The criterium for zygote editing was recovery of a new edit having a >30% read abundance in addition to the parental edit. With this criterium the zygote editing rate averaged 17% across the three targets (18/105), with the highest rate for the Waxy1 gRNA (40% zygote editing rate) (Table 6). Table 6. F1 zygote editing rate for the three targets from vector 26258.
In contrast, the F1 zygote editing rate of events PLANTHIE23, PLANTHIE24, and PLANTHIE25 (of 23396, 23397, and 23399 respectively) averaged 56% (55/99) (Table 7) using those events as male pollen donors. Table 7. F1 zygote editing rate for the three targets from vectors 23396, 23397, and 23399.
*For line PLANTHIE25, two CRISPR T-DNA insertions were present, possibly increasing the editing rate. [0199] The difference in zygote editing rate could be due to the promoters used, the cutting efficiency of the Cas enzyme or guide RNA, the germplasm, environmental factors, or a combination of these factors. To determine if the zygote editing proxy assay was operating as a solid predictor of HI-Edit efficiency, we endeavored to compare the HI-Edit efficiency of 26258 to the three Cas9 constructs, which utilize prSoUbi4-04 to express Cas9 and prOsU3 to express their guide RNA. These three vectors were originally reported in the HI-Edit publication [Kelliher, T et al. (2019) One-step genome editing of elite crop germplasm during haploid induction. Nature Biotech 37: 287–292] to have an average of ~3-6% HI-Edit efficiency in five out of six maize lines tested. If the F1 zygote editing assay is a good predictor of HI-Edit rates, we would expect the HI-Edit rate of 26258 to be about three-fold lower than that publication, since the F1 zygote editing rate of 26258 (17%) was about three- fold lower than the three vectors from the publication (56%). [0200] To test the HI-Edit efficiency of two events carrying vector 26258, T2 generation plants homozygous for the CRISPR transgenes were outcrossed as males (pollen donors) onto ears of tester inbred lines from the stiff stalk line NP2222 and the non-stiff stalk line SYN- INBG78. Haploids were color sorted, and the haploid induction rate was about 15.5%. The haploids were submitted for molecular analysis and edited haploids are called by the result of the Taqman assay for the editing target site. This assay detects the WT sequence; mutations are not able to amplify or bind the Taqman probe, resulting in a haploid copy call of zero. Thus, the edited haploids are those that have a zero copy call for the target genes. In Table 8, the haploid editing rate for the Waxy1 target averages around 0.8%. This lower HI-Edit rate is about 3-4x lower than the HI-Edit rate of the original publication, which corresponds well to the 3-4x lower F1 zygote editing rate of 26258. Therefore, it appears that the zygote editing rate may be a useful assay for predicting HI-Edit efficiency.
Table 8. HI-Edit rate of the Waxy1 target site from two events (PLANTHIE30 and
[0201] There many potential variables that could explain the lower HI-Edit rate in 26258: the different promoters used to express the guide RNAs (26258 uses prOsU6, while the original vectors used prOsU3), the different guide RNAs used, the different Cas enzyme (26258 uses Cas12a; the original vectors used Cas9), the different testers and inducer line genetic backgrounds used in the trials, differences in the environmental conditions, or a combination of some or all of these variables. Regardless, it appears that the F1 zygote assay is a good proxy for HI-Edit efficiency based on this comparison. [0202] We suspected that the key factor driving the F1 zygote and HI-Edit rate is choice of promoter used to express the CRISPR machinery and guide RNAs. To test the guide RNA expression factor, a new vector, 27241, was built which had the same promoter driving the Cas12a enzyme and the same or similar gRNAs compared with vector 26258, but where we replaced the prOsU6 promoter with prOsU3. T2 generation plants homozygous for editing machinery were outcrossed as males (pollen donors) onto female ears of tester inbred lines from different maize genetic backgrounds and the haploid editing rate was determined across the testers. Results did not indicate a significant difference between prOsU3 and prOsU6 in driving zygote editing and HI-Edit. IV. E0 out cross and editing analysis of new sperm, zygote, and constitutive promoters. [0203] T-DNA positive F1s (hybrid embryos with editing machinery) were selected and NGS was conducted to check the editing type as shown in Table 10. Table 9. Analysis of F1 zygote NGS data after E0 out crossing.
[0204] Here we can see the high F1 zygote editing rate of the positive control vector 27680, and the high zygote editing rate for vector 27146, including 90% when used as a male. This vector has prZmVSP (Zm00001d011353) driving Cas12a and prOsU3 to drive gRNA targeting GL2 and Waxy1. Similarly, vector 27145 (prZmRZDP) had high F1 zygote editing as both a male and female, in contrast to 27226 (rice Actin1) and 27234 (prZmDUO1A). This demonstrates the value of two newly discovered promoters over others that were considered high constitutive with high zygote (rice Actin1) and high expressed in sperm cells (prZmDUO1A). V. E1 out cross and Hi-edit rate of new sperm, zygote, and constitutive promoters [0205] The E1 seeds of the haploid inducer line transformed with constructs harboring new sperm, zygote and constitutive promoters were sown. The transgene homozygous E1 plants were selected and crossed to WT testers, then the haploids were identified by color marker
observation during early embryo development. Young haploid embryos were then characterized by Taqman and Sanger sequencing. Haploid editing rates are provided in Table 10.
Table 10. Haploid induction rate and haploid editing rate: summarizing the two inducers and the different promoter combinations and gRNAs.
[0206] Detailed results from the testing of prVSP (27146), prRZDP (27145), and prSoUbi4 (27680) including Fl zygote editing rate (examining diploids) and HI-Edit rate (examining haploids) are shown in Table 11. The results in Table 11 are broken down on an event-by- event basis. This data indicates that event-by-event variation is large: one or two events for each vector have a much higher HI-Edit rates than the others, indicating that the T-DNA insertion site may play a role in HI-Edit efficiency. Importantly, for each of the following examples, the Fl zygote data was predictive of the HI-Edit efficiency. For example, in construct 27146 (which corresponds to prVSP), event PLANTHIE36- showed a dramatically higher HI-Edit rate in both testers according to both Taqman and NGS data, and the Fl zygote editing rate was also the highest (75% and 43% for tester 8 and 2 respectively) of any event of that construct. In 27145, event PLANTHIE37- showed the highest HI-Edit rate and highest Fl-zygote rate in both testers and gRNA targets. And in 27680, event PLANTHIE38-
had the highest HI-Edit and F1-zygote rate. This correlation suggests that the F1 zygote assay is an effective proxy for HI-Edit. The 27146 VSP promoter result also shows the potential for this promoter, given the right T-DNA insertion site to drive exceptionally high HI-Edit rates compared to the prSoUbi4 standard control.
Table 11a. Haploid induction rates and haploid editing rates of individual events from the materials tested in the new inducer, NP3003RS, along with Fl diploid editing NGS data.
6028 179 603 „„ n 498
340 „ „ 10 . 0% 18 .2
0 . 0 8 2 38 36 13
Table lib. Haploid induction rates and haploid editing rates of individual events from the materials tested in the new inducer, NP3003RS, along with Fl diploid editing NGS data.
VI. Efficient zygote editing via maternal cross. [0207] While zygote-preferred promoters were first used for paternal expression in maize sperm, we also tested whether they can be used for maternal expression in maize egg cells. To do so, T2 seeds of event NP3003RS (homozygous for Cas12a and guide RNA) were sown and then crossed as female with Tester1 pollen. The resulting F1 embryos were isolated and genotyped for zygote editing detection. The below table shows the zygote editing rate (“ZER”) results. Table 12. Promoter zygote editing rates (“ZER”) in crosses.
“F1#” refers to the number of F1 zygotes obtained from the cross. It’s clear the promoters prZmRZDP, prZmVSP, prSCE1 can enable efficient zygote editing from female side. VII. Construct Annotations [0208] Construct features are provided in Tables 13–33. The terms “minimum” and “maximum” in the tables refer to the position of the first and last nucleotide of the insertion, respectively, in the construct. Table 13. Construct 23396
Table 34. Illustrative promoter and terminator sequences. Lower case nucleotide sequence in VSP promoter sequences SEQ ID NOS: 1-27 and RZDP promoter sequences SEQ ID NOS: 28-43: 5’UTR or intron; Upper case nucleotide sequence in VSP promoter sequences
[0209] All patents, patent publications, patent applications, journal articles, books, technical references, and the like discussed in the instant disclosure are incorporated herein by reference in their entirety for all purposes. [0210] It is to be understood that the descriptions of the disclosure have been simplified to illustrate elements that are relevant for a clear understanding of the disclosure. Omitted details and modifications or alternative embodiments are within the purview of persons of ordinary skill in the art. [0211] It can be appreciated that, in certain aspects of the disclosure, a single component may be replaced by multiple components, and multiple components may be replaced by a single component, to provide an element or structure or to perform a given function or functions. Except where such substitution would not be operative to practice certain embodiments of the disclosure, such substitution is considered within the scope of the disclosure. [0212] The examples presented herein are intended to illustrate potential and specific implementations of the disclosure. It can be appreciated that the examples are intended primarily for purposes of illustration of the disclosure for those skilled in the art. There may be variations to these diagrams or the operations described herein without departing from the spirit of the disclosure. For instance, in certain cases, method steps or operations may be performed or executed in differing order, or operations may be added, deleted or modified. [0213] Where a range of values is provided, it is understood that each intervening value, to the smallest fraction of the unit of the lower limit, unless the context clearly dictates otherwise, between the upper and lower limits of that range is also specifically disclosed. Any narrower range between any stated values or unstated intervening values in a stated range and any other stated or intervening value in that stated range is encompassed. The upper and lower limits of those smaller ranges may independently be included or excluded in the range, and each range where either, neither, or both limits are included in the smaller ranges is also encompassed within the technology, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included.
[0214] In the foregoing description, numerous specific details are set forth to provide a more thorough understanding of the present invention. However, it will be apparent to one of skill in the art that the invention described in this disclosure may be practiced without one or more of these specific details. In other instances, well-known features and procedures well known to those skilled in the art have not been described in order to avoid obscuring the invention. Embodiments of the disclosure have been described for illustrative and not restrictive purposes. Although the present invention is described primarily with reference to specific embodiments, it is also envisioned that other embodiments will become apparent to those skilled in the art upon reading the present disclosure, and it is intended that such embodiments be contained within the present inventive methods. Accordingly, the present disclosure is not limited to the embodiments described above or depicted in the drawings, and various embodiments and modifications can be made without departing from the scope of the claims below.
Claims
WHAT IS CLAIMED IS: 1. A synthetic DNA construct comprising a zygote-preferred promoter operably linked to a first nucleotide sequence of interest (“NSOI”).
2. The synthetic DNA construct of claim 1, wherein the zygote-preferred promoter is a vacuolar sorting protein (VSP) promoter.
3. The synthetic DNA construct of claim 1, wherein the VSP promoter comprises a sequence: a) selected from the group consisting of SEQ ID NOs: 1–27, 86, 87 or a functional fragment thereof; or b) an orthologous promoter to SEQ ID NO: 1.
4. The synthetic DNA construct of claim 1, wherein the VSP promoter comprises SEQ ID NO: 86.
5. The synthetic DNA construct of claim 1, wherein the zygote-preferred promoter is a ring zinc-finger domain protein promoter (RZDP) promoter.
6. The synthetic DNA construct of claim 5, wherein the RZDP promoter comprises a sequence: (a) selected from the group consisting of SEQ ID NOs: 28–43 or a functional fragment thereof; or (b) orthologous promoter to SEQ ID NO: 28.
7. The synthetic DNA construct of claim 6, wherein RZDP promoter comprises SEQ ID NO: 28.
8. The synthetic DNA construct of claim 1, further comprising a U3 promoter operably linked to a second NSOI.
9. The synthetic DNA construct of claim 8, wherein the U3 promoter comprises a sequence:
(a) selected from the group consisting of SEQ ID NOs: 44–49 or a functional fragment thereof; or (b) orthologous promoter to SEQ ID NO: 44.
10. The synthetic DNA construct of claim 9, wherein the U3 promoter comprises SEQ ID NO: 45.
11. The synthetic DNA construct of claim 1, wherein the first NSOI comprises a sequence encoding a nuclease.
12. The synthetic DNA construct of claim 11, wherein the nuclease is selected from the group consisting of zinc finger nucleases (“ZFNs”), meganucleases (“MNs”), transcription activator-like effector nucleases (TALENs), and CRISPR nucleases.
13. The synthetic DNA construct of claim 12, wherein the CRISPR nuclease is selected from the group consisting of Cas9, Cas12a, Cas12b, Cas12c, Cas12d, Cas12e, Cas12f, Cas12i, Cas12j, Cas12l, Cas1, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas10, Cas11, Cas13a, Cas13b, Cas13c, Cas13d, Cas14, any other CRISPR-Cas nuclease, any mutant thereof, a nickase variant, and a deactivated variant.
14. The synthetic DNA construct of claim 13, wherein the CRISPR nuclease further comprises a fusion domain.
15. The synthetic DNA construct of claim 14, wherein the fusion domain is selected from the group consisting of a deaminase, a uracil DNA glycosylase, a reverse transcriptase, a ubiquitin receptor, and an exonuclease.
16. The synthetic DNA construct of claim 8, wherein the second NSOI comprises a sequence encoding for at least one guide RNA.
17. The synthetic DNA construct of claim 16, wherein the at least one guide RNA is encoded by a sequence selected from the group consisting of SEQ ID NO: 50–61.
18. The synthetic DNA construct of claim 17, comprising a sequence selected from the group consisting of SEQ ID NO: 62–72.
19. A plant cell comprising the synthetic DNA construct of claims 1-18.
20. The plant cell of claim 19, wherein the plant cell is a pollen cell or an egg cell.
21. A plant comprising the plant cell of claims 19-20.
22. The plant of claim 21, wherein the plant is a maize plant.
23. A method of obtaining an edited progeny plant, comprising: (a) providing a first plant, wherein the first plant is transformed to comprise the synthetic construct of claims 1-18; (b) pollinating a second plant; and (c) selecting at least one progeny produced by the pollination of step (b), wherein the progeny possesses an edit; thereby obtaining an edited progeny plant.
24. The method of claim 23, wherein the first plant is a haploid inducer line of the plant.
25. The method of claim 24, wherein the haploid inducer line is a paternal haploid inducer.
26. The method of claim 25, wherein the paternal haploid inducer line comprises a mutation in a CENH3 gene.
27. The method of claim 24, wherein the haploid inducer line is a maternal haploid inducer.
28. The method of claim 27, wherein the maternal haploid inducer line comprises a mutation in a MATL gene.
29. The method of claim 23, wherein the second plant comprises plant genomic DNA to be edited.
30. The method of claim 23, wherein the edited progeny plant is a haploid progeny plant.
31. The method of claim 23, wherein the haploid progeny plant comprises the genome of the second plant but not the first plant 32. The method of claim 30, wherein the haploid progeny plant is a maize haploid progeny plant. 33. The synthetic DNA construct of claim 1, wherein the zygote-preferred promoter is a SUMO-conjugating enzyme 1 (SCE1) promoter. 34. The synthetic DNA construct of claim 33, wherein the SCE1 promoter comprises a sequence: a) selected from the group consisting of SEQ ID NOs: 83–85 or a functional fragment thereof; or b) an orthologous promoter to SEQ ID NO: 83. 35. The synthetic DNA construct of claim 33, wherein the SCE1 promoter comprises SEQ ID NO: 84. 36. The synthetic DNA construct of claim 33, wherein the SCE1 promoter comprises SEQ ID NO: 85. 37. The synthetic DNA construct of claim 1, further comprising a terminator operably linked to the NSOI. 38. The synthetic DNA construct of claim 37, wherein the terminator is a ubiquitin terminator. 39. The synthetic DNA construct of claim 37, wherein the ubiquitin terminator comprises SEQ ID NO: 88, 89, 90, 91, 92, or 93.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CNPCT/CN2023/110941 | 2023-08-03 | ||
| PCT/CN2023/110941 WO2025025203A1 (en) | 2023-08-03 | 2023-08-03 | Zygote-preferred expression |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025030093A1 true WO2025030093A1 (en) | 2025-02-06 |
Family
ID=94393049
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2023/110941 Pending WO2025025203A1 (en) | 2023-08-03 | 2023-08-03 | Zygote-preferred expression |
| PCT/US2024/040716 Pending WO2025030093A1 (en) | 2023-08-03 | 2024-08-02 | Zygote-preferred expression |
Family Applications Before (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2023/110941 Pending WO2025025203A1 (en) | 2023-08-03 | 2023-08-03 | Zygote-preferred expression |
Country Status (2)
| Country | Link |
|---|---|
| AR (1) | AR133449A1 (en) |
| WO (2) | WO2025025203A1 (en) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20130096032A1 (en) * | 2000-03-29 | 2013-04-18 | Monsanto Technology Llc | Plant polymorphic markers and uses thereof |
| US20170016017A1 (en) * | 2014-07-31 | 2017-01-19 | Michael E Fromm | Method for increasing plant yields |
| US20170290279A1 (en) * | 2014-09-22 | 2017-10-12 | Pioneer Hi-Bred International, Inc. | Methods for Reproducing Plants Asexually and Compositions Thereof |
| WO2018102816A1 (en) * | 2016-12-02 | 2018-06-07 | Syngenta Participations Ag | Simultaneous gene editing and haploid induction |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CA2532186A1 (en) * | 2003-07-14 | 2005-01-27 | Monsanto Technology Llc | Materials and methods for the modulation of cyclin-dependent kinase inhibitor-like polypeptides in maize |
| EP3136842A4 (en) * | 2014-04-28 | 2017-11-29 | Dow AgroSciences LLC | Haploid maize transformation |
| CN111763687B (en) * | 2019-03-12 | 2021-12-07 | 中国农业大学 | Method for rapidly cultivating corn haploid induction line based on gene editing technology |
| CA3131547A1 (en) * | 2019-04-18 | 2020-10-22 | Pioneer Hi-Bred International, Inc. | Embryogenesis factors for cellular reprogramming of a plant cell |
| PH12022553337A1 (en) * | 2020-06-09 | 2024-03-11 | Cold Spring Harbor Laboratory | Heterozygous cenh3 monocots and methods of use thereof for haploid induction and simultaneous genome editing |
-
2023
- 2023-08-03 WO PCT/CN2023/110941 patent/WO2025025203A1/en active Pending
-
2024
- 2024-08-02 WO PCT/US2024/040716 patent/WO2025030093A1/en active Pending
- 2024-08-02 AR ARP240102054A patent/AR133449A1/en unknown
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20130096032A1 (en) * | 2000-03-29 | 2013-04-18 | Monsanto Technology Llc | Plant polymorphic markers and uses thereof |
| US20170016017A1 (en) * | 2014-07-31 | 2017-01-19 | Michael E Fromm | Method for increasing plant yields |
| US20170290279A1 (en) * | 2014-09-22 | 2017-10-12 | Pioneer Hi-Bred International, Inc. | Methods for Reproducing Plants Asexually and Compositions Thereof |
| WO2018102816A1 (en) * | 2016-12-02 | 2018-06-07 | Syngenta Participations Ag | Simultaneous gene editing and haploid induction |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2025025203A1 (en) | 2025-02-06 |
| AR133449A1 (en) | 2025-10-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11447785B2 (en) | Method for base editing in plants | |
| US12234467B2 (en) | Diplospory gene | |
| JP2011504735A (en) | Brassica plant containing a mutant INDEISCENT allele | |
| WO2019038417A1 (en) | Methods for increasing grain yield | |
| CN110862993B (en) | Control gene ZKM89 of maize plant height and ear height and its application | |
| AU2020285344B2 (en) | Gene for parthenogenesis | |
| CN108368518B (en) | Methods for preparing haploid and subsequently doubly haploid plants | |
| CN114286862B (en) | Controlling flowering of plants | |
| US20190225657A1 (en) | Method for the production of haploid and subsequent doubled haploid plants | |
| US20230323384A1 (en) | Plants having a modified lazy protein | |
| WO2025030093A1 (en) | Zygote-preferred expression | |
| US20230183725A1 (en) | Method for obtaining mutant plants by targeted mutagenesis | |
| WO2018228348A1 (en) | Methods to improve plant agronomic trait using bcs1l gene and guide rna/cas endonuclease systems | |
| US12203081B2 (en) | Cannabis ubiquitin promoter | |
| CA3043774C (en) | A method for base editing in plants | |
| EP4186917A1 (en) | Tobamovirus resistant plants | |
| BR112018004300B1 (en) | CHIMERIC GENE, GENETIC CONSTRUCT, VECTOR, USE AND METHOD TO CONFER DIPLOSPORIA IN A PLANT |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24850156 Country of ref document: EP Kind code of ref document: A1 |