[go: up one dir, main page]

WO2025038948A2 - Alpha-crystalline domain proteins and their use in genome modification - Google Patents

Alpha-crystalline domain proteins and their use in genome modification Download PDF

Info

Publication number
WO2025038948A2
WO2025038948A2 PCT/US2024/042723 US2024042723W WO2025038948A2 WO 2025038948 A2 WO2025038948 A2 WO 2025038948A2 US 2024042723 W US2024042723 W US 2024042723W WO 2025038948 A2 WO2025038948 A2 WO 2025038948A2
Authority
WO
WIPO (PCT)
Prior art keywords
polypeptide
nucleic acid
target nucleic
polypeptides
acd15
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2024/042723
Other languages
French (fr)
Other versions
WO2025038948A3 (en
Inventor
Steve E. Jacobsen
Jason GARDINER
Brandon BOONE
Ming Wang
Trevor John WEISS
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of California Berkeley
University of California San Diego UCSD
Original Assignee
University of California Berkeley
University of California San Diego UCSD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of California Berkeley, University of California San Diego UCSD filed Critical University of California Berkeley
Publication of WO2025038948A2 publication Critical patent/WO2025038948A2/en
Publication of WO2025038948A3 publication Critical patent/WO2025038948A3/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases [RNase]; Deoxyribonucleases [DNase]
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/415Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from plants
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/102Mutagenizing nucleic acids
    • C12N15/1024In vivo mutagenesis using high mutation rate "mutator" host strains by inserting genetic material, e.g. encoding an error prone polymerase, disrupting a gene for mismatch repair
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/62DNA sequences coding for fusion proteins
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8261Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/1003Transferases (2.) transferring one-carbon groups (2.1)
    • C12N9/1007Methyltransferases (general) (2.1.1.)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide

Definitions

  • the present disclosure relates generally to methods of eukaryotic genome modification. More specifically, the present disclosure relates to compositions and methods for targeting a genetic modifier polypeptide and an ⁇ -crystalline domain polypeptide to a target nucleic acid of interest to facilitate a genome modification.
  • Genome modification methods such as genome editing, can involve targeting various types of polypeptides to specific nucleic acids.
  • Affecting gene function through the specific targeting of epigenetic modifications, transcriptional regulatory proteins, or gene editing reagents allows for control of gene activity and function as well as cellular function(s).
  • molecular tools designed to implement these processes need to be both efficient and specific.
  • such methods can suffer from low efficiency of producing genome modifications and high occurrence of off- target modifications. While the targeted modification may be made, off-target events across an organism’s genome may occur. These off-target events can lead to unintended consequences and uncontrolled changes to non-targeted cellular pathways. It is of great interest to create molecular tools that maintain, or even increase, efficient modifying processes while also increasing specificity. Therefore, improved genome modification methods are needed that provide improved efficiency and reduced off-target effects.
  • the present disclosure provides a method of modifying a target nucleic acid in a eukaryotic cell, the method including: a) providing a eukaryotic cell including: 1) a genetic modifier polypeptide capable of being targeted to the target nucleic acid, and 2) a ⁇ -crystalline domain polypeptide capable of being targeted to the target nucleic acid, wherein at least one of the genetic modifier polypeptide or the ⁇ -crystalline domain polypeptide is a recombinant polypeptide; and b) maintaining the eukaryotic cell under conditions whereby the genetic modifier polypeptide and the ⁇ -crystalline domain polypeptide are targeted to the target nucleic acid, thereby modifying the target nucleic acid.
  • At least one of the genetic modifier polypeptide or the ⁇ -crystalline domain polypeptide is encoded on a recombinant nucleic acid. In some embodiments, at least one of the genetic modifier polypeptide or the ⁇ -crystalline domain polypeptide comprise a heterologous targeting domain which facilitates targeting of the polypeptide to the target nucleic acid. In some embodiments, the heterologous targeting domain is a DNA-binding domain. In some embodiments, at least one of the genetic modifier polypeptide or the ⁇ - crystalline domain polypeptide is targeted to the target nucleic acid via a SunTag-based targeting system involving a RNA-guided DNA-endonuclease polypeptide and a guide RNA.
  • the genetic modifier polypeptide includes a heterologous Sticky-C (StkyC) domain. In some embodiments that may be combined with any of the preceding embodiments, at least two different ⁇ -crystalline domain polypeptides are targeted to the target nucleic acid. In some embodiments that may be combined with any of the preceding embodiments, the genetic modifier polypeptide includes a DNA methyltransferase polypeptide. In some embodiments, the DNA methyltransferase polypeptide is a TRBIP1 polypeptide having at least 80% amino acid identity to the polypeptide encoded by Arabidopsis thaliana NP_195276.3 (SEQ ID NO: 1).
  • the ⁇ -crystalline domain polypeptide includes an amino acid sequence having at least 80% amino acid identity to ACD15 or ACD21 from Arabidopsis thaliana (SEQ ID NO: 11 or SEQ ID NO: 13, respectively). In some embodiments that may be combined with any of the preceding embodiments, the ⁇ - crystalline domain polypeptide is selected from the group consisting of: ACD15, ACD21, HSPB1, HSPB3, and HSPB5.
  • modification of the target nucleic acid confers a change in expression and/or a change in the target nucleotide sequence of the target nucleic acid as sf-6059413 Attorney Docket No.: 26223-20027.40 compared to a corresponding control.
  • the incidence of modification of a non-target nucleic acid is reduced as compared to a corresponding control.
  • the eukaryotic cell is a plant cell or a mammalian cell.
  • the eukaryotic cell is a plant cell and the method further includes regenerating a whole plant from said plant cell.
  • the method further includes (c) crossing the plant with a modified target nucleic acid to a second plant to produce one or more F1 plants.
  • the method further includes (d) selecting from the one or more F1 plants an F1 plant that (i) lacks a recombinant genetic modifier polypeptide and/or a recombinant ⁇ -crystalline domain polypeptide, and (ii) has the modified target nucleic acid.
  • the present disclosure provides a recombinant nucleic acid encoding at least one of 1) a genetic modifier polypeptide capable of being targeted to a target nucleic acid, and 2) an ⁇ -crystalline domain polypeptide capable of being targeted to a target nucleic acid.
  • at least one of the genetic modifier polypeptide or the ⁇ -crystalline domain polypeptide comprises a heterologous targeting domain that facilitates targeting of the polypeptide to the target nucleic acid.
  • the heterologous targeting domain is a DNA-binding domain.
  • the genetic modifier polypeptide includes a heterologous Sticky C (StkyC) domain.
  • the genetic modifier polypeptide includes a DNA methyltransferase polypeptide.
  • the DNA methyltransferase polypeptide is a TRBIP1 polypeptide having at least 80% amino acid identity to the polypeptide encoded by Arabidopsis thaliana NP_195276.3.
  • the ⁇ -crystalline domain polypeptide includes an amino acid sequence having at least 80% amino acid identity to ACD15 or ACD21 from Arabidopsis thaliana.
  • the ⁇ -crystalline domain polypeptide is selected from the group consisting of: ACD15, ACD21, HSPB1, HSPB3, and HSPB5.
  • the present disclosure provides an expression vector including a recombinant nucleic acid encoding at least one of 1) a genetic modifier polypeptide capable of being targeted to a target nucleic acid, and 2) a ⁇ -crystalline domain polypeptide capable of being targeted to a target nucleic acid.
  • At least one of the genetic sf-6059413 Attorney Docket No.: 26223-20027.40 modifier polypeptide or the ⁇ -crystalline domain polypeptide comprise a heterologous targeting domain which facilitates targeting of the polypeptide to the target nucleic acid.
  • the heterologous targeting domain is a DNA-binding domain.
  • the genetic modifier polypeptide includes a heterologous Sticky-C (StkyC) domain.
  • the genetic modifier polypeptide includes a DNA methyltransferase polypeptide.
  • the DNA methyltransferase polypeptide is a TRBIP1 polypeptide having at least 80% amino acid identity to the polypeptide encoded by Arabidopsis thaliana NP_195276.3.
  • the ⁇ -crystalline domain polypeptide includes an amino acid sequence having at least 80% amino acid identity to ACD15 or ACD21 from Arabidopsis thaliana.
  • the ⁇ -crystalline domain polypeptide is selected from the group consisting of: ACD15, ACD21, HSPB1, HSPB3, and HSPB5.
  • the present disclosure provides a plant cell including: 1) a genetic modifier polypeptide capable of being targeted to the target nucleic acid, and 2) a ⁇ - crystalline domain polypeptide capable of being targeted to the target nucleic acid, wherein at least one of the genetic modifier polypeptide or the ⁇ -crystalline domain polypeptide is a recombinant polypeptide, and wherein the plant cell includes a modified nucleic acid as compared to a corresponding control nucleic acid.
  • at least one of the genetic modifier polypeptide or the ⁇ -crystalline domain polypeptide is encoded on a recombinant nucleic acid.
  • At least one of the genetic modifier polypeptide or the ⁇ -crystalline domain polypeptide comprise a heterologous targeting domain which facilitates targeting of the polypeptide to the modified nucleic acid.
  • the heterologous targeting domain is a DNA-binding domain.
  • the genetic modifier polypeptide includes a heterologous Sticky-C (StkyC) domain.
  • the genetic modifier polypeptide includes a DNA methyltransferase polypeptide.
  • the DNA methyltransferase polypeptide is a TRBIP1 polypeptide having at least 80% amino acid identity to the polypeptide encoded by Arabidopsis thaliana NP_195276.3.
  • the ⁇ -crystalline domain polypeptide includes an amino acid sequence having at least 80% amino acid identity to ACD15 or ACD21 from Arabidopsis sf-6059413 Attorney Docket No.: 26223-20027.40 thaliana.
  • the ⁇ -crystalline domain polypeptide is selected from the group consisting of: ACD15, ACD21, HSPB1, HSPB3, and HSPB5.
  • the modified nucleic acid includes a change in expression and/or a change in nucleotide sequence as compared to a corresponding control nucleic acid.
  • FIG.1A shows a heatmap of FLAG-tagged MBD5, MBD6, SLN, ACD15, and ACD21 ChIP-seq enrichment (log2FC over no-FLAG Col0 control) centered at all merged peaks.
  • FIG.1B shows a genome browser image of ChIP-seq data showing two methylated loci co-bound by all MBD5/6 complex members.
  • FIG.1C shows Loess curves showing correlation between ChIP-seq enrichment for a representative replicate and CG methylation density.
  • FIG.1D shows violin plots showing mature pollen RNA-seq data for the indicated mutants, at mbd5 mbd6 upregulated transcripts (6 replicates per genotype).
  • FIG.1E shows a comparison between genotypes of the number of RNA-seq differentially expressed genes (DEGs) with >40% CG methylation levels around the TSS.
  • FIG.1F shows a genome browser image of RNA-seq data at the FWA locus in the indicated genotypes. Wild-type BS-seq data is shown as reference.
  • FIGS.1G-1J show ChIP-seq and RNA-seq analysis of ACD15 and ACD21.
  • FIG.1G shows a Venn diagram of ChIP-seq peaks showing large overlap between samples.
  • FIG.1H shows a scheme of ACD15 and ACD21 genes showing the location of the guide RNAs used for CRISPR/Cas9 mediated mutant generation. The table below shows the mutations obtained in each line.
  • FIG.1I shows bar plots showing the number of differentially expressed TEs (DE-TEs) or differentially expressed genes (DEGs) in the indicated genotypes.
  • FIG.1J shows upset plots showing the sf-6059413 Attorney Docket No.: 26223-20027.40 intersection of the upregulated genes or TEs found for each genotype. The largest intersection group constitutes loci upregulated in all six mutant lines.
  • FIGS.2A-2R show ACD15 and ACD21 bridge SLN to MBD5/6 and organization of the MBD5/6 complex structure.
  • FIG.2A shows IP-MS of flag-tagged MBD5/6 complex members in the indicated genetic backgrounds (MS/MS counts).
  • FIG.2B shows MBD5/6 complex organization as predicted by IP-MS.
  • FIGS.2G-2O show a correlation between MBD6-RFP signal and either ACD15-YFP, ACD21-CFP, or SLN-CFP signal in the indicated mutant backgrounds (underlined).
  • FIGS.2P-2Q show an AlphaFold Multimer predicted structure of MBD5/6 complex with two copies each of MBD6, ACDC15, ACD21, and SLN along with confidence score map of the predicted complex.
  • FIG.2R shows a cartoon representation of the core dimeric MBD5/6 complex based on the AlphaFold Multimer prediction. The figure was created with Biorender.com.
  • FIGS.3A-3U show that ACD15, ACD21, and SLN regulate MBD6 accumulation and mobility and that SLN regulates the nuclear mobility of MBD5/6 complex members.
  • FIG.3D shows representative image of FRAP experiment.
  • FIG.3F shows box plots of mean intensity values of MBD6 foci (5 individual plants per genotype). Two-tailed t-test (****: P ⁇ 0.0001).
  • FIG.3G shows heatmaps and metaplots of MBD6-RFP ChIP-seq signal (log2 ratio over no-FLAG Col0 control) at peaks called in “MBD6-RFP in wild-type” dataset.
  • FIG.3H shows Loess curves sf-6059413 Attorney Docket No.: 26223-20027.40 showing correlation between MBD6-RFP ChIP-seq enrichment and CG methylation density.
  • FIG.3I shows genome browser tracks showing an example of a high density meCG site bound by MBD6-RFP (ChIP-Seq). Wild-type BS-seq data is shown as reference.
  • FIGS.3K-3L show MBD6-RFP signal within DAPI-stained nuclei.
  • FIG.3M shows a table of extrapolated values from FRAP curve data fitted with one-phase association liner regression using GraphPad Prism.
  • FIGS.3R-3S show the intensity of ACD15 (FIG.3R) and ACD21 (FIG. 3S) signals at 100 individual foci from multiple nuclei and plant lines. Comparisons were made using two-tailed t tests (****: P ⁇ 0.0001).
  • FIGS.4A-4R show that the StkyC domain of MBD6 is necessary for function and localization of MBD6.
  • FIG.4A shows a graphical description of MBD6 mutant constructs.
  • FIG.4I shows protein structure representation of AlphaFold Multimer prediction of MBD6 with ACD15. Domains of MBD6 are annotated.
  • FIG.4J shows protein alignment of MBD5, MBD6, and MBD7. MBDs and StkyC regions of MBD5, MBD6, and MBD7 are labelled.
  • FIG.4K shows graphical representation of MBD6 deletion mutants.
  • FIGS.4L-M show RT-qPCR results of FWA expression comparing mbd5 mbd6 plants expressing MBD6 deletion mutants (FIG.4L), along with representative images of MBD6 deletion mutants in root nuclei (FIG.4M).
  • FIG.5A shows a graphical representation of SunTag StkyC system and the hypothesized result. Created with BioRender.com.
  • FIG.5B shows representative nuclear images of SunTag StkyC in different mutant backgrounds.
  • FIG.5F shows leaf counts post flowering of T1 fwa rdr-6 SunTag StkyC plants. Brown-Forsythe ANOVA with Dunnett’s multiple comparisons test (****: P ⁇ 0.0001).
  • FIG.5G shows representative image of early flowering T2 fwa rdr-6 plants expressing SunTag StkyC .
  • FIGS.5H-5N show that SunTag StkyC drives the formation of MBD5/6 nuclear foci.
  • FIG.5K shows correlation of SunTag StkyC FWA expression with leaf counts of individual T1 plants from Figure 5E-F. Correlation coefficient: Pearson.
  • FIGS.5L-5N show RT-qPCR of FWA in mbd5 mbd6, sln, and acd15 acd21 plants with and without SunTag StkyC . Comparisons made using Brown- Forsythe ANOVA with Dunnett’s multiple comparison test for each qPCR experiment.
  • FIG.6 shows a model of MBD5/6 oligomerization at high density meCG sites. Pictured is a diagram of proposed model showing ACD15/ACD21-dependent binding and accumulation of MBD5/6 complex members in multimeric assemblies. MBD5/6 recognize DNA methylation through their MBD domain. Although MBD5 or MBD6 can recognize individual meCG sites, regions with high density meCG sites facilitate recruitment of multiple MBD5/6 complexes, which triggers oligomerization: once MBD5/6 are bound to DNA, ACD15/ACD21 drive recruitment of other MBD5/6 complexes to facilitate oligomerization.
  • FIGS.7A-7B show two modules from the SunTag-TRBIP1-MQ1 plasmid, which contains dCAS9 fused and 10 x GCN4 peptide straight fusion that driven by UBQ10 Promoter (FIG.7A, top), and scFv antibody, sfGFP, TRBIP1 and MQ1 straight fusion that driven by UBQ10 promoter (FIG.7A, bottom), and a plasmid map of SunTag-TRBIP1-MQ1 (FIG.7B).
  • FIGS.8A-8E show: (FIG.8A) Dot plots showing the leaf number of Col-0, fwa rdr6, SunTag-MQ1 and SunTag-TRBIP1-MQ1. (FIG.8B) qRT-PCR indicating the relative mRNA level of fwa rdr6, and six T1 transgenic lines of SunTag-MQ1 and SunTag-TRBIP1- MQ1 in fwa rdr6 background, respectively.
  • FIG.8C qPCR result showing the relative McrBC qPCR value of FWA in the fwa rdr6, Col-0, and eight T1 transgenic lines of SunTag- MQ1 and SunTag-TRBIP1-MQ1 in fwa rdr6 background, respectively.
  • FIG.8D Relative CG, CHG and CHH DNA methylation level at FWA promoter region using bisulfite PCR-seq (BS-PCR-seq), the pink regions indicated the ZF binding sites.
  • FIG.8E CG DNA methylation of fwa rdr6, SunTag-MQ1 and SunTag-TRBIP1-MQ1 T1 transgenic line in fwa rdr6 background, measured by whole genome bisulfite sequencing (WGBS).
  • FIGS.9A-9B show: (FIG.9A) two modules from the SunTag-TRBIP1-MQ1 plasmid with varied promoters, which contains dCAS9 and 10 x GCN4 peptides straight fusion driven by UBQ10 promoter, and scFv-GFP- TRBIP1-MQ1 that driven by 20 different Arabidopsis promoters, respectively; and (FIG 9B) a plasmid map of the SunTag-TRBIP1- MQ1 with 20 promoters, respectively.
  • FIG.10 shows bar charts indicating the relative McrBC qPCR value (Orange bars) and relative DNA methylation level at Chromosome Chloroplast (Blue bars) of two replicates of fwa rdr6 and the T1 transgenic lines of SunTag-TRBIP1-MQ1 with 20 promoters, measured by McrBC and Skim-seq, respectively.
  • FIG.11 shows a screenshot of CG DNA methylation at FWA locus in two replicates of fwa rdr6 and T1 transgenic lines of SunTag-MQ1, SunTag-TRBIP1-MQ1, as sf-6059413 Attorney Docket No.: 26223-20027.40 well as SunTag-TRBIP1-MQ1 with Promoter 6, Promoter 7, Promoter 9, and Promoter 20, respectively showing reduced off target methylation.
  • FIG.12 shows screenshots of CG DNA methylation at random loci in two replicates of fwa rdr6 and T1 transgenic lines of SunTag-MQ1, SunTag-TRBIP1-MQ1, as well as SunTag-TRBIP1-MQ1 with Promoter 6, Promoter 7, Promoter 9, and Promoter 20, respectively.
  • This screenshot was to show the reduced CG hyper DNA methylation at a random site over plant genome.
  • FIG.13 shows line charts showing the relative CG DNA methylation of SunTag- TRBIP1-MQ1 with Promoter 6, Promoter 7, Promoter 9, and Promoter 20, and the DNA methylation was measured by whole genome bisulfite DNA sequencing (WGBS).
  • the plots show methylation across the entire Arabidopsis genome.
  • the blue and the orange lines represent the two replicates of fwa rdr6, while the grey and yellow lines represent the two replicates of SunTag-TRBIP1-MQ1 with Promoter 6, Promoter 7, Promoter 9, and Promoter 20, respectively.
  • FIGS.14A-14B show: (FIG.14A) two modules from the SunTag-StykC- TRBIP1-MQ1 plasmid, which contains UBQ10 promoter driven dCAS9 and 10 x GCN4 peptides straight fusion, and UBQ10 promoter driven scFv-StykC-sfGFP-TRBIP1-MQ1; and (FIG.14B) a plasmid map of SunTag-StykC-TRBIP1-MQ1.
  • FIG.15 shows a screenshot of CG, CHG and CHH DNA methylation at FWA locus in fwa rdr6 (two replicates), SunTag-StykC-MQ1, and SunTag-StykC-TRBIP1-MQ1, respectively, measured by WGBS. This shows the on target proper methylation of FWA.
  • FIG.16 shows a screenshot of CG, CHG and CHH DNA methylation at a random locus of the genome in fwa rdr6 (two replicates), SunTag-StykC-MQ1, and SunTag-StykC- TRBIP1-MQ1, respectively, measured by WGBS.
  • FIG.17A shows CG, CHG and CHH DNA methylation over plant genome in fwa rdr6 (two replicates), SunTag-StykC-MQ1 and SunTag-StykC-TRBIP1-MQ1, respectively, measured by WGBS.
  • Blue and orange lines represent fwa rdr6, grey represent SunTag- StykC-MQ1, and yellow line represent SunTag-StykC-TRBIP1-MQ1, which is measured by WGBS. This result indicates that adding StykC removes CG DNA hypermethylation over the plant genome.
  • FIG.17B-FIG.17D shows use of the StkyC domain to increase the specificity of CRISPR-based methylation targeting.
  • FIG.17B show genome wide CG DNA methylation levels across the five Arabidopsis chromosomes in control non-transgenic fwa, SunTag- TRBIP1-MQ1, or SunTag-StkyC-TRBIP1-MQ1 plants.
  • FIG.17C shows chloroplast methylation levels in the same genotypes.
  • FIG.17D shows genome browser view of FWA showing DNA methylation targeting of the FWA promoter in SunTag-StkyC-TRBIP1-MQ1 plants, with minimal off target methylation.
  • the position of the gRNA for dCas9 is shown.
  • the results demonstrate that that there is 75% CG methylation of the chloroplast methylation in the TRBIP1-MQ1, but zero when StkyC is added.
  • FIGS.18A-18C show construct design for SunTag-TDG-MBD6 StkyC-TET1.
  • FIG.18A shows a cartoon representation of SunTag-TDG-MBD6 StkyC-TET1.
  • FIG.18B shows a plasmid map of for SunTag-TDG-MBD6 StkyC-TET1.
  • FIG.18C shows SunTag- TDG-TET1.
  • FIGS.19A-19D show (FIG.19A) a cartoon representation of scFV-GFP-SDG2 construct localizing to dCas9 with 10xGCN4 binding sites (black squares). Made with BioRender; (FIG.19B) a plasmid map of SunTag system containing SDG2; (FIG.19C) a cartoon representation of scFV-GFP-SDG2-MBD6 StkyC construct localizing to dCas9 with 10xGCN4 binding sites (black squares).
  • FIGS.20A-20B show (FIG.20A) a cartoon representation of scFV-GFP- StkyC MBD7 construct localizing to dCas9 with 10xGCN4 binding sites (black squares); and (FIG.20B) a plasmid map of SunTag system containing StkyC MBD7 .
  • FIGS.21A-21B show (FIG.21A) a cartoon graphic depicting MBD6-HSPB1- RFP chimeric protein and goal of experiments.
  • FIGS.22A-22B show (FIG.22A) a cartoon graphic depicting MBD6-HSPB3- RFP chimeric protein and goal of experiments. Made using biorender; and (FIG.22B) a plasmid map depicting the expression construct of MBD6-HSPB3-RFP.
  • FIGS.23A-23B show (FIG.23A) a cartoon graphic depicting MBD6-HSPB5- RFP chimeric protein and goal of experiments.
  • FIGS.24A-24B show (FIG.24A) a cartoon graphic depicting MBD6-HSPB8- RFP chimeric protein and goal of experiments. Made using biorender; and (FIG.24B) a plasmid map depicting the expression construct of MBD6-HSPB8-RFP.
  • FIG.25A shows 3D reconstruction of root meristem tissue of mbd5 mbd6 mutant plants expressing MBD6, MBD6 HSPB1 , MBD6 HSPB3 , MBD6 HSPB5 , or MBD6 HSPB8 all with C- terminal RFP. All constructs except for MBD6 HSPB8 show a punctate pattern of RFP fluorescence, indicating concentrated localization of signal at the chromocenters. MBD6 HSPB8 shows diffuse nuclear staining.
  • FIG.25B shows fluorescence confocal imaging of nuclei expressing MBD6-RFP, or MBD6-RFP variants in which the StkyC domain has been replaced by human sHSPs.
  • FIGS.26A-26B show (FIG.26A) a cartoon representation of scFV-GFP-HSPB1 construct localizing to dCas9 with 10xGCN4 binding sites (black squares); and (FIG.26B) a plasmid map of SunTag system containing HSPB1.
  • FIGS.27A-27B show (FIG.27A) a cartoon representation of scFV-GFP-HSPB3 construct localizing to dCas9 with 10xGCN4 binding sites (black squares); and (FIG.27B) a plasmid map of SunTag system containing HSPB3.
  • FIGS.28A-28B show (FIG.28A) a cartoon representation of scFV-GFP-HSPB5 construct localizing to dCas9 with 10xGCN4 binding sites (black squares); and (FIG.28B) a plasmid map of SunTag system containing HSPB5.
  • FIGS.29A-29B show (FIG.29A) a cartoon representation of scFV-GFP-HSPB8 construct localizing to dCas9 with 10xGCN4 binding sites (black squares); and (FIG.29B) a plasmid map of SunTag system containing HSPB8.
  • FIGS.30A-30C show an alignment of the ACD15 ACD domain with full length proteins of ACD orthologs from other plant species (FIGS.30A-30B) and a phylogeny of the species included in the alignment (FIG.30C).
  • FIGS.30D-30E show an alignment of the ACD21 ACD domain with full length proteins of ACD orthologs from other plant species (FIG.30E) and a phylogeny of the species included in the alignment (FIG.30E).
  • sf-6059413 Attorney Docket No.: 26223-20027.40 [0042]
  • FIG.31 shows a comparison of A.
  • FIG.32 shows a comparison of A. thaliana ACD15 and ACD21 ⁇ -Crystalline Domains with protein coding sequences of H. sapiens ⁇ -Crystalline Domain containing small heat shock proteins including HSPB1-10.
  • FIG.33 shows a comparison of thaliana ACD15 and ACD21 ⁇ -Crystalline Domains with protein coding sequences of ⁇ -Crystalline Domain containing small heat shock proteins from the following species which represent all kingdoms of life: HSPB1 (Homo sapiens) (a mammal), HSP22 (Drosophila melanogaster) (an insect), HSP26 (Saccharomyces cerevisiae) (a fungus), M1URI8 (Cyanidioschyzon merolae) (a red algae), P12811 (Chlamydomonas reinhardtii) (a green algae), Q9RTR5 (Deinococcus radiodurans) (a bacterium), and D0KNS6 (Saccharolobus solfataricus) (an archaebacterium).
  • HSPB1 Homo sapiens
  • HSP22 Drosophila melanogaster
  • HSP26 Sacharomyces cerevisiae
  • M1URI8
  • FIGS.34A-34C show the construct designs discussed in Example 10.
  • FIG.34A shows a cartoon depiction of ZF-RFP constructs.
  • FIG.34B shows a plasmid map of ZF alone-RFP.
  • FIG.34C shows a plasmid map of ZF-MBD6 StkyC-RFP.
  • FIG.35 shows StkyC MBD6 increases protein accumulation of a ZF binding domain to its genomic binding sites. Ratio of ChIP-seq signal of ZF-StkyC MBD6 over ZF alone over ZF binding sites. Increased blue color corresponds to increased signal intensity.
  • FIGS.36A-36B show Cas ⁇ -MBD6 StkyC Construct Design Plasmid maps for Cas ⁇ -MBD6 StkyC (FIG.36A) and Cas ⁇ alone (FIG.36B).
  • FIGS.37A-37D show Cas ⁇ Alone Construct Designs: Cas ⁇ alone plasmid maps with no gRNA (FIG.37A), gRNA9 (FIG.37B), gRNA6 (FIG.37C), gRNA8 (FIG.37D).
  • FIGS.38A-38D show Cas ⁇ -ACD15-ACD21 Construct Designs: Cas ⁇ -ACD15- ACD21 plasmid maps with no gRNA (FIG.38A), gRNA9 (FIG.38B), gRNA6 (FIG.38C), gRNA8 (FIG.38D).
  • FIGS.39A-39B show an alignment of StykC domain homologs from MBD6 polypeptides (FIG.39A) and an associated phylogenetic tree (FIG.39B).
  • FIGS.40A-40B show an alignment of StykC domain homologs from MBD7 polypeptides (FIG.40A) and an associated phylogenetic tree (FIG.40B).
  • FIGS.41A-41C show StkyC enhancement of genome editing.
  • FIG.41A Genome browser tracks of the FWA locus demonstrating methylation state of wild type or fwa mutant epiallele as well as accessibility of that region.
  • FIG.41B - FIG.41C Editing at guide 4 and guide 17 of FWA locus in wild type plants.
  • FIGS.42A-42C show Cas9-XTEN-StkyC construct design.
  • FIG.42A Graphical cartoon of Cas9-XTEN-StkyC construct (made with biorender).
  • FIG.42B - FIG.42C Plasmid maps of constructs targeting Guide 4 (FIG.42B) or Guide 17 (FIG.42C).
  • FIGS.43A-43B show Cas9-SunTag-1xGCN4 construct design.
  • FIG.43A Graphical cartoon of Cas9-SunTag-1xGCN4 (made with biorender).
  • FIG.43B Plasmid map of constructs targeting Guide 4.
  • FIGS.44A-44C show Cas9-SunTag-4xGCN4 construct design.
  • FIG.44A Graphical cartoon of Cas9-SunTag-4xGCN4 (made with biorender).
  • FIG.44B - FIG.44C Plasmid maps of constructs targeting Guide 4 (FIG.44B) or Guide 17 (FIG.44C).
  • FIGS.45A-45C show Cas9 construct design.
  • FIG.45A Graphical cartoon of Cas9 construct (made with biorender).
  • FIG.45B - FIG.45C Plasmid maps of constructs targeting Guide 4 (FIG.45B) or Guide 17 (FIG.45C).
  • FIGS.46A-46C show an exemplary SunTag-HSPB1 construct.
  • FIG.46A Graphical representation of an exemplary SunTag construct (made using BioRender).
  • FIG.46A Graphical representation of an exemplary SunTag construct (made using BioRender).
  • FIG.46B Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-HSPB1 construct creating distinct GFP foci. White dashes represent nuclear periphery.
  • FIG.46C Plasmid map of the SunTag-HSPB1 construct expressed in the plants in FIG.46B. [0058] FIGS.47A-47C show an exemplary SunTag-HSPB4 construct.
  • FIG.47A Graphical representation of an exemplary SunTag construct (made using BioRender).
  • FIG. 47B Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-HSPB4 construct creating distinct GFP foci. White dashes represent nuclear periphery.
  • FIG.47C Plasmid map of the SunTag-HSPB4 construct expressed in the plants in FIG.47B. sf-6059413 Attorney Docket No.: 26223-20027.40 [0059]
  • FIGS.48A-48C show an exemplary SunTag-HSPB5 construct.
  • FIG.48A Graphical representation of an exemplary SunTag construct (made using BioRender).
  • FIG. 48B Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-HSPB5 construct creating distinct GFP foci. White dashes represent nuclear periphery.
  • FIG.48C Plasmid map of the SunTag-HSPB5 construct expressed in the plants in FIG.48B.
  • FIGS.49A-49C show an exemplary SunTag-Chlamydomonas reinhardtii ACD construct.
  • FIG.49A Graphical representation of an exemplary SunTag construct (made using BioRender).
  • FIG.49B Representative nucleus from live cell imaging analysis of wild type plants expressing a SunTag-Chlamydomonas reinhardtii ACD construct creating distinct GFP foci.
  • FIG.49C Plasmid map of the SunTag-Chlamydomonas reinhardtii ACD construct expressed in the plants in FIG.49B.
  • FIGS.50A-50C show an exemplary SunTag-Sacchrolobus solfataricus ACD construct.
  • FIG.50A Graphical representation of an exemplary SunTag construct (made using BioRender).
  • FIG.50B Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-Sacchrolobus solfataricus construct creating distinct GFP foci. White Dashes represent nuclear periphery.
  • FIG.50C Plasmid Map of the SunTag- Sacchrolobus solfataricus construct expressed in the plants in FIG.50B.
  • FIGS.51A-51C show an exemplary SunTag-Zea mays ACD construct.
  • FIG. 51A Graphical representation of an exemplary SunTag construct (made using BioRender).
  • FIG.51B Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-Zea mays ACD construct creating distinct GFP foci. White dashes represent nuclear periphery.
  • FIG.51C Plasmid map of the SunTag-Zea mays ACD construct expressed in the plants in FIG.51B. [0063] FIGS.52A-52C show an exemplary SunTag-HSPB8 construct.
  • FIG.52A Graphical representation of an exemplary SunTag construct (made using BioRender).
  • FIG. 52B Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-HSPB8 construct, which do not form distinct GFP foci. White dashes represent nuclear periphery.
  • FIG.52C Plasmid map of the SunTag-HSPB8 construct expressed in the plants in FIG.52B.
  • FIGS.53A-53C show an exemplary SunTag-Oryza sativa ACD construct.
  • FIG. 53A Graphical representation of an exemplary SunTag construct (made using BioRender). sf-6059413 Attorney Docket No.: 26223-20027.40
  • FIG.53B Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-Oryza sativa ACD construct, which do not form distinct GFP foci. White Dashes represent nuclear periphery.
  • FIG.53C Plasmid map of the SunTag-Oryza sativa ACD construct expressed in the plants in FIG.53B.
  • FIGS.54A-54C show an exemplary SunTag-Deinococcus radiodurans ACD construct.
  • FIG.54A Graphical representation of an exemplary SunTag construct (made using BioRender).
  • FIG.54B Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-Deinococcus radiodurans ACD construct, which do not form distinct GFP foci. White dashes represent nuclear periphery.
  • FIG. 54C Plasmid map of the SunTag-Deinococcus radiodurans ACD construct expressed in the plants in FIG.54B.
  • FIGS.55A-55C show an exemplary SunTag-Solanum tuberosum ACD construct.
  • FIG.55A Graphical representation of an exemplary SunTag construct (made using BioRender).
  • FIG.55B Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-Solanum tuberosum ACD construct creating distinct GFP foci and staining DNA with DAPI. White dashes represent nuclear periphery.
  • FIG.55C Plasmid map of the SunTag-Solanum tuberosum ACD construct expressed in the plants in FIG.55B. [0067]
  • FIGS.56A-56C show an exemplary SunTag-Solanum lycopersicum ACD construct.
  • FIG.56A Graphical representation of an exemplary SunTag construct (made using BioRender).
  • FIG.56B Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-Solanum lycopersicum ACD construct creating distinct GFP foci and staining DNA with DAPI. White dashes represent nuclear periphery.
  • FIG.56C Plasmid map of the SunTag-Solanum lycopersicum ACD construct expressed in the plants in FIG.56B. [0068]
  • FIG.57 shows complementation of FWA expression in mbd5 mbd6 mutant plants through use of human small heat shock proteins.
  • FIG.58 shows editing efficiency for stable transgenic plants in Wild-Type (Col- 0) Arabidopsis thaliana.
  • the plot displays editing efficiency comparing Cas9 controls to 1x and 4x GCN4 SunTagStkyC constructs in stably transformed plants.
  • the Cas9 transgene construct design is listed along the X-axis.
  • FIGS.59A-59B show Arabidopsis thaliana protoplast experiments testing improved Cas9 genome editing capability through ACD accumulation technology.
  • FIG.59A shows editing efficiency of SunTag-ACD constructs in protoplasts from experiment 1.
  • FIG. 59B shows editing efficiency of SunTag-ACD constructs in protoplasts from experiment 2.
  • the Cas9 transgene construct design is plotted along the X-axis.
  • FIGS.60A-60C show plasmid maps of constructs used in experiments: Plasmid maps of SunTag-VP64 (FIG.60A), SunTag Sacc -VP64 (FIG.60B), and PiggyBac Transposase plasmid (FIG.60C) construct.
  • FIGS.61A-61B show HEK293 cells showing accumulation by ACD targeting technology. Representative HEK293 cells expressing SunTag-VP64 control construct (FIG. 61A) versus SunTagSacc-VP64 (FIG.61B). Nuclei are visualized by staining DNA with DAPI.
  • FIGS.62A-62B show SunTag Chlamy construct design.
  • FIG.62A Plasmid map of the SunTag Chlamy construct for targeting the siren locus.
  • FIG.62B Representative root nucleus image demonstrating two foci formed by SunTag Chlamy targeted to siren loci.
  • FIGS.63A-63C show design of the vCas ⁇ -XTEN-StkyC construct.
  • FIG.63A shows genome browser tracks of the FWA locus demonstrating methylation state of wild type or fwa mutant epiallele as well as accessibility of that region. gRNA17 and gRNA4 used in this example are shown.
  • FIG.63B shows a plasmid map of vCas ⁇ negative control to be used in the experiments described in Example 20.
  • FIG.63C shows a plasmid map of vCas ⁇ - XTEN-StkyC construct to be used in the experiments described in Example 20.
  • FIGS.64A-64B show design of a Cas9-Suntag-HSPB5-4xGCN4-Truncated construct.
  • FIG.64A shows genome browser tracks of FWA loci demonstrating methylation state of wild type or fwa mutant epiallele as well as accessibility of that region. The positions of gRNA17 and gRNA4 are shown.
  • FIG.64B shows a plasmid map of Cas9-Suntag-HSPB5- 4xGCN4-Truncated construct. sf-6059413 Attorney Docket No.: 26223-20027.40 [0076]
  • FIGS.65A-65E show exemplary designs of SunTag-ACD-TRBIP1-MQ1 constructs.
  • the present disclosure relates generally to methods of eukaryotic genome modification. More specifically, the present disclosure relates to compositions and methods for targeting a genetic modifier polypeptide and a ⁇ -crystalline domain polypeptide to a target nucleic acid of interest to facilitate a genome modification.
  • the present disclosure is based, at least in part, on Applicant’s surprising discovery described herein that recruitment of a sufficient number of ⁇ -crystalline domain- containing polypeptides, such as small heat shock proteins (sHSPs), to a genomic locus (for example, by targeting with a dead Cas9) may form a nucleation center that recruits a large number of ⁇ -crystalline domain proteins, concentrating them into a nuclear body tethered to the genomic site, and sequestering them away from other locations in the nucleus. Further, Applicant has discovered that genetic modifier polypeptides may be co-targeted with a ⁇ - crystalline domain polypeptide to a nucleic acid of interest to facilitate improved modification of the target nucleic acid.
  • sHSPs small heat shock proteins
  • the present disclosure is directed to methods and compositions for aggregation of ⁇ -crystalline domain polypeptides to concentrate polypeptides of interest near a genomic locus or other target nucleic acid of interest by targeting ⁇ -crystalline domain-containing proteins to the genomic locus or other target nucleic acid.
  • the methods described herein are used for, for example, epigenetic editing, genome editing, RNA editing, control of recombination, sf-6059413 Attorney Docket No.: 26223-20027.40 control of transcription, and/or any other process that occurs at specific regions of chromatin or other nucleic acids.
  • the methods and compositions described herein may be used in the concentration of genome modification activities to one or more sites, which reduces the incidence of off- target activity and increases the efficiency of on-target activity.
  • the methods and compositions described herein may be used to increase efficiency of genome modification of a target nucleic acid by a genetic modifier polypeptide—for example, increased editing efficiency by, for example, a Cas protein.
  • Editing efficiency could be measured in a variety of ways. For example, efficiency could be measured relative to a corresponding control.
  • a corresponding control could comprise, for example, the same genetic modifier polypeptide but in which the genetic modifier polypeptide is not co-targeted with a ⁇ -crystalline domain polypeptide to the target nucleic acid.
  • Efficiency could be quantified by, for example, providing a sample comprising nucleic acids comprising a plurality of copies of a target nucleic acid sequence, targeting a genetic modifier polypeptide to the target nucleic acid sequence in the sample, and then measuring the proportion of nucleic acids comprising the target sequence in the sample that are modified by the genetic modifier polypeptide. For example, if a sample comprises 100 nucleic acids each comprising one copy of the target nucleic acid sequence (such as, for example, 50 cells with two nucleic acids per cell), and the genetic modifier polypeptide modifies the target sequence in 20 of the nucleic acids, then the editing efficiency could be quantified as 20%.
  • Editing efficiency could also be measured as, for example, a fold change in targeted nucleic acid modifications in a sample comprising a genetic modifier polypeptide co-targeted with a ⁇ -crystalline domain polypeptide to a target nucleic acid compared to a corresponding control comprising the same genetic modifier polypeptide to the same target nucleic acid but not co-targeted with a ⁇ - crystalline domain polypeptide.
  • Relative editing efficiencies compared to a corresponding control could be measured from various perspectives. For example, editing efficiency could be measured relative to a corresponding control based on relative number of edits that occur per sample within a given time frame.
  • the genetic modifier polypeptide is expressed in a sample (such as, for example, in a sample of cells) beginning at time 0 with or without co- targeting with an ⁇ -crystalline domain polypeptide, and then the nucleic acids are collected from the sample after an hour, then editing efficiency could be measured as edits per hour in a co-targeting sample relative to a non-co-targeting sample.
  • editing efficiency sf-6059413 Attorney Docket No.: 26223-20027.40 could be measured as the relative time it takes for, for example, half of the nucleic acids comprising a target sequence in a first sample comprising co-targeting with an ⁇ -crystalline domain polypeptide to be edited compared to the time it takes for half of the nucleic acids comprising a target sequence in a second sample lacking co-targeting with an ⁇ -crystalline domain polypeptide to be edited.
  • Editing efficiency could be improved (for example, as compared to a corresponding control) to various degrees.
  • targeting a genetic modifier polypeptide and an ⁇ -crystalline domain polypeptide to a target nucleic acid may improve (e.g., increase) editing efficiency by 0-5%, 5-10%, 10-15%, 15-20%, 20-25%, 25-30%, 30- 35%, 35-40%, 40-45%, 45-50%, 50-55%, 55-60%, 60-65%, 65-70%, 70-75%, 75-80%, 80- 85%, 85-90%, 90-95%, 95-100%, 100-150%, 150-200%, or more than 200% compared to a corresponding control.
  • the present disclosure also provides methods and compositions that allow for ⁇ - crystalline domain polypeptides to be co-opted for use in targeting epigenetic enzymes, gene editing proteins, guide RNAs (gRNAs), template nucleic acids such as DNA, transcription factors, and other factors to 1) specifically accumulate proteins or other molecules of interest; 2) maintain enzymatic functions; and 3) sequester these factors away from other sites in the genome.
  • gRNAs guide RNAs
  • isolated and purified refers to a material that is removed from at least one component with which it is naturally associated (e.g., removed from its original environment).
  • isolated when used in reference to an isolated protein, refers to a protein that has been removed from the culture medium of the host cell that expressed the protein. As such an isolated protein is free of extraneous or unwanted compounds (e.g., nucleic acids, native bacterial or other proteins, etc.).
  • CRISPR enzymes guided by sequence specific guide RNAs
  • CRISPR systems evolved in bacteria and viruses that do not have nucleosomes, these systems did not evolve mechanisms to efficiently gain access to DNA bound to nucleosomes. Indeed, it has been found that CRISPR-mediated gene editing can be highly inefficient at particular DNA sequences, especially those with tightly associated nucleosomes.
  • CRISPR-mediated gene editing techniques are revolutionizing the fields of crop and livestock improvement, and are the basis of a new class of human therapeutics.
  • the present disclosure provides methods to allow genetic modifier polypeptides, such as CRISPR systems, or other polypeptides of interest, to more readily gain access to nucleic acid (e.g. DNA) target sites in plant and animal genomes.
  • CRISPR systems Co- targeting ⁇ -crystalline domain proteins with genetic modifier polypeptides, such as, for sf-6059413 Attorney Docket No.: 26223-20027.40 instance, CRISPR systems, causes the genetic modifier polypeptides to oligomerize and hyperaccumulate at the target locus, while also sequestering them away from non-target sites.
  • the methods described herein involving addition of ⁇ -crystalline domain proteins to, for example, a CRISPR-based DNA methylation targeting systems have the beneficial effect of increasing specificity to the target locus, thereby reducing or eliminating off-target effects, such as, for instance, off-target methylation.
  • the methods described herein harness the unique properties of ⁇ -crystalline domain proteins to develop, in some embodiments, a new class of CRISPR-based gene editing tools that are both more powerful and more specific. Without wishing to be bound by theory, it is believed that dramatically increasing the concentration of CRIPSR or other genetic modifier components at a target site will allow them to more effectively compete for DNA binding.
  • the present disclosure relates to genetic modifier polypeptides that are capable of being targeted to the target nucleic acid, and ⁇ -crystalline domain polypeptides that are capable of being targeted to the target nucleic acid, wherein the genetic modifier polypeptide and/or the ⁇ -crystalline domain polypeptide is a recombinant polypeptide, as well as methods of using these genetic modifier polypeptides and ⁇ -crystalline domain polypeptides for modifying a target nucleic acid in a eukaryotic cell.
  • the present disclosure is based, at least in part, on Applicant’s discovery that ⁇ -crystalline domain polypeptides can be used to concentrate polypeptides of interest, such as genetic modifier polypeptides, to a target nucleic acid, and that, once targeted this concentration can reduce the incidence of off-target activity and increase the efficiency of, for instance, the production of genetic modifications by a genetic modifier polypeptide.
  • the present disclosure provides methods for targeting a genetic modifier polypeptide to a target nucleic acid and targeting a ⁇ -crystalline domain polypeptide to the target nucleic acid, where the genetic modifier polypeptide and/or the ⁇ -crystalline domain polypeptide is a recombinant polypeptide.
  • the genetic modifier polypeptide modifies the target nucleic acid.
  • nucleic acids encoding the genetic modifier polypeptides and ⁇ -crystalline domain polypeptides
  • expression vectors containing nucleic acids that encode the genetic modifier polypeptides and ⁇ -crystalline domain polypeptides cells containing the genetic modifier polypeptides and ⁇ - crystalline domain polypeptides, plants, mammals, and other organisms containing the genetic modifier polypeptides and ⁇ -crystalline domain polypeptides, and plants, mammals, and other organisms having a target nucleic acid containing a genetic modification as a sf-6059413 Attorney Docket No.: 26223-20027.40 consequence of having the genetic modifier polypeptides and ⁇ -crystalline domain polypeptides targeted to the target nucleic acid.
  • Each one of the genetic modifier polypeptides and ⁇ -crystalline domain polypeptides described herein may be expressed in a host cell individually or in various combinations to act to modify a target nucleic acid.
  • Recombinant Polypeptides Certain aspects of the present disclosure relate to recombinant polypeptides containing genetic modifier polypeptides and/or ⁇ -crystalline domain polypeptides. These recombinant polypeptides may be targeted to a target nucleic acid to facilitate genetic modifications of the target nucleic acid.
  • polypeptides may contain other features as described herein and as well be apparent to one of skill in the art.
  • Other amino acid and/or polypeptide sequence features of the recombinant polypeptides may be used to provide additional functionality and/or features to the recombinant polypeptide including e.g. subcellular localization, downstream detection, etc. as will be readily apparent to one of skill in the art.
  • a “polypeptide” is an amino acid sequence including a plurality of consecutive polymerized amino acid residues (e.g., at least about 15 consecutive polymerized amino acid residues).
  • Polypeptide refers to an amino acid sequence, oligopeptide, peptide, protein, or portions thereof, and the terms “polypeptide” and “protein” are used interchangeably.
  • Polypeptides as described herein also include polypeptides having various amino acid additions, deletions, or substitutions relative to the native amino acid sequence of a polypeptide of the present disclosure.
  • polypeptides that are homologs of a polypeptide of the present disclosure contain non-conservative changes of certain amino acids relative to the native sequence of a polypeptide of the present disclosure.
  • polypeptides that are homologs of a polypeptide of the present disclosure contain conservative changes of certain amino acids relative to the native sequence of a polypeptide of the present disclosure, and thus may be referred to as conservatively modified variants.
  • a conservatively modified variant may include individual substitutions, deletions or additions to a polypeptide sequence which result in the substitution of an amino acid with a chemically similar amino acid.
  • Conservative substitution tables providing functionally sf-6059413 Attorney Docket No.: 26223-20027.40 similar amino acids are well-known in the art.
  • Such conservatively modified variants are in addition to and do not exclude polymorphic variants, interspecies homologs, and alleles of the disclosure.
  • the following eight groups contain amino acids that are conservative substitutions for one another: 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine (S), Threonine (T); and 8) Cysteine (C), Methionine (M) (see, e.g., Creighton, Proteins (1984)).
  • a modification of an amino acid to produce a chemically similar amino acid may be referred to as an analogous amino acid.
  • a “recombinant” polypeptide, protein, or enzyme of the present disclosure may be a polypeptide, protein, or enzyme that may be encoded by e.g. a “recombinant nucleic acid” or “heterologous nucleic acid” or “recombinant polynucleotide.”
  • Recombinant polypeptides of the present disclosure that are composed of individual polypeptide domains may be described based on the individual polypeptide domains of the overall recombinant polypeptide.
  • a domain in such a recombinant polypeptide refers to the particular stretches of contiguous amino acid sequences with a particular function or activity.
  • the contiguous amino acids that encode the sequence from the genetic modifier polypeptide may be described as the “genetic modifier domain” in the overall recombinant polypeptide
  • the contiguous amino acids that encode the sequence from the ⁇ -crystalline domain polypeptide may be described as the “ ⁇ -crystalline domain polypeptide domain” in the over recombinant polypeptide.
  • Fusion polypeptides of the present disclosure may contain an individual polypeptide domain that is in various N-terminal or C-terminal orientations relative to other individual polypeptide domains present in the fusion polypeptide. Fusion of individual polypeptide domains in fusion polypeptides may also be direct or indirect fusions. Direct fusions of individual polypeptide domains refer to direct fusion of the coding sequences of each respective individual polypeptide domain.
  • a linker domain or other contiguous amino acid sequence may separate the coding sequences of two individual polypeptide domains in a fusion polypeptide.
  • Polypeptides of the present disclosure may be detecting using antibodies. Techniques for detecting polypeptides using antibodies include, for example, enzyme linked immunosorbent assays (ELISAs), Western blots, immunoprecipitations, and immunofluorescence.
  • ELISAs enzyme linked immunosorbent assays
  • An antibody provided herein can be a polyclonal antibody or a monoclonal antibody.
  • An antibody having specific binding affinity for a polypeptide provided herein can be generated using methods well known in the art.
  • Linkers [0105] Various linkers may be used in the construction of recombinant polypeptides as described herein. In general, linkers are short peptides that separate the different domains in a multi-domain protein. They may play an important role in fusion proteins, affecting the crosstalk between the different domains, the yield of protein production, and the stability and/or the activity of the fusion proteins. Linkers are generally classified into 2 major categories: flexible or rigid.
  • Flexible linkers are typically used when the fused domains require a certain degree of movement or interaction, and these linkers are usually composed of small amino acids such as, for example, glycine (G), serine (S) or proline (P).
  • G glycine
  • S serine
  • P proline
  • the certain degree of movement between domains allowed by flexible linkers is an advantage in some fusion proteins.
  • rigid linkers may be used since they enforce a fixed distance between domains and promote their independent functions. A thorough description of several linkers has been provided in Chen X et al., 2013, Advanced Drug Delivery Reviews 65 (2013) 1357–1369).
  • Linkers may be used in, for example, the construction of recombinant polypeptides as described herein.
  • Linkers may be used to separate the coding sequences of a genetic modifier polypeptide and a ⁇ -crystalline domain polypeptide.
  • a variety of wiggly/flexible linkers, stiff/rigid linkers, short linkers, and long linkers may be used as described herein.
  • Various linkers as described herein may be used in the construction of recombinant polypeptides as described herein.
  • a variety of shorter or longer linker regions are known in the art, for example corresponding to a series of glycine residues, a series of adjacent glycine-serine dipeptides, a series of adjacent glycine-glycine-serine tripeptides, or known linkers from other proteins.
  • a flexible linker may include, for example, the amino acid sequence: SSGPPPGTG (SEQ ID NO: 489) and variants thereof.
  • a rigid linker may include, for example, the amino acid sequence: AEAAAKEAAAKA (SEQ ID NO: 490) and variants thereof.
  • Nuclear Localization Signals may contain one or more nuclear localization signals (NLS).
  • Nuclear localization signals may also be referred to as nuclear localization sequences, domains, peptides, or other terms readily apparent to those of skill in the art.
  • Nuclear localization signals are a translocation sequence that, when present in a polypeptide, direct that polypeptide to localize to the nucleus of a eukaryotic cell.
  • Various nuclear localization signals may be used in recombinant polypeptides of the present disclosure.
  • one or more SV40-type NLS or one or more REX NLS may be used in recombinant polypeptides.
  • Recombinant polypeptides may also contain two or more tandem copies of a nuclear localization signal.
  • recombinant polypeptides may contain at least two, at least three, at least for, at least five, at least six, at least seven, at least eight, at least nine, or at least ten copies, either tandem or not, of a nuclear localization signal.
  • Tags, Reporters, and Other Features [0111] Recombinant polypeptides of the present disclosure may contain one or more tags that allow for e.g.
  • Recombinant polypeptides of the present disclosure may contain one or more reporters that allow for e.g. visualization and/or detection of the recombinant polypeptide.
  • a reporter polypeptide encodes a protein that may be readily detectable due to its biochemical characteristics such as, for example, enzymatic activity or chemifluorescent features.
  • Reporter polypeptides may be detected in a number of ways depending on the characteristics sf-6059413 Attorney Docket No.: 26223-20027.40 of the particular reporter.
  • a reporter polypeptide may be detected by its ability to generate a detectable signal (e.g. fluorescence), by its ability to form a detectable product, etc.
  • Various reporters may be used herein and are well-known to those of skill in the art. Exemplary reporters may include GFP, GUS, mCherry, luciferase, etc., and multiple copies of one or more tags may be present in a recombinant polypeptide.
  • Recombinant polypeptides of the present disclosure may contain one or more polypeptide domains that serve a particular purpose depending on the particular goal/need. Recombinant polypeptides may contain translocation sequences that target the polypeptide to a particular cellular compartment or area. Suitable features will be readily apparent to those of skill in the art. Genetic Modifier Polypeptides [0114] Certain aspects of the present disclosure relate to genetic modifier polypeptides that are capable of being targeted to a target nucleic acid. Genetic modifier polypeptides as described herein generally refer to polypeptides that can facilitate, whether directly or indirectly, modification of a feature of a nucleic acid.
  • the nucleic acid may be any type of nucleic acid of any length, including but not limited to DNA or RNA; single- or double- stranded nucleic acids; linear or circular nucleic acids; chromosomal or extra-chromosomal nucleic acids; nuclear, cytoplasmic, or organellar nucleic acids.
  • nucleic acid that may be modified include but are not limited to, for example, the genetic sequence of the nucleic acid (such as, for example, addition, deletion, or inversion of one or more nucleic acid residues); the chemical structure of one or more nucleic acid residues (such as, for example, methylation, pseudouridylation, and other types of base modifications; the expression of the nucleic acid (such as, for example, the level of expression, the timing of expression, and/or the location of expression); one or more characteristics of a polypeptide (such as, for example, a histone or other scaffold proteins or a transcription factor) that is bound to or otherwise closely associated with the nucleic acid; and structures including the nucleic acid (such as, for example, hybridization state, secondary structure, tertiary structure, quarternary structure, chromatin, chromosomes).
  • the genetic sequence of the nucleic acid such as, for example, addition, deletion, or inversion of one or more nucleic acid residues
  • genetic modifier polypeptides may include, for example, transcriptional repressors, transcriptional activators, methyltransferases, demethylases, nucleases, recombinases, topoisomerases, ligases, polynucleotide kinases, uracil DNA glycosylases, and terminal deoxynucleotidyl transferases.
  • transcriptional repressor polypeptides may include but are not limited to a PHD1 polypeptide, a PIAL1 polypeptide, a PIAL2 polypeptide, a TRB1 polypeptide, a TRB2 polypeptide, a sf-6059413 Attorney Docket No.: 26223-20027.40 TRB3 polypeptide, a MSI1 polypeptide, a LHP1 polypeptide, a HD2A polypeptide, a HD2B polypeptide, a HD2C polypeptide, an ELF7 polypeptide, a CPL2 polypeptide, a MBD2 polypeptide, a SUVH7 polypeptide, a SSRP1 polypeptide, a SPT16 polypeptide, a JMJ18 polypeptide, a TRBIP1 polypeptide, a TRBIP2 polypeptide, and an ASF1B polypeptide.
  • transcriptional activator polypeptides may include but are not limited to activating transcription factors and VP64.
  • methyltransferase polypeptides may include but are not limited to an MQ1 polypeptide and a SDG2 histone methyltransferase polypeptide (an exemplary MQ1 polypeptide is set forth in Example 2 herein).
  • demethylase polypeptides may include but are not limited to a DEMETER polypeptide, a Tet1 polypeptide, a TDG polypeptide, and a ROS1 polypeptide.
  • DNA methyltransferase polypeptides may include but are not limited to a DRM2 polypeptide, an SssI polypeptide, and a Dnmt3 polypeptide.
  • nuclease polypeptides may include but are not limited to an endonuclease polypeptide (such as, for example, a Cas9 or CasPhi polypeptide) and an exonuclease polypeptide.
  • recombinase polypeptides may include but are not limited to a Cre recombinase, a Hin recombinase, a Tre recombinase, and a FLP recombinase.
  • topoisomerase polypeptides may include but are not limited to a type IA topoisomerase, a type IB topoisomerase, a type IC topoisomerase, a type IIA topoisomerase, and a type IIB topoisomerase.
  • ligase polypeptides may include but are not limited to a DNA ligase and an RNA ligase.
  • polynucleotide kinase polypeptides may include but are not limited to a T4 Polynucleotide Kinase (PNK).
  • PNK Polynucleotide Kinase
  • Recombinant genetic modifier polypeptides of the present disclosure may be capable of being targeted to a specific nucleic acid sequence on a target nucleic acid and may be used in reducing the expression of a target nucleic acid, such as a gene, in a eukaryotic organism (e.g. plants or mammals) as described herein.
  • a recombinant genetic modifier polypeptide may comprise, for example, a Cas protein (or other RNA- guided nuclease), which may be used to target the genetic modifier polypeptide to the target nucleic acid and/or to make the desired genome edit.
  • the Cas protein is a “dead” or deactivated (i.e., comprising deficient or at least reduced nuclease activity) Cas protein, in which case the Cas protein may be used to target the genetic modifier polypeptide to the target nucleic acid.
  • the Cas protein may comprise nuclease activity (e.g., nickase or double-stranded break activity), in which case the Cas protein may sf-6059413 Attorney Docket No.: 26223-20027.40 actively mediate both the targeting to the target nucleic acid and the genetic modification.
  • genome editing involving a Cas protein as described herein may comprise editing with an active Cas or with a deactivated Cas.
  • any Cas protein (or other RNA-guided nuclease) sufficient to target to a specific nucleic acid may be used in the methods described herein (e.g., Cas9, Cas12a, or other Cas proteins, and/or “dead” versions thereof).
  • a genetic modifier polypeptide contains at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, at least 60 consecutive amino acids, at least 70 consecutive amino acids, at least 80 consecutive amino acids, at least 90 consecutive amino acids, at least 100 consecutive amino acids, at least 120 consecutive amino acids, at least 140 consecutive amino acids, at least 160 consecutive amino acids, at least 180 consecutive amino acids, at least 200 consecutive amino acids, at least 220 consecutive amino acids, at least 240 consecutive amino acids, or 241 or more consecutive amino acids of an endogenous or wild-type full-length genetic modifier polypeptide.
  • genetic modifier polypeptides include sequences with one or more amino acids removed from the consecutive amino acid sequence of an endogenous or wild-type full-length genetic modifier polypeptide. In some embodiments, genetic modifier polypeptides may include sequences with one or more amino acids replaced/substituted with an amino acid different from an endogenous or wild-type amino acid present at a given amino acid position in a consecutive amino acid sequence of an endogenous or wild-type full-length genetic modifier polypeptide. In some embodiments, genetic modifier polypeptides may include sequences with one or more amino acids added to an otherwise consecutive amino acid sequence of an endogenous or wild-type full-length genetic modifier polypeptide.
  • Suitable genetic modifier polypeptides may be identified from any organism, including but not limited to monocot and dicot plants, algae, fungi, animals (including, but not limited to mammals, such as Homo sapiens, and insects, such as Drosophila melanogaster), bacteria, archaea, and protists.
  • a genetic modifier polypeptide of the present disclosure has an amino acid sequence with at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about sf-6059413 Attorney Docket No.: 26223-20027.40 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of any one of a Cas9 polypeptide (e.g.
  • TRBIP1 polypeptide e.g. SEQ ID NO: 1
  • MQ1 polypeptide e.g. SEQ ID NO: 184
  • Tet1 polypeptide e.g. SEQ ID NO: 272
  • SDG2 histone methyltransferase polypeptide e.g. SEQ ID NO: 288.
  • TRBIP1 polypeptides of the present disclosure may be capable of being targeted to a specific nucleic acid sequence on a target nucleic acid and may be used in reducing the expression of a target nucleic acid, such as a gene, in plants.
  • TRBIP1 proteins are known in the art. TRB Interacting Protein 1 (TRBIP1 AT4G35510) interacts with TRB proteins. Additionally, TRBI1 proteins are annotated as PHD finger-like proteins in The Arabidopsis Information Resource (TAIR) database. However, the endogenous function of TRBIP1 proteins have not been elucidated.
  • a TRBIP1 polypeptide contains at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, at least 60 consecutive amino acids, at least 70 consecutive amino acids, at least 80 consecutive amino acids, at least 90 consecutive amino acids, at least 100 consecutive amino acids, at least 120 consecutive amino acids, at least 140 consecutive amino acids, at least 160 consecutive amino acids, at least 180 consecutive amino acids, at least 200 consecutive amino acids, at least 220 consecutive amino acids, at least 240 consecutive amino acids, or 241 or more consecutive amino acids of an endogenous or wild- type full-length TRBIP1 polypeptide.
  • TRBIP1 polypeptides include sequences with one or more amino acids removed from the consecutive amino acid sequence of an endogenous or wild-type full-length TRBIP1 polypeptide. In some embodiments, TRBIP1 polypeptides may include sequences with one or more amino acids replaced/substituted with an amino acid different from an endogenous or wild-type amino acid present at a given amino acid position in a consecutive amino acid sequence of an endogenous or wild-type full-length TRBIP1 polypeptide. In some embodiments, TRBIP1 polypeptides may include sequences with one or more amino acids added to an otherwise consecutive amino acid sequence of an endogenous or wild-type full-length TRBIP1 polypeptide.
  • TRBIP1 polypeptides may be identified from monocot and dicot plants. Examples of suitable TRBIP1 polypeptides may include, for example, those listed in Table 1, homologs thereof, and orthologs thereof.
  • TRBIP1 Polypeptides [0124]
  • a TRBIP1 polypeptide of the present disclosure has an amino acid sequence with at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of any one of a TRBIP polypeptide described herein, such as, for example, a TRBIP1 polypeptide described in Table 1, including, for example, the polypeptide encoded by Arabidopsis thaliana NP_195276.3 (SEQ ID NO: 1).
  • Non-Genetic Modifier Polypeptides that are capable of being targeted to a target nucleic acid.
  • Non-genetic modifier polypeptides as described herein generally refer to polypeptides that can be targeted to a nucleic acid but that do not modify the nucleic acid in any of the ways described above for genetic modifier polypeptides.
  • a non-genetic modifier polypeptide may be used as a visual marker, such as, for instance, a green fluorescent protein (GFP).
  • GFP green fluorescent protein
  • the nucleic acid may be any type of nucleic acid of any length, including but not limited to DNA or RNA; single- or double-stranded nucleic acids; linear or circular nucleic acids; chromosomal or extra-chromosomal nucleic acids; nuclear, cytoplasmic, or organellar nucleic acids.
  • sf-6059413 Attorney Docket No.: 26223-20027.40 ⁇ (Alpha)-Crystalline Domain Polypeptides [0126] Certain aspects of the present disclosure relate to molecular chaperone polypeptides. Molecular chaperone polypeptides are highly conserved across species from bacteria to humans to plants.
  • ACD ⁇ -crystalline domain
  • sHSPs small heat shock polypeptides
  • ACD containing proteins target the polypeptides containing them to perform a variety of functions including regulation of aggregation, oligomeric species formation, holdase functions, and sequestering of proteins into specific compartments (see, e.g., M. Haslbeck, E. Vierling, A First Line of Stress Defense: Small Heat Shock Proteins and Their Function in Protein Homeostasis. Journal of Molecular Biology.427, 1537–1548 (2015); F. McLoughlin, E. Basha, M. E. Fowler, M. Kim, J.
  • ⁇ -crystalline domain (ACD) polypeptides and ⁇ -crystalline domain-containing polypeptides, as described herein, are polypeptides that contain an ⁇ - crystalline domain.
  • ACDs evolved to be efficient, ATP independent, chaperones with the purpose of utilizing oligomerization capacity to maintain the efficacy of cellular processes (M. Haslbeck, S. Weinkauf, J. Buchner, Small heat shock proteins: Simplicity meets complexity. J Biol Chem.294, 2121–2132 (2019)).
  • sHSPs small heat shock polypeptides
  • ACD ⁇ -crystalline domain
  • ACDs form the general structure of a ⁇ -sandwich composed of anti-parallel 3 and 4 beta sheets (M. Haslbeck, S. Weinkauf, J. Buchner, Small heat shock proteins: Simplicity meets complexity. J Biol Chem.294, 2121–2132 (2019)). Often, proteins containing ACDs will also contain a variable N-terminus as well as a short C-terminal region. ACDs function to form the primary building block of sHSP oligomers through formation of dimer interfaces between the ⁇ 6+7 strands (i.e.
  • ACD proteins of various species allow for the formation of ordered, oligomeric species using at minimum the highly conserved ACD domain, augmented by the more variable N- and C-terminal regions.
  • ACDs and sHSPs While the general structure of the ⁇ -sheets in the ⁇ -crystalline domain of ACDs and sHSPs is generally conserved, there is not necessarily exact sequence conservation between diverse species. Most of the conservation is related to hydrophobic or charged regions of the polypeptides, which does not necessarily produce conservation in exact sequences of amino acids (as described in, for example, Narberhaus F. Alpha-crystallin-type heat shock proteins: socializing minichaperones in the context of a multichaperone network.
  • ⁇ -crystalline domain polypeptides for use in the methods and compositions of the present disclosure may contain certain amino acid sequences at one or more positions corresponding to positions of the amino acid sequence of ACD15.
  • Such ⁇ -crystalline domain polypeptides may contain one or more of the amino acids, conservative substitutions thereof, or analogous amino acids corresponding thereof, corresponding to one or more of the amino acids shown in one or more of the consensus sequences compared to ACD15 shown in FIGS.
  • such ⁇ -crystalline domain polypeptides may contain one or more of the amino acids, conservative substitutions thereof, or analogous amino acids corresponding sf-6059413 Attorney Docket No.: 26223-20027.40 thereof, corresponding to the amino acids: D (38), LPG (52-54), G (70), G (76), F (104), LP (110-111), and/or G (127), shown in bold in the ACD15 sequence below (SEQ ID NO: 11): MNAENNQTTTTHSKVISHVFCTGTAKLGSVGPPIGLVDIGVSEVAYIFRVSLPGIEKN QDKIKCEIQREGRVCIQGVIPEIAIPSDTGCLYRMQVQQLCPPGPFSITFNLPGQVDPR LFSPNFRSDGIFEVVVVKLGVRIPTS.
  • the ⁇ -crystalline domain of the full-length ACD15 polypeptide sequence above has the following sequence (with corresponding bolded amino acids from the full-length sequence reproduced below): LVDIGVSEVAYIFRVSLPGIEKNQDKIKCEIQREGRVCIQGVIPEIAIPSDTGCLYRMQ VQQLCPPGPFSITFNLPGQVDPRLFSPNFRSDGIFEVVVVKL (SEQ ID NO: 12).
  • Polypeptides that contain this sequence, conservative substitutions of the amino acids thereof, or analogous amino acids corresponding thereof, may also be used in the methods and compositions of the present disclosure.
  • an ⁇ -crystalline domain polypeptide of the present disclosure may contain at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of the ⁇ -crystalline domain from ACD15 (SEQ ID NO: 12).
  • an ⁇ -crystalline domain polypeptide of the present disclosure contains at least 10 consecutive amino acids, at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, or at least 60 consecutive amino acids, conservative substitutions thereof, or analogous amino acids corresponding thereof, of the ⁇ -crystalline domain from ACD15 (SEQ ID NO: 12).
  • an ⁇ -crystalline domain polypeptide of the present disclosure contain one or more of the amino acids, conservative substitutions thereof, or analogous amino acids corresponding thereof, corresponding to the amino acids bolded in the ⁇ -crystalline domain from ACD15 below: sf-6059413 Attorney Docket No.: 26223-20027.40 LVDIGVSEVAYIFRVSLPGIEKNQDKIKCEIQREGRVCIQGVIPEIAIPSDTGCLYRMQ VQQLCPPGPFSITFNLPGQVDPRLFSPNFRSDGIFEVVVVKL (SEQ ID NO: 12).
  • Polypeptides that are homologs of ACD15 or ACD21 may include polypeptides having various amino acid additions, deletions, or substitutions relative to the amino acid sequences of ACD15 or ACD21.
  • polypeptides that are homologs of ACD15 or ACD21 contain non-conservative changes of certain amino acids relative to ACD15 or ACD21.
  • polypeptides that are homologs of ACD15 or ACD21 contain conservative changes of certain amino acids relative to ACD15 or ACD21, and thus may be referred to as conservatively modified variants.
  • a conservatively modified variant may include individual substitutions, deletions or additions to a polypeptide sequence which result in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art. Such conservatively modified variants are in addition to and do not exclude polymorphic variants, interspecies homologs, and alleles of the disclosure.
  • the following eight groups contain amino acids that are conservative substitutions for one another: 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine (S), Threonine (T); and 8) Cysteine (C), Methionine (M) (see, e.g., Creighton, Proteins (1984)).
  • Polypeptides that are homologs of ACD15 or ACD21 may contain the same amino acid or a conservative amino acid substitution at one or more positions corresponding to positions of the amino acid sequence of ACD15 or ACD21.
  • the homolog contains the same or a conservative amino acid substitution at a position corresponding to D (38) of ACD15.
  • the homolog contains the same or a conservative amino acid substitution at positions corresponding to LPG (52-54) of ACD15.
  • the homolog contains the same or a conservative amino acid substitution at a position corresponding to G (70) of ACD15.
  • the homolog contains the same or a conservative amino acid substitution at a position corresponding to G (76) of ACD15. In some embodiments, the homolog contains the same or a conservative amino acid substitution at a position corresponding to F (104) of ACD15. In some embodiments, the homolog contains the same or a conservative amino acid sf-6059413 Attorney Docket No.: 26223-20027.40 substitution at positions corresponding to LP (110-111) of ACD15. In some embodiments, the homolog contains the same or a conservative amino acid substitution at a position corresponding to G (127) of ACD15. The homolog may contain various combinations (e.g.
  • the homolog contains the same amino acid or conservative amino acid substitutions at a position corresponding to all of D (38), LPG (52-54), G (70), G (76), F (104), LP (110-111), and G (127) of ACD15.
  • Some small HSPs are known to form dynamic oligomeric assemblies (M. Haslbeck, E.
  • sHSPs are found across all kingdoms of life and generally contain an ⁇ -crystalline domain (ACD).
  • ACD ⁇ -crystalline domain
  • sHSPs polypeptides include but are not limited to, the plant proteins ACD15 (e.g., SEQ ID NO: 11) and ACD21 (e.g., SEQ ID NO: 13) and homologs and orthologs thereof; the plant sHSPs At1g52560 (e.g., SEQ ID NO: 14), At4g27670 (e.g., SEQ ID NO: 15), At5g51440 (e.g., SEQ ID NO: 16), and At4g25200 (e.g., SEQ ID NO: 17) and homologs and orthologs thereof; the human sHSPs HSPB1 (e.g., SEQ ID NO: 18), HSPB2 (e.g., SEQ ID NO: 19), HSPB3 (e.g., SEQ ID NO: 20), HSPB4 (e.g., SEQ ID NO: 21), HSPB5 (e.g., SEQ ID NO: 22), HSPB6 (e.g
  • an ⁇ -crystalline domain polypeptide of the present disclosure may have an amino acid or DNA sequence (as applicable) with at least 50%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or sf-6059413 Attorney Docket No.: 26223-20027.40 100% sequence identity to any one or more of SEQ ID NOs: 11-120, 310, 401-409, 413-414, 416-417, 419-420, 422-423, 425-426, 428-429, 431-432, 447, and 472-476.
  • ⁇ -crystalline domain polypeptides may be modified in view of a particular intended application.
  • the Arabidopsis sHSPs At1g52560, At4g27670, At5g51440, and At4g25200 contain organelle targeting peptides that target them to the mitochondria or chloroplasts. Such targeting peptides may be modified and/or removed and replaced with, e.g., nuclear localization signals to localize these polypeptides to the nucleus.
  • ACD15 Polypeptides [0141] Certain aspects of the present disclosure relate to ACD15 polypeptides.
  • ACD15 is a ⁇ -crystalline domain-containing protein from Arabidopsis.
  • ACD15 polypeptides of the present disclosure may be capable of being targeted to a specific nucleic acid sequence on a target nucleic acid and may be used in modifying the expression of a target nucleic acid, such as a gene.
  • ACD15 polypeptides are known in the art. As described herein, Arabidopsis ACD15 (along with ACD21) drives accumulation of MBD5/6 complex silencing assemblies at methyl-CG sites.
  • an ACD15 polypeptide contains at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, at least 60 consecutive amino acids, at least 70 consecutive amino acids, at least 80 consecutive amino acids, at least 90 consecutive amino acids, at least 100 consecutive amino acids, at least 120 consecutive amino acids, at least 140 consecutive amino acids, or 142 or more consecutive amino acids of an endogenous or wild-type full- length ACD15 polypeptide (e.g., SEQ ID NO: 11).
  • ACD15 polypeptides include sequences with one or more amino acids removed from the consecutive amino acid sequence of an endogenous or wild-type full-length ACD15 polypeptide.
  • ACD15 polypeptides may include sequences with one or more amino acids replaced/substituted with an amino acid different from an endogenous or wild-type amino acid present at a given amino acid position in a consecutive amino acid sequence of an endogenous or wild-type full-length ACD15 polypeptide.
  • ACD15 polypeptides may include sequences with one or more amino acids added to an otherwise sf-6059413 Attorney Docket No.: 26223-20027.40 consecutive amino acid sequence of an endogenous or wild-type full-length ACD15 polypeptide.
  • Suitable ACD15 polypeptides may be identified from monocot and dicot plants.
  • an ACD15 polypeptide of the present disclosure has an amino acid sequence with at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least
  • ACD21 Polypeptides [0148] Certain aspects of the present disclosure relate to ACD21 polypeptides, such as, e.g., SEQ ID NO: 13.
  • ACD21 is a ⁇ -crystalline domain-containing protein from Arabidopsis.
  • ACD21 polypeptides of the present disclosure may be capable of being targeted to a specific nucleic acid sequence on a target nucleic acid and may be used in modifying the expression of a target nucleic acid, such as a gene.
  • sf-6059413 Attorney Docket No.: 26223-20027.40
  • ACD21 polypeptides are known in the art.
  • an ACD21 polypeptide contains at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, at least 60 consecutive amino acids, at least 70 consecutive amino acids, at least 80 consecutive amino acids, at least 90 consecutive amino acids, at least 100 consecutive amino acids, at least 120 consecutive amino acids, at least 140 consecutive amino acids, at least 160 consecutive amino acids, at least 180 consecutive amino acids, at least 200 consecutive amino acids, or 205 or more consecutive amino acids of an endogenous or wild-type full-length ACD21 polypeptide (e.g., SEQ ID NO: 13).
  • ACD21 polypeptides include sequences with one or more amino acids removed from the consecutive amino acid sequence of an endogenous or wild-type full-length ACD21 polypeptide.
  • ACD21 polypeptides may include sequences with one or more amino acids replaced/substituted with an amino acid different from an endogenous or wild-type amino acid present at a given amino acid position in a consecutive amino acid sequence of an endogenous or wild-type full-length ACD21 polypeptide.
  • ACD21 polypeptides may include sequences with one or more amino acids added to an otherwise consecutive amino acid sequence of an endogenous or wild-type full- length ACD21 polypeptide.
  • Suitable ACD21 polypeptides may be identified from monocot and dicot plants.
  • suitable ACD21 polypeptides may include, for example, At1g54850 from Arabidopsis (SEQ ID NO: 13), as well as those listed in Table 4 and/or Table 5, and homologs and orthologs thereof.
  • an ACD21 polypeptide of the present disclosure has an amino acid sequence with at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at sf-6059413 Attorney Docket No.: 26223-20027.40 least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of any one of a ACD21 polypeptide described here
  • HSPB1 Polypeptides also referred to as hHSP1 polypeptides.
  • HSPB1 SEQ ID NO: 18
  • HSPB1 polypeptides of the present disclosure may be capable of being targeted to a specific nucleic acid sequence on a target nucleic acid and may be used in modifying the expression of a target nucleic acid, such as a gene.
  • an HSPB1 polypeptide contains at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, at least 60 consecutive amino acids, at least 70 consecutive amino acids, at least 80 consecutive amino acids, at least 90 consecutive amino acids, at least 100 consecutive amino acids, at least 120 consecutive amino acids, at least 140 consecutive amino acids, at least 160 consecutive amino acids, at least 180 consecutive amino acids, at least 200 consecutive amino acids, at least 220 consecutive amino acids, at least 240 consecutive amino acids, or 241 or more consecutive amino acids of an endogenous or wild- type full-length HSPB1 polypeptide (e.g., SEQ ID NO: 18).
  • SEQ ID NO: 18 an endogenous or wild- type full-length HSPB1 polypeptide
  • HSPB1 polypeptides include sequences with one or more amino acids removed from the consecutive amino acid sequence of an endogenous or wild-type full-length HSPB1 polypeptide.
  • HSPB1 polypeptides may include sequences with one or more amino acids replaced/substituted with an amino acid different from an endogenous or wild-type amino acid present at a given amino acid position in a consecutive amino acid sequence of an endogenous or wild-type full-length HSPB1 polypeptide.
  • HSPB1 polypeptides may include sequences with one or more amino acids added to an otherwise consecutive amino acid sequence of an endogenous or wild-type full-length HSPB1 polypeptide.
  • a HSPB1 polypeptide of the present disclosure has an amino acid sequence with at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of any one of a HSPB1 polypeptide described herein, such as, for example, the HSPB1 protein represented by Uniprot ID P04792, Gene ID No.
  • HSPB3 polypeptides also referred to as hHSP3 polypeptides.
  • HSPB3 (SEQ ID NO: 20) is a sHSP from Homo sapiens.
  • HSPB3 polypeptides of the present disclosure may be capable of being targeted to a specific nucleic acid sequence on a target nucleic acid and may be used in modifying the expression of a target nucleic acid, such as a gene.
  • an HSPB3 polypeptide contains at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, at least 60 consecutive amino acids, at least 70 consecutive amino acids, at least 80 consecutive amino acids, at least 90 consecutive amino acids, at least 100 consecutive amino acids, at least 120 consecutive amino acids, at least 140 consecutive amino acids, at least 160 consecutive amino acids, at least 180 consecutive amino acids, at least 200 consecutive amino acids, at least 220 consecutive amino acids, at least 240 consecutive amino acids, or 241 or more consecutive amino acids of an endogenous or wild- type full-length HSPB3 polypeptide (e.g., SEQ ID NO: 20).
  • SEQ ID NO: 20 an endogenous or wild- type full-length HSPB3 polypeptide
  • HSPB3 polypeptides include sequences with one or more amino acids removed from the consecutive amino acid sequence of an endogenous or wild-type full-length HSPB3 polypeptide.
  • HSPB3 polypeptides may include sequences with one or more amino acids replaced/substituted with an amino acid different from an endogenous or wild-type amino acid present at a given amino acid position in a consecutive amino acid sequence of an endogenous or wild-type full-length HSPB3 polypeptide.
  • HSPB3 polypeptides may include sequences with one or more amino acids added to an otherwise sf-6059413 Attorney Docket No.: 26223-20027.40 consecutive amino acid sequence of an endogenous or wild-type full-length HSPB3 polypeptide.
  • a HSPB3 polypeptide of the present disclosure has an amino acid sequence with at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of any one of a HSPB3 polypeptide described herein, such as, for example, the HSPB3 protein represented by Uniprot ID Q12988, Gene ID No.
  • HSPB5 polypeptides also referred to as hHSP5 polypeptides.
  • HSPB5 (SEQ ID NO: 22) is a sHSP from Homo sapiens.
  • HSPB5 polypeptides of the present disclosure may be capable of being targeted to a specific nucleic acid sequence on a target nucleic acid and may be used in modifying the expression of a target nucleic acid, such as a gene.
  • an HSPB5 polypeptide contains at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, at least 60 consecutive amino acids, at least 70 consecutive amino acids, at least 80 consecutive amino acids, at least 90 consecutive amino acids, at least 100 consecutive amino acids, at least 120 consecutive amino acids, at least 140 consecutive amino acids, at least 160 consecutive amino acids, at least 180 consecutive amino acids, at least 200 consecutive amino acids, at least 220 consecutive amino acids, at least 240 consecutive amino acids, or 241 or more consecutive amino acids of an endogenous or wild- type full-length HSPB5 polypeptide (e.g., SEQ ID NO: 22).
  • HSPB5 polypeptides include sequences with one or more amino acids removed from the consecutive amino acid sequence of an endogenous or wild-type full-length HSPB5 polypeptide.
  • HSPB5 polypeptides may include sequences with one or more amino acids replaced/substituted with an amino acid different from an endogenous or wild-type amino acid present at a given amino acid position in a consecutive amino acid sequence of an sf-6059413 Attorney Docket No.: 26223-20027.40 endogenous or wild-type full-length HSPB5 polypeptide (e.g., SEQ ID NO: 22).
  • HSPB5 polypeptides may include sequences with one or more amino acids added to an otherwise consecutive amino acid sequence of an endogenous or wild-type full- length HSPB5 polypeptide.
  • a HSPB5 polypeptide of the present disclosure has an amino acid sequence with at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of any one of a HSPB5 polypeptide described herein, such as, for example, the HSPB5 protein represented by Unipro
  • HSPB8 polypeptides also referred to as hHSP8 polypeptides.
  • HSPB8 (SEQ ID NO: 25) is a sHSP from Homo sapiens.
  • HSPB8 polypeptides of the present disclosure may be capable of being targeted to a specific nucleic acid sequence on a target nucleic acid and may be used in modifying the expression of a target nucleic acid, such as a gene.
  • an HSPB8 polypeptide contains at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, at least 60 consecutive amino acids, at least 70 consecutive amino acids, at least 80 consecutive amino acids, at least 90 consecutive amino acids, at least 100 consecutive amino acids, at least 120 consecutive amino acids, at least 140 consecutive amino acids, at least 160 consecutive amino acids, at least 180 consecutive amino acids, at least 200 consecutive amino acids, at least 220 consecutive amino acids, at least 240 consecutive amino acids, or 241 or more consecutive amino acids of an endogenous or wild- type full-length HSPB8 polypeptide (e.g., SEQ ID NO: 25).
  • SEQ ID NO: 25 an endogenous or wild- type full-length HSPB8 polypeptide
  • HSPB8 polypeptides include sequences with one or more amino acids removed from the consecutive amino acid sequence of an endogenous or wild-type full-length HSPB8 polypeptide.
  • HSPB8 polypeptides may include sequences with one or more amino acids sf-6059413 Attorney Docket No.: 26223-20027.40 replaced/substituted with an amino acid different from an endogenous or wild-type amino acid present at a given amino acid position in a consecutive amino acid sequence of an endogenous or wild-type full-length HSPB8 polypeptide (e.g., SEQ ID NO: 25).
  • HSPB8 polypeptides may include sequences with one or more amino acids added to an otherwise consecutive amino acid sequence of an endogenous or wild-type full- length HSPB8 polypeptide.
  • a HSPB8 polypeptide of the present disclosure has an amino acid sequence with at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of any one of a HSPB8 polypeptide described herein, such as, for example, the HSPB8 protein represented by Unipro
  • StykC (STKYC) domains [0167]
  • one or more ⁇ -crystalline domain polypeptides e.g. sHSPs
  • STKYC StykC domains
  • one or more ⁇ -crystalline domain polypeptides are not recombinant.
  • a StykC (STKYC) domain may be used to mediate co-targeting of a polypeptide of interest and a ⁇ -crystalline domain polypeptide.
  • certain aspects of the present disclosure are related to recombinant polypeptides that contain a heterologous StykC domain.
  • targeting ⁇ -crystalline domain polypeptides includes targeting of a polypeptide with a StykC domain (also referred to herein as a “Sticky-C”, “Sticky C”, “StkyC”, “STKYC”, and/or “STC” domain).
  • StykC domains may include, for example, a conserved domain of MBD6 or MBD7 that recruits a ⁇ -crystalline domain polypeptide such as, for example, the ACD15 and ACD21 proteins. Accordingly, polypeptides of the present disclosure may contain a StykC domain.
  • ⁇ -crystalline domain polypeptides such as, for example, ACD15 and ACD21
  • a StykC domain that is present in at least one to at least ten copies relative to each dCas9 protein in a SunTag system results in the establishment of a nucleation sf-6059413 Attorney Docket No.: 26223-20027.40 site for the aggregation of other ⁇ -crystalline domain proteins that may or may not be recombinant and are present diffusely around the nucleic acid target.
  • a StykC domain contains at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, at least 60 consecutive amino acids, at least 70 consecutive amino acids, at least 80 consecutive amino acids, at least 90 consecutive amino acids, at least 100 consecutive amino acids, at least 120 consecutive amino acids, at least 140 consecutive amino acids, at least 160 consecutive amino acids, at least 180 consecutive amino acids, at least 200 consecutive amino acids, at least 220 consecutive amino acids, at least 240 consecutive amino acids, or 241 or more consecutive amino acids of an endogenous or wild- type full-length StykC domain.
  • a StykC domain includes sequences with one or more amino acids removed from the consecutive amino acid sequence of an endogenous or wild-type full-length StykC domain.
  • a StykC domain may include sequences with one or more amino acids replaced/substituted with an amino acid different from an endogenous or wild-type amino acid present at a given amino acid position in a consecutive amino acid sequence of an endogenous or wild-type full-length StykC domain.
  • StykC domains may include sequences with one or more amino acids added to an otherwise consecutive amino acid sequence of an endogenous or wild-type full-length StykC domain.
  • a StykC domain of the present disclosure has an amino acid sequence with at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of any one of a StykC domain described herein, such as, for example, the StykC domain sequence provided in SEQ ID NO: 182.
  • the StykC domain may be derived from an MBD6 polypeptide.
  • Exemplary StykC domains from MBD6 polypeptides are illustrated below in Table 6.
  • sf-6059413 Attorney Docket No.: 26223-20027.40
  • Table 6 – StykC homologies from MBD6 polypeptides See Table XH for associated SEQ ID NOs.
  • the StykC domain may be derived from an MBD7 polypeptide.
  • Exemplary StykC domains from MBD7 polypeptides are illustrated below in Table 7.
  • Certain aspects of the present disclosure relate to co-targeting a target nucleic acid with 1) one or more of a genetic modifier polypeptide (including but not limited to, for example, a TRBIP1 polypeptide and/or an MQ1 polypeptide); and 2) one or more of a ⁇ - crystalline domain polypeptide.
  • a genetic modifier polypeptide including but not limited to, for example, a TRBIP1 polypeptide and/or an MQ1 polypeptide
  • Certain aspects of the present disclosure relate to co- targeting a target nucleic acid with 1) one or more of a non-genetic modifier polypeptide (including but not limited to, for example, a fluorescent marker polypeptide, such as a GFP); and 2) one or more of a ⁇ -crystalline domain polypeptide.
  • a non-genetic modifier polypeptide including but not limited to, for example, a fluorescent marker polypeptide, such as a GFP
  • a ⁇ -crystalline domain polypeptide such as GFP
  • Co-targeting a target nucleic acid with 1) one or more of a genetic modifier polypeptide (including but not limited to, for example, a TRBIP1 polypeptide and/or an MQ1 polypeptide); and 2) one or more of a ⁇ -crystalline domain polypeptide may result in modification of the nucleic acid.
  • the co-targeting may result in reduced expression of the target nucleic acid.
  • the genetic modifier polypeptide is a DNA methyltransferase polypeptide
  • the co-targeting may result in increased efficiency of methylation.
  • more than one type of genetic modifier polypeptide is co-targeted with one or more ⁇ -crystalline domain polypeptides.
  • one or more of a transcriptional repressor polypeptide e.g. a TRBIP1 polypeptide
  • a DNA methyltransferase polypeptide e.g.
  • an MQ1 polypeptide are co-targeted along with one or more ⁇ -crystalline domain polypeptides to a target nucleic acid.
  • the target nucleic acid may experience an increase in DNA methylation of about 1%, about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 100%, about 125%, about 150%, about 175%, about 200%, about 250%, or about 300% or more as sf-6059413 Attorney Docket No.: 26223-20027.40 compared to a corresponding control (e.g.
  • the target nucleic acid may experience a decrease in DNA methylation of about 1%, about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 100%, about 125%, about 150%, about 175%, about 200%, about 250%, or about 300% or more as compared to a corresponding control (e.g.
  • nucleic acid targeted with only ⁇ -crystalline domain polypeptides as described herein a nucleic acid targeted with only ⁇ -crystalline domain polypeptides as described herein.
  • the one or more genetic modifier polypeptides comprises one or more transcriptional repressor polypeptide (e.g. a TRBIP1 polypeptide) and/or a DNA methyltransferase polypeptide (e.g.
  • the target nucleic acid may experience a decrease in transcriptional expression of about 1%, about 2%, about 3%, about 4%, about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, or about 100% as compared to a corresponding control (e.g. a nucleic acid targeted with only ⁇ -crystalline domain polypeptides as described herein).
  • a corresponding control e.g. a nucleic acid targeted with only ⁇ -crystalline domain polypeptides as described herein.
  • the co-targeting results in the formation of a polypeptide aggregate or “cloud” of concentrated polypeptides around the target nucleic acid.
  • the concentrated polypeptides extend no more than about 1 kb on either side of the target nucleic acid. In some embodiments, the concentrated polypeptides extend no more than about 500 bp, 400 bp, 300 bp, 200 bp, or 100 bp on either side of the target nucleic acid. In some embodiments, the concentrated polypeptides extend more than about 1 kb on either side of the target nucleic acid. In some embodiments, the concentrated polypeptides extend more than about 2 kb, 3 kb, 4 kb, 5 kb, 6 kb, 7 kb, 8 kb, 9 kb, 10 kb, or more than 10 kb on either side of the target nucleic acid.
  • the size of the polypeptide aggregate depends on the amount of ⁇ -crystalline domain polypeptides available. In some embodiments, the expression level of the one or more co-targeted ⁇ -crystalline domain polypeptides is tuned (increased or decreased) in order to modulate the dimensions of the concentrated polypeptides around the target nucleic acid.
  • a genetic modifier polypeptide being targeted “to” the target nucleic acid and/or a ⁇ -crystalline domain polypeptide being targeted “to” the target nucleic acid may include the genetic modifier polypeptide and/or the ⁇ -crystalline domain polypeptide having activity no more than about 1 kb on either side of the target nucleic acid sequence, including, for example, no more than about 500 bp, 400 bp, 300 bp, 200 bp, 100 bp, 75 bp, 50 bp, 25 bp, 10 bp, 5 bp, 1 bp or fewer on either side of the target nucleic acid sequence.
  • the activity of the genetic modifier polypeptide is limited directly on the target nucleic acid sequence.
  • a genetic modifier polypeptide being targeted “to” the target nucleic acid and/or a ⁇ -crystalline domain polypeptide being targeted “to” the target nucleic acid may include the genetic modifier polypeptide and/or the ⁇ -crystalline domain polypeptide having activity more than about 1 kb on either side of the target nucleic acid sequence, such as, for example, more than about 2 kb, 3 kb, 4 kb, 5 kb, 6 kb, 7 kb, 8 kb, 9 kb, 10 kb, or more than 10 kb on either side of the target nucleic acid.
  • polypeptides described herein can form different clustering patterns when expressed in cells, which can be observed in some instances by, for example, fluorescent microscopy of fluorescently tagged versions of the polypeptides (such as, e.g., fluorescently- tagged ⁇ -crystalline domain polypeptides targeted to a target nucleic acid), immunogold microscopy against, for example, ⁇ -crystalline domain polypeptides, or other methods of observing selective localization of polypeptides of interest within a cell.
  • fluorescent microscopy of fluorescently tagged versions of the polypeptides such as, e.g., fluorescently- tagged ⁇ -crystalline domain polypeptides targeted to a target nucleic acid
  • immunogold microscopy against, for example, ⁇ -crystalline domain polypeptides or other methods of observing selective localization of polypeptides of interest within a cell.
  • Such nuclear bodies may form (and, in some instances, be visible by microscopy), comprising aggregates of, for example, ⁇ -crystalline domain polypeptides targeted to a target nucleic acid as described herein, and in some instances, further comprising additional types of polypeptides, such as, for example, one or more genetic modifier polypeptides.
  • one or more ⁇ -crystalline domain polypeptides that aggregate into a nuclear body are derived from one or more species selected from the group consisting of: Homo sapiens, Arabidopsis thaliana, Chlamydomonas reinhardtii, Sacchrolobus solfataricus, Zea mays, Solanum tuberosum, Solanum lycopersicum, and Oryza sativa, Deinococcus radiodurans. [0183] Such nuclear bodies may take a variety of different forms. For example, in some instances, ⁇ -crystalline domain polypeptide-containing aggregates may form foci, which may or may not be clearly visible by microscopy and may vary in number and size per nucleus.
  • ⁇ -crystalline domain polypeptide-containing aggregates sf-6059413 Attorney Docket No.: 26223-20027.40 may form many nuclear foci, while in other embodiments, ⁇ -crystalline domain polypeptide- containing aggregates may form few or no apparent nuclear foci.
  • the number of foci per nucleus may be relatively consistent across a sample (for example, across a population of cells collected from the same organism at the same time), while in other embodiments, the number of foci per nucleus may vary cell to cell across a given sample.
  • some or all foci in a nucleus may be relatively large in diameter (e.g., ⁇ 0.5-1, 1-2, or >2 microns in diameter); in other embodiments, some or all foci in a nucleus are relatively small in diameter (e.g., less than about 0.5 microns in diameter).
  • one or a plurality of foci may have a relatively strong signal compared to the background signal when assessed microscopically (for example, in the case of fluorescently tagged ⁇ -crystalline domain polypeptides, a relatively bright signal over background), while in other embodiments, one or a plurality of foci may have a relatively dim signal compared to the background signal, even to the point of not being effectively distinguishable from background.
  • the number of foci per cell may vary from, for example, none to 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11-15, 16-20, 21-30, 31-40, 41-50, or more than 50 per cell. In some embodiments, there are about 1, 2, or 3 foci per cell. In some embodiments, there are no observable foci per cell. In some embodiments, a cell or nucleus may demonstrate relatively strong diffuse localization signals (i.e., non-foci), indicating ACD polypeptides expressed at relatively high levels but with relatively low levels of aggregation. In some embodiments, one or more foci are observed in a cell nucleus.
  • one or more foci are observed outside the nucleus, such as, for example, in an organelle or in the cytoplasm.
  • relatively large foci and/or foci with relatively strong localization signals represent relatively high levels of ACD polypeptide multimerization.
  • relatively small foci and/or foci with relatively weak localization signals represent relatively low levels of ACD polypeptide multimerization.
  • foci represent aggregation around a particular genetic feature, such as, for example, chromocenters.
  • foci represent aggregation around a target nucleic acid sequence.
  • the size, number, distribution, and/or intensity of foci in a sample may be tuned to, for example, drive more or less multimerization of ACD polypeptides and/or increase, decrease, or otherwise modulate (e.g., concentrate or disperse) targeting of a genome modification polypeptide.
  • different ACD proteins form different localization patterns—e.g., different patterns of nuclear bodies.
  • ACD proteins from different sf-6059413 Attorney Docket No.: 26223-20027.40 organisms may form different patterns of nuclear bodies.
  • ACD proteins from Chlamydomonas reinhardtii, Zea mays, Sacchrolobus solfataricus, and/or human HSPB1 form distinct foci with about 1, 2, 3, 4, or 5 foci per nucleus and lead to high multimerization and localization of the majority of multimerized proteins to a target.
  • HSPB4, HSPB5, HSPB8, and ACD proteins from Oryza sativa and Deinococcus radiodurans lead to relatively lower levels of multimerization and form, for example, many smaller clusters.
  • many smaller clusters could be efficient in, for example, scanning a genome for target sites.
  • having many smaller foci may increase gene editing efficiency.
  • one or more foci overlap with chromocenters. In some embodiments, one or more foci partially overlap with chromocenters. In some embodiments, one or more foci do not overlap with chromocenters. In some embodiments, partial overlapping of foci and chromocenters indicates interaction between the ACD proteins and chromocenter complexes, such as, for example, interaction with polypeptides in the chromocenter complexes having sequence homology to an ACD protein (such as, for example, ACD15 or ACD21).
  • At least partial overlapping of foci and chromocenters indicates potential for targeting and/or increasing editing efficiency of nucleic acid sequences present in heterochromatin.
  • Targeting Domains Certain aspects of the present disclosure relate to recombinant polypeptides that contain a targeting domain and are capable of being targeted to a target nucleic acid.
  • a targeting domain generally refers to a polypeptide or amino acid sequence that is able to facilitate or is involved in facilitating, either directly or indirectly, targeting of a recombinant polypeptide to a target nucleic acid sequence.
  • the targeting domain may directly confer the specific targeting functionality of the genetic modifier polypeptide or ⁇ - crystalline domain polypeptide to the target nucleic acid, or the targeting domain may be associated with or interact with another agent that confers the specific targeting functionality of the genetic modifier polypeptide or ⁇ -crystalline domain polypeptide to the target nucleic acid.
  • the targeting domain may associate with a DNA-binding polypeptide that is able to be targeted to a target nucleic acid. Suitable targeting domains for use in the present disclosure are described herein and will be readily apparent to one of skill in the art.
  • the targeting domain is or may include a DNA-binding domain or have DNA-binding activity. In some embodiments, this DNA-binding activity is achieved through a heterologous DNA-binding domain (e.g. binds with a sequence affinity other than that of a DNA-binding domain that may be present in the endogenous protein).
  • recombinant polypeptides of the present disclosure including, for example, recombinant genetic modifier polypeptides, ⁇ -crystalline domain polypeptides, and/or non-genetic modifier polypeptides, contain a DNA-binding domain.
  • Recombinant polypeptides of the present disclosure may contain one DNA binding domain or they may contain more than one DNA-binding domain.
  • Heterologous DNA-binding domains may be recombinantly fused to a genetic modifier polypeptide, ⁇ -crystalline domain polypeptides, and/or a non-genetic modifier polypeptide of the present disclosure such that the polypeptide is then targeted to a specific nucleic acid sequence and can facilitate reduced expression and/or silencing of the specific nucleic acid.
  • the DNA-binding domain is a zinc finger domain.
  • a zinc finger domain generally refers to a DNA-binding protein domain that contains zinc fingers, which are small protein structural motifs that can coordinate one or more zinc ions to help stabilize their protein folding.
  • Zinc fingers were first identified as DNA-binding motifs (Miller et al., 1985), and numerous other variations of them have been characterized. Recent progress has been made that allows the engineering of DNA-binding proteins that specifically recognize any desired DNA sequence. For example, it was shown that a three-finger zinc finger protein could be constructed to block the expression of a human oncogene that was transformed into a mouse cell line (Choo and Klug, 1994). [0191] Zinc fingers can generally be classified into several different structural families and typically function as interaction modules that bind DNA, RNA, proteins, or small molecules. Suitable zinc finger domains of the present disclosure may contain two, three, four, five, six, seven, eight, or nine zinc fingers.
  • suitable zinc finger domains may include, for example, Cys2His2 (C2H2) zinc finger domains, C-x8-C-x5-C-x3-H (CCCH) zinc finger domains, multi-cysteine zinc finger domains, and zinc binuclear cluster domains.
  • C2H2 Cys2His2
  • CCCH C-x8-C-x5-C-x3-H
  • multi-cysteine zinc finger domains multi-cysteine zinc finger domains
  • zinc binuclear cluster domains may include, for example, Cys2His2 (C2H2) zinc finger domains, C-x8-C-x5-C-x3-H (CCCH) zinc finger domains, multi-cysteine zinc finger domains, and zinc binuclear cluster domains.
  • the DNA-binding domain binds a specific nucleic acid sequence.
  • the DNA-binding domain may bind a sequence that is at least 5 sf-6059413 Attorney Docket No.: 26223-20027.40 nucleotides, at least 6 nucleotides, at least 7 nucleotides, at least 8 nucleotides, at least 9 nucleotides, at least 10 nucleotides, at least 11 nucleotides, at least 12 nucleotides, at least 13 nucleotides, at least 14 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 25 nucleotides, at least 30 nucleotides, at least 35 nucleotides, at least 40 nucleotides, at least 45 nucleotides, at least 50 nucleotides, or a high number of nucleotides in length.
  • a genetic modifier polypeptide, ⁇ -crystalline domain polypeptide, and/or non-genetic modifier polypeptide of the present disclosure further contains two N-terminal CCCH zinc finger domains.
  • the zinc finger domain is an engineered zinc finger array, such as a C2H2 zinc finger array.
  • Engineered arrays of C2H2 zinc fingers can be used to create DNA-binding proteins capable of targeting desired genomic DNA sequences. Methods of engineering zinc finger arrays are well known in the art, and include, for example, combining smaller zinc fingers of known specificity.
  • An exemplary zinc finger is ZF108 which targets the FWA locus of Arabidopsis and whose amino acid sequence is provided in SEQ ID NO: 486.
  • genetic modifier polypeptides, ⁇ -crystalline domain polypeptide, and/or non-genetic modifier polypeptides of the present disclosure may contain a DNA-binding domain other than a zinc finger domain.
  • DNA-binding domains may include, for example, TAL (transcription activator-like) effector targeting domains, helix-turn-helix family DNA-binding domains, basic domains, ribbon-helix-helix domains, TBP (TATA-box binding protein) domains, barrel dimer domains, RHB domains (real homology domain), BAH (bromo-adjacent homology) domains, SANT domains, Chromodomains, Vietnamese domains, Bromodomains, PHD domains (plant homeo domain), WD40 domains, and MBD domains (methyl-CpG-binding domain).
  • TAL transcription activator-like effector targeting domains
  • helix-turn-helix family DNA-binding domains basic domains
  • ribbon-helix-helix domains T
  • the DNA-binding domain is a TAL effector targeting domain.
  • TAL effectors generally refer to secreted bacterial proteins, such as those secreted by Xanthomonas or Ralstonia bacteria when infecting various plant species.
  • TAL effectors are capable of binding promoter sequences in the host plant, and activate the expression of plant genes that aid in bacterial infection.
  • TAL effectors recognize plant DNA sequences through a central repeat targeting domain that contains a variable number of approximately 34 amino acid repeats.
  • TAL effector targeting domains can be engineered to target specific DNA sequences.
  • RNA-Guided DNA-Binding Proteins and Systems [0197]
  • the targeting domain is or may include an RNA-guided DNA binding protein.
  • the targeting domain may be an RNA-guided DNA binding protein (e.g.
  • CRISPR systems naturally use small base-pairing guide RNAs to target and cleave foreign DNA elements in a sequence-specific manner (Wiedenheft et al., 2012).
  • CRISPR systems in different organisms that may be used to target proteins of the present disclosure to a target nucleic acid.
  • One of the simplest systems is the type II CRISPR system from Streptococcus pyogenes.
  • crRNA CRISPR RNA
  • tracrRNA partially complementary trans-acting RNA
  • dCAS9 programmable RNA-dependent DNA-binding protein
  • duplex gRNA-dCAS9 binds target sequences without endonuclease activity has been used to tether regulatory proteins, such as transcriptional activators or repressors, to promoter regions in order to modify gene expression (Gilbert et al., 2013), and CAS9 transcriptional activators have been used for target specificity screening and paired nickases for cooperative genome engineering (Mali et al., 2013, Nature Biotechnology 31:833-838).
  • dCAS9 may be used as a modular RNA-guided platform to recruit different proteins to DNA in a highly specific manner.
  • One of skill in the art would recognize other RNA-guided DNA binding protein/RNA complexes that can be used equivalently to CRISPR-CAS9.
  • the CAS sf-6059413 Attorney Docket No.: 26223-20027.40 polypeptide may be a Cas9 polypeptide having an amino acid sequence that has at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of dCas9 (SEQ ID NO: 487).
  • the CAS polypeptide may be a Cas ⁇ polypeptide (also known as CasPhi and Cas12J) having an amino acid sequence that has at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of CasPhi (SEQ ID NO: 488).
  • Cas ⁇ polypeptide also known as CasPhi and Cas12J
  • Targeting using CRISPR-based systems may be beneficial over other genome targeting techniques in certain instances. For example, one need only change the guide RNAs in order to target fusion proteins to a new genomic location, or even multiple locations simultaneously.
  • guide RNAs can be extended to include sites for binding to proteins, such as the MS2 protein, which can be fused to proteins of interest.
  • Variations of CRISPR-based targeting may also be used herein (e.g. a SunTag system) to facilitate targeting of a recombinant polypeptides to a target nucleic acid, as will be readily apparent to one of skill in the art.
  • Suitable CRISPR-based targeting systems and variations thereof are well-known in the art and may be used in the embodiments of the present disclosure in view of the guidance provided herein.
  • WO2018/136783 describes a SunTag-based targeting system for use in plants.
  • WO2018/136783 is incorporated herein by reference in its entirety.
  • SunTag-based targeting in the context of the present disclosure may involve the recruitment of multiple copies of a genetic modifier polypeptide, ⁇ -crystalline domain polypeptide, and/or non-genetic modifier polypeptide to a target nucleic acid in plants via CRISPR-based targeting.
  • this specific targeting involves the use of a sf-6059413 Attorney Docket No.: 26223-20027.40 system that includes (1) a nuclease-deficient CAS9 (dCAS9) polypeptide that is recombinantly fused to a multimerized epitope, (2) a genetic modifier polypeptide, ⁇ - crystalline domain polypeptide, and/or non-genetic modifier polypeptide that is recombinantly fused to an affinity polypeptide, and (3) a guide RNA (gRNA).
  • dCAS9 portion of the dCAS9-multimerized epitope fusion protein is involved with targeting a target nucleic acid as directed by the guide RNA.
  • the multimerized epitope portion of the dCAS9-multimerized epitope fusion protein is involved with binding to the affinity polypeptide (which is recombinantly fused to a transcriptional repressor).
  • the affinity polypeptide portion of the genetic modifier polypeptide-, ⁇ -crystalline domain polypeptide-, and/or non-genetic modifier polypeptide-affinity polypeptide fusion protein is involved with binding to the multimerized epitope so that the genetic modifier polypeptide, ⁇ -crystalline domain polypeptide, and/or non-genetic modifier polypeptide can be in association with dCAS9.
  • targeting ⁇ -crystalline domain polypeptides includes targeting of a polypeptide with a StykC domain.
  • StykC domains include, for example, a conserved domain of MBD6 that recruits the ACD15 and ACD21 proteins.
  • ⁇ -crystalline domain polypeptides such as, for example, ACD15 and ACD21
  • a StykC domain that is present in at least one to at least ten copies relative to each dCas9 protein in a SunTag system results in the establishment of a nucleation site for the aggregation of other ⁇ -crystalline domain proteins that may or may not be recombinant and are present diffusely around the nucleic acid target.
  • certain aspects of the present disclosure involve CRISPR- based targeting of a target nucleic acid, which may involve use of a CRISPR-CAS9 targeting system.
  • CRISPR-CAS9 systems may involve the use of a CRISPR RNA (crRNA), a trans- activating CRISPR RNA (tracrRNA), and a CAS9 protein.
  • the crRNA and tracrRNA aid in directing the CAS9 protein to a target nucleic acid sequence, and these RNA molecules can be specifically engineered to target specific nucleic acid sequences.
  • certain aspects of the present disclosure involve the use of a single guide RNA (gRNA) that reconstitutes the function of the crRNA and the tracrRNA.
  • gRNA single guide RNA
  • certain aspects of the present disclosure involve a CAS9 protein that does not exhibit DNA cleavage activity sf-6059413 Attorney Docket No.: 26223-20027.40 (dCAS9).
  • gRNA molecules may be used to direct a dCAS9 protein to a target nucleic acid sequence.
  • Certain aspects of the present disclosure involving SunTag-based targeting relate to recombinant polypeptides that contain an affinity polypeptide.
  • Affinity polypeptides of the present disclosure may bind to one or more epitopes (e.g. a multimerized epitope).
  • an affinity polypeptide is present in a recombinant polypeptide that contains a transcriptional repressor polypeptide and an affinity polypeptide.
  • affinity polypeptides are known in the art and may be used herein.
  • the affinity polypeptide should be stable in the conditions present in the intracellular environment of a target cell of interest, such as, for example, a plant cell or an animal cell. Additionally, the affinity polypeptide should specifically bind to its corresponding epitope with minimal cross-reactivity.
  • the affinity polypeptide may be an antibody such as, for example, an scFv.
  • the antibody may be optimized for stability in the plant intracellular environment.
  • a suitable affinity polypeptide that is an antibody may contain an anti-GCN4 scFv domain.
  • affinity polypeptides include, for example, proteins with SH2 domains or the domain itself, 14-3-3 proteins, proteins with SH3 domains or the domain itself, the Alpha-Syntrophin PDZ protein interaction domain, the PDZ signal sequence, or proteins from plants which can recognize AGO hook motifs (e.g. AGO4 from Arabidopsis thaliana).
  • SunTag-based targeting relate to genetic modifier polypeptides, ⁇ -crystalline domain polypeptide, and/or non-genetic modifier polypeptides that contain an epitope or a multimerized epitope. Epitopes of the present disclosure may bind to an affinity polypeptide.
  • an epitope or multimerized epitope is present in a recombinant polypeptide that contains a dCAS9 polypeptide.
  • Epitopes of the present disclosure may be used for recruiting affinity polypeptides (and any polypeptides they may be recombinantly fused to) to a dCAS9 polypeptide.
  • the dCAS9 polypeptide may be fused to one copy of an epitope, multiple copies of an epitope, more than one different epitope, or multiple copies of more than one different epitope as further described herein.
  • epitopes and multimerized epitopes are known in the art and may be used herein.
  • the epitope or multimerized epitope may be any polypeptide sequence that is specifically recognized by an affinity polypeptide of the present disclosure.
  • Exemplary epitopes may include a c-Myc affinity tag, an HA affinity tag, a His affinity tag, an S affinity tag, a methionine-His affinity tag, an RGD-His affinity tag, a FLAG octapeptide, a strep tag or strep tag II, a V5 tag, a VSV-G epitope, and a GCN4 epitope.
  • exemplary amino acid sequences that may serve as epitopes and multimerized epitopes include, for example, phosphorylated tyrosines in specific sequence contexts recognized by SH2 domains, characteristic consensus sequences containing phosphoserines recognized by 14-3-3 proteins, proline rich peptide motifs recognized by SH3 domains, the PDZ protein interaction domain or the PDZ signal sequence, and the AGO hook motif from plants.
  • Epitopes described herein may also be multimerized.
  • Multimerized epitopes may include at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, or at least 24 or more copies of an epitope.
  • Multimerized epitopes may be present as tandem copies of an epitope, or each individual epitope may be separated from another epitope in the multimerized epitope by a linker or other amino acid sequence. Suitable linker regions are known in the art and are described herein.
  • the linker may be configured to allow the binding of affinity polypeptides to adjacent epitopes without, or without substantial, steric hindrance.
  • Linker sequences may also be configured to provide an unstructured or linear region of the polypeptide to which they are recombinantly fused.
  • the linker sequence may comprise e.g. one or more glycines and/or serines.
  • the linker sequences may be e.g. at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, or at least 10 or more amino acids in length.
  • Recombinant Nucleic Acids [0214] Certain aspects of the present disclosure relate to recombinant nucleic acids encoding recombinant polypeptides.
  • polynucleotide As used herein, the terms “polynucleotide,” “nucleic acid,” and variations thereof shall be generic to polydeoxyribonucleotides (containing 2-deoxy-D-ribose), to polyribonucleotides (containing D-ribose), to any other type of polynucleotide that is an N- glycoside of a purine or pyrimidine base, and to other polymers containing non-nucleotidic sf-6059413 Attorney Docket No.: 26223-20027.40 backbones, provided that the polymers contain nucleobases in a configuration that allows for base pairing and base stacking, as found in DNA and RNA.
  • nucleic acid sequence modifications for example, substitution of one or more of the naturally occurring nucleotides with an analog, and inter-nucleotide modifications.
  • symbols for nucleotides and polynucleotides are those recommended by the IUPAC-IUB Commission of Biochemical Nomenclature.
  • the present disclosure provides recombinant nucleic acids that encode a genetic modifier polypeptide capable of being targeted to the target nucleic acid, and/or 2) a ⁇ -crystalline domain polypeptide capable of being targeted to the target nucleic acid.
  • Sequences of the polynucleotides of the present disclosure may be prepared by various suitable methods known in the art, including, for example, direct chemical synthesis or cloning.
  • formation of a polymer of nucleic acids typically involves sequential addition of 3 '-blocked and 5 '-blocked nucleotide monomers to the terminal 5'-hydroxyl group of a growing nucleotide chain, wherein each addition is effected by nucleophilic attack of the terminal 5'-hydroxyl group of the growing chain on the 3 '- position of the added monomer, which is typically a phosphorus derivative, such as a phosphotriester, phosphoramidite, or the like.
  • the desired sequences may be isolated from natural sources by splitting DNA using appropriate restriction enzymes, separating the fragments using gel electrophoresis, and thereafter, recovering the desired polynucleotide sequence from the gel via techniques known to those of ordinary skill in the art, such as utilization of polymerase chain reactions (PCR; e.g., U.S. Pat. No.4,683,195).
  • PCR polymerase chain reactions
  • nucleic acids employed in the methods and compositions described herein may be codon optimized relative to a parental template for expression in a particular host cell.
  • Cells differ in their usage of particular codons, and codon bias corresponds to relative abundance of particular tRNAs in a given cell type.
  • codon bias corresponds to relative abundance of particular tRNAs in a given cell type.
  • codon optimization/deoptimization can provide control over nucleic acid sf-6059413 Attorney Docket No.: 26223-20027.40 expression in a particular cell type (e.g. bacterial cell, plant cell, mammalian cell, etc.).
  • a particular cell type e.g. bacterial cell, plant cell, mammalian cell, etc.
  • Methods of codon optimizing a nucleic acid for tailored expression in a particular cell type are well-known to those of skill in the art.
  • Methods of Identifying Sequence Similarity [0219] Various methods are known to those of skill in the art for identifying similar (e.g. homologs, orthologs, paralogs, etc.) polypeptide and/or polynucleotide sequences, including phylogenetic methods, sequence similarity analysis, and hybridization methods.
  • Phylogenetic trees may be created for a gene family by using a program such as CLUSTAL (Thompson et al. Nucleic Acids Res.22: 4673-4680 (1994); Higgins et al. Methods Enzymol 266: 383-402 (1996)) or MEGA (Tamura et al. Mol. Biol. & Evo. 24:1596-1599 (2007)).
  • CLUSTAL Thimpson et al. Nucleic Acids Res.22: 4673-4680 (1994); Higgins et al. Methods Enzymol 266: 383-402 (1996)) or MEGA (Tamura et al. Mol. Biol. & Evo. 24:1596-1599 (2007)).
  • CLUSTAL Thimpson et al. Nucleic Acids Res.22: 4673-4680 (1994); Higgins et al. Methods Enzymol 266: 383-402 (1996)) or MEGA (Tamura et al.
  • Homologous sequences may also be identified by a reciprocal BLAST strategy. Evolutionary distances may be computed using the Poisson correction method (Zuckerkandl and Pauling, pp.97-166 in Evolving Genes and Proteins, edited by V. Bryson and H.J. Vogel. Academic Press, New York (1965)). [0221] In addition, evolutionary information may be used to predict gene function. Functional predictions of genes can be greatly improved by focusing on how genes became similar in sequence (i.e. by evolutionary processes) rather than on the sequence similarity itself (Eisen, Genome Res.8: 163-167 (1998)).
  • sub-sequences that are particular to the clade. These sub-sequences, known as consensus sequences, can not only be used to define the sequences within each clade, but define the functions of these sf-6059413 Attorney Docket No.: 26223-20027.40 genes; genes within a clade may contain paralogous sequences, or orthologous sequences that share the same function (see also, for example, Mount, Bioinformatics: Sequence and Genome Analysis Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., page 543 (2001)).
  • Gapped BLAST in BLAST 2.0
  • Altschul et al. (1997) Nucleic Acids Res.25:3389.
  • PSI-BLAST in BLAST 2.0
  • PSI-BLAST can be used to perform an iterated search that detects distant relationships between molecules. See Altschul et al. (1997) supra.
  • the default parameters of the respective programs e.g., BLASTN for nucleotide sequences, BLASTX for proteins
  • Methods for the alignment of sequences and for the analysis of similarity and identity of polypeptide and polynucleotide sequences are well-known in the art.
  • sequence identity refers to the percentage of residues that are identical in the same positions in the sequences being analyzed.
  • sequence similarity refers to the percentage of residues that have similar biophysical / biochemical characteristics in the same positions (e.g. charge, size, hydrophobicity) in the sequences being analyzed.
  • Methods of alignment of sequences for comparison are well-known in the art, including manual alignment and computer assisted sequence alignment and analysis. This latter approach is a preferred approach in the present disclosure, due to the increased throughput afforded by computer assisted methods. As noted below, a variety of computer programs for performing sequence alignment are available, or can be produced by one of skill. [0227] The determination of percent sequence identity and/or similarity between any two sequences can be accomplished using a mathematical algorithm.
  • Computer implementations of these mathematical algorithms can be utilized for comparison of sequences to determine sequence identity and/or similarity.
  • Such implementations include, for example: CLUSTAL in the PC/Gene program (available from Intelligenetics, Mountain View, Calif.); the AlignX program, version10.3.0 (Invitrogen, Carlsbad, CA) and GAP, BESTFIT, BLAST, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Version 8 (available from Genetics Computer Group (GCG), 575 Science Drive, Madison, Wis., USA). Alignments using these programs can be performed using the default parameters.
  • the CLUSTAL program is well described by Higgins et al.
  • Single stranded polynucleotides hybridize when they associate based on a variety of well characterized physical-chemical forces, such as hydrogen bonding, solvent exclusion, base stacking and the like.
  • the stringency of a hybridization reflects the degree of sequence identity of the nucleic acids involved, such that the higher the stringency, the more similar are the two polynucleotide strands. Stringency is influenced by a variety of factors, including temperature, salt concentration and composition, organic and non-organic additives, solvents, etc.
  • sf-6059413 Attorney Docket No.: 26223-20027.40 [0230]
  • polynucleotide sequences that are capable of hybridizing to the disclosed polynucleotide sequences and fragments thereof under various conditions of stringency (see, for example, Wahl and Berger, Methods Enzymol.152: 399- 407 (1987); and Kimmel, Methods Enzymo.152: 507-511, (1987)).
  • Full length cDNA, homologs, orthologs, and paralogs of polynucleotides of the present disclosure may be identified and isolated using well-known polynucleotide hybridization methods.
  • Amino acid and polypeptide sequences of the present disclosure may also be compared to other amino acid or polypeptide sequences based on their three-dimensional structure. Homologs of polypeptide sequences may be those that have a similar folded structure as compared to a reference polypeptide sequence of the present disclosure. Programs such as AlphaFold or other similar folding algorithms known in the art may be used for such comparisons.
  • Target Nucleic Acids and Sequences [0232] Recombinant polypeptides of the present disclosure may be targeted to specific target nucleic acid acids to, for example, modify the target nucleic acid. [0233] Certain aspects of the present disclosure relate to target sites on target nucleic acids.
  • a target site generally refers to a location of a target nucleic acid that is targeted by a genetic modifier polypeptide and/or a ⁇ -crystalline domain polypeptide of the present disclosure (e.g. a nucleotide sequence of a target nucleic acid that can be bound by a targeting agent, such as e.g. a DNA-binding domain, in a recombinant polypeptide).
  • a targeting agent such as e.g. a DNA-binding domain
  • the target site may include both the nucleotide sequence targeted as well as at least 5, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, or at least 50 nucleotides or more on the 3’ side, the 5’ side, or both the 3’ and 5’ side of the nucleotide sequence in the target nucleic acid that is targeted.
  • the target site may contain at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 125, at least 150, at least 175, or at least 200 or more nucleotides.
  • a recombinant polypeptide is targeted to a particular locus.
  • a locus generally refers to a specific position on a chromosome or other nucleic acid molecule.
  • a locus may contain, for example, a polynucleotide that encodes a protein or an RNA.
  • a locus may also contain, for example, a non-coding RNA, a gene, a promoter, a 5’ sf-6059413 Attorney Docket No.: 26223-20027.40 untranslated region (UTR), an exon, an intron, a 3’ UTR, or combinations thereof.
  • a locus may contain a coding region for a gene.
  • a recombinant polypeptide is targeted to a gene.
  • a gene generally refers to a polynucleotide that can produce a functional unit (for example, a protein or a noncoding RNA molecule).
  • a gene may contain a promoter, an enhancer sequence, a leader sequence, a transcriptional start site, a transcriptional stop site, a polyadenylation site, one or more exons, one or more introns, a 5’ UTR, a 3’ UTR, or combinations thereof.
  • a gene sequence may contain a polynucleotide sequence encoding a promoter, an enhancer sequence, a leader sequence, a transcriptional start site, a transcriptional stop site, a polyadenylation site, one or more exons, one or more introns, a 5’ UTR, a 3’ UTR, or combinations thereof.
  • the target nucleic acid sequence may be located within the coding region of a target gene or upstream or downstream thereof. Moreover, the target nucleic acid sequence may reside endogenously in a target gene or may be inserted into the gene, e.g., heterologous, for example, using techniques such as homologous recombination.
  • a target gene of the present disclosure can be operably linked to a control region, such as a promoter, that contains a sequence that can be recognized by a targeting agent (e.g. a DNA-binding domain) or other factor in association with a targeting agent (e.g. a guide RNA) such that a recombinant polypeptide may be targeted to that sequence.
  • a targeting agent e.g. a DNA-binding domain
  • a targeting agent e.g. a guide RNA
  • the target nucleic acid sequence may be located in a region of chromatin.
  • the target nucleic acid sequence may be in a region of open chromatin or similar region of DNA that is generally accessible to transcriptional machinery.
  • Regions of open chromatin may be characterized by nucleosome depletion, nucleosome disruption, accessibility to transcriptional machinery, and/or a transcriptionally active state. Regions of open chromatin will be readily understood and identifiable by one of skill in the art.
  • Target genes or nucleic acid regions to be targeted for modification by a genetic modifier polypeptide of the present disclosure will be readily apparent to those of skill in the art depending on the particular application and/or purpose. For example, genes with particular agricultural importance may be targeted for reduced expression according to the methods of the present disclosure. Exemplary genes to be targeted for reduced expression may include, for example, those involved in light perception (e.g. PHYB, etc.), those involved in the circadian clock (e.g.
  • the target nucleic acid is endogenous to the organism in which the expression of one or more genes is to be reduced according to the methods described herein.
  • the target nucleic acid is a transgene of interest that has been inserted into a plant, an alga, a fungus, an animal (including, but not limited to a mammal and an insect), a bacterium, an archaea, and a protist.
  • Suitable target nucleic acids will be readily apparent to one of skill in the art depending on the particular need or outcome.
  • the target nucleic acid sequence may be in e.g. a region of euchromatin (e.g. highly expressed gene), or the target nucleic acid sequence may be in a region of heterochromatin (e.g. centromere DNA).
  • the target nucleic acid may be in a region of repressive chromatin.
  • Repressive chromatin generally refers to regions of chromatin where transcription is repressed or otherwise generally transcriptionally inactive. Exemplary regions of repressive chromatin include, for example, regions with repressive DNA methylation, compact chromatin, and/or no transcription).
  • Recombinant Expression Recombinant nucleic acids and/or recombinant polypeptides of the present disclosure may be present in host cells (e.g. plant, algal, fungal, animal (including, but not limited to mammalian and insect), a bacterial, archael, and/or protist cells).
  • recombinant nucleic acids are present in an expression vector and may encode a recombinant polypeptide, and the expression vector may be present in host cells (e.g. plant, algal, fungal, animal (including, but not limited to mammalian and insect), a bacterial, archael, and/or protist cells).
  • recombinant nucleic acids and/or recombinant polypeptides are present in host cells (e.g. plant, algal, fungal, animal (including, but not limited to mammalian and insect), a bacterial, archael, and/or protist cells) via direct introduction into the cell (e.g. via RNPs).
  • the genes encoding the recombinant polypeptides in the host cell may be heterologous to the host cell.
  • the host cell does not naturally produce one or more polypeptides of the present disclosure, and contains heterologous nucleic acid constructs capable of expressing one or more genes necessary for producing those molecules.
  • the host cell does not naturally produce one or more polypeptides of the present disclosure, and is provided the one or more polypeptides through exogenous delivery of the polypeptides directly to the host cell without the need to express a recombinant nucleic acid encoding the recombinant polypeptide in the host cell.
  • Recombinant polypeptides of the present disclosure may be introduced into host cells (e.g., plant, algal, fungal, animal (including, but not limited to mammalian and insect), a bacterial, archael, and/or protist cells) via any suitable methods known in the art.
  • host cells e.g., plant, algal, fungal, animal (including, but not limited to mammalian and insect), a bacterial, archael, and/or protist cells
  • a recombinant polypeptide can be exogenously added to host cells and the host cells are maintained under conditions such that the recombinant polypeptide is targeted to one or more target nucleic acids to reduce expression of the target nucleic acids in the host cells.
  • a recombinant nucleic acid encoding a recombinant polypeptide of the present disclosure can be expressed in host cells and the host cells are maintained under conditions such that the recombinant polypeptide is targeted to one or more target nucleic acids to reduce expression of the target nucleic acids in the host cells.
  • a recombinant polypeptide of the present disclosure may be transiently expressed in a host via viral infection of the host, or by introducing a recombinant polypeptide-encoding RNA into a host to facilitate reduced expression of a target nucleic acid of interest. Methods of introducing recombinant proteins via viral infection or via the introduction of RNAs into various host organisms are well known in the art.
  • TRV Tobacco rattle virus
  • TRV and other appropriate viruses may be used herein to facilitate editing in plants cells.
  • a recombinant polypeptide and a guide RNA may be exogenously and directly supplied to a host cell as a ribonucleoprotein (RNP) complex. This particular form of delivery is useful for facilitating transgene-free editing in various organisms.
  • RNP ribonucleoprotein
  • Modified guide RNAs which are resistant to nuclease digestion could also be used sf-6059413 Attorney Docket No.: 26223-20027.40 in this approach.
  • transgene-free calli from plants cells provided with an RNP could be used to regenerate whole plants.
  • a recombinant nucleic acid encoding a recombinant polypeptide of the present disclosure can be expressed in a host with any suitable plant expression vector.
  • Typical vectors useful for expression of recombinant nucleic acids in higher plants are well known in the art and include, for example, vectors derived from the tumor-inducing (Ti) plasmid of Agrobacterium tumefaciens (e.g., see Rogers et al., Meth. in Enzymol. (1987) 153:253-277). These vectors are plant integrating vectors in that on transformation, the vectors integrate a portion of vector DNA into the genome of the host plant. Exemplary A.
  • tumefaciens vectors useful herein are plasmids pKYLX6 and pKYLX7 (e.g., see of Schardl et al., Gene (1987) 61:1-11; and Berger et al., Proc. Natl. Acad. Sci. USA (1989) 86:8402-8406); and plasmid pBI 101.2 that is available from Clontech Laboratories, Inc. (Palo Alto, CA).
  • Typical vectors useful for expression of recombinant nucleic acids in mammalian cells are known in the art.
  • recombinant polypeptides of the present disclosure can be expressed as a fusion protein that is coupled to, for example, a maltose binding protein ("MBP"), glutathione S transferase (GST), hexahistidine, c-myc, or the FLAG epitope for ease of purification, monitoring expression, or monitoring cellular and subcellular localization.
  • MBP maltose binding protein
  • GST glutathione S transferase
  • hexahistidine hexahistidine
  • c-myc hexahistidine
  • FLAG epitope for ease of purification, monitoring expression, or monitoring cellular and subcellular localization.
  • a recombinant nucleic acid encoding a recombinant polypeptide of the present disclosure can be modified to improve expression of the recombinant protein in host cells by using codon preference/codon optimization to target preferential expression in host cells.
  • recombinant nucleic acid When the recombinant nucleic acid is prepared or altered synthetically, advantage can be taken of known codon preferences of the intended host where the nucleic acid is to be expressed.
  • recombinant nucleic acids of the present disclosure can be modified to account for the specific codon preferences and GC content preferences of monocotyledons and dicotyledons, as these preferences have been shown to differ (Murray et al., Nucl. Acids Res. (1989) 17: 477-498).
  • the present disclosure further provides expression vectors encoding recombinant polypeptides of the present disclosure.
  • a nucleic acid sequence coding for the desired recombinant nucleic acid of the present disclosure can be used to construct a recombinant expression vector which can be introduced into the desired host cell.
  • a recombinant expression vector will typically contain a nucleic acid encoding a recombinant protein of the sf-6059413 Attorney Docket No.: 26223-20027.40 present disclosure, operably linked to transcriptional initiation regulatory sequences which will direct the transcription of the nucleic acid in the intended host cell, such as tissues of a transformed plant or plant cell, or transformed mammal or mammalian cell.
  • Recombinant nucleic acids e.g.
  • plant expression vectors may include (1) a cloned gene under the transcriptional control of 5' and 3' regulatory sequences and (2) a dominant selectable marker.
  • plant expression vectors may also contain, if desired, a promoter regulatory region (e.g., one conferring inducible or constitutive, environmentally- or developmentally-regulated, or cell- or tissue-specific/selective expression), a transcription initiation start site, a ribosome binding site, an RNA processing signal, a transcription termination site, and/or a polyadenylation signal.
  • expression of a nucleic acid of the present disclosure may be driven (in operable linkage) with a promoter (e.g. a promoter functional in plants or a plant-specific promoter).
  • a promoter generally refers to a DNA sequence that contains an RNA polymerase binding site, transcription start site, and/or TATA box and assists or promotes the transcription and expression of an associated transcribable polynucleotide sequence such as, for example, a gene.
  • a plant promoter, or functional fragment thereof can be employed to e.g.
  • a mammalian promoter, or functional fragment thereof can be employed to, for example, control the expression of a recombinant nucleic acid of the present disclosure in transformed mammalian cells .
  • the selection of the promoter used in expression vectors will determine the spatial and temporal expression pattern of the recombinant nucleic acid in the modified host, e.g., the nucleic acid encoding the recombinant polypeptide of the present disclosure is only expressed in the desired tissue or at a certain time in the host’s development or growth.
  • promoters will express recombinant nucleic acids in all host tissues and are active under most environmental conditions and states of development or cell differentiation (i.e., constitutive promoters).
  • Other promoters will express recombinant nucleic acids in specific cell types (such as, for example in the context of plant hosts, leaf epidermal cells, mesophyll cells, root cortex cells; or, for example, in the context of mammalian hosts, skin epidermal cells, neurons, or immune cells or in specific tissues or organs (such as, for example in the context of plant hosts, roots, leaves or flowers; or, for example in the context of mammals, lung cells, bladder cells, or sf-6059413 Attorney Docket No.: 26223-20027.40 ovary cells and the selection will reflect the desired location of accumulation of the gene product.
  • the selected promoter may drive expression of the recombinant nucleic acid under various inducing conditions.
  • suitable constitutive plant promoters may include, for example, the core promoter of the Rsyn7, the core CaMV 35S promoter (Odell et al., Nature (1985) 313:810-812), CaMV 19S (Lawton et al., 1987), rice actin (Wang et al., 1992; U.S. Pat. No. 5,641,876; and McElroy et al., Plant Cell (1985) 2:163-171); ubiquitin (Christensen et al., Plant Mol. Biol.
  • No.5,683,439 the Nos promoter, the pEmu promoter, the rubisco promoter, the GRP 1 - 8 promoter, and other transcription initiation regions from various plant genes known to those of skilled artisans, and constitutive promoters described in, for example, U.S. Pat. Nos.5,608,149; 5,608,144; 5,604,121; 5,569,597; 5,466,785; 5,399,680; 5,268,463; and 5, 608,142.
  • suitable constitutive mammalian promoters may include, for example, CMV, SV40, UBC, PGK, EF1A, and CAGG in the context of mammalian systems, and COPIA and ACT5C in the context of Drosophila systems (see, e.g., Qin JY, et al. (2010) Systematic Comparison of Constitutive Promoters and the Doxycycline-Inducible Promoter. PLOS ONE 5(5): e10611. https://doi.org/10.1371/journal.pone.0010611).
  • Recombinant nucleic acids of the present disclosure may be expressed using an RNA Polymerase III (Pol III) promoter such as, for example, the U6 promoter or the H1 promoter (eLife 20132:e00471).
  • Pol III RNA Polymerase III
  • U6 U6 promoter
  • H1 H1 promoter
  • BMC Plant Biology 201414:327 BMC Plant Biology 201414:327
  • additional Pol III promoters could be utilized to, for example, simultaneously express many guide RNAs to many different locations in the genome simultaneously.
  • RNA Polymerase II RNA Polymerase II
  • CmYLCV CmYLCV promoter
  • 35S 35S promoter
  • tissue specific promoters in plants may include, for example, the lectin promoter (Vodkin et al., 1983; Lindstrom et al., 1990), the corn alcohol dehydrogenase 1 promoter (Vogel et al., 1989; Dennis et al., 1984), the corn light harvesting complex promoter (Simpson, 1986; Bansal et al., 1992), the corn heat shock protein promoter (Odell et al., Nature (1985) 313:810-812; Rochester et al., 1986), the pea small subunit RuBP carboxylase promoter (Poulsen et al., 1986; Cashmore et al., 1983), the Ti plasmid mannopine synthase promoter (Langridge et al., 1989), the Ti plasmid nopaline synthase promoter (
  • tissue specific promoters in mammals may include, for example, cytokeratin 18 and 19 for epithelial cell-specificity; the tissue kallikrein promoter for ductal cell-specificity in salivary glands; and the amylase 1C and aquaporin-5 (AQP5) promoters for acinar cell-specificity (Zheng C, Baum BJ. Evaluation of promoters for use in tissue-specific gene delivery. Methods Mol Biol.2008;434:205-19. doi: 10.1007/978-1- 60327-248-3_13. PMID: 18470647; PMCID: PMC2685069).
  • the promoter can direct expression of a recombinant nucleic acid of the present disclosure in a specific tissue or may be otherwise under more precise environmental or developmental control.
  • Such promoters are referred to here as “inducible” promoters.
  • Environmental conditions that may affect transcription by inducible promoters sf-6059413 Attorney Docket No.: 26223-20027.40 include, for example, pathogen attack, anaerobic conditions, or the presence of light.
  • inducible plant promoters include, for example, the AdhI promoter which is inducible by hypoxia or cold stress, the Hsp70 promoter which is inducible by heat stress, and the PPDK promoter which is inducible by light.
  • promoters under developmental control include, for example, promoters that initiate transcription only, or preferentially, in certain tissues, such as, in plants, leaves, roots, fruit, seeds, or flowers.
  • An exemplary promoter is the anther specific promoter 5126 (U.S. Pat. Nos.5,689,049 and 5,689,051).
  • inducible mammalian systems include, for example, Tet operator (TetO)-based systems, cumate-controlled operator systems, and rapamycin-induced interaction between FKBP12 (FK506 binding protein 12) and mTOR-based systems (Kallunki T, et al. How to Choose the Right Inducible Gene Expression System for Mammalian Studies? Cells.2019 Jul 30;8(8):796.
  • the operation of a promoter may also vary depending on its location in the genome. Thus, an inducible promoter may become fully or partially constitutive in certain locations. [0258] Moreover, any combination of a constitutive or inducible promoter, and a non- tissue specific or tissue specific promoter may be used to control the expression of various recombinant polypeptides of the present disclosure. [0259]
  • the recombinant nucleic acids of the present disclosure and/or a vector housing a recombinant nucleic acid of the present disclosure may also contain a regulatory sequence that serves as a 3’ terminator sequence.
  • a terminator sequence generally refers to a nucleic acid sequence that marks the end of a gene or transcribable nucleic acid during transcription.
  • terminators that may be used in the recombinant nucleic acids of the present disclosure.
  • a recombinant nucleic acid of the present disclosure may contain a 3’ NOS terminator.
  • recombinant nucleic acids of the present disclosure contain a transcriptional termination site. Transcription termination sites may include, for example, OCS terminators, rbcS-E9 terminators, NOS terminators, HSP18.2 terminators, and poly-T terminators.
  • Recombinant nucleic acids of the present disclosure may include one or more introns. Introns may be included in e.g. recombinant nucleic acids being expressed on a vector in a host cell. The inclusion of one of more introns in a recombinant nucleic acid to be expressed may be particularly helpful to increase expression in plant cells. sf-6059413 Attorney Docket No.: 26223-20027.40 [0261] Recombinant nucleic acids of the present disclosure may also contain selectable markers. A selectable marker can be used to assist in the selection of transformed cells or tissue due to the presence of a selection agent, such as an antibiotic or herbicide, where the selectable marker gene provides tolerance or resistance to the selection agent.
  • a selection agent such as an antibiotic or herbicide
  • the selection agent can bias or favor the survival, development, growth, proliferation, etc., of transformed cells expressing the selectable marker gene.
  • Selectable marker genes may include, for example, those conferring tolerance or resistance to antibiotics, such as kanamycin and paromomycin ( nptll ), hygromycin B (aph IV), streptomycin or spectinomycin ( aadA ) and gentamycin ( aac3 and aacC4), or those conferring tolerance or resistance to herbicides such as glufosinate ( bar or pat), dicamba (DMO) and glyphosate (aroA or Cp4-EPSPS).
  • antibiotics such as kanamycin and paromomycin ( nptll ), hygromycin B (aph IV), streptomycin or spectinomycin ( aadA ) and gentamycin ( aac3 and aacC4)
  • those conferring tolerance or resistance to herbicides
  • Selectable marker genes which provide an ability to visually screen for transformants may also be used such as, for example, luciferase or green fluorescent protein (GFP), or a gene expressing a beta glucuronidase or uidA gene (GUS) for which various chromogenic substrates are known.
  • GFP green fluorescent protein
  • GUS beta glucuronidase or uidA gene
  • a nucleic acid molecule provided herein contains a selectable marker gene selected from the group consisting of nptll, aph IV, aadA, aac3, aacC4, bar, pat, DMO, EPSPS, aroA, luciferase, GFP, and GUS.
  • Eukaryotes and Eukaryotic Cells Certain aspects of the present disclosure relate to eukaryotes and eukaryotic cells that contain recombinant polypeptides that are targeted to one or more target nucleic acids in the host/host cell in order to reduce expression of the target nucleic acid.
  • eukaryotes and eukaryotic cells refers to any of various species in the domain of Eukaryota, in which the cells contain a nucleus, including, for example, plant, algal, fungal, and animal (including, but not limited to mammalian and insect) species.
  • a “plant” refers to any of various photosynthetic, eukaryotic multi- cellular organisms of the kingdom Plantae, characteristically producing embryos, containing chloroplasts, having cellulose cell walls and lacking locomotion.
  • a “plant” includes any plant or part of a plant at any stage of development, including seeds, suspension cultures, plant cells, embryos, meristematic regions, callus tissue, leaves, roots, shoots, gametophytes, sporophytes, pollen, microspores, and progeny thereof. Also included are cuttings, and cell or tissue cultures.
  • plant tissue includes, for example, whole plants, plant cells, plant organs, e.g., leaves, stems, roots, sf-6059413 Attorney Docket No.: 26223-20027.40 meristems, plant seeds, protoplasts, callus, cell cultures, and any groups of plant cells organized into structural and/or functional units.
  • Various eukaryotic cells may be used in the present disclosure so long as they remain viable after being transformed or otherwise modified to express recombinant nucleic acids or house recombinant polypeptides.
  • the eukaryotic cell is not adversely affected by the transduction of the necessary nucleic acid sequences, the subsequent expression of the proteins or the resulting intermediates.
  • a broad range of species may be modified to incorporate recombinant polypeptides and/or polynucleotides of the present disclosure.
  • Suitable plants that may be modified include both monocotyledonous (monocot) plants and dicotyledonous (dicot) plants.
  • Suitable animals that may be modified include, for example, mammalian cells and insect cells.
  • suitable plants may include, for example, species of the Family Gramineae, including Sorghum bicolor and Zea mays; species of the genera: Cucurbita, Rosa, Vitis, Juglans, Fragaria, Lotus, Medicago, Onobrychis, Trifolium, Trigonella, Vigna, Citrus, Linum, Geranium, Manihot, Daucus, Arabidopsis, Brassica, Raphanus, Sinapis, Atropa, Capsicum, Datura, Hyoscyamus, Lycopersicon, Nicotiana, Solanum, Petunia, Digitalis, Majorana, Ciahorium, Helianthus, Lactuca, Bromus, Asparagus, Antirrhinum, Heterocallis, Nemesis, Pelargonium, Panieum, Pennisetum, Ranunculus, Senecio, Salpiglossis, Cucumis, Browaalia, Glycine, Pisum, Phaseolus
  • plant cells may include, for example, those from corn (Zea mays), canola (Brassica napus, Brassica rapa ssp.), Brassica species useful as sources of seed oil, alfalfa (Medicago sativa), rice (Oryza sativa), rye (Secale cereale), sorghum (Sorghum bicolor, Sorghum vulgare), millet (e.g., pearl millet (Pennisetum glaucum), proso millet (Panicum miliaceum), foxtail millet (Setaria italica), finger millet (Eleusine coracana)), sunflower (Helianthus annuus), safflower (Carthamus tinctorius), wheat (Triticum aestivum), duckweed (Lemna), soybean (Glycine max), tobacco (Nicotiana tab
  • suitable vegetables plants may include, for example, tomatoes (Lycopersicon esculentum), lettuce (e.g., Lactuca sativa), green beans (Phaseolus vulgaris), lima beans (Phaseolus limensis), peas (Lathyrus spp.), and members of the genus Cucumis such as cucumber (C. sativus), cantaloupe (C. cantalupensis), and musk melon (C. melo).
  • tomatoes Locopersicon esculentum
  • lettuce e.g., Lactuca sativa
  • green beans Phaseolus vulgaris
  • lima beans Phaseolus limensis
  • peas Lathyrus spp.
  • members of the genus Cucumis such as cucumber (C. sativus), cantaloupe (C. cantalupensis), and musk melon (C. melo).
  • Examples of suitable ornamental plants may include, for example, azalea (Rhododendron spp.), hydrangea (Macrophylla hydrangea), hibiscus (Hibiscus rosasanensis), roses (Rosa spp.), tulips (Tulipa spp.), daffodils (Narcissus spp.), petunias (Petunia hybrida), carnation (Dianthus caryophyllus), poinsettia (Euphorbiapulcherrima), and chrysanthemum.
  • suitable conifer plants may include, for example, loblolly pine (Pinus taeda), slash pine (Pinus elliotii), ponderosa pine (Pinus ponderosa), lodgepole pine (Pinus contorta), Monterey pine (Pinus radiata), Douglas-fir (Pseudotsuga menziesii), Western hemlock (Isuga canadensis), Sitka spruce (Picea glauca), redwood (Sequoia sempervirens), silver fir (Abies amabilis), balsam fir (Abies balsamea), Western red cedar (Thuja plicata), and Alaska yellow-cedar (Chamaecyparis nootkatensis).
  • leguminous plants may include, for example, guar, locust bean, fenugreek, soybean, garden beans, cowpea, mungbean, lima bean, fava bean, lentils, chickpea, peanuts (Arachis sp.), crown vetch (Vicia sp.), hairy vetch, adzuki bean, lupine (Lupinus sp.), trifolium, common bean (Phaseolus sp.), field bean (Pisum sp.), clover (Melilotus sp.) Lotus, trefoil, lens, and false indigo.
  • Examples of suitable forage and turf grass may include, for example, alfalfa (Medicago s sp.), orchard grass, tall fescue, perennial ryegrass, creeping bent grass, and redtop.
  • Examples of suitable crop plants and model plants may include, for example, Arabidopsis, corn, rice, alfalfa, sunflower, canola, soybean, cotton, peanut, sorghum, wheat, tobacco, and lemna. sf-6059413 Attorney Docket No.: 26223-20027.40
  • Examples of suitable animal cells include, for example, mammalian cells (such as, for instance, human cells), insect cells, and/or stem cells, such as, for example iPSCs.
  • iPSCs are reprogrammed using the methods described herein.
  • the eukaryotes and eukaryotic cells of the present disclosure may be genetically modified in that recombinant nucleic acids have been introduced into the eukaryotes and eukaryotic cells, and as such the genetically modified eukaryotes and/or eukaryotic cells do not occur in nature.
  • a suitable host of the present disclosure is e.g. one capable of expressing one or more nucleic acid constructs encoding one or more recombinant proteins.
  • transgenic and “genetically modified” are used interchangeably and refer to a eukaryote and/or eukaryotic cell that contains within its genome a recombinant nucleic acid.
  • the recombinant nucleic acid is stably integrated within the genome such that the polynucleotide is passed on to successive generations.
  • the recombinant nucleic acid is transiently expressed in the eukaryote and/or eukaryotic cell.
  • the recombinant nucleic acid may be integrated into the genome alone or as part of a recombinant expression cassette.
  • Transgenic is used herein to include any cell, cell line, callus, tissue, or whole or part of an organism, the genotype of which has been altered by the presence of exogenous nucleic acid including those transgenics initially so altered as well as those created by sexual crosses or asexual propagation from the initial transgenic.
  • Plant transformation protocols as well as protocols for introducing recombinant nucleic acids of the present disclosure into plants may vary depending on the type of plant or plant cell, e.g., monocot or dicot, targeted for transformation.
  • Suitable methods of introducing recombinant nucleic acids of the present disclosure into plant cells and subsequent insertion into the plant genome include, for example, microinjection (Crossway et al., Biotechniques (1986) 4:320-334), electroporation (Riggs et al., Proc. Natl. Acad Sci. USA (1986) 83:5602-5606), Agrobacterium-mediated transformation (U.S. Pat. No. 5,563,055), direct gene transfer (Paszkowski et al., EMBO J. (1984) 3:2717-2722), and ballistic particle acceleration (U.S. Pat. No.4,945,050; Tomes et al. (1995).
  • recombinant polypeptides of the present disclosure can be targeted to a specific organelle within a eukaryotic cell. Targeting can be achieved by providing the recombinant protein with an appropriate targeting peptide sequence.
  • targeting peptides include, for example, secretory signal peptides (for secretion or cell wall or membrane targeting), plastid transit peptides, chloroplast transit peptides, mitochondrial target peptides, vacuole targeting peptides, nuclear targeting peptides, and the like (e.g., see Reiss et al., Mol. Gen. Genet. (1987) 209(1):116-121; Settles and Martienssen, Trends Cell Biol (1998) 12:494-501; Scott et al., J Biol Chem (2000) 10:1074; and Luque and Correas, J Cell Sci (2000) 113:2485-2495).
  • Modified eukaryotes and eukaryotic cells may be grown in accordance with conventional methods
  • modified plants may be grown in accordance with conventional methods (e.g., see McCormick et al., Plant Cell. Reports (1986) 81-84.). These plants may then be grown, and pollinated with either the same transformed strain or different strains, with the resulting hybrid having the desired phenotypic characteristic. Two or more generations may be grown to ensure that the subject phenotypic characteristic is stably maintained and inherited and then seeds harvested to ensure the desired phenotype or other property has been achieved.
  • the present disclosure also provides plants derived from plants having modified expression of a target nucleic acid as a consequence of the methods of the present disclosure.
  • a plant having modified expression of a target nucleic acid as a consequence of the methods of the present disclosure may be crossed with itself or with another plant to produce an F1 plant.
  • one or more of the resulting F1 plants may also have modified expression of a target nucleic acid.
  • Progeny plants may also have an altered or modified phenotype as compared to a corresponding control plant.
  • the derived plants e.g. F1 or F2 plants resulting from or derived from crossing the plant having modified expression of a target nucleic acid as a consequence of the methods of the present disclosure with another plant
  • the derived plants can be selected sf-6059413 Attorney Docket No.: 26223-20027.40 from a population of derived plants.
  • methods of selecting one or more of the derived plants that (i) lack recombinant nucleic acids, and (ii) have modified expression of a target nucleic acid are provided.
  • the modified expression of the target nucleic acid may be heritable, progeny plants as described herein do not necessarily need to contain a recombinant polypeptide in order to maintain the modified expression of the target nucleic acid.
  • Methods of Modifying a Target Nucleic Acid [0283] Growing and/or cultivation conditions sufficient for the recombinant polypeptides and/or polynucleotides of the present disclosure to be expressed and/or maintained in the eukaryote/eukaryotic cell and to be targeted to and to modify one or more target nucleic acids of the present disclosure are well known in the art and include any suitable growing conditions disclosed herein.
  • the cell is grown under conditions sufficient to express a recombinant polypeptide of the present disclosure, and for the expressed recombinant polypeptides to be localized to the nucleus in order to be targeted to and modify the target nucleic acids (if those target nucleic acids are present in the nucleus).
  • nucleic acids present outside the nucleus such as, for example, in the cytoplasm or in an organelle, may be targeted for modification.
  • the conditions sufficient for the expression of the recombinant polypeptide (if being encoded from a recombinant nucleic acid) will depend on the promoter used to control the expression of the recombinant polypeptide.
  • growth conditions As noted above, growing conditions sufficient for the recombinant polypeptides of the present disclosure to be expressed and/or maintained in the eukaryote/eukaryotic cells and to be targeted to one or more target nucleic acids to modify one or more target nucleic acids may vary depending on a number of factors (e.g. species, use of inducible promoter, etc.).
  • Suitable growing conditions may include, for example, ambient environmental conditions, standard laboratory conditions, standard greenhouse conditions, growth in long days under standard environmental conditions (e.g.16 hours of light, 8 hours of dark), growth in 12 hour light : 12 hour dark day/night cycles, etc.
  • sf-6059413 Attorney Docket No.: 26223-20027.40 [0285]
  • Various time frames may be used to observe modification of a target nucleic acid according to the methods of the present disclosure.
  • Eukaryotes and/or eukaryotic cells may be observed/assayed for modified expression of a target nucleic acid after, for example, about 30 minutes, about 45 minutes, about 1 hour, about 2.5 hours, about 5 hours, about 7.5 hours, about 10 hours, about 15 hours, about 20 hours, about 1 day, about 5 days, about 10 days, about 15 days, about 20 days, about 25 days, about 30 days, about 35 days, about 40 days, about 45 days, about 50 days, or about 55 days or more after being cultivated/grown in conditions sufficient for a recombinant polypeptide to facilitate modification of a target nucleic acid.
  • a target nucleic acid of the present disclosure may have its expression modified as compared to a corresponding control nucleic acid.
  • a target nucleic acid of the present disclosure in a eukaryote/eukaryotic cell housing recombinant polypeptides of the present disclosure may have its expression decreased/downregulated by at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% as compared to a corresponding control.
  • a target nucleic acid of the present disclosure in a eukaryote/eukaryotic cell housing recombinant polypeptides of the present disclosure may have its expression increased/upregulated by at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% as compared to a corresponding control.
  • a control may be a corresponding eukaryote or eukaryotic cell that does not contain recombinant polypeptides of the present disclosure (e.g. wild-type plant or plant cell).
  • sf-6059413 Attorney Docket No.: 26223-20027.40
  • a target nucleic acid may have its expression modified (e.g.
  • a control nucleic acid may be a corresponding nucleic acid from a eukaryote or eukaryotic cell that does not contain a nucleic acid encoding a recombinant polypeptide of the present disclosure.
  • Comparisons in the present disclosure may also be in reference to corresponding control eukaryotes/eukaryotic cells.
  • a control plant or plant cell may be a plant or plant cell that does not contain a recombinant polypeptide (e.g. a wild-type plant) of the present disclosure.
  • nucleic acid-containing sample e.g. plants, animals, plant tissues, animal tissues, animal cells, or plant cells.
  • recombinant polypeptides of the present disclosure may facilitate an epigenetic change or other chromatin modification at the target nucleic acid that does not involve a change to the actual nucleic acid nucleotide sequence.
  • Such epigenetic changes and/or chromatin modifications at the target nucleic acid may include, for example, increased DNA methylation, H3K27me3 deposition, H3K4me3 removal/demethylation, and histone deacetylation (e.g. H3K9, H3K14, H3K27, and H4K16 deacetylation).
  • Target sf-6059413 Attorney Docket No.: 26223-20027.40 nucleic acids of the present disclosure may exhibit one or more of increased: DNA methylation, H3K27me3 deposition, H3K4me3 removal/demethylation, and histone deacetylation (e.g.
  • H3K9, H3K14, H3K27, and H4K16 deacetylation at a level or frequency that is at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% higher as compared to a corresponding control nucleic acid.
  • Target nucleic acids of the present disclosure may exhibit one or more of decreased: DNA methylation, H3K27me3 deposition, H3K4me3 removal/demethylation, and histone deacetylation (e.g. H3K9, H3K14, H3K27, and H4K16 deacetylation) at a level or frequency that is at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% reduced as compared to a corresponding control nucleic acid.
  • a control nucleic acid may be a corresponding nucleic acid from a eukaryote or eukaryotic cell that does not contain recombinant polypeptides of the present disclosure (e.g. wild-type plant or plant cells).
  • recombinant polypeptides of the present disclosure may interfere with transcription of the target nucleic acid. Such interference may include, e.g. interference with RNA Polymerase II transcription elongation and RNA Polymerase II Serine 5 (Ser-5) dephosphorylation.
  • Target nucleic acids of the present disclosure may exhibit one or more of interference with RNA Polymerase II transcription elongation and RNA Polymerase II Serine 5 (Ser-5) dephosphorylation at a level or frequency that is at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at sf-6059413 Attorney Docket No.: 26223-20027.40 least about 99%, or at least about 100% higher as compared to a corresponding control nucleic acid.
  • Serine 5 Serine 5
  • control nucleic acid may be a corresponding nucleic acid from a eukaryote or eukaryotic cell that does not contain recombinant polypeptides of the present disclosure (e.g. wild-type plant or plant cell).
  • the method of modifying a target nucleic acid in a eukaryotic cell includes a) providing a cell comprising 1) a genetic modifier polypeptide capable of being targeted to the target nucleic acid, and 2) one or more polypeptides comprising an ⁇ -crystallin domain (ACD), such as, for example, one or more small heat shock polypeptides (sHSPs) capable of being targeted to the target nucleic acid; and b) maintaining the cell under conditions whereby the genetic modifier polypeptide and the small HSP are targeted to the target nucleic acid, thereby modifying the target nucleic acid.
  • ACD ⁇ -crystallin domain
  • methods of targeting or aggregating polypeptides of interest to a target nucleic acid in a eukaryotic cell including includes a) providing a cell comprising 1) a polypeptide of interest capable of being targeted to the target nucleic acid, and 2) one or more polypeptides comprising an ⁇ -crystallin domain (ACD), such as, for example, one or more small heat shock polypeptides (sHSPs) capable of being targeted to the target nucleic acid; and b) maintaining the cell under conditions whereby the polypeptide of interest and the small HSP are targeted to the target nucleic acid.
  • ACD ⁇ -crystallin domain
  • the polypeptide of interest comprises a transcription factor, a transcriptional repressor polypeptide, and/or a visualizable marker protein.
  • the visualizable maker protein is a fluorescent protein, such as, for example a GFP.
  • the method involves targeting a polypeptide of interest, such as a genetic modifier polypeptide, to a target nucleic acid in a eukaryotic cell, thereby modifying the target nucleic acid.
  • the genetic modifier polypeptide may include, for example, a sequence specific endonuclease, a demethylation enzyme (such as, for example, TET1, or LSD1), a methyltransferase (such as, for example, TRBIP1-MQ1 or a Dnmt3 protein), a component of a methylation binding complex (such as, for example, MBD5 and/or MBD6), and/or a sequence specific recombinase.
  • a sequence specific endonuclease such as, for example, TET1, or LSD1
  • a methyltransferase such as, for example, TRBIP1-MQ1 or a Dnmt3 protein
  • a component of a methylation binding complex such as, for example, MBD5 and/or MBD6
  • a sequence specific recombinase such as, for example, MBD5 and/or MBD6
  • Exemplary sequence- specific recombinases include, for example, a CRISPR protein (such as, for example, a Cas sf-6059413 Attorney Docket No.: 26223-20027.40 protein), a TALEN protein, a zinc finger nuclease (ZFN) protein.
  • the method may further entail, for example, providing a gRNA that targets a Cas protein to the target nucleic acid.
  • Modifying a target nucleic acid may include various different mechanisms, such as, for example, epigenetic editing, genome editing, RNA editing (include, for example, A-to- I and C-to-U editing; see, e.g., https://www.frontiersin.org/articles/10.3389/fendo.2018.00762/full), targeted recombination, regulation of transcription, or modifications of any other process that occurs at specific regions of chromatin, or any combinations thereof.
  • ⁇ -crystalline domain polypeptides e.g. sHSPs
  • the small HSP may be a plant small HSP or an animal small HSP.
  • the sHSP may be a bacterial small HSP, a fungal small HSP, a protist small HSP, and an archaeal small HSP.
  • the sHSP may be a modified version of any natural sHSP, such as, for example, a recombinant and/or chimeric sHSP.
  • the methods described herein may entail providing more than one type of ⁇ -crystalline domain polypeptide (for example, ACD15 and ACD21) capable of being targeted to the target nucleic acid together or independently.
  • the methods described herein may entail providing more than one type of ⁇ - crystalline domain polypeptide, in which one or more types are targeted to the target nucleic acid, and one or more types are expressed diffusely.
  • the different types of ⁇ -crystalline domain polypeptides are from the same species.
  • the different types of ⁇ -crystalline domain polypeptides are from different species.
  • the different types of ⁇ -crystalline domain polypeptides are non-naturally occurring, such as modifications of naturally occurring ⁇ -crystalline domain polypeptides.. [0300]
  • more than one type of genetic modifier polypeptide capable of being targeted to the target nucleic acid is provided, sequentially or simultaneously.
  • one or more genetic modifier polypeptides is tethered to one or more the ⁇ -crystalline domain polypeptides.
  • additional, non- tethered ⁇ -crystalline domain polypeptides are co-expressed with ⁇ -crystalline domain polypeptides that are tethered to a genetic modifier polypeptide.
  • sf-6059413 Attorney Docket No.: 26223-20027.40
  • the genetic modifier polypeptide and/or the ⁇ -crystalline domain polypeptide comprises a StkyC domain.
  • the genetic modifier polypeptide and/or the ⁇ -crystalline domain polypeptide does not comprise a StkyC domain.
  • the methods and compositions of the present disclosure involving targeting a genetic modifier polypeptide and an ⁇ -crystalline domain polypeptide to a target nucleic acid may be used for making improved transcription factors for reprogramming of human stem cells, or making improved transcription factors for reprogramming plant cells to more easily regenerate plants from tissue culture or other cells.
  • a dCas9 is tethered to a location in the genome that is attached to ACDs in a SunTag system, and then another CRISPR of another type (e.g., CasPhi) is sent to an adjacent location that is also fused with the same ACD so that it becomes concentrated there. This may, for example, result in higher frequency of editing.
  • a SunTag-ACD forms a condensate, such that, for example, anything fused to the ACD (e.g., a nuclease and/or one or more other peptides, nucleic acids, or other domains or attachments of interest) will then concentrate along with the ACD condensate.
  • the ACD could be attached to a gRNA and thus concentrated there, or the ACD could be attached to a protein that is fused to a piece of DNA that could be used as a repair template.
  • a CRISPR system comprising a Cas polypeptide and a guide RNA (gRNA) is targeted to a target nucleic acid using the constructs provided herein with a non-truncated (e.g., about 20 bp long) gRNA that provides “normal” DNA cleavage for a corresponding CRISPR system using the same Cas polypeptide.
  • gRNA guide RNA
  • a CRISPR system is targeted to a target nucleic acid using the constructs provided herein with a truncated (e.g., less than about 20 bp long, such as, for example, up to 10, 11, 12, 13, 14, 15, 16, 17, 18, or 19 bp long) gRNA that provides reduced DNA cleavage compared to a corresponding non-truncated gRNA in a CRISPR system using the same Cas polypeptide.
  • the truncated gRNA is 14 bp long.
  • the truncated gRNA is 15 bp long.
  • both truncated and non-truncated gRNAs are co- targeted within the same cell.
  • both the co-targeted truncated and non-truncated gRNAs bind to the same type of Cas polypeptide (e.g., Cas9 or Cas ⁇ ). In some embodiments, each of the co-targeted truncated and non-truncated gRNAs bind to different types of Cas polypeptides.
  • amino acid and/or nucleotide sequences (as applicable) having at least 50%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one or more of SEQ ID NOs: 1 – 492.
  • kits [0307] Certain aspects of the present disclosure relate to an article of manufacture or kit comprising a polynucleotide, vector, cell, and/or composition described herein.
  • the kit further comprises a packed insert comprising instructions for the use of the polynucleotide, vector, cell, and/or composition.
  • the article of manufacture or kit further comprises one or more buffer, e.g., for storing, transferring, or otherwise using the polynucleotide, vector, cell, and/or composition.
  • the kit further comprises one or more containers for storing the polynucleotide, vector, cell, and/or composition.
  • Example 1 ACD15, ACD21, and SLN regulate accumulation and mobility of MBD6 to silence genes and transposable elements Summary
  • the Examples provided herein illustrate experiments on recruitment of small heat shock proteins to various genomic loci (for example, with a dead Cas9).
  • Some small heat shock proteins are known to form dynamic oligomeric assemblies (M. Haslbeck, E. Vierling, A First Line of Stress Defense: Small Heat Shock Proteins and Their Function in Protein Homeostasis. Journal of Molecular Biology.427, 1537–1548 (2015)).
  • DNA methylation mediates silencing of transposable elements (TEs) and genes in part via recruitment of the Arabidopsis MBD5/6 complex, which contains the methyl-CpG- binding domain (MBD) proteins MBD5 and MBD6, and the J-domain containing protein SILENZIO (SLN).
  • TEs transposable elements
  • MBD methyl-CpG- binding domain
  • SSN J-domain containing protein
  • Arabidopsis ACD21 and ACD15 drive accumulation of MBD5/6 complex silencing assemblies at methyl-CG sites and recruit SLN to maintain protein mobility in these assemblages.
  • sf-6059413 Attorney Docket No.: 26223-20027.40 Materials and Methods Plant materials and growth conditions [0314] All plants used in this study were in the Columbia-0 ecotype (Col-0) and were grown on soil in a greenhouse under long-day conditions (16h light / 8h dark). Plants grown for microscopy were plated on 1/2MS plates in growth rooms at room temperature ( ⁇ 25qC), with 16h of light and 8h of dark.
  • mutant lines were previously described: mbd5 mbd6 T-DNA double mutant composed of mbd5 T-DNA line SAILseq_750_A09.1 and mbd6 T-DNA line SALK_043927 (29); mbd5 mbd6 double mutant composed of mbd5 CRISPR/Cas9-generated indel and mbd6 T-DNA mutation SALK_043927 (29); sln (SALK_090484) (29), fwa rdr6-15 (41), lil-1 (30). Novel mutants and transgenic lines were generated as described below.
  • CRISPR/Cas9 mutants for ACD15.5 and ACD21.4 were generated with the pYAO::hSpCas9 system (44).
  • FOG.1H The guide RNAs were cloned sequentially in the AtU6-26-sgRNA cassette by overlapping PCR.
  • the PCR product was cloned into the SpeI site of the pYAO::hSpCas9 destination plasmid by In- Fusion (Takara, 639650). The procedure was repeated four times (two guides for each gene). The final vector was electroporated into AGLO agrobacteria and transformed in Col0 or sln mutant plants (SALK_090484). T1 plants were selected on 1 ⁇ 2 MS agar plates with hygromycin B and were genotyped by PCR and by sanger sequencing of PCR amplified genomic regions surrounding each guide RNA. The lines containing the desired mutations were propagated to identify null segregants for the Cas9 transgene, and to obtain homozygous mutations.
  • transgenic lines expressing FLAG-tagged constructs used for IP-MS and ChIP-seq were generated as follows. Genomic DNA was cloned into pENTR/D-TOPO vectors (Thermo Fisher), including endogenous promoters and introns, until the last base before the STOP codon.
  • the MBD5 gene was cloned starting from 1094 bp before the TSS, MBD6 from 294 bp before the TSS, SLN from 2351 bp before the TSS, ACD15.5 from 644 sf-6059413 Attorney Docket No.: 26223-20027.40 bp before the TSS, and ACD21.4 from 266 bp before the TSS.
  • the genes were then transferred via a Gateway LR Clonase II reaction (Invitrogen, 11791020) into a pEG302 based binary destination vector including a C-terminal 3xFLAG epitope tag.
  • T1 transgenic plants were selected with hygromycin B on 1 ⁇ 2 MS agar medium or with Basta (Glufosinate) on soil.
  • IP-MS and ChIP-seq experiments were done in T2 or T3 generation.
  • Transgenic plants expressing fluorescently tagged proteins were created using the pGWB553 (https://www.addgene.org/74883/), pGWB540 (https://www.addgene.org/74874/), and pGWB543 (https://www.addgene.org/74877/).
  • ACD15, ACD21, and SLN promoters and coding sequences were PCR amplified from genomic DNA (as explained above) and cloned into pENTR vectors. These coding sequences were then inserted into final destination vectors using Gateway LR Clonase II Enzyme mix(Catalog number: 11791020, ThermoFisher).
  • the SunTag StkyC was targeted using two guides (Guide 4 (ACGGAAAGATGTATGGGCTT; SEQ ID NO: 152) and Guide 17 (AAAACTAGGCCATCCATGGA; SEQ ID NO: 153) which were cloned as previously described (46).
  • This plasmid was electroporated into AGLO and transformed into Col0, mbd5 mbd6 (SALK_043927), acd15 acd21, acd21, acd15, and sln (SALK_090484), and fwa rdr-6- 15(41).
  • Frozen tissue was ground with a tissue lyser and resuspended in IP buffer (50 mM Tris ⁇ HCl pH 8.0, 150 mM NaCl, 5 mM EDTA, 20% sf-6059413 Attorney Docket No.: 26223-20027.40 glycerol, 0.1% Tergitol, 0.5 mM DTT, and cOmplete EDTA-free Protease Inhibitor Cocktail [Roche]). Samples were filtered with miracloth, disrupted with a Dounce homogenizer, and centrifuged for 10 min at 4°C at 20,000 g.
  • IP buffer 50 mM Tris ⁇ HCl pH 8.0, 150 mM NaCl, 5 mM EDTA, 20% sf-6059413 Attorney Docket No.: 26223-20027.40 glycerol, 0.1% Tergitol, 0.5 mM DTT, and cOmplete EDTA-free Protease Inhibit
  • CMMB carboxylate- modified magnetic beads
  • the protein was digested overnight with 0.1 ⁇ g LysC (Promega) and 0.8 ⁇ g trypsin (Thermo Scientific, 90057) at 37 °C. Following digestion, 1.2 ml of 100% acetonitrile was added to each sample to increase the final acetonitrile concentration to over 95% to induce peptide binding to CMMB. CMMB were then washed 3 times with 100% acetonitrile and the peptide was eluted with 65 ⁇ l of 2% DMSO. Eluted peptide samples were dried by vacuum centrifugation and reconstituted in 5% formic acid before analysis by LC-MS/MS.
  • ChIP-seq Chromatin Immunoprecipitation-sequencing
  • ChIP-seq libraries were prepared with the Ovation Ultra Low System V21-16 kit (NuGEN, 0344NB-A01) following the manufacturer’s instructions, with 15 cycles of PCR. Final libraries were sequenced with the Illumina NovaSeq 6000 System. ChIP-seq analysis [0327] Raw reads were filtered based on quality score and trimmed to remove Illumina adapters using Trim Galore (Babraham Institute).
  • ChIP-seq peaks were called with MACS2 (v 2.1.0) (51) using an FDR cutoff of 0.01.
  • the FLAG and RFP associated hyperchippable regions defined as peaks called in the anti-FLAG Col0 or anti-RFP Col0 controls, were subtracted from the peak sets of each sample.
  • the peaks of individual replicates for ACD15 and ACD21 were merged with homer mergePeaks using the option -d given (52). Overlap analysis of different ChIP-seq peak sets was performed with homer mergePeaks using the options -d given and -venn (52).
  • RNA samples for RT-qPCR experiments were purified using Direct-zol RNA miniprep kit (catalog number: R2052, Zymo Research) from unopened flower bud tissue or sf-6059413 Attorney Docket No.: 26223-20027.40 leaf tissue used in Figure 5.
  • cDNA samples were prepared using Superscript IV mastermix (catalog number: 11760500, Invitrogen) from ⁇ 400 ng of RNA and qPCR was performed using BioRad Sybergreen mastermix (catalog number: 1708882, Bio-Rad). Each qPCR experiment contained 2 technical replicates for each gene (either FWA or IPP2 housekeeping control).
  • RNA-Sequencing was performed on mature pollen samples isolated as previously described (53), with 6 biological replicates per genotype, grown and processed in 2 batches (3 replicates each).
  • 700-1000 PL of open flowers were harvested in 2-mL protein low bind tubes (Eppendorf).
  • 700 PL of Galbraith buffer 45 mM MgCl2,30 mMC6H5Na3O7.2H2O [Trisodium citrate dihydrate], 20 mM MOPS, 0.1% [v/v] Triton X- 100, pH 7) supplemented with 70 mM 2-Mercaptoethanol, were added to the tube, and the flowers were vortexed for 3 min at max speed in the cold room, to release the pollen from the anthers.
  • the extraction procedure was repeated two times, and the two aliquots of pollen in solution were combined.
  • RNA extraction was performed with the Zymo Direct-zol RNA MiniPrep kit (Zymo Research), with in-column DNase digestion. ⁇ 500 ng of RNA were used as input for library preparation using the TruSeq Stranded mRNA Library Prep Kit (Illumina), according to the manufacturer’s instructions. The final libraries were sequenced with the Illumina NovaSeq 6000 System.
  • RNA-sequencing reads were filtered based on quality score and trimmed to remove Illumina adapters using Trim Galore (Babraham Institute). The filtered reads were mapped to the Arabidopsis reference genome (TAIR10) using STAR (54) , allowing 5% of mismatches (-outFilterMismatchNoverReadLmax 0.05) and unique mapping (– outFilterMultimapNmax 1). MarkDuplicates from the Picard Tools suite was used to remove PCR duplicates.
  • the HTseq gene counts were used to perform the differential gene expression analysis using the R package DEseq2 (56) with a cutoff for significance of adjusted p-value ⁇ 0.05 and
  • Amino Acid Alignment [0335] Amino acid alignments of MBD5, MBD6, and MBD7 were performed using Clustal Omega multiple sequence alignment tool (https://www.ebi.ac.uk/Tools/msa/clustalo/). Amino acid sequences MBD5 (Accession No. Q9SNC0), MBD6 (Accession No. Q9LTJ1), MBD7 (Accession No. Q9FJF4) were obtained from UniProt protein database. The alignment was run with default settings.
  • Leaf counting was performed as mentioned previously (29) where total numbers of rosette and cauline leaves were counted in T1 generation of plants grown side-by-side under the same conditions.
  • Confocal Microscopy [0338] All confocal microscopy experiments were performed using the LSM 980 confocal microscope. Unless otherwise stated, all experiments were performed using a 40x magnification water objective lens. For all experiments using multiple fluorescent tags, we manually gated the excitation and emission spectrum to limit any cross reactivity of the samples.
  • Live plant samples were prepared as follows: 2 week-old seedlings were grown on 1 ⁇ 2 MS plats at room temperature, ⁇ 25C, and then transferred using forceps onto 1mm thick glass slides (FisherScientific, Cat No.12-550-08) containing de-ionized water (room temperature). Seedlings were oriented such that root tips were on the middle of the slide while leaves were extending from the top of slides. #1.5 Coverslips (FisherScientific, Cat No. 12-544-EP) were placed on top of the plant, gently, so as not to destroy or stress the seedling. Usually, 1-4 plants were placed on the one slide for imaging.
  • FRAP Experiment and Analysis [0340] FRAP experiments were performed on a LSM 980 using 40x magnification and water objective lens. Images of a region of interested were obtained as a “snap” in order to circle a region of interest to be bleached. Then an experiment was run such that 5 images were taken followed by a bleaching event using 100% laser excitation wavelength, dependent on the fluorescent protein being imaged, for 300 iterations. Signal was then tracked post bleaching for indicated amount of time. FRAP analysis was performed using EasyFRAP online analysis software (https://easyfrap.vmnet.upatras.gr/).
  • HSPs heat shock proteins
  • ACD conserved ⁇ -crystalline domains
  • sHSPs further recruit J-domain containing proteins (JDPs) which act as co- sf-6059413 Attorney Docket No.: 26223-20027.40 chaperones for HSP70 proteins to maintain protein homeostasis (14–17). Both sHSPs and JDP/HSP70 pairs have been shown to associate with and regulate disease related cellular condensates across species(18–21).
  • JDPs J-domain containing proteins
  • ACD15 and ACD21 are necessary and sufficient for the accumulation of high density MBD6 at methylated CG sites to silence genes and TEs, while also bridging SLN to MBD5/6 to maintain the high mobility of all complex components.
  • MBD5/6 complex assemblies can be formed at discrete foci outside of chromocenters, in an ACD15 and ACD21 dependent manner, to cause gene silencing.
  • ACD15 and ACD21 colocalize with MBD5 and MBD6 genome-wide and are essential for silencing [0344]
  • MBD5 and MBD6 pulled down peptides of ACD15 and ACD21 in the absence of SLN, while SLN did not pull down MBD5 and MBD6 in the absence of ACD15 and ACD21, suggesting that ACD15 and ACD21 bridge the interaction between MBD5/6 and SLN (FIGS.2A-2B). Consistent with this model, ACD15 and ACD21 pulled down MBD5 and MBD6 in the sln mutant background, and SLN pulled down ACD15 and ACD21 in the mbd5 mbd6 mutant background (FIG.2A).
  • ACD15 also pulled-down MBD5 and MBD6 but not SLN in the absence of ACD21, while ACD21 did not pull down MBD5 and MBD6 in the absence of ACD15 (FIG.2A).
  • MBD5/6 complex is organized such that MBD5 or MBD6 interact with ACD15, ACD15 interacts with ACD21, and ACD21 interacts with SLN (29) (FIG.2B).
  • FIG.2B To further study the organization and localization of MBD5/6 complex components we used live confocal imaging of root tips to determine the cellular localization of fluorescent-protein-tagged ACD15, ACD21, SLN, and MBD6.
  • ACD15, ACD21, and SLN all showed clear nuclear localization which correlated strongly with nuclear MBD6 (FIGS.2C-2E and FIGS.2G-2I).
  • ACD21, ACD15, and SLN all showed an increase in cytosolic signal in mbd5 mbd6 mutant plants which was rescued by coexpressing MBD6, demonstrating that all members of the complex require genetically redundant MBD5 or MBD6 for proper nuclear localization (FIGS.2C-2E).
  • the reduction of nuclear localization of SLN is also consistent with previous ChIP-seq experiments showing loss of chromatin bound SLN in the absence of MBD5 and MBD6(29).
  • ACD15 maintained nuclear localization and correlation with MBD6 in acd15 acd21 and sln mutant plants whereas ACD21 lost nuclear localization and correlation with MBD6 in acd15 and acd15 sf-6059413 Attorney Docket No.: 26223-20027.40 acd21 mutants, but not in the sln mutant (FIGS.2C-2D, FIGS.2J-2M). Finally, SLN nuclear localization and correlation with MBD6 decreased in acd15, acd21, and acd15 acd21, mutant plants (FIG.2E, FIGS.2N-2O).
  • MBD6 nuclear localization and mobility of MBD6 in root cells using live-cell, fluorescence, confocal microscopy.
  • MBD6 formed foci, which colocalized with ACD15, ACD21, and SLN foci (FIGS.3A-B, FIG.3J). MBD6 foci also overlapped with DAPI-staining chromocenters, as previously shown when MBD6 was overexpressed in leaf cells (FIG.3K) (36).
  • FAP fluorescence recovery after photobleaching
  • FRAP in wild- type plants revealed that MBD6 moves rapidly within nuclei with a FRAP recovery half time (t1/2) of ⁇ 3.60 seconds back into chromocenters after bleaching (FIGS.3C-3D, FIG.3M).
  • sf-6059413 Attorney Docket No.: 26223-20027.40 [0350]
  • MBD6 nuclear distribution or mobility was altered in sln mutants. Although MBD6 formed a similar number of nuclear foci in sln compared to wild- type plants, these foci showed somewhat reduced fluorescence intensity, suggesting that MBD6 was accumulating less efficiently within heterochromatin (FIGS.3A, 3E, 3F).
  • FRAP of MBD6 in sln mutant plants revealed a dramatic reduction in mobility and a lack of full recovery of signal post bleaching (FIGS.3C-3D and 3M). Similar FRAP experiments on ACD15 and ACD21 nuclear foci showed that both were highly mobile in wild-type (t1/2 of 3.63 and 4.30 seconds respectively), but were much less mobile and failed to recover full signal in sln mutant plants (FIGS.3M-3Q), and also showed decreased fluorescence intensity of foci in sln compared to wild-type (FIGS.3R-3S). SLN thus regulates the mobility, and to a lesser extent the accumulation of the MBD5/6 complex.
  • a decreased number of MBD6 foci and a lack of overlap of these foci with DAPI stained chromocenters was also observed in acd15, acd21, and acd15 acd21 sln mutant plants (FIGS. 3A, 3L).
  • ACD15 and ACD21 are required for MBD6 to efficiently concentrate into nuclear foci. This effect was specific to ACD15 and ACD21 since loss of IDM3 (LIL), an ACD protein in the MBD7 complex (28), did not affect the MBD6 nuclear foci (FIGS.3T- 3U).
  • ACD15/ACD21 are not necessary for MBD6 to bind meCG sites, they are needed for high accumulation of MBD6 at high density meCG sites, which is consistent with the decrease of observable MBD6 foci in acd mutants (FIGS.3A-3B).
  • sf-6059413 Attorney Docket No.: 26223-20027.40 [0353]
  • the StkyC domain of MBD6 is required for gene silencing and recruits ACD15 to the complex [0354]
  • the AlphaFold-predicted structure of MBD6 reveals two structured domains, the MBD and a C-terminal domain of unknown function, as well as two intrinsically disordered regions (IDRs) (FIGS.4A, 4I).
  • the C-terminal folded domain shares amino acid similarity with the C-terminus of two related MBD proteins, MBD5 and MBD7 (FIG.4J).
  • This region of MBD7 has been termed the StkyC domain, and is the proposed binding site for the ACD containing IDM3 protein, which belongs to the same family as ACD15 and ACD21(28).
  • MBD6C ⁇ also showed a dramatic reduction in nuclear foci compared to full length MBD6, a phenotype similar to that observed in acd15 acd21 mutants and consistent with loss of the ACD15 binding site (FIGS.4B-4C, 4M).
  • StkyC domain amino acids 167-224
  • MBD6C ⁇ +StkyC MBD6C ⁇ +StkyC was able to rescue MBD6 nuclear foci counts, and complemented the derepression of FWA in the mbd5 mbd6 mutant (FIGS.4B-4E).
  • MBD6C ⁇ +StkyC expressed in acd15 acd21 mutant plants formed very few nuclear foci, similar to the low number of MBD6C ⁇ foci in wild-type plants, demonstrating that foci localization rescue by the StkyC domain requires ACD15 and ACD21 (FIG.4D).
  • sf-6059413 Attorney Docket No.: 26223-20027.40 [0357] To determine if the StkyC domain is responsible for localizing ACD15 and ACD21 to the MBD5/6 complex, we performed fluorescent protein colocalization experiments by co-expressing ACD21-CFP or ACD15-YFP with MBD6-RFP, MBD6C ⁇ - RFP, or MBD6C ⁇ +StkyC RFP in mbd5 mbd6 mutants.
  • ACD15 and ACD21 also showed visibly higher cytosolic signal and lower nuclear signal when co-expressed with MBD6C ⁇ in mbd5 mbd6 (FIGS.4F-4G and 4N-4O).
  • ACD21 showed a reduced correlation with MBD6C ⁇ +StkyC in acd15 acd21 plants compared to wild type (0.48 vs 0.79), a reduction of colocalization with MBD6 across nuclei, and a visible increase in ACD21 cytosolic localization, suggesting that ACD21 requires ACD15 to associate properly with MBD6C ⁇ +StkyC (FIGS.4I and 4P-4Q).
  • StkyC domain of MBD6 is required for the function of the MBD5/6 complex, is needed for proper localization of ACD15 and ACD21, and mediates the accumulation of MBD6 at heterochromatic foci through ACD15 and ACD21.
  • ACD15 and ACD21 can mediate functional and targeted gene silencing foci.
  • ACD containing sHSP proteins are known to form dynamic oligomeric assemblies as part of their function in maintaining protein homeostasis (16), which could explain how ACD15/ACD21 drive high levels of MBD5/6 complex accumulation at meCG dense heterochromatin.
  • SunTag system(38) composed of a dead Cas9 protein sf-6059413 Attorney Docket No.: 26223-20027.40 (dCas9) fused to ten single-chain variable fragment (scFv) binding sites, targeted to the promoter of the euchromatic FWA gene (39).
  • SunTag StkyC now only displayed diffuse nucleoplasmic GFP signal, lacking detectable foci (FIGS.5B, 5C). This pattern was similar to control plants expressing a SunTag-TET1 system(40), in which the scFv was fused to GFP and the human TET1 protein, suggesting that the GFP foci observed in SunTagStkyC is not a general property or artifact of the SunTag system (FIG.5J).
  • SunTag StkyC was able to silence FWA in mbd5 mbd6 mutant plants demonstrating that the tethering function of MBD6 could be largely replaced by targeting with the StkyC domain, and that silencing can occur without the methyl binding proteins (FIG.5L).
  • SunTag StkyC could also partially complement FWA derepression in the sln mutant background, while SunTag StkyC could not complement FWA derepression in the acd15 acd21 mutant background (FIGS.5M-5N).
  • ACD domain containing small HSPs are found in all eukaryotic lineages and are most well known for their role in regulating the aggregation of proteins (14, 15, 17, 34, 43).
  • the oligomerization capacities of ACD15 and ACD21 are sf-6059413 Attorney Docket No.: 26223-20027.40 specifically co-opted to control complex multimerization and silencing function. It seems likely that ACD proteins in other systems may also play important roles outside of general protein homeostasis.
  • TRBIP1 was identified from TRB protein Immunoprecipitation and Mass Spectrometry (IP-MS), and can be used to silence target genes through H3K4me3 demethylation and H3K27me3 deposition.
  • MQ1 is a bacteria DNA methyltransferase with a site mutation of Q147L, which can be used to target DNA methylation in SunTag system (Ghoshal et al., 2021).
  • FIGS.7A-7B The plasmid map and modules are shown in FIGS.7A-7B and the sequences of each module are shown in Table 2A.
  • Materials and Methods Construct Design [0367] To construct the SunTag-TRBIP1-MQ1, the original SunTag-MQ1 plasmid from (Ghoshal et al., 2021) was digested with BsiWI (ThermoFisher). Arabidopsis cDNA was used as a template to amplify TRBIP1 CDS fragment, using oligo 26987 (SEQ ID NO: 158) and oligo 27079 (SEQ ID NO: 159).
  • SunTag-TRBIP1-MQ1 (Ghoshal et al., 2021) was used as a template to amplify MQ1 PCR fragment, using oligo 26987 (SEQ ID NO: 160) and oligo 26988 (SEQ ID NO: 161).
  • TRBIP1-MQ1 fragment was amplified by using the TRBIP1 and MQ1 PCR fragments as template and using oligo 26987 and 26988 as primers. The SV40 sequence was thus synthesized in the oligos.
  • TRBIP1-MQ1 PCR fragment were cloned into SunTag vector by infusion (Takara).
  • TRBIP1 and MQ1 in the fusion protein may impact function
  • different constructs were prepared such that TRBIP1 and MQ1 is oriented either N-terminal or C-terminal to the position of the GFP protein in the fusion protein.
  • the protocol described in Ghoshal et al., 2021 was followed using the plasmid SunTag-MQ1.
  • gRNA FWA-17 that targets the sequence “AAAACTAGGCCATCCATGGA” (SEQ ID NO: 162) in the FWA Promoter
  • two sf-6059413 Attorney Docket No.: 26223-20027.40 consecutive PCRs using the plasmid SunTag-MQ1 as a template and the oligos 27141 ( SEQ ID NO: 163) and oligo 27296 (SEQ ID NO: 164) for PCR1; oligo 27297 (SEQ ID NO: 165) and oligo 27142 (SEQ ID NO: 166) for PCR2 were performed.
  • the overlapping PCR was conducted by using PCR1 and PCR2 as templates, and oligo 27141 and 27142 as primers.
  • the plasmid was digested with KpnI and MauBI, purified with column and used to perform infusion reaction (Takara) together with the overlapping PCR fragment.
  • Table 2A gRNA Molecules Targeting the FWA Promoter
  • Table 2B The sequence of each module in the SunTag-TRBIP1-MQ1 plasmid is listed in Table 2B. Transformation of fwa rdr6 Plants [0372] Agrobacterium AGL0 cells were transformed with the final binary vector containing the fusion proteins and the gRNA.
  • Arabidopsis fwa rdr6 plants were transformed using floral dip methods well-known in the art.
  • Flowering Time Measurements Progeny of transformed plants (T1s) were planted and screened for hygromycin- resistant plants that incorporate the T-DNA into the Arabidopsis genome, which confers resistance to hygromycin.
  • flowering time was measured and compared to early-flowering wild-type Col0 and late-flowering fwa rdr6 plants. Flowering time was measured by counting the total number of leaves (rosette and cauline) of each individual plant.
  • Plants transformed with the fusion constructs described above were evaluated for phenotypic differences as compared to corresponding control plants (e.g. fwa rdr6, SunTag- sf-6059413 Attorney Docket No.: 26223-20027.40 MQ1, and SunTag-TRBIP1-MQ1) which were suggestive of successful fusion protein targeting to the locus of interest and subsequent silencing at the locus.
  • Other analyses included measuring the expression level of the targeted locus in the transformed plants, measuring the degree of DNA methylation at the targeted locus in the transformed plants, and other assays well-known to those of skill in the art.
  • Table 2B Parameters and Sequences for Fusion Construct Modules Results
  • Example 3 Replacing the UBQ10 promoter with 20 different Arabidopsis promoters with weaker transcription activity resulted in reduced CG DNA hypermethylation in SunTag-TRBIP1-MQ1 transgenic lines Summary [0376] This Example describes experiments demonstrating a method of reducing the CG DNA hypermethylation in SunTag-TRBIP1-MQ1 by using weaker promoters to drive the expression of TRBIPMQ1. However, the UBQ10 promoter was still maintained to drive the expression of dCAS9-GCN4.
  • This Example describes experiments replacing UBQ10 promoter with 20 different Arabidopsis promoters with different activity to drive the expression of scFv antibody, GFP, TRBIP1 and MQ1 expression cassette (scFv-GFP-TRBIP1-MQ1).
  • the plasmid map and modules were shown in FIG.9A and 9B, and the sequences of each module are shown in Table 2B.
  • This Example describes experiments replacing the UBQ10 promoter with APX1 promoter to drive the expression of scFv-GFP-TRBIP1-MQ1.
  • the SunTag-TRBIP1-MQ1 was digested with NruI and AleI and purified with column.
  • fragment1 TBS insulator
  • oligo 28056 SEQ ID NO: 191
  • oligo 28057 SEQ ID NO: 192
  • the fragment2 was amplified by using Arabidopsis genomic DNA as template, as well as oligo 28058 (SEQ ID NO: 193) and oligo 28059 (SEQ ID NO: 194) as primers.
  • Fragment 3 was amplified by using the plasmid SunTag-TRBIP1-MQ1 as a template and oligo 28098 (SEQ ID NO: 195) and oligo 28061 (SEQ ID NO: 196) as primers. These fragments were gel purified and cloned into NruI/AlelI digested plasmid by infusion. By using this method, two PacI digestion sites were introduced into both ends of APX1 promoter through synthesized oligoes, which can be used for the further construction of the remaining plasmids with the other 19 promoters. [0379] This Example also describes construction of plasmids with the other 19 promoters.
  • SunTag-TRBIP1-MQ1-ProAPX1 was digested with PacI and purified by using Qiagen sf-6059413 Attorney Docket No.: 26223-20027.40 column.19 promoters were amplified by using Arabidopsis genomic DNA as template and the corresponding oligos as primers listed in Table 4A. The 19 PCR products were gel purified and cloned into the PacI digested SunTag-TRBIP1-MQ1 by using infusion (Takara).
  • SEQ ID NO: 235 Promoter 1 AT4G32020.
  • SEQ ID NO: 236 Promoter 2 AT1G19770.
  • SEQ ID NO: 237 Promoter 3 AT5G11770900bp.
  • SEQ ID NO: 238 Promoter 4 AT1G57720.
  • SEQ ID NO: 239 Promoter 5 AT1G06570.
  • SEQ ID NO: 240 Promoter 6 AT3G16100.
  • SEQ ID NO: 241 Promoter 7 AT4G28220.
  • SEQ ID NO: 242 Promoter 8 AT2G48020.
  • SEQ ID NO: 243 Promoter 9 AT3G50410.
  • SEQ ID NO: 244 Promoter 10 AT1G16640.
  • SEQ ID NO: 245 Promoter 11 AT1G79400.
  • SEQ ID NO: 246 Promoter 12 AT2G28860.
  • SEQ ID NO: 247 Promoter 13 AT1G07890 APX1.
  • SEQ ID NO: 248 Promoter 14 AT2G45190.
  • SEQ ID NO: 250 Promoter 16 AT4G18960.
  • SEQ ID NO: 251 Promoter 17 AT1G55480 MET1.
  • SEQ ID NO: 252 Promoter 18 AT2G33830 DRM2.
  • SEQ ID NO: 253 Promoter 19 AT4G19020 CMT2.
  • SEQ ID NO: 254 Promoter 20 AT1G69770 CMT3. Transformation of fwa rdr6 Plants [0382] Agrobacterium AGL0 cells were transformed with the final binary vector containing the fusion proteins and the gRNA. Arabidopsis fwa rdr6 plants were transformed using floral dip methods well-known in the art. Flowering Time Measurements [0383] Progeny of transformed plants (T1s) were planted and screened for hygromycin- resistant plants that incorporated the T-DNA into the Arabidopsis genome, which confers resistance to hygromycin.
  • FIGS.9A-9B, 10, 11, 12, and 13 demonstrate that using weaker promoters still maintained DNA methylation at the target locus, while notably reducing the genome wide hyper CG DNA methylation. However, the hyper CG DNA methylation throughout the nuclear genome and also in the chloroplast genome was still not completely removed, which is addressed in Example 4 below.
  • Example 4 Removing the CG DNA hypermethylation in SunTag-TRBIP1-MQ1 and SunTag-MQ1 transgenic lines by using StkyC domain
  • This Example describes construction of fusion constructs containing StkyC directly fused to MQ1 and TRBIP1-MQ1, as an individual effector protein, which aimed to further remove the genome wide CG DNA hypermethylation caused by SunTag-TRBIP1- MQ1.
  • StykC is a conserved domain of MBD6 that recruits the ACD15 and ACD21 proteins.
  • FIG.14A and FIG.14B Structures of the fusion constructs used in the SunTag-TRBIP1-MQ1 system are presented in FIG.14A and FIG.14B. In these figures, different regions of the construct are labeled, with each region representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these figures were also be prepared, and the sequences are described in Table 2B.
  • SunTag-StykC-TRBIP1-MQ1 the original SunTag-TRBIP1- MQ1 plasmid was digested with BsiWI (ThermoFisher).
  • SunTag-TRBIP1-MQ1 was used as a template to amplify TRBIP1-MQ1, using oligo 23069 (SEQ ID NO: 255) and oligo 27102 (SEQ ID NO: 256).
  • the sequence of StykC-Xten liner was ordered from IDT (see Table 2B), together with TRBIP1-MQ1 PCR fragment, were cloned into SunTag vector by infusion (Takara).
  • the UBQ10 promoter was used to drive the expression of scFv-GFP-TRBIP1- MQ1. [0390] In order to change the target sequence present in the different gRNAs, the protocol described in Ghoshal et al., 2021 was followed.
  • gRNA FWA- 17 that targets the sequence “AAAACTAGGCCATCCATGGA” (SEQ ID NO: 257) in the FWA Promoter
  • two consecutive PCRs using the plasmid SunTag-TRBIP1-MQ1 as a template and the oligos 27141 (SEQ ID NO: 258) and oligo 27296 ( SEQ ID NO: 259) as primers for PCR1; oligo 27297 (SEQ ID NO: 260) and oligo 27142 ( SEQ ID NO: 261) for PCR2 were performed.
  • the overlapping PCR were conducted by using PCR1 and PCR2 as templates, and oligo 27141 and 27142 as primers.
  • the plasmid was digested with KpnI and MaubI, after column purification, it was used to perform infusion reaction (Takara) together with the overlapping PCR fragment.
  • a tRNA-gRNA expression cassette (Xie, X et al, 2015, Proc Natl Acad Sci U S A.2015 Mar 17;112(11):3570-5) was used to deliver multiple gRNAs simultaneously with high expression levels. Due to the repetitive nature of these modules, gene synthesis, instead of traditional cloning, was used to generate the cassettes.
  • various alternative gRNA sequences described were tested, as presented in Table 2B.
  • sf-6059413 Attorney Docket No.: 26223-20027.40
  • Various other loci in the genome were also targeted to demonstrate the ability of the fusion protein to target a locus of interest.
  • Exemplary loci that were targeted include GA1, FLC, and RITA.
  • a series of different gRNA molecules were designed that target these loci.
  • the crRNA portion of these gRNAs are presented below in Table 2B.
  • the gRNA was a fusion of the crRNA and tracrRNA. Transformation of fwa rdr6 Plants [0394]
  • Agrobacterium AGL0 cells were transformed with the final binary vector containing the fusion proteins and the gRNA.
  • Plants transformed with the fusion constructs described above were evaluated for phenotypic differences as compared to corresponding control plants (e.g., fwa rdr6, SunTag- StykC-MQ1 and SunTag-StykC-TRBIP1-MQ1) for evidence suggestive of successful fusion protein targeting to the locus of interest and subsequent silencing at the locus.
  • the phenotype that was evaluated varied depending on the locus targeted.
  • Example 5 ACD15 and ACD21 chaperone system for use to increase specificity of TDG-TET1 mediated DNA demethylation in plants Summary [0398] This Example describes exemplary experimental guidelines for constructing a genome targeting system utilizing a dCas9 SunTag targeting system combined with plant ACD15-ACD21 mediated recruitment using the plant StkyC domain of MBD6 to increase the specificity of the TDG-TET1 DNA demethylase enzymes for their intended genomic targets.
  • constructs will be used to target TDG-TET1 to a specific locus of the genome using dCas9 targeting and decrease off target demethylation through accumulation of excess TDG-TET1 at the target site through ACD15/ACD21 oligomerization recruited by the StkyC MBD6 domain.
  • dCas9 targeting will be used to target TDG-TET1 to a specific locus of the genome using dCas9 targeting and decrease off target demethylation through accumulation of excess TDG-TET1 at the target site through ACD15/ACD21 oligomerization recruited by the StkyC MBD6 domain.
  • dCas9 system which will be targeted to the promoter of the CACTA gene in heterochromatin. It is expected this technology will cause the formation of foci in nuclei of cells corresponding to the dCas9 binding sites.
  • a GFP tag will be added to the dCAS9 targeting system.
  • FIGS.18A-18C Exemplary structures of these fusion constructs to be used in the CRISPR-CAS9 system are presented in FIGS.18A-18C. In this figure, different regions of the construct are numerically labeled, with each region representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these figures will also be prepared, as described below in Table 5A.
  • Table 5A Exemplary Parameters for Fusion Construct Modules sf-6059413 Attorney Docket No.: 26223-20027.40 Exemplary Construct Design [0402]
  • a SunTag plasmid will be used which contains the dCas9-1xHA-3xNLS-10xGCN4 as well as the two guide RNAs targeting the dCas9 to CACTA. Coding sequences for the StkyC MBD6 domain will be amplified from genomic DNA while TDG-TET1 will be amplified from an existing SunTag plasmid and will be cloned into a separate SunTag plasmid after cutting the plasmid with the appropriate restriction enzymes.
  • TDG-StkyC MBD6 -TET1 This will add the TDG-StkyC MBD6 -TET1 directly after the GFP and before the HA tag.
  • Features of SunTag-TDG-StkyC MBD6 -TET1 include a UBQ10 promoter, dCas9_1xHA_3xNLS_10xGCN4_OCS, gRNA-CACTA sequences, single chain variable fragment (scFV)_GFP_NLS_1xHA, StkyC MBD6 Domain, TDG, TET1, and the XTEN Linker.
  • All the different modules will be amplified by PCR using specific oligos and cloned into a binary plasmid using InFusion (Clontech) Transformation of Plants
  • Agrobacterium AGL0 cells will be transformed with the final binary vector containing the fusion proteins and the gRNA.
  • Arabidopsis wild type (Col-0) plants will be transformed using floral dip methods.
  • Microscopy Experiments [0405] Root meristems of seedlings selected for hygromycin resistance will be analyzed using an LSM980 confocal microscope. The GFP reporter will allow for observation of cellular localization and nuclear bodies.
  • ACD containing proteins are known to oligomerize and therefore GFP foci are expected to form in nuclei of cells.
  • a SunTag system expressing the scFV-GFP-TDG--TET1 lacking the StkyC MBD6 will be used as a control in all experiments.
  • GFP foci representing accumulation of scFV-GFP-TDG-StkyC MBD6 - TET1 targeted to the promoter of CACTA.
  • Multiple plants lines will be used for the whole genome bisulfite experiments. Leaf tissue will be harvested for these experiments. Downstream analysis will be performed according to previous protocols.
  • This technology will demonstrate the functional use of ACD15 and ACD21 to specifically accumulate TDG-TET1 at a targeted locus through a dCas9 targeting system and reduce off-target demethylation events.
  • SEQ ID NO: 262 UBQ10 promoter (DNA sequence).
  • SEQ ID NO: 263 dCas9_1xHA_3xNLS_10xGCN4_OCS (DNA sequence).
  • SEQ ID NO: 264 dCas9_1xHA_3xNLS_10xGCN4_OCS (protein sequence).
  • SEQ ID NO: 265 gRNA-FWA sequences (gRNA8 and scaffold).
  • SEQ ID NO: 266 gRNA-FWA sequences (gRNA17 and sf-6059413 Attorney Docket No.: 26223-20027.40 scaffold).
  • SEQ ID NO: 267 single chain variable fragment (scFV)_GFP_NLS_1xHA (DNA sequence).
  • SEQ ID NO: 268 single chain variable fragment (scFV)_GFP_NLS_1xHA (protein sequence).
  • SEQ ID NO: 269 StkyC MBD6 (DNA sequence).
  • SEQ ID NO: 270 StkyC MBD6 (protein sequence).
  • SEQ ID NO: 272 TET1 (protein sequence).
  • SEQ ID NO: 273 TDG (DNA sequence).
  • SEQ ID NO: 274 TDG (protein sequence).
  • SEQ ID NO: 275 XTEN (DNA sequence).
  • SEQ ID NO: 276 SunTag- TDG-StkyC MBD6 -TET1 Plasmid Sequence.
  • SEQ ID NO: 277 SunTag-TDG-TET1 Plasmid Sequence.
  • Example 6 ACD15 and ACD21 chaperone system for use to increase specificity of SDG2 histone methyltransferase Summary [0412] This Example describes planned experiments involving dCas9-epitope tail + antibody-GFP-SDG2-StkyC constructs, with control constructs lacking StkyC.
  • These constructs will be used to target SDG2 to a specific locus of the genome using dCas9 targeting and decrease off target histone methylation through accumulation of excess SDG2 at the target site through ACD15/ACD21 oligomerization.
  • dCas9 To demonstrate the efficacy of this technology we will create this dCas9 system which will be targeted to the promoter of the FWA gene.
  • FIGS.19A-19D Exemplary structures of these fusion constructs to be used in the CRISPR-CAS9 system are presented in FIGS.19A-19D. In these figures, different regions of the construct are numerically labeled, with each region representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these figures will also be prepared, as described below in Table 2A.
  • Table 2A Exemplary Parameters for Fusion Construct Modules sf-6059413 Attorney Docket No.: 26223-20027.40 Exemplary Construct Design [0416]
  • the current SunTag plasmid will used, which contains the dCas9-1xHA-3xNLS-10xGCN4 as well as the two guide RNAs targeting the dCas9 to FWA. Coding sequences of StkyC MBD6 domain and SDG2 will be amplified from genomic DNA and will be cloned into the SunTag plasmid after cutting the plasmid with the appropriate restriction enzymes.
  • SunTag-SDG2- StkyC MBD6 This will add the SDG2- StkyC MBD6 coding sequence directly after the GFP and before the HA tag.
  • Features of SunTag-SDG2- StkyC MBD6 include a UBQ10 promoter, dCas9_1xHA_3xNLS_10xGCN4, gRNA-FWA sequences, and the gRNA backbone including the tracrRNA and the gRNA terminator, single chain variable fragment (scFV)_GFP_NLS_1xHA, StkyC MBD6 Domain, SDG2.
  • ACD containing proteins are known to oligomerize and therefore GFP foci are expected to form in nuclei of cells.
  • a SunTag system expressing the scFV-GFP-SGD2 without StkyC MBD6 will be used as a control in all experiments.
  • ChIP-Seq [0420] To determine the specificity of the technology, ChIP-Seq experiments will be performed using antibodies that recognize H3K4me3. This experiment will be performed using both the SunTag-SDG2- StkyC MBD6 technology and the SunTag-SDG2 control.
  • sf-6059413 Attorney Docket No.: 26223-20027.40 Data Analysis [0421] Multiple seedlings of each SunTag-SDG2- StkyC MBD6 will be imaged using confocal microscopy to determine expression of GFP. If GFP signal is concentrated in nuclei of the cells then the construct will be assumed to properly localize without any misfolding. Z- stack of root meristems will be obtained across multiple plant lines to acquire images across many cells for each SunTag-SDG2- StkyC MBD6 construct. ImageJ software will be used to analyze the images and quantify the foci across multiple cells. H3K4me3 will be measured by ChIP-seq and gene expression of the target will be measured by RT-PCR and RNA-seq.
  • This technology will demonstrate the functional use of the SunTag- StkyC MBD6 targeting system to specifically accumulate SDG2 at a targeted loci through a dCas9 system and reduce off-target histone methylation events. By comparing to a control construct that does not contain the StkyC domain, we expect to see enhanced specificity of targeting of H3K4 trimethylation to the FWA locus.
  • it may also be beneficial to fuse the StkyC domain of MBD7 (StkyC MBD7 ) to SDG2. It may also be beneficial to fuse the human heat shock proteins HSPB1, HSPB3, or HSPB5 to SDG2.
  • SEQ ID NO: 278 UBQ10 promoter DNA sequence.
  • SEQ ID NO: 279 dCas9_1xHA_3xNLS_10xGCN4 DNA sequence.
  • SEQ ID NO: 280 dCas9_1xHA_3xNLS_10xGCN4 protein sequence.
  • SEQ ID NO: 281 gRNA-FWA sequences; gRNA4 and scaffold.
  • SEQ ID NO: 282 gRNA-FWA sequences; gRNA17 and sf-6059413 Attorney Docket No.: 26223-20027.40 scaffold.
  • SEQ ID NO: 283 single chain variable fragment (scFV)_GFP_NLS_1xHA; DNA sequence.
  • SEQ ID NO: 284 single chain variable fragment (scFV)_GFP_NLS_1xHA; protein sequence.
  • SEQ ID NO: 285 StkyC MBD6 DNA sequence.
  • SEQ ID NO: 286 StkyC MBD6 protein sequence.
  • SEQ ID NO: 287 SDG2 DNA sequence.
  • SEQ ID NO: 288 SDG2 protein sequence.
  • SEQ ID NO: 289 SunTag-SDG2-StkyC MBD6 Plasmid Sequence.
  • SEQ ID NO: 290 SunTag-SDG2 Plasmid Sequence.
  • Example 7 dCas9 directed accumulation of protein through ACD proteins of the MBD7 complex using the StkyC domain of MBD7 Summary [0426] This Example describes exemplary experimental guidelines for constructing a genome targeting system utilizing a dCas9 SunTag system combined with the plant StkyC domain of MBD7 (StkyC MBD7 ).
  • FIGS 20A-20B Exemplary structures of these fusion constructs to be used in the CRISPR-CAS9 system are presented in FIGS 20A-20B. In these figures, different regions of the construct are numerically labeled, with each region representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these figures will also be prepared, as described below in Table 7A.
  • Table 7A Exemplary Parameters for Fusion Construct Modules sf-6059413 Attorney Docket No.: 26223-20027.40 Exemplary Construct Design [0429]
  • the current SunTag plasmid will used which contains the dCas9-1xHA-3xNLS-10xGCN4 as well as the two guide RNAs targeting the dCas9 to FWA. Coding sequences of StkyC MBD7 domain will be amplified from genomic DNA and will be cloned into the SunTag plasmid after cutting the plasmid with the appropriate restriction enzymes. This will add the StkyC MBD7 domain directly after the GFP and before the HA tag.
  • SunTag-StkyC MBD7 include a UBQ10 promoter, dCas9_1xHA_3xNLS_10xGCN4, gRNA-FWA sequences, and the gRNA backbone including the tracrRNA and the gRNA terminator, single chain variable fragment (scFV)_GFP_NLS_1xHA, and StkyC MBD7 Domain.
  • All the different modules will be amplified by PCR using specific oligos and cloned into a binary plasmid using InFusion (Clontech) Transformation of Plants
  • Agrobacterium AGL0 cells will be transformed with the final binary vector containing the fusion proteins and the gRNA.
  • SEQ ID NO: 292 dCas9_1xHA_3xNLS_10xGCN4, DNA sequence.
  • SEQ ID NO: 293 dCas9_1xHA_3xNLS_10xGCN4, protein sequence.
  • SEQ ID NO: 294 gRNA-FWA sequences: gRNA4 and scaffold.
  • SEQ ID NO: 295 gRNA-FWA sequences: gRNA17 and scaffold.
  • SEQ ID NO: 296 single chain variable fragment (scFV)_GFP_NLS_1xHA, DNA sequence.
  • SEQ ID NO: 297 single chain variable fragment (scFV)_GFP_NLS_1xHA, protein sequence.
  • SEQ ID NO: 298 StkyC MBD7 ; DNA sequence.
  • SEQ ID NO: 299 StkyC MBD7 ; protein sequence.
  • SEQ ID NO: 300 SunTag-StkyC MBD7 Plasmid Sequence.
  • Example 8 Human small heat shock proteins cause accumulation of MBD6 at chromocenters in plants Summary [0437] This example describes experiments in which we created fusions of human small heat shock proteins HSPB1, HSPB3, HSPB5, and HSPB8 to the plant methyl-CpG-binding domain (MBD) protein 6 (MBD6), while also deleting the MBD6 StkyC domain that is known to be required for interaction with ACD15 and ACD21, and subsequent localization of MBD6 at Arabidopsis chromocenters.
  • MBD methyl-CpG-binding domain
  • human sf-6059413 Attorney Docket No.: 26223-20027.40 ⁇ -crystalline domain containing proteins
  • human sHSPs can functionally replace the accumulation function of plant ⁇ -crystalline domain proteins ACD15 and ACD21 and provide evidence for the use of ⁇ -crystalline domain containing proteins from other organisms in plants to accumulate proteins in a targeted manner.
  • human sHSPs can replace the function of MBD5/6 complex specific ⁇ -crystalline domain containing proteins ACD15 and ACD21.
  • sHSPs HSPB1, HSPB3, HSPB5, and HSPB8 were added to the C-terminus of the MBD6 protein in place of the StkyC domain of MBD6 (amino acids 168-225).
  • the promoter of MBD6 as well as the other regions of the protein, including the MBD of MBD6, were left intact.
  • These fusions constructs also contain a C-terminal RFP tag in order to observe their cellular localization using fluorescence, confocal microscopy.
  • FIG221A- 21B, 22A-22B, 23A-23B, and 24A-24B Exemplary structures of these fusion constructs used are presented in FIG221A- 21B, 22A-22B, 23A-23B, and 24A-24B. In these figures, different regions of the construct are numerically labeled, with each region representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these figures were also prepared, as described below in Table 8A.
  • Table 8A Parameters for Fusion Construct Modules Exemplary Construct Design
  • a PENTR_D (Invitrogen) plasmid that contains the genomic DNA of MBD6 without the C-terminal StkyC domain sf-6059413 Attorney Docket No.: 26223-20027.40 (promoter and coding sequence) was cloned into a separate PENTR_D (Invitrogen) vector along with the cDNA of human HSPB1, HSPB3, HSPB5, and HSPB8, and 8 using infusion reaction (Takara).
  • MBD6-human sHSP constructs include the MBD6 promoter, the MBD6 StkyC ⁇ coding sequence, HSPB1 sequence, HSPB3 coding sequence, HSPB5 coding sequence, HSPB8 coding sequence, mRFP sequence, and Nos terminator.
  • Z-stacks of root meristem tissue were imaged across multiple plant lines to confirm RFP signal and acquire data of protein localization.
  • Data Analysis was performed using ImageJ image analysis software. Using the 3D-projection application of ImageJ, reconstruction of root meristems were created using Z- stack data from microscopy experiments. These images allow for the direct comparison of MBD6-HSP phenotypes. Further, using the 3D objects counter application of ImageJ, the amounts of foci were quantified across multiple plant lines to directly compared to wild-type MBD6. [0445] Chromocenters are located on the nuclear periphery of plants and it has been shown that MBD6 localizes strongly to chromocenters when analyzed using microscopy.
  • MBD6 HSPB1 , MBD6 HSPB3 , MBD6 HSPB5 , and MBD6 HSPB8 RFP protein across multiple plant lines revealed clear nuclear localization of all of these fusion proteins. This is consistent with the correct folding and MBD6 fusion proteins allowing the protein to properly localize to the nuclei of cells.
  • MBD6 HSPB8 RFP construct did not demonstrate clear nuclear foci across cells, but instead resulted in a diffuse RFP signal throughout nuclei (FIG.25, far right panel). This phenotype is similar to the wild-type MBD6 localization without plant sHSPs ACD15 and ACD21. Therefore, this data suggests MBD6 HSPB8 RFP was not able to functionally replace the function of ⁇ -crystalline domain proteins in the MBD5/6 complex. It is known from the literature that HSPB1, HSPB3, and HSPB5 form oligomers, while HSPB8 only can form dimers (B. Tedesco et al., Insights on Human Small Heat Shock Proteins and Their Alterations in Diseases.
  • MBD6 HSPB8 RFP could be expressed with another human sHSP to form oligomers consistent with functions of some ⁇ -crystalline domain sf-6059413 Attorney Docket No.: 26223-20027.40 proteins that work together with other ⁇ -crystalline domain partner proteins (M. Haslbeck, S. Weinkauf, J. Buchner, Small heat shock proteins: Simplicity meets complexity. J Biol Chem. 294, 2121–2132 (2019)).
  • SEQ ID NO: 301 MBD6 promoter, DNA sequence.
  • SEQ ID NO: 302 MBD6 StkyC ⁇ coding DNA sequence.
  • SEQ ID NO: 303 MBD6 StkyC ⁇ protein sequence.
  • SEQ ID NO: 304 HSPB1 DNA sequence.
  • SEQ ID NO: 305 HSPB1 protein sequence.
  • SEQ ID NO: 306 HSPB3 DNA sequence.
  • SEQ ID NO: 307 HSPB3 protein sequence.
  • SEQ ID NO: 308 HSPB5 DNA sequence.
  • SEQ ID NO: 309 HSPB5 protein sequence.
  • SEQ ID NO: 310 HSPB8 DNA sequence.
  • SEQ ID NO: 311 HSPB8 protein sequence.
  • SEQ ID NO: 312 mRFP DNA sequence.
  • SEQ ID NO: 313 mRFP protein sequence.
  • SEQ ID NO: 314 Nos terminator DNA sequence.
  • SEQ ID NO: 315 MBD6-HSPB8 Plasmid Sequence.
  • SEQ ID NO: 316 MBD6-HSPB5 Plasmid Sequence.
  • SEQ ID NO: 317 MBD6-HSPB3 Plasmid Sequence.
  • SEQ ID NO: 318 MBD6-HSPB1 Plasmid Sequence.
  • Example 9 Human small heat shock protein target accumulation using CRISPR-CAS9 system Summary
  • This Example describes exemplary experimental guidelines for constructing a genome targeting system utilizing a dCas9 SunTag and human, ⁇ -crystalline domain containing, small heat shock proteins (sHSPs). These constructs may be used to target a protein of interest to a specific locus of the genome using dCas9 specific targeting and oligomerization through the ⁇ -crystalline domain of the human sHSPs.
  • the sf-6059413 Attorney Docket No.: 26223-20027.40 StkyC domain (which normally recruits the plant ⁇ -crystalline domain containing proteins) of the existing system will be replaced by different human sHSPs.
  • SunTag_sHSPs include a UBQ10 promoter, dCas9_1xHA_3xNLS_10xGCN4, gRNA-FWA sequences, and the gRNA backbone including the tracrRNA and the gRNA terminator, single chain variable fragment (scFV)_GFP_NLS_1xHA, HSPB1, HSPB3, HSPB5, HSPB8.
  • scFV single chain variable fragment
  • sf-6059413 Attorney Docket No.: 26223-20027.40
  • All the different modules will be amplified by PCR using specific oligos and cloned into a binary plasmid using InFusion (Clontech) Transformation of fwa-4 Plants
  • Agrobacterium AGL0 cells will be transformed with the final binary vector containing the fusion proteins and the gRNA.
  • Arabidopsis wild type (Col-0) plants will be transformed using floral dip methods.
  • Microscopy Experiments Root meristem of seedlings selected for hygromycin resistance will be analyzed using an LSM980 confocal microscope. GFP reporter will allow for observation of cellular localization and nuclear phenotypes.
  • sHSPs are known to oligomerize using ⁇ -crystalline domains, and therefore GFP foci are expected to form in nuclei of cells.
  • a SunTag system expressing the scFV-GFP without any human sHSPs will be used as a control in all experiments.
  • Data Analysis [0459] Multiple seedlings of each SunTag-sHSPs will be imaged using confocal microscopy to determine expression of GFP. If GFP signal is concentrated in nuclei of the cells then the construct will be assumed to properly localize without any misfolding. Z-stack of root meristems will be obtained across multiple plant lines to acquire images across many cells for each SunTag-sHSPs construct.
  • SEQ ID NO: 320 dCas9_1xHA_3xNLS_10xGCN4, DNA sequence.
  • SEQ ID NO: 321 dCas9_1xHA_3xNLS_10xGCN4, protein sequence.
  • SEQ ID NO: 322 gRNA-FWA sequences, gRNA4 and scaffold.
  • SEQ ID NO: 323 gRNA-FWA sequences, gRNA17 and scaffold.
  • SEQ ID NO: 324 single chain variable fragment (scFV)_GFP_NLS_1xHA, DNA sequence.
  • SEQ ID NO: 325 single chain variable fragment (scFV)_GFP_NLS_1xHA, protein sequence.
  • SEQ ID NO: 326 HSPB1, DNA sequence.
  • SEQ ID NO: 327 HSPB1, protein sequence.
  • SEQ ID NO: 328 HSPB3, DNA sequence.
  • SEQ ID NO: 329 HSPB3, protein sequence.
  • SEQ ID NO: 331 HSPB5, protein sequence.
  • SEQ ID NO: 332 HSPB8, DNA sequence.
  • SEQ ID NO: 333 HSPB8, protein sequence.
  • SEQ ID NO: 335 SunTag-HSPB5 Plasmid Sequence.
  • SEQ ID NO: 336 SunTag-HSPB1 Plasmid Sequence.
  • SEQ ID NO: 337 SunTag-HSPB3 Plasmid Sequence.
  • Example 10 Targeted hyperaccumulation of Zinc Finger binding domains using small heat shock protein fusion Summary
  • This example describes experiments in which we created a fusion of a Zinc finger domain to the StkyC domain of MBD6 (StkyC MBD6 ). These constructs were used to demonstrate that the StkyC MBD6 domain can cause hyperaccumulation of fusion proteins at the zinc finger binding site, even though only a single StkyC MBD6 domain has been added to the fusion.
  • StkyC MBD6 can lead to the hyperaccumulation of ZF domains at their binding sites.
  • multiple fusion constructs were prepared. The StkyC domain of MBD6 was added to the C-terminus of the ZF domain (amino acids 168-225).
  • This fusion construct contains a C-terminal 3x Flag for western blots and immunoprecipitation experiments and an sf-6059413 Attorney Docket No.: 26223-20027.40 RFP tag in order to observe cellular localization using fluorescence, confocal microscopy.
  • a control construct was also prepared which is lacking the StkyC MBD6 .
  • Materials and Methods Cloning of Fusion Proteins [0465] Structures of these fusion constructs used are presented in FIGS.34A-34C. In this figure, different regions of the construct are numerically labeled, with each region representing a respective module of the construct. The different modules of these constructs are presented below in Table 10A.
  • Table 10A Parameters for Fusion Construct Modules Construct Design
  • T1s Progeny of transformed plants (T1s) were planted and screened for hygromycin-resistant plants that incorporate the T-DNA into the Arabidopsis genome, which confers resistance to hygromycin.
  • sf-6059413 Attorney Docket No.: 26223-20027.40 Chromatin Immunoprecipitation Experiments (ChIP-Seq) [0469] Seedlings of both ZF alone vs ZF-StkyC MBD6 in both wild type and mbd5 mbd6 mutant plants were grown on plates for ⁇ 2 weeks and then harvested. ChIP-Seq experiments were performed using anti-flag tag beads.
  • ⁇ -crystalline domain protein- enhanced transcription factors could be used to make developmental regulators more potent, for example in stem cell reprogramming for human health or for morphoregulator factors used in plant regeneration processes.
  • SEQ ID NO: 338 UBQ10 promoter (DNA sequence).
  • SEQ ID NO: 339 StkyC MBD6 (DNA sequence).
  • SEQ ID NO: 340 StkyC MBD6 (protein sequence).
  • SEQ ID NO: 342 6xZF-3xFlag-2xSV40 NLS (protein sequence).
  • SEQ ID NO: 343 mRFP sequence (DNA sequence).
  • SEQ ID NO: 344 mRFP sequence (protein sequence).
  • SEQ ID NO: 345 Nos terminator (DNA sequence).
  • SEQ ID NO: 346 ZF-RFP Alone Plasmid Sequence.
  • SEQ ID NO: 347 ZF-StkyC MBD6 RFP Plasmid Sequence.
  • sf-6059413 Attorney Docket No.: 26223-20027.40
  • Example 11 ACD15 and ACD21 chaperone system for use to increase editing efficiency of Cas ⁇ .
  • This Example describes exemplary experimental guidelines for constructing a Cas ⁇ (also known as CasPhi and Cas12J) nuclease combined with plant ACD15-ACD21 mediated accumulation technology using the plant StkyC domain of MBD6 to increase accumulation of Cas ⁇ at the targeted locus and therefore increase editing efficiency.
  • Cas ⁇ also known as CasPhi and Cas12J
  • FIGS.36A-36B Exemplary structures of these fusion constructs to be used in the Cas ⁇ -StkyC MBD6 system are presented in FIGS.36A-36B.
  • Coding sequences for the StkyC MBD6 domain will be amplified from genomic DNA and will be cloned into a separate Cas ⁇ plasmid after cutting the plasmid with the appropriate restriction enzymes. This will add the StkyC MBD6 directly after the Cas ⁇ .
  • Features of Cas ⁇ -StkyC MBD6 include a UBQ10 promoter, Cas ⁇ , gRNA-FWA sequence, StkyC MBD6 Domain, U6 promoter, Terminator-PolyT, and RbcS-E9t.
  • Cas ⁇ -StkyC MBD6 will create double strand breaks, which will need to be repaired through nonhomologous end joining (NHEJ), leading to insertion and deletion mutations at the guide 4 site of FWA promoter.
  • NHEJ nonhomologous end joining
  • Cas ⁇ -StkyC MBD6 will demonstrate a higher editing frequency compared to Cas ⁇ alone due to the increased accumulation of Cas ⁇ -StkyC MBD6 .
  • Cas ⁇ currently shows extremely low editing efficiency in protoplasts from wild type plants, in part because the FWA gene is methylated and the DNA is relatively inaccessibly.
  • SEQ ID NO: 348 UBQ10 promoter (DNA sequence).
  • SEQ ID NO: 349 gRNA- FWA sequences (gRNA4 and scaffold).
  • SEQ ID NO: 350 StkyCMBD6 (DNA sequence).
  • SEQ ID NO: 351 StkyCMBD6 (protein sequence).
  • SEQ ID NO: 352 U6 Promoter, DNA.
  • SEQ ID NO: 353 Cas ⁇ , DNA. Terminator-PolyT, DNA: tttttttt.
  • SEQ ID sf-6059413 Attorney Docket No.: 26223-20027.40 NO: 355: RbcS-E9t, DNA.
  • SEQ ID NO: 356 Cas ⁇ -StkyCMBD6 Plasmid DNA Sequence.
  • SEQ ID NO: 357 Cas ⁇ alone Plasmid Sequence.
  • Example 12 ACD15 and ACD21 chaperone system for use to increase editing efficiency of Cas ⁇ in HEK293 Cells Summary
  • This Example describes exemplary experimental guidelines for constructing a Cas ⁇ (Cas12J) nuclease combined with plant ACD15-ACD21 mediated accumulation technology to increase accumulation of Cas ⁇ at the targeted locus and therefore increase editing efficiency.
  • Cas ⁇ -ACD15- ACD21 which will be targeted to a GFP gene inserted into the genome of the HEK293 cells expressed using a EF1alpha promoter (P. Pausch, B. Al-Shayeb, E. Bisom- Rapp, C. A.
  • FIGS.37A-37D and FIGS.38A-38D Exemplary structures of these fusion constructs to be used in the Cas ⁇ -ACD15- ACD21 system are presented in FIGS.37A-37D and FIGS.38A-38D. In these figures, different regions of the construct are numerically labeled, with each region representing a respective module of the construct.
  • Coding sequences for the ACD15-ACD21 will be amplified from cDNA and will be cloned into a separate Cas ⁇ plasmid after cutting the plasmid with the appropriate restriction enzymes. This will add the ACD15-ACD21 directly the Cas ⁇ with an XTEN linker in between.
  • Features of Cas ⁇ -ACD15-ACD21 include a Chicken ⁇ -actin Promoter, Cas ⁇ , gRNA-GFP sequences, ACD15, ACD21, U6 promoter, Terminator-PolyT, XTEN Linker, and bGH poly(A) signal.
  • Cas ⁇ -ACD15-ACD21 will demonstrate a higher editing percentage compared to Cas ⁇ alone due to the increased accumulation of Cas ⁇ -ACD15- sf-6059413 Attorney Docket No.: 26223-20027.40 ACD21. It is therefore anticipated that this will result in a higher proportion of GFP negative cells when cells are transfected with the ACD15-ACD21 containing fusions compared to cells transfected with Cas ⁇ that is not fused with ACD15-ACD21.
  • This example will demonstrate the functional use of ACD15 and ACD21 to specifically accumulate Cas ⁇ at an editing locus in human cells.
  • Example 13a Targeting gRNAs and template DNAs for enhanced genome editing.
  • This example outlines proposed future experiments for targeting gRNAs and template DNAs for enhanced genome editing.
  • the assembly of Cas nuclease proteins with their gRNA sequences is a key limiting biochemical step in genome editing (2). For this reason, researchers commonly achieve higher editing efficiencies when Cas protein is preassembled with its gRNA in a high concentration in vitro reaction to form ribonucleoprotein (RNP), after which the RNPs are delivered to cells for editing (2).
  • RNP ribonucleoprotein
  • An important genome editing technique involves delivering a DNA template together with the gRNA and Cas protein to induce either insertion of the DNA into the genomic cut site, or homologous recombination to create precise sequence replacements (5).
  • oligonucleotides can be inserted into CRISPR/gRNA cut sites in protoplasts, or in plants by particle bombardment, but the efficiency is very low (5, 6). We propose to increase this efficiency by increasing the concentration of oligonucleotides at the target site.
  • sf-6059413 Attorney Docket No.: 26223-20027.40
  • oligonucleotides in vitro to the HUH endonuclease protein (6-8), which has been purified as a fusion protein with StkyC domains or sHSPs that have been shown to be effective.
  • the HUH-DNA complex will then be delivered along with the other reagents for genome editing to protoplasts and tested for DNA insertion efficiency.
  • SEQ ID NO: 358 Chicken ⁇ -actin Promoter (DNA sequence).
  • SEQ ID NO: 359 U6 Promoter (DNA sequence).
  • SEQ ID NO: 360 Cas ⁇ (DNA sequence).
  • SEQ ID NO: 361 Cas ⁇ (protein sequence).
  • SEQ ID NO: 362 ACD15 DNA sequence.
  • SEQ ID NO: 363 ACD15 protein sequence).
  • SEQ ID NO: 364 ACD21 DNA sequence.
  • SEQ ID NO: 13 ACD21 protein sequence.
  • SEQ ID NO: 365 XTEN Linker DNA sequence.
  • Terminator-PolyT DNA Tttttttt.
  • SEQ ID NO: 367 bGH poly(A) signal, DNA.
  • SEQ ID NO: 368 gRNA9 DNA sequence.
  • SEQ ID NO: 369 gRNA6 DNA sequence.
  • SEQ ID NO: 370 gRNA8 DNA.
  • SEQ ID NO: 371 No gRNA DNA sequence.
  • SEQ ID NO: 372 CRISPR Repeat DNA Sequence.
  • SEQ ID NO: 373 Cas ⁇ -No gRNA Plasmid DNA Sequence.
  • SEQ ID NO: 374 Cas ⁇ -gRNA9 DNA sequence.
  • SEQ ID NO: 375 Cas ⁇ -gRNA6 DNA sequence.
  • SEQ ID NO: 376 Cas ⁇ -gRNA8 DNA sequence.
  • SEQ ID NO: 377 Cas ⁇ - ACD15-ACD21 no gRNA DNA sequence.
  • SEQ ID NO: 378 Cas ⁇ -ACD15-ACD21-gRNA9 DNA sequence.
  • SEQ ID NO: 379 Cas ⁇ -ACD15-ACD21-gRNA6 DNA sequence.
  • SEQ ID NO: 380 Cas ⁇ -ACD15-ACD21-gRNA8 DNA sequence.
  • Example 13b ACD15 and ACD21 chaperone system used to increase editing efficiency of Cas9 Summary
  • This Example describes that addition of the StkyC domain can increase the efficiency of genome editing.
  • this example describes experiments wherein ACD15- and ACD21-accumulation technology was used to increase the editing efficiency of sf-6059413 Attorney Docket No.: 26223-20027.40 the Cas9 nuclease. This was achieved using the plant StkyC domain of MBD6 (StkyC MBD6 ) to cause multimerization and accumulation of Cas9 at the target locus and therefore increase editing efficiency.
  • Table 2A Parameters for Fusion Construct Modules Construct Design
  • Three different constructs were created using the StkyC MBD6 : Cas9-XTEN- StkyC MBD6 (FIGS.42A-42C), Cas9-SunTag-1xGCN4 (FIGS.43A-43B), and Cas9-SunTag- 4xGCN4 (FIGS.44A-44C). All constructs were created using Golden Gate Assembly to assemble each expression cassette into a binary backbone using Cermak et al. (https://pubmed[dot]ncbi[dot]nlm[dot]nih[dot]gov/28522548/).
  • a current Cas9 sequence lacking the XTEN- StkyC MBD6 domain was ligated with coding sequences for the XTEN-StkyC MBD6 domain, amplified from plasmids containing those sequences, after cutting the cassettes and final plasmid with the appropriate restriction enzymes. This added the XTEN-StkyC MBD6 directly after the Cas9.
  • the Cas9-1xGCN4 and 4xGCN4 constructs were amplified from current SunTag plasmids and cloned into plasmids for the Golden Gate Assembly, which were cut with the sf-6059413 Attorney Docket No.: 26223-20027.40 necessary restriction enzymes.
  • Features of these constructs include a UBQ10 promoter, Cas9, gRNA-FWA sequence, StkyC MBD6 Domain, CamV 35S promoters, GCN4 sites, and scFv- GFP-StkC.
  • Cas9 was expected to create double strand breaks, which would need to be repaired through nonhomologous end joining (NHEJ), leading to insertion and deletion mutations at the guide 4 or guide 17 site of FWA promoter.
  • NHEJ nonhomologous end joining
  • Cas9 constructs containing StkyC domains were designed with the goal of demonstrating a higher editing rate compared to Cas9 alone due to the increased accumulation of Cas9-StkyCMBD6.
  • Cas9 alone is known to show low editing efficiency in protoplasts, in part because the FWA gene is methylated and the DNA is relatively inaccessibly (FIGS.41A-41C).
  • the constructs described in this Example were designed such that the addition of the StkyC domain would allow Cas9 to gain more frequent access to its target sites.
  • SEQ ID NO: 384 gRNA-FWA: StkyCMBD6 (protein sequence).
  • SEQ ID NO: 385 CaMV 35S promoter (DNA sequence).
  • SEQ ID NO: 386 Cas9 (DNA sequence).
  • SEQ ID NO: 387 Cas9-XTEN-StkyC MBD6 Plasmid Sequence (guide 4; DNA sequence).
  • SEQ ID NO: 388 Cas9-XTEN-StkyCMBD6 Plasmid Sequence (guide 17).
  • SEQ ID NO: 389 Cas9 Plasmid Sequence (Guide 4; DNA sequence).
  • SEQ ID NO: 390 Cas9 Plasmid Sequence (Guide 17; DNA sequence).
  • SEQ ID NO: 391 Cas9-SunTag-1xGCN4 (Guide 4; DNA sequence).
  • SEQ ID NO: 392 Cas9-SunTag-4xGCN4 Plasmid Sequence (Guide 4; DNA sequence).
  • SEQ ID NO: 393 Cas9-SunTag-4xGCN4 Plasmid Sequence (Guide 17; DNA sequence).
  • Example 14 Targeted protein accumulation of CRISPR-Cas9 system using ⁇ - crystalline domain proteins Summary [0510] This Example describes a number of additional ACD proteins from different organisms and shows that many, but not all, can multimerize highly like the ACD15/21 proteins.
  • this Example describes experiments in which a genome targeting system was constructed utilizing a dCas9 SunTag system and ⁇ -crystalline domain (ACD)- containing proteins. These constructs were used to demonstrate the specific accumulation of a protein of interest to a specific locus of the genome using dCas9-specific targeting and oligomerization through ACD proteins.
  • the StkyC domain which normally recruits the plant ACD proteins
  • the SunTag system in which ACD15, ACD21 and SLN regulate accumulation and mobility of MBD6 to silence genes and transposable elements; Science Advances, 9(46):eadi9036.
  • Candidate ACD proteins were discovered through sequence homology analysis of the existing Arabidopsis thaliana ACD proteins (ACD21 and ACD15), which were shown to cause higher order multimerization of the MBD5/6 complex (ACD15, ACD21 and SLN regulate accumulation and mobility of MBD6 to silence genes and transposable elements; Science Advances, 9(46):eadi9036. doi: 10.1126/sciadv.adi9036) and are capable of causing hyperaccumulation of the SunTag-StkyC system discussed previously (ACD15, ACD21 and SLN regulate accumulation and mobility of MBD6 to silence genes and transposable elements; Science Advances, 9(46):eadi9036.
  • FIGS.46A-FIG.56C Structures of the fusion constructs used in the CRISPR-Cas9 system are presented in FIGS.46A-FIG.56C. In FIGS.46A-FIG.56C, different regions of each construct are labeled, with each region representing a respective module of the construct.
  • Fusion constructs containing different variants of the modules presented in FIG.46A-FIG.56C were also prepared as described below in Table 2A.
  • Table 2A Parameters for Fusion Construct Modules sf-6059413 Attorney Docket No.: 26223-20027.40 Construct Design [0513]
  • a SunTag plasmid was used that contains the dCas9-1xHA-3xNLS-10xGCN4 as well as the two guide RNAs targeting the dCas9 to FWA.
  • Coding sequences of the ACD proteins from module 3 were amplified from cDNA or gene blocks ordered of the cDNA sequences and were cloned into the SunTag plasmid after cutting the plasmid with the appropriate restriction enzymes. This process added the ACD proteins directly after the GFP and before the HA tag.
  • SunTag_ACD proteins include a UBQ10 promoter, dCas9_1xHA_3xNLS_10xGCN4, gRNA-FWA sequences, single chain variable fragment (scFV)_GFP_NLS_1xHA, HSPB1, HSPB4, HSPB5, HSPB8, Chlamydomonas reinhardtii ACD, Saccharolobus solfataricus ACD, Oryza sativa ACD, Solanum tuberosum ACD, Solanum lycopersicum ACD, Deinococcus radiodurans ACD, and Zea Mays ACD.
  • scFV single chain variable fragment
  • the SunTag_HSPB8 construct known to be unable to form oligomeric assemblies and shown in Example 8 herein to be unable to create visible nuclear foci when fused with Arabidopsis thaliana MBD6 protein FIGS.25A-25B), was used as a negative control.
  • Data Analysis [0516] Multiple seedlings of each SunTag-ACD protein were imaged using confocal microscopy to determine expression of GFP. If GFP signal was concentrated in nuclei of the cells, this was interpreted as the construct having properly localized without any misfolding. Z-stacks of root meristems were obtained across multiple plant lines to acquire images across many cells for each SunTag-ACD protein construct. ImageJ software was used to analyze the images and quantify the foci across multiple cells.
  • Example 8 in which some human ACD proteins functionally replaced the StkyC domain of MBD6, we expected that HSPB1, HSPB5, and HSPB4 would cause localization of the fusion proteins to the dCas9 sites, while HSPB8 would fail to cause this aggregation and lead to diffuse localization of the fusion protein throughout the nucleus.
  • These human ACD proteins were therefore chosen as positive and negative controls to compare to the novel ACD proteins from a diverse range of species.
  • This example demonstrates the functional use of different ACD proteins to specifically accumulate at a targeted locus utilizing a dCas9 system. By fusing the ACD proteins to other proteins, they would be anticipated to similarly be targeted into the nuclear bodies present at the target locus.
  • the ACD protein should therefore be useful for concentrating a variety of proteins to a genomic site of interest. It is also anticipated that ACD proteins from many other organisms throughout all kingdoms of life may be similarly useful in targeting and concentrating proteins of interest to a genomic locus. Results [0519] To determine the impact of ACD proteins on the localization of the SunTag targeting system, seedling root meristems were imaged across multiple plant lines. [0520] Microscopic analysis of seedlings demonstrated clear foci formation of all ACD proteins tested, with the exceptions of HSPB8 (FIG.52B), Oryza sativa ACD (FIG.53B), and Deinococcus radiodurans ACD (FIG.54B).
  • the use of these ACDs may be desirable when very high levels of multimerization are desired, for example, to drive the majority of ACD fusions proteins to dCas9 sites.
  • HSPB4, and HSPB5 created very clear nuclear bodies, but they were less intense.
  • ACDs do not multimerize sufficiently to cause the formation of visible nuclear bodies. These ACDs may multimerize to some extent, however, and may be desirable when lower levels of multimerization are desirable, for example in active Cas genome editing where it would be desirable to have many smaller multimerized clusters for optimal genome scanning and editing.
  • the Solanum tuberosum and Solanum lycopersicum ACD proteins resulted in foci which both overlap with and do not overlap with chromocenters (DAPI stained bodies), suggesting possible interaction with Arabidopsis thaliana chromocenter complexes.
  • ACD15 and ACD21 Due to the high sequence homology with ACD15 and ACD21, the most likely interactions were the endogenous MBD5/6 complex components. These ACD proteins would be useful when localization to both Cas9 binding sites, and chromocenters, is desirable, for example genome editing or dCas9 binding of sites present in heterochromatin.
  • sf-6059413 Attorney Docket No.: 26223-20027.40 [0525]
  • the results of this initial screen demonstrated the use of ACD proteins from a diverse range of organisms to multimerize and cause the accumulation of fusion proteins through the SunTag targeting system. These results further suggest conservation in the ability of ACD proteins of different organisms to multimerize and accumulate in the genome and suggest that many ACD proteins from many organisms can be used for this purpose.
  • SEQ ID NO: 394 UBQ10 promoter (DNA sequence).
  • SEQ ID NO: 395 dCas9_1xHA_3xNLS_10xGCN4 (DNA sequence).
  • SEQ ID NO: 396 dCas9_1xHA_3xNLS_10xGCN4 (protein sequence).
  • SEQ ID NO: 397 gRNA-FWA sequences: gRNA4 and scaffold.
  • SEQ ID NO: 398 gRNA-FWA sequences: gRNA17 and scaffold.
  • SEQ ID NO: 399 single chain variable fragment (scFV)_GFP_NLS_1xHA (DNA sequence).
  • SEQ ID NO: 400 single chain variable fragment (scFV)_GFP_NLS_1xHA (protein sequence).
  • SEQ ID NO: 401 HSPB1 (DNA sequence).
  • SEQ ID NO: 402 HSPB1 (protein sequence).
  • SEQ ID NO: 403 HSPB5 (DNA sequence).
  • SEQ ID NO: 404 HSPB5 (protein sequence).
  • SEQ ID NO: 405 HSPB4 (DNA sequence).
  • SEQ ID NO: 406 HSPB4 (protein sequence).
  • SEQ ID NO: 407 SunTag HSPB4 Plasmid Sequence.
  • SEQ ID NO: 408 HSPB8 (DNA sequence).
  • SEQ ID NO: 409 HSPB8 (protein sequence).
  • SEQ ID NO: 410 SunTag-HSPB8 Plasmid Sequence.
  • SEQ ID NO: 411 SunTag-HSPB5 Plasmid Sequence.
  • SEQ ID NO: 412 SunTag-HSPB1 Plasmid Sequence.
  • SEQ ID NO: 413 Chlamydomonas reinhardtii ACD protein (DNA sequence).
  • SEQ ID NO: 414 Chlamydomonas reinhardtii ACD protein (protein sequence).
  • SEQ ID NO: 415 SunTag-Chlamydomonas reinhardtii ACD Plasmid Sequence.
  • SEQ ID NO: 416 Saccharolobus solfataricus ACD Protein (DNA sequence).
  • SEQ ID NO: 417 Saccharolobus solfataricus ACD Protein (protein sequence).
  • SEQ ID NO: 418 SunTag-Saccharolobus solfataricus ACD Plasmid Sequence.
  • SEQ ID NO: 419 Oryza sativa ACD Protein Sequence (DNA sequence).
  • SEQ ID NO: 420 Oryza sativa ACD Protein Sequence (protein sequence).
  • SEQ ID NO: 421 SunTag-Oryza sativa ACD Plasmid Sequence.
  • SEQ ID NO: 422 Solanum tuberosum ACD Protein (DNA sequence).
  • SEQ ID NO: 423 Solanum tuberosum ACD Protein (protein sequence).
  • SEQ ID NO: 424 SunTag-Solanum tuberosum ACD Plasmid Sequence.
  • SEQ ID NO: 425 Solanum lycopersicum ACD Protein Sequence (DNA sequence).
  • SEQ ID NO: 426 Solanum lycopersicum ACD Protein Sequence (protein sequence).
  • SEQ ID NO: 427 SunTag-Solanum lycopersicum ACD Plasmid Sequence.
  • SEQ ID NO: 428 Deinococcus radtiodurans ACD Protein Sequence (DNA sequence).
  • SEQ ID NO: 429 Deinococcus radtiodurans ACD sf-6059413 Attorney Docket No.: 26223-20027.40 Protein Sequence (protein sequence).
  • SEQ ID NO: 430 SunTag-Deinococcus radiodurans ACD Plasmid Sequence.
  • SEQ ID NO: 431 Zea Mays ACD Protein Sequence (DNA sequence).
  • SEQ ID NO: 432 Zea Mays ACD Protein Sequence (protein sequence).
  • SEQ ID NO: 433 SunTag-Zea Mays ACD Plasmid Sequence.
  • Example 15 Use of MBD6-human small heat shock protein chimeric proteins to silence FWA Summary [0527] This Example describes experiments that demonstrated that three mammalian ACD proteins could functionally substitute for the silencing function of ACD15/21. This Example is related to Example 8 herein.
  • HSPB1, HSPB3, HSPB5, and HSPB8 are all ACD-containing proteins, but only HSPB1, HSPB3, and HSPB5 are able to highly multimerize.
  • HSPB1, HSPB3, and HSPB5 are able to highly multimerize.
  • Data Analysis FWA expression was determined by normalizing FWA expression across multiple plant lines and technical replicates relative to the expression of a control gene, IPP2, and were compared to FWA expression in the mbd5 mbd6 mutant plants, which were centered to a value of 1. This normalized data was then plotted using GraphPad Prism where the results were statistically compared using a one-way ANOVA with corrections for multiple comparisons.
  • Example 16 Leveraging ACD accumulation technology, through the StkyC domain of MBD6, to increase genome editing efficiency in stable transgenic Arabidopsis thaliana plants Summary [0533] This example describes experiments wherein ACD15 and ACD21 accumulation technology was used to increase the editing efficiency of the Cas9 nuclease. This was achieved using the StkyC domain of MBD6 (StkyC) to cause the accumulation of Cas9 at the target locus and therefore increase editing efficiency. This example is related to Example 13b.
  • StkyC domain can be used to increase Cas9 editing efficiency.
  • this technology will be transferable to other genome editing nucleases such as Cas12, TnpB or other nucleases, as well as other plant species.
  • sf-6059413 Attorney Docket No.: 26223-20027.40
  • Example 17 Using alpha crystalline domain (ACD) proteins from various organisms to increase genome editing efficiency through ACD accumulation technology. Summary [0539] This example describes experiments wherein novel ACDs were used to increase the editing efficiency of the Cas9 nuclease. This example is related to Examples 13 and 14 above.
  • Example 14 we demonstrated novel ACDs from various organisms could behave similarly to the StkyC domain of MBD6, forming dCas9, SunTag GFP foci at the promoter of FWA. Based on these results, we predicted that this accumulation of an enzyme, such as a nuclease, through ACD accumulation technology would increase its enzymatic activity. Therefore, we moved from a dead Cas9 to an active Cas9 SunTag system and tested the impact of ACD accumulation technology on the editing efficiency of Cas9.
  • an enzyme such as a nuclease
  • Protoplast Transfection and Sequencing were transfected into Arabidopsis thaliana wild-type (Col-0) mesophyll protoplast cells. Protoplast cells were incubated at 26°C for 48 hours. At 48 hours post-transfection, protoplasts were harvested for genomic DNA extraction and targeted mutagenesis analysis using Next-generation amplicon sequencing. Data Analysis [0543] Based on the results described in the above Examples, we hypothesized that the ACD proteins from other organisms will generally increase the editing efficiency of Cas9 constructs in protoplasts. Sequencing results were analyzed for the amount of insertion and deletions created at the target site. This was plotted as a percentage to allow for statistical comparison among experiments.
  • Example 18 Saccharolobus solfataricus ACD protein SunTag (SunTagSacc) system for locus-specific accumulation in human cells. Summary [0546] This example describes experiments wherein ACD accumulation technology was used to accumulate the SunTag system in human embryonic kidney cells (HEK293T).
  • FIGS.60A-60B Exemplary structures of this construct to be used in the SunTag-VP64 control construct and the SunTag Sacc -VP64 system are presented in (FIGS.60A-60B). In these Figures, different regions of the construct are numerically labeled, with each region sf-6059413 Attorney Docket No.: 26223-20027.40 representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these figures were also prepared, as described below in Table 18A.
  • Table 18A Exemplary Parameters for Fusion Construct Modules Exemplary Construct Design
  • the construct that was created using the Saccharolobus solfataricus ACD protein was the SunTag Sacc -VP64 (FIG.60B).
  • the SunTag plasmid was cut using restriction enzyme digest to make a linear plasmid.
  • a gene block was ordered containing the Saccharolobus solfataricus ACD coding sequence with homology to the cut site and used to ligate into the cut site.
  • the SunTag plasmid was cut using a restriction enzyme and the gRNA sequence was ligated into the cut restriction site.
  • a PiggyBac Transposase (FIG.60C) was co-transfected with the SunTag constructs.
  • constructs include a CMV Enhancer with Chicken ⁇ Promoter and Chimeric Intron (SEQ ID NO: 434), dCas9 with GCN4 (SEQ ID NO: 435), U6 promoter+NLRC5 gRNA+scaffold RNA (SEQ ID NO: 438), P2A-scFV-sfGFP-VP64- Saccharolobus solfataricus ACD (SEQ ID NO: 437), P2A-scFV-sfGFP-VP64 (SEQ ID NO: 436), CMV Enhancer with CMV Promoter (SEQ ID NO: 439), and Synthetic PiggyBac Transposase with SV40 PolyA (SEQ ID NO: 440).
  • HEK293T cells were collected and imaged using confocal microscopy wherein both GFP and DAPI were measured across multiple cells. Data Analysis [0553] Based on the results described in the above Examples, we hypothesized that Saccharolobus solfataricus ACD protein would oligomerize in nuclei of human cells.
  • SEQ ID NO: 444 PiggyBac Transposase Plasmid Sequence.
  • Example 19 Chlamydomonas reinhardtii ACD protein SunTag (SunTag Chlamy ) system for locus-specific labeling through microscopy Summary [0558] This example describes experiments wherein Chlamydomonas reinhardtii ACD protein accumulation technology was used to accumulate the SunTag system for the goal of locating a specific locus within nuclei by microscopy. This was achieved using the full-length coding sequence of the ACD containing, small heat shock protein (sHSP) of Chlamydomonas reinhardtii to concentrate GFP after binding the dCas9, SunTag targeting system at specific loci.
  • sHSP small heat shock protein
  • Table 19A Exemplary Parameters for Fusion Construct Modules Exemplary Construct Design
  • the construct that was created using the Chlamydomonas reinhardtii ACD protein was the SunTag Chlamy (FIG.62A). All constructs were created using golden gate to assemble (https://pubmed[dot]ncbi[dot]nlm[dot]nih[dot]gov/28522548/).
  • constructs include a UBQ10 promoter (SEQ ID NO: 445), Cas9 (SEQ ID NO: 448), gRNA-siren sequence (SEQ ID NO: 448), Chlamydomonas reinhardtii ACD gene (Example 14), and scFv-HA-GFP (SEQ ID NO: 492). Transformation of plants [0562] Agrobacterium AGL0 cells was transformed with the final binary vector containing the fusion proteins and the gRNA. Arabidopsis wild type (Col-0) plants were transformed using floral dip methods. Microscopy Experiment [0563] The root meristem region of hygromycin resistance seedlings were analyzed using an LSM980 confocal microscope.
  • the GFP reporter allowed for observations of cellular sf-6059413 Attorney Docket No.: 26223-20027.40 localization and nuclear phenotypes. ACD proteins are known to oligomerize and therefore GFP foci were expected to form in nuclei of cells.
  • Data Analysis [0564] Multiple seedlings expressing SunTag Chlamy were imaged using confocal microscopy to determine localization of GFP. If GFP signal was concentrated in nuclei of the cells, then the construct was assumed to properly localize without any misfolding. Z-stacks of root meristems were obtained across multiple plant lines to acquire images across many cells for each SunTag Chlamy . ImageJ software was used to analyze the images and quantify the foci across multiple cells.
  • Chlamydomonas reinhardtii ACD protein would oligomerize in nuclei of plants, which we assume leads to the accumulation of the SunTag constructs, through oligomerization of bound Chlamydomonas reinhardtii ACD protein, at the site of interest (siren locus). This will lead to the creation of ⁇ 2 GFP foci in plant nuclei.
  • This technology demonstrates the functional use of Chlamydomonas reinhardtii ACD protein to specifically accumulate dCas9 at a specific locus and the use of targeted accumulation of Chlamydomonas reinhardtii ACD protein as a mechanism for locating genes through microscopy.
  • Results [0567] To determine the impact of Chlamydomonas reinhardtii ACD protein on the localization of SunTag targeting system, seedling root meristems were imaged across multiple plant lines. Root meristem tissue contains a high density of nuclei providing a large number of nuclear data in a single region. [0568] Microscopic analysis of seedlings demonstrated clear foci formation for the SunTag Chlamy construct (FIG.62B).
  • SEQ ID NO: 445 UBQ10 promoter.
  • SEQ ID NO: 446 Chlamydomonas reinhardtii ACD protein.
  • SEQ ID NO: 447 gRNA-siren sequence, DNA (5’-3’).
  • SEQ ID sf-6059413 Attorney Docket No.: 26223-20027.40 NO: 448: Cas9 DNA.
  • SEQ ID NO: 449 SunTagChlamy Plasmid Sequence (siren loci), DNA.
  • Example 20 Co-targeting vCas ⁇ with dead Cas9 through ACD accumulation technology for improved genome editing efficiency of Cas ⁇ nuclease Summary
  • This Example describes designed experiments to demonstrate the goal of increasing vCas ⁇ genome editing in plants.
  • plants expressing the previously described SunTag StkyC ACD mediated accumulation technology (Example 1), we propose to concentrate vCas ⁇ to its target site at the promoter of FWA through ACD mediated multimerization.
  • the SunTag StkyC system has already been shown to be localized to the previously characterized guide RNA 17 (gRNA 17) and guide RNA 4 (gRNA4) sites at the FWA promoter (FIG.63A) showing nuclear bodies which indicate high concentration of the components to the genomic site.
  • vCas ⁇ on the other hand will be localized to a guide RNA site downstream of those sites called guide RNA 1 (gRNA1).
  • gRNA1 guide RNA 1
  • the SunTag StkyC accumulation technology contains accumulated dCas9 at FWA, this will create an optimal scaffold to localize other proteins that are similarly interacting with the ACDs.
  • vCas ⁇ will be expressed containing a C-terminal fusion of the MBD6 StkyC domain (StkyC) with an XTEN linker (vCas ⁇ -XTEN-StkyC). As described in Example 1, the StkyC domain mediates the accumulation of proteins through ACD15 and ACD21 interactions.
  • the vCas ⁇ and vCas ⁇ -XTEN- StkyC will be transfected into wild-type (Col-0) protoplasts and wild-type protoplasts expressing the SunTag StkyC system. Then the protoplasts will be collected and next generation sequencing (NGS) will be performed to measure editing efficiencies. In addition, transgenic plants will be created with these same constructs, and we again expect that in the presence of sf-6059413 Attorney Docket No.: 26223-20027.40 the SunTag StkyC , the both the vCas ⁇ -XTEN-StkyC will show a higher editing efficiency as compared to vCas ⁇ .
  • FIGS.63B-C Exemplary structures of these fusion constructs to be used in the vCas ⁇ -XTEN- StkyC system are presented in FIGS.63B-C. In these Figures, different regions of the construct are numerically labeled, with each region representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these figures will also be prepared, as described below in Table 20.
  • Table 20 Exemplary Parameters for Fusion Construct Modules Exemplary Construct Design
  • the construct created using the StkyC domain was vCas ⁇ -XTEN-StkyC (FIG. 63C). All constructs were created using golden gate to assemble each expression cassette into (https://pubmed[dot]ncbi[dot]nlm[dot]nih[dot]gov/28522548/).
  • the vCas ⁇ - XTEN-StkyC the vCas ⁇ construct (FIG.63B) containing gRNA1 was cut with a restriction enzyme downstream of vCas ⁇ .
  • a gene block was ordered with homology to the cut site containing the XTEN linker with the StkyC and was ligated at the cut site. This will add the sf-6059413 Attorney Docket No.: 26223-20027.40 XTEN-StkyC directly after the vCas ⁇ .
  • Features of these constructs include a UBQ10 promoter (SEQ ID NO: 450), vCas ⁇ (SEQ ID NO: 452), XTEN-StkyC (SEQ ID NO: 453), gRNA1 sequence (SEQ ID NO: 451), RbcS-E9t (SEQ ID NO: 455), U6 Promoter (SEQ ID NO: 454).
  • Protoplasts were created following current protocols. Plasmids were directly transformed into protoplasts made from wild-type (Col-0) plants. Sequencing [0578] To determine if the promoter of FWA was successfully edited we will perform amplicon sequencing on protoplasts transfected with the Cas9 constructs. Data Analysis and expected results [0579] We hypothesize that ACD15/ACD21 will oligomerize in nuclei of plants. This will lead to the accumulation of the vCas ⁇ -XTEN-StkyC constructs at the promoter of FWA where the SunTag StkyC scaffold will be accumulated. This will lead to increased opportunity for an editing event to occur and therefore will demonstrate increased editing compared to vCas ⁇ control.
  • This technology will demonstrate the functional use of ACD accumulation technology to increase editing of vCas ⁇ in a unique, ACD mediated mechanism. This technology will further demonstrate the use of targeted accumulation of a genome editing enzyme as a mechanism for increasing editing efficiency, without needing extensive optimization of the nuclease enzyme itself. Additional Sequences: [0581] SEQ ID NO: 456: vCas ⁇ Plasmid Sequence, DNA. SEQ ID NO: 457: vCas ⁇ - XTEN-StkyC, DNA. Example 21: Truncated guide RNA (gRNA) co-targeting through ACD accumulation technology for improved genome editing efficiency.
  • gRNA Truncated guide RNA
  • the CRISPR-Cas9 genome editing system relies on a guide RNA (gRNA) to direct the Cas9 protein to the genomic DNA target site.
  • gRNA guide RNA
  • the standard gRNA length for optimal CRISPR-Cas9 genome editing is 20 base pairs (bp).
  • bp base pairs
  • DNA cleavage capabilities are sf-6059413 Attorney Docket No.: 26223-20027.40 hindered while still allowing for Cas9 to bind genomic DNA (Pan et al.2022; Kiani et al. 2015).
  • This example describes experimental guidelines to improve editing efficiency using ACD accumulation technology (ACD proteins) in combination with editing (20 bp) and non-editing (14 bp truncated) Cas9 gRNAs.
  • ACD proteins ACD accumulation technology
  • the following experiments will be performed in wild-type (Col-0) Arabidopsis thaliana protoplast cells and stable transgenic plants.
  • We hypothesize that simultaneously expressing a 20 bp gRNA (for editing) and a 14 bp gRNA (for binding) targeting nearby genomic locations would be an effective strategy to localize more Cas9 to the site being targeted for editing.
  • the Cas9 fusion proteins would accumulate at the site corresponding to the truncated gRNA4 site because of ACD mediated multimerization.
  • FIG.64B Exemplary structures of this construct to be used in the Cas9-SunTag-HSPB5- 4xGCN4-Truncated system are presented in (FIG.64B). In this Figure, different regions of the construct are numerically labeled, with each region representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these Figures will also be prepared, as described below in Table 21.
  • Table 21 Exemplary Parameters for Fusion Construct Modules Exemplary Construct Design
  • the construct to be created using HSPB5 is the Cas9-SunTag-HSPB5-4xGCN4- Truncated construct (FIG.64B). All constructs were created using golden gate to assemble each expression cassette into a binary backbone using Cermak et al. (https://pubmed[dot]ncbi[dot]nlm[dot]nih[dot]gov/28522548/).
  • the previously used Cas9- SunTag-HSPB5-4xGCN4 construct was used to clone the Cas9-SuntTag- HSPB5-4xGCN4-Truncated.
  • These constructs include a UBQ10 promoter (SEQ ID NO: 458), Cas9-SunTag-4xGCN4 (SEQ ID NO: 460), 35S promoter (SEQ ID NO: 459), HSP Terminator (SEQ ID NO: 461), AtU6 promoter (SEQ ID NO: 462), FWA Guide 17 (SEQ ID NO: 463: AAAACTAGGCCATCCATGGA), FWA Guide 4 Truncated (SEQ ID NO: 464: GACGGAAAGATGTAT), scFV-sfGFP-HSPB5 (SEQ ID NO: 465), Rbcs E9 terminator (SEQ ID NO: 466).
  • Agrobacterium AGL0 cells will be transformed with the final binary vector containing the fusion proteins and the gRNA.
  • Arabidopsis wild type (Col-0) plants will be transformed using floral dip methods.
  • Plant selection [0589] Plants will be selected for on appropriate selection plates. Selection positive plants will then be transferred to soil to grow under normal greenhouse conditions.
  • Plant selection [0590] Protoplasts will also be created from wild-type (Col-0) plants while transgenic plants are growing. These protoplasts will be prepared following usual protocols.
  • Next Generation Sequencing [0591] Plant tissue will then be collected, and next-generation sequencing will be performed to determine the rates of insertion and deletion at the guide site.
  • the second control will be the Cas9-SunTag-HSPB5-4xGCN4-Truncated plasmid without truncated FWAg4 gRNA, just the FWAg17 gRNA used for editing.
  • Sequences and References [0593] SEQ ID NO: 467: Cas9-SunTag-HSPB5-4xGCN4-Truncated Plasmid Sequence. SEQ ID NO: 468: Cas9-SunTag-HSPB5 Without 4xGCN4-Truncated Plasmid Sequence. Kiani, Samira, Alejandro Chavez, Marcelle Tuttle, Richard N. Hall, Raj Chari, Dmitry Ter- Ovanesyan, Jason Qian, et al.2015.
  • Example 22 Targeting DNA methylation by TRBIP1-MQ1 DNA methylation system using ACD mediated multimerization technology. Summary [0594] This Example describes experimental guidelines to improve the genomic targeting of DNA methylation in plants using the efficient localization of DNA methylation enzymes using ACD-mediated multimerization technology. This example expands on the results found in Example 2 herein wherein ACD-mediated multimerization technology, mediated by the StkyC domain of MBD6, was able induce very specific targeting of DNA methylation. This example details plans to utilize different ACD proteins from different organisms to similarly induce very specific genomic targeting of DNA methylation.
  • This example combines the use of SunTag-TRBIP1-MQ1 DNA methylation targeting technology from Example 2 along with the ACD proteins described in Example 14 for the specific targeting of DNA methylation and reducing off-target methylation (SunTag- ACD-TRBIP1-MQ1).
  • the SunTag-ACD- TRBIP1-MQ1 will be targeted to unmethylated promoter of FWA in the unmethylated fwa epiallele in the rdr-6 background.
  • SunTag- TRBIP1-MQ1 construct (https://pubmed.ncbi[dot]nlm[dot]nih[dot]gov/28522548/).
  • the previously used SunTag- TRBIP1-MQ1 construct (Example 2) will be the template to create the SunTag-ACD- TRBIP1-MQ1 constructs.
  • SunTag-TRBIP1-MQ1 will be cut by restriction enzyme digest to make the plasmid linear. Then the ACD proteins will be ordered as gene blocks with homology to the cut site and ligated into the linear SunTag-TRBIP1-MQ1 plasmid.
  • UBQ10 promoter SEQ ID NO: 469
  • Cas9-10xGCN4 SEQ ID NO: 470
  • scFV-sfGFP SEQ ID NO: 471
  • Chlamydomonas reinhardtii ACD protein SEQ ID NO: 472
  • Zea mays ACD protein SEQ ID NO: 473
  • HSPB1 ACD protein SEQ ID NO: 474
  • HSPB4 ACD protein SEQ ID NO: 475
  • HSPB8 ACD protein SEQ ID NO: 476
  • fwa Guide 4 SEQ ID NO: 478
  • fwa Guide 17 SEQ ID NO: 477
  • XTEN Linker- TRBIP1-MQ1 SEQ ID NO: 479
  • Nos Terminator SEQ ID NO: 480
  • Exemplary plasmid sequences are provided in SEQ ID NO: 481 (SunTag-Chlamy-TRBIP1-MQ1 Plasmid Sequence), SEQ ID NO: 482 (SunTag-Zea Mays-TRBIP1-MQ1 Plasmid Sequence), SEQ ID NO: 483 (SunTag-HSPB1-TRBIP1-MQ1 Plasmid Sequence), SEQ ID NO: 484 (SunTag- sf-6059413 Attorney Docket No.: 26223-20027.40 HSPB4-TRBIP1-MQ1 Plasmid Sequence), and SEQ ID NO: 485 (SunTag-HSPB8-TRBIP1- MQ1 Plasmid Sequence).
  • Agrobacterium AGL0 cells will be transformed with the final binary vector containing the fusion proteins and the gRNA.
  • Arabidopsis fwa rdr-6 (Col-0) plants will be transformed using floral dip methods.
  • Plant selection [0600] Plants will be selected for on appropriate selection plates. Selection positive plants will then be transferred to soil to grow under normal greenhouse conditions.
  • Flowering Time Assay [0601] One way to determine the impact of the ACD proteins on the efficacy of the SunTag-TRBIP1-MQ1 DNA methylation system is by following the silencing of FWA. FWA is usually methylated and the plants have normal flowering time.
  • Example 23 ADDITIONAL SEQUENCE INFORMATION [0605] The amino acid sequence of ACD15 (tr
  • ACD21 (>sp
  • Table XA ACD15 Close Plant Homologs sf-6059413 Attorney Docket No.: 26223-20027.40
  • Table XB ACD15 Orthologs
  • FIGS.30A-30C An alignment of the ACD15 ACD domain with full length proteins of ACD orthologs from other plant species and a phylogeny of the species included in the alignment are provided in FIGS.30A-30C.
  • Table XC ACD21 Close Plant Homologs sf-6059413 Attorney Docket No.: 26223-20027.40
  • Table XD ACD21 Orthologs.
  • Table XE ACD15 and ACD21 ACD domains compared to broader plant species [0614] Alignments of the ACD15 and ACD21 ACD domains compared to the sequences shown in Table XE are presented in FIG.31. Alignments of the ACD15 and ACD21 ACD domains compared to the sequences shown in Table XF are presented in FIG.32. Alignments of the ACD15 and ACD21 ACD domains compared to the sequences shown in Table XG are presented in FIG.33.
  • Table XF Protein coding sequences of H. sapiens ⁇ -Crystalline Domain containing small heat shock proteins HSPB1-10.
  • Table XG Protein coding sequences of ⁇ -Crystalline Domain containing small heat shock proteins from the following species which represent all kingdoms of life: HSPB1 (Homo sapiens) (a mammal), HSP22 (Drosophila melanogaster) (an insect), HSP26 (Saccharomyces cerevisiae) (a fungus), M1URI8 (Cyanidioschyzon merolae) (a red algae), P12811 (Chlamydomonas reinhardtii) (a green algae), Q9RTR5 (Deinococcus radiodurans) (a bacterium), and D0KNS6 (Saccharolobus solfataricus) (an archaebacterium).
  • NP_195276.3 PHD finger-like protein [Arabidopsis thaliana]: SEQ ID NO: 111; XP_003517132.1 uncharacterized protein sf-6059413 Attorney Docket No.: 26223-20027.40 LOC100794366 [Glycine max]: SEQ ID NO: 112; XP_004966523.1 uncharacterized protein LOC101783772 [Setaria italica]: SEQ ID NO: 113; XP_021839485.1 uncharacterized protein LOC110779264 [Spinacia oleracea] SEQ ID NO: 114; KAG0516341.1 hypothetical protein BDA96_10G353700 [Sorghum bicolor] SEQ ID NO: 115; XP_011650244.1 uncharacterized protein LOC101214022 [Cucumis sativus] SEQ ID NO: 116; XP_002277317.1 PREDICTED: uncharacterized protein LOC10024
  • Table XH Amino acid sequences of MBD6 StykC domain homologs sf-6059413 Attorney Docket No.: 26223-20027.40
  • Table XI Amino acid sequences of MBD7 StykC domain homologs sf-6059413

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Medicinal Chemistry (AREA)
  • Cell Biology (AREA)
  • Botany (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The present disclosure relates generally to methods of eukaryotic genome modification. More specifically, the present disclosure relates to compositions and. methods for targeting a. genetic modifier polypeptide and a a-crystalline domain polypeptide to a target nucleic acid of interest to facilitate a genome modification.

Description

Attorney Docket No.: 26223-20027.40 ALPHA-CRYSTALLINE DOMAIN PROTEINS AND THEIR USE IN GENOME MODIFICATION CROSS-REFERENCE TO RELATED APPLICATION [0001] This application claims priority to U.S. Provisional Application No.63/553,478, filed February 14, 2024, and to U.S. Provisional Application No.63/533,017, filed August 16, 2023, the entire contents of each which are hereby incorporated by reference in their entireties. REFERENCE TO AN ELECTRONIC SEQUENCE LISTING [0002] The content of the electronic sequence listing (262232002740SEQLIST.xml; Size: 1,781,338 bytes; and Date of Creation: August 14, 2024) is incorporated herein by reference in its entirety. FIELD [0003] The present disclosure relates generally to methods of eukaryotic genome modification. More specifically, the present disclosure relates to compositions and methods for targeting a genetic modifier polypeptide and an Į-crystalline domain polypeptide to a target nucleic acid of interest to facilitate a genome modification. BACKGROUND [0004] Genome modification methods, such as genome editing, can involve targeting various types of polypeptides to specific nucleic acids. Affecting gene function through the specific targeting of epigenetic modifications, transcriptional regulatory proteins, or gene editing reagents allows for control of gene activity and function as well as cellular function(s). In order for these processes to occur accurately, molecular tools designed to implement these processes need to be both efficient and specific. However, such methods can suffer from low efficiency of producing genome modifications and high occurrence of off- target modifications. While the targeted modification may be made, off-target events across an organism’s genome may occur. These off-target events can lead to unintended consequences and uncontrolled changes to non-targeted cellular pathways. It is of great interest to create molecular tools that maintain, or even increase, efficient modifying processes while also increasing specificity. Therefore, improved genome modification methods are needed that provide improved efficiency and reduced off-target effects. sf-6059413 Attorney Docket No.: 26223-20027.40 BRIEF SUMMARY [0005] In one aspect, the present disclosure provides a method of modifying a target nucleic acid in a eukaryotic cell, the method including: a) providing a eukaryotic cell including: 1) a genetic modifier polypeptide capable of being targeted to the target nucleic acid, and 2) a Į-crystalline domain polypeptide capable of being targeted to the target nucleic acid, wherein at least one of the genetic modifier polypeptide or the Į-crystalline domain polypeptide is a recombinant polypeptide; and b) maintaining the eukaryotic cell under conditions whereby the genetic modifier polypeptide and the Į-crystalline domain polypeptide are targeted to the target nucleic acid, thereby modifying the target nucleic acid. In some embodiments, at least one of the genetic modifier polypeptide or the Į-crystalline domain polypeptide is encoded on a recombinant nucleic acid. In some embodiments, at least one of the genetic modifier polypeptide or the Į-crystalline domain polypeptide comprise a heterologous targeting domain which facilitates targeting of the polypeptide to the target nucleic acid. In some embodiments, the heterologous targeting domain is a DNA-binding domain. In some embodiments, at least one of the genetic modifier polypeptide or the Į- crystalline domain polypeptide is targeted to the target nucleic acid via a SunTag-based targeting system involving a RNA-guided DNA-endonuclease polypeptide and a guide RNA. In some embodiments, the genetic modifier polypeptide includes a heterologous Sticky-C (StkyC) domain. In some embodiments that may be combined with any of the preceding embodiments, at least two different Į-crystalline domain polypeptides are targeted to the target nucleic acid. In some embodiments that may be combined with any of the preceding embodiments, the genetic modifier polypeptide includes a DNA methyltransferase polypeptide. In some embodiments, the DNA methyltransferase polypeptide is a TRBIP1 polypeptide having at least 80% amino acid identity to the polypeptide encoded by Arabidopsis thaliana NP_195276.3 (SEQ ID NO: 1). In some embodiments that may be combined with any of the preceding embodiments, the Į-crystalline domain polypeptide includes an amino acid sequence having at least 80% amino acid identity to ACD15 or ACD21 from Arabidopsis thaliana (SEQ ID NO: 11 or SEQ ID NO: 13, respectively). In some embodiments that may be combined with any of the preceding embodiments, the Į- crystalline domain polypeptide is selected from the group consisting of: ACD15, ACD21, HSPB1, HSPB3, and HSPB5. In some embodiments that may be combined with any of the preceding embodiments, modification of the target nucleic acid confers a change in expression and/or a change in the target nucleotide sequence of the target nucleic acid as sf-6059413 Attorney Docket No.: 26223-20027.40 compared to a corresponding control. In some embodiments that may be combined with any of the preceding embodiments, the incidence of modification of a non-target nucleic acid is reduced as compared to a corresponding control. In some embodiments that may be combined with any of the preceding embodiments, the eukaryotic cell is a plant cell or a mammalian cell. In some embodiments that may be combined with any of the preceding embodiments, the eukaryotic cell is a plant cell and the method further includes regenerating a whole plant from said plant cell. In some embodiments, the method further includes (c) crossing the plant with a modified target nucleic acid to a second plant to produce one or more F1 plants. In some embodiments, the method further includes (d) selecting from the one or more F1 plants an F1 plant that (i) lacks a recombinant genetic modifier polypeptide and/or a recombinant Į-crystalline domain polypeptide, and (ii) has the modified target nucleic acid. [0006] In another aspect, the present disclosure provides a recombinant nucleic acid encoding at least one of 1) a genetic modifier polypeptide capable of being targeted to a target nucleic acid, and 2) an Į-crystalline domain polypeptide capable of being targeted to a target nucleic acid. In some embodiments, at least one of the genetic modifier polypeptide or the Į-crystalline domain polypeptide comprises a heterologous targeting domain that facilitates targeting of the polypeptide to the target nucleic acid. In some embodiments, the heterologous targeting domain is a DNA-binding domain. In some embodiments, the genetic modifier polypeptide includes a heterologous Sticky C (StkyC) domain. In some embodiments that may be combined with any of the preceding embodiments, the genetic modifier polypeptide includes a DNA methyltransferase polypeptide. In some embodiments, the DNA methyltransferase polypeptide is a TRBIP1 polypeptide having at least 80% amino acid identity to the polypeptide encoded by Arabidopsis thaliana NP_195276.3. In some embodiments that may be combined with any of the preceding embodiments, the Į-crystalline domain polypeptide includes an amino acid sequence having at least 80% amino acid identity to ACD15 or ACD21 from Arabidopsis thaliana. In some embodiments that may be combined with any of the preceding embodiments, the Į-crystalline domain polypeptide is selected from the group consisting of: ACD15, ACD21, HSPB1, HSPB3, and HSPB5. [0007] In another aspect, the present disclosure provides an expression vector including a recombinant nucleic acid encoding at least one of 1) a genetic modifier polypeptide capable of being targeted to a target nucleic acid, and 2) a Į-crystalline domain polypeptide capable of being targeted to a target nucleic acid. In some embodiments, at least one of the genetic sf-6059413 Attorney Docket No.: 26223-20027.40 modifier polypeptide or the Į-crystalline domain polypeptide comprise a heterologous targeting domain which facilitates targeting of the polypeptide to the target nucleic acid. In some embodiments, the heterologous targeting domain is a DNA-binding domain. In some embodiments, the genetic modifier polypeptide includes a heterologous Sticky-C (StkyC) domain. In some embodiments that may be combined with any of the preceding embodiments, the genetic modifier polypeptide includes a DNA methyltransferase polypeptide. In some embodiments, the DNA methyltransferase polypeptide is a TRBIP1 polypeptide having at least 80% amino acid identity to the polypeptide encoded by Arabidopsis thaliana NP_195276.3. In some embodiments that may be combined with any of the preceding embodiments, the Į-crystalline domain polypeptide includes an amino acid sequence having at least 80% amino acid identity to ACD15 or ACD21 from Arabidopsis thaliana. In some embodiments that may be combined with any of the preceding embodiments, the Į-crystalline domain polypeptide is selected from the group consisting of: ACD15, ACD21, HSPB1, HSPB3, and HSPB5. [0008] In another aspect, the present disclosure provides a plant cell including: 1) a genetic modifier polypeptide capable of being targeted to the target nucleic acid, and 2) a Į- crystalline domain polypeptide capable of being targeted to the target nucleic acid, wherein at least one of the genetic modifier polypeptide or the Į-crystalline domain polypeptide is a recombinant polypeptide, and wherein the plant cell includes a modified nucleic acid as compared to a corresponding control nucleic acid. In some embodiments, at least one of the genetic modifier polypeptide or the Į-crystalline domain polypeptide is encoded on a recombinant nucleic acid. In some embodiments, at least one of the genetic modifier polypeptide or the Į-crystalline domain polypeptide comprise a heterologous targeting domain which facilitates targeting of the polypeptide to the modified nucleic acid. In some embodiments, the heterologous targeting domain is a DNA-binding domain. In some embodiments, the genetic modifier polypeptide includes a heterologous Sticky-C (StkyC) domain. In some embodiments that may be combined with any of the preceding embodiments, the genetic modifier polypeptide includes a DNA methyltransferase polypeptide. In some embodiments, the DNA methyltransferase polypeptide is a TRBIP1 polypeptide having at least 80% amino acid identity to the polypeptide encoded by Arabidopsis thaliana NP_195276.3. In some embodiments that may be combined with any of the preceding embodiments, the Į-crystalline domain polypeptide includes an amino acid sequence having at least 80% amino acid identity to ACD15 or ACD21 from Arabidopsis sf-6059413 Attorney Docket No.: 26223-20027.40 thaliana. In some embodiments that may be combined with any of the preceding embodiments, the Į-crystalline domain polypeptide is selected from the group consisting of: ACD15, ACD21, HSPB1, HSPB3, and HSPB5. In some embodiments that may be combined with any of the preceding embodiments, the modified nucleic acid includes a change in expression and/or a change in nucleotide sequence as compared to a corresponding control nucleic acid. BRIEF DESCRIPTION OF THE DRAWINGS [0009] The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawings will be provided by the Office upon request and payment of the necessary fee. [0010] FIGS.1A-1J show data supporting that ACD15 and ACD21 are required for silencing. FIG.1A shows a heatmap of FLAG-tagged MBD5, MBD6, SLN, ACD15, and ACD21 ChIP-seq enrichment (log2FC over no-FLAG Col0 control) centered at all merged peaks. FIG.1B shows a genome browser image of ChIP-seq data showing two methylated loci co-bound by all MBD5/6 complex members. FIG.1C shows Loess curves showing correlation between ChIP-seq enrichment for a representative replicate and CG methylation density. FIG.1D shows violin plots showing mature pollen RNA-seq data for the indicated mutants, at mbd5 mbd6 upregulated transcripts (6 replicates per genotype). FIG.1E shows a comparison between genotypes of the number of RNA-seq differentially expressed genes (DEGs) with >40% CG methylation levels around the TSS. FIG.1F shows a genome browser image of RNA-seq data at the FWA locus in the indicated genotypes. Wild-type BS-seq data is shown as reference. FIGS.1G-1J show ChIP-seq and RNA-seq analysis of ACD15 and ACD21. FIG.1G shows a Venn diagram of ChIP-seq peaks showing large overlap between samples. The peak sets indicated with circles (putative MBD5/6 unique, ACD15 unique, or MBD5/6/ACD15 unique peaks) were selected and visualized with heatmaps (right). We noted that at each peaks set groups, enrichment of most proteins was observed, thus suggesting that these regions are bound by all components of the MBD5/6 complex, despite not reaching our stringent significance threshold to be called as peaks. The heatmap shows log2(fold-change) over no-FLAG control. FIG.1H shows a scheme of ACD15 and ACD21 genes showing the location of the guide RNAs used for CRISPR/Cas9 mediated mutant generation. The table below shows the mutations obtained in each line. FIG.1I shows bar plots showing the number of differentially expressed TEs (DE-TEs) or differentially expressed genes (DEGs) in the indicated genotypes. FIG.1J shows upset plots showing the sf-6059413 Attorney Docket No.: 26223-20027.40 intersection of the upregulated genes or TEs found for each genotype. The largest intersection group constitutes loci upregulated in all six mutant lines. [0011] FIGS.2A-2R show ACD15 and ACD21 bridge SLN to MBD5/6 and organization of the MBD5/6 complex structure. FIG.2A shows IP-MS of flag-tagged MBD5/6 complex members in the indicated genetic backgrounds (MS/MS counts). FIG.2B shows MBD5/6 complex organization as predicted by IP-MS. Created with BioRender.com FIGS.2C-2E show 3D reconstructions of root meristems of plants expressing fluorescently tagged ACD15, ACD21, or SLN in wild-type (Col0) and mutant backgrounds. Scale bar = 20 μm. FIG.2F shows a predicted structure of MBD5/6 complex from AlphaFold Multimer (33). MBD6=Blue, ACD15=magenta, ACD21=maroon, SLN=gold. FIGS.2G-2O show a correlation between MBD6-RFP signal and either ACD15-YFP, ACD21-CFP, or SLN-CFP signal in the indicated mutant backgrounds (underlined). Images represent individual z-stack slices of roots from plants co-expressing MBD6 with either ACD15, ACD21, and SLN. Scatter plots indicate signal intensity for each fluorescent protein at each pixel of the image shown. Correlation coefficient: Pearson. Scale bars = 20 μM. FIGS.2P-2Q show an AlphaFold Multimer predicted structure of MBD5/6 complex with two copies each of MBD6, ACDC15, ACD21, and SLN along with confidence score map of the predicted complex. FIG.2R shows a cartoon representation of the core dimeric MBD5/6 complex based on the AlphaFold Multimer prediction. The figure was created with Biorender.com. [0012] FIGS.3A-3U show that ACD15, ACD21, and SLN regulate MBD6 accumulation and mobility and that SLN regulates the nuclear mobility of MBD5/6 complex members. FIG.3A shows representative MBD6-RFP nuclear images in mutant backgrounds. Scale bar = 2μM. FIG.3B shows 3D reconstruction of MBD6-RFP root meristem z-stacks. Scale bar = 20μM. FIG.3C shows FRAP recovery curves comparing MBD6 signal in WT and sln plants. Shaded area: 95% confidence interval of FRAP data (N=25 from 5 plants lines), dots: mean values, line: fitted one-phase, non-linear regression. FIG.3D shows representative image of FRAP experiment. White circles indicate foci chosen for bleaching. Scale bars = 2μM. FIG.3E shows MBD6 foci counts across 50 slice Z-stacks of root meristems from five plant lines per genotype. Welch’s ANOVA and Dunnet’s T3 multiple comparisons test (**: P<0.01, NS: P>=0.05). FIG.3F shows box plots of mean intensity values of MBD6 foci (5 individual plants per genotype). Two-tailed t-test (****: P<0.0001). FIG.3G shows heatmaps and metaplots of MBD6-RFP ChIP-seq signal (log2 ratio over no-FLAG Col0 control) at peaks called in “MBD6-RFP in wild-type” dataset. FIG.3H shows Loess curves sf-6059413 Attorney Docket No.: 26223-20027.40 showing correlation between MBD6-RFP ChIP-seq enrichment and CG methylation density. FIG.3I shows genome browser tracks showing an example of a high density meCG site bound by MBD6-RFP (ChIP-Seq). Wild-type BS-seq data is shown as reference. FIG.3J shows representative root nuclei images demonstrating MBD6-RFP overlap with ACD15- YFP, ACD21-CFP, and SLN-CFP. Scale bar = 2μM. FIGS.3K-3L show MBD6-RFP signal within DAPI-stained nuclei. Scale bar = 2μM. FIG.3M shows a table of extrapolated values from FRAP curve data fitted with one-phase association liner regression using GraphPad Prism. FIGS.3N-3Q show FRAP curves of ACD15 and ACD21 along with representative nuclei images of FRAP experiments. Shaded area: 95% confidence interval of FRAP data (N=25 from 5 plants lines), dots: mean values, line: fitted one-phase, non-linear regression. Scale bars = 2μM. FIGS.3R-3S show the intensity of ACD15 (FIG.3R) and ACD21 (FIG. 3S) signals at 100 individual foci from multiple nuclei and plant lines. Comparisons were made using two-tailed t tests (****: P<0.0001). FIG.3T shows representative nuclei showing MBD6-RFP foci in lil-1 mutant and control plants. Scale bars = 2μM. FIG.3U shows MBD6 foci counts across 50 slice Z-stacks of root meristem from five plant lines per genotype. Two Tailed T-test (NS: P>=0.05). [0013] FIGS.4A-4R show that the StkyC domain of MBD6 is necessary for function and localization of MBD6. FIG.4A shows a graphical description of MBD6 mutant constructs. FIG.4B shows representative nuclei showing MBD6-RFP signal. Scale bar = 2 μM. FIGS. 4C-4D show the number of MBD6 nuclear foci (5 different plants lines per sample, Z-stack of 50 slices). Brown-Forsythe ANOVA with Tukey’s multiple corrections test (****: P<0.0001, NS: P>=0.05). FIG.4E shows FWA expression from RT-qPCR of flower bud RNA. Brown-Forsythe ANOVA with Dunnet’s multiple corrections test (*: P<0.05, NS: P>=0.05). FIGS.4F-4H show representative images and nuclear profile plots of MBD6-RFP mutants with either ACD15-YFP or ACD21-CFP. White lines indicate the region plotted in the graphs (“nuclear distance”). Scale bars = 10μM. FIG.4I shows protein structure representation of AlphaFold Multimer prediction of MBD6 with ACD15. Domains of MBD6 are annotated. FIG.4J shows protein alignment of MBD5, MBD6, and MBD7. MBDs and StkyC regions of MBD5, MBD6, and MBD7 are labelled. FIG.4K shows graphical representation of MBD6 deletion mutants. FIGS.4L-M show RT-qPCR results of FWA expression comparing mbd5 mbd6 plants expressing MBD6 deletion mutants (FIG.4L), along with representative images of MBD6 deletion mutants in root nuclei (FIG.4M). Comparisons made using Brown-Forsythe ANOVA with Dunnett’s multiple comparisons sf-6059413 Attorney Docket No.: 26223-20027.40 test. ***: P<0.001, **: P<0.01, NS: P>=0.05. Scale bars = 2μM. FIGS.4N-4R show a correlation of MBD6-RFP domain, deletion mutants with ACD15-YFP or ACD21-CFP along with representative nuclear images. Correlation coefficient: Pearson. Scale bars = 2μM. [0014] FIGS.5A-5N show that ACD15 and ACD21 drive the formation of MBD5/6 multimeric assemblies and that SunTagStkyC drives the formation of MBD5/6 nuclear foci. FIG.5A shows a graphical representation of SunTagStkyC system and the hypothesized result. Created with BioRender.com. FIG.5B shows representative nuclear images of SunTagStkyC in different mutant backgrounds. FIG.5C shows SunTagStkyC GFP foci counts per nucleus (N=100 per genotype). Compared using Welch’s ANOVA Dunnett’s T3 multiple comparisons test (****: P<0.0001, NS: P>=0.05). Scale bar = 2μM. FIG.5D shows volume of SunTagStkyC GFP foci from 5 plant lines per genotype (WT: n=1461, mbd5 mbd6: n=1371). Two tailed t-test (****: P<0.0001, NS: P>=0.05). FIG.5E shows RT-qPCR showing FWA expression in leaf tissue from T1 or control plants. Brown-Forsythe ANOVA with Tukey’s multiple corrections test (**: P<0.01, NS: P>=0.05). FIG.5F shows leaf counts post flowering of T1 fwa rdr-6 SunTagStkyC plants. Brown-Forsythe ANOVA with Dunnett’s multiple comparisons test (****: P<0.0001). FIG.5G shows representative image of early flowering T2 fwa rdr-6 plants expressing SunTagStkyC. FIGS.5H-5N show that SunTagStkyC drives the formation of MBD5/6 nuclear foci. FIGS.5H-5I show SunTagStkyC expressing root nuclei stained with DAPI. Scale bars = 2μM. FIG.5J shows a representative image of a root nucleus expressing SunTagTET1. Scale bar = 2μM. FIG.5K shows correlation of SunTagStkyC FWA expression with leaf counts of individual T1 plants from Figure 5E-F. Correlation coefficient: Pearson. FIGS.5L-5N show RT-qPCR of FWA in mbd5 mbd6, sln, and acd15 acd21 plants with and without SunTagStkyC. Comparisons made using Brown- Forsythe ANOVA with Dunnett’s multiple comparison test for each qPCR experiment. ****: P<0.0001, ***: P<0.001, **: P<0.01, *: P<0.05, NS: P>=0.05. [0015] FIG.6 shows a model of MBD5/6 oligomerization at high density meCG sites. Pictured is a diagram of proposed model showing ACD15/ACD21-dependent binding and accumulation of MBD5/6 complex members in multimeric assemblies. MBD5/6 recognize DNA methylation through their MBD domain. Although MBD5 or MBD6 can recognize individual meCG sites, regions with high density meCG sites facilitate recruitment of multiple MBD5/6 complexes, which triggers oligomerization: once MBD5/6 are bound to DNA, ACD15/ACD21 drive recruitment of other MBD5/6 complexes to facilitate oligomerization. This accumulation of proteins leads to higher than expected binding events sf-6059413 Attorney Docket No.: 26223-20027.40 and dwell time at meCG dense regions. SLN directly interacts with ACD21, accumulates with the complex, and acts to maintain the mobility of proteins within the oligomeric assembly. Created with BioRender.com. [0016] FIGS.7A-7B show two modules from the SunTag-TRBIP1-MQ1 plasmid, which contains dCAS9 fused and 10 x GCN4 peptide straight fusion that driven by UBQ10 Promoter (FIG.7A, top), and scFv antibody, sfGFP, TRBIP1 and MQ1 straight fusion that driven by UBQ10 promoter (FIG.7A, bottom), and a plasmid map of SunTag-TRBIP1-MQ1 (FIG.7B). [0017] FIGS.8A-8E show: (FIG.8A) Dot plots showing the leaf number of Col-0, fwa rdr6, SunTag-MQ1 and SunTag-TRBIP1-MQ1. (FIG.8B) qRT-PCR indicating the relative mRNA level of fwa rdr6, and six T1 transgenic lines of SunTag-MQ1 and SunTag-TRBIP1- MQ1 in fwa rdr6 background, respectively. (FIG.8C) qPCR result showing the relative McrBC qPCR value of FWA in the fwa rdr6, Col-0, and eight T1 transgenic lines of SunTag- MQ1 and SunTag-TRBIP1-MQ1 in fwa rdr6 background, respectively. (FIG.8D) Relative CG, CHG and CHH DNA methylation level at FWA promoter region using bisulfite PCR-seq (BS-PCR-seq), the pink regions indicated the ZF binding sites. (FIG.8E) CG DNA methylation of fwa rdr6, SunTag-MQ1 and SunTag-TRBIP1-MQ1 T1 transgenic line in fwa rdr6 background, measured by whole genome bisulfite sequencing (WGBS). [0018] FIGS.9A-9B show: (FIG.9A) two modules from the SunTag-TRBIP1-MQ1 plasmid with varied promoters, which contains dCAS9 and 10 x GCN4 peptides straight fusion driven by UBQ10 promoter, and scFv-GFP- TRBIP1-MQ1 that driven by 20 different Arabidopsis promoters, respectively; and (FIG 9B) a plasmid map of the SunTag-TRBIP1- MQ1 with 20 promoters, respectively. [0019] FIG.10 shows bar charts indicating the relative McrBC qPCR value (Orange bars) and relative DNA methylation level at Chromosome Chloroplast (Blue bars) of two replicates of fwa rdr6 and the T1 transgenic lines of SunTag-TRBIP1-MQ1 with 20 promoters, measured by McrBC and Skim-seq, respectively. This shows that the off target methylation on the chloroplast genome was lower (but not eliminated) when the TRBIP1- MQ1 fusion was driven by weaker promoters, but the silencing of FWA was maintained. [0020] FIG.11 shows a screenshot of CG DNA methylation at FWA locus in two replicates of fwa rdr6 and T1 transgenic lines of SunTag-MQ1, SunTag-TRBIP1-MQ1, as sf-6059413 Attorney Docket No.: 26223-20027.40 well as SunTag-TRBIP1-MQ1 with Promoter 6, Promoter 7, Promoter 9, and Promoter 20, respectively showing reduced off target methylation. [0021] FIG.12 shows screenshots of CG DNA methylation at random loci in two replicates of fwa rdr6 and T1 transgenic lines of SunTag-MQ1, SunTag-TRBIP1-MQ1, as well as SunTag-TRBIP1-MQ1 with Promoter 6, Promoter 7, Promoter 9, and Promoter 20, respectively. This screenshot was to show the reduced CG hyper DNA methylation at a random site over plant genome. [0022] FIG.13 shows line charts showing the relative CG DNA methylation of SunTag- TRBIP1-MQ1 with Promoter 6, Promoter 7, Promoter 9, and Promoter 20, and the DNA methylation was measured by whole genome bisulfite DNA sequencing (WGBS). The plots show methylation across the entire Arabidopsis genome. The blue and the orange lines represent the two replicates of fwa rdr6, while the grey and yellow lines represent the two replicates of SunTag-TRBIP1-MQ1 with Promoter 6, Promoter 7, Promoter 9, and Promoter 20, respectively. [0023] FIGS.14A-14B show: (FIG.14A) two modules from the SunTag-StykC- TRBIP1-MQ1 plasmid, which contains UBQ10 promoter driven dCAS9 and 10 x GCN4 peptides straight fusion, and UBQ10 promoter driven scFv-StykC-sfGFP-TRBIP1-MQ1; and (FIG.14B) a plasmid map of SunTag-StykC-TRBIP1-MQ1. [0024] FIG.15 shows a screenshot of CG, CHG and CHH DNA methylation at FWA locus in fwa rdr6 (two replicates), SunTag-StykC-MQ1, and SunTag-StykC-TRBIP1-MQ1, respectively, measured by WGBS. This shows the on target proper methylation of FWA. [0025] FIG.16 shows a screenshot of CG, CHG and CHH DNA methylation at a random locus of the genome in fwa rdr6 (two replicates), SunTag-StykC-MQ1, and SunTag-StykC- TRBIP1-MQ1, respectively, measured by WGBS. This shows that there is no detectable off target methylation. [0026] FIG.17A shows CG, CHG and CHH DNA methylation over plant genome in fwa rdr6 (two replicates), SunTag-StykC-MQ1 and SunTag-StykC-TRBIP1-MQ1, respectively, measured by WGBS. Blue and orange lines represent fwa rdr6, grey represent SunTag- StykC-MQ1, and yellow line represent SunTag-StykC-TRBIP1-MQ1, which is measured by WGBS. This result indicates that adding StykC removes CG DNA hypermethylation over the plant genome. sf-6059413 Attorney Docket No.: 26223-20027.40 [0027] FIG.17B-FIG.17D shows use of the StkyC domain to increase the specificity of CRISPR-based methylation targeting. FIG.17B show genome wide CG DNA methylation levels across the five Arabidopsis chromosomes in control non-transgenic fwa, SunTag- TRBIP1-MQ1, or SunTag-StkyC-TRBIP1-MQ1 plants. FIG.17C shows chloroplast methylation levels in the same genotypes. FIG.17D shows genome browser view of FWA showing DNA methylation targeting of the FWA promoter in SunTag-StkyC-TRBIP1-MQ1 plants, with minimal off target methylation. The position of the gRNA for dCas9 is shown. The results demonstrate that that there is 75% CG methylation of the chloroplast methylation in the TRBIP1-MQ1, but zero when StkyC is added. [0028] FIGS.18A-18C show construct design for SunTag-TDG-MBD6 StkyC-TET1. FIG.18A shows a cartoon representation of SunTag-TDG-MBD6 StkyC-TET1. FIG.18B shows a plasmid map of for SunTag-TDG-MBD6 StkyC-TET1. FIG.18C shows SunTag- TDG-TET1. [0029] FIGS.19A-19D show (FIG.19A) a cartoon representation of scFV-GFP-SDG2 construct localizing to dCas9 with 10xGCN4 binding sites (black squares). Made with BioRender; (FIG.19B) a plasmid map of SunTag system containing SDG2; (FIG.19C) a cartoon representation of scFV-GFP-SDG2-MBD6 StkyC construct localizing to dCas9 with 10xGCN4 binding sites (black squares). Made with BioRender; and (FIG.19D) a plasmid map of SunTag system containing SDG2- MBD6 StkyC. [0030] FIGS.20A-20B show (FIG.20A) a cartoon representation of scFV-GFP- StkyCMBD7 construct localizing to dCas9 with 10xGCN4 binding sites (black squares); and (FIG.20B) a plasmid map of SunTag system containing StkyCMBD7. [0031] FIGS.21A-21B show (FIG.21A) a cartoon graphic depicting MBD6-HSPB1- RFP chimeric protein and goal of experiments. Made using biorender; and (FIG.21B) a plasmid map depicting the expression construct of MBD6-HSPB1-RFP. [0032] FIGS.22A-22B show (FIG.22A) a cartoon graphic depicting MBD6-HSPB3- RFP chimeric protein and goal of experiments. Made using biorender; and (FIG.22B) a plasmid map depicting the expression construct of MBD6-HSPB3-RFP. [0033] FIGS.23A-23B show (FIG.23A) a cartoon graphic depicting MBD6-HSPB5- RFP chimeric protein and goal of experiments. Made using biorender; and (FIG.23B) a plasmid map depicting the expression construct of MBD6-HSPB5-RFP. sf-6059413 Attorney Docket No.: 26223-20027.40 [0034] FIGS.24A-24B show (FIG.24A) a cartoon graphic depicting MBD6-HSPB8- RFP chimeric protein and goal of experiments. Made using biorender; and (FIG.24B) a plasmid map depicting the expression construct of MBD6-HSPB8-RFP. [0035] FIG.25A shows 3D reconstruction of root meristem tissue of mbd5 mbd6 mutant plants expressing MBD6, MBD6HSPB1, MBD6HSPB3, MBD6HSPB5, or MBD6HSPB8 all with C- terminal RFP. All constructs except for MBD6HSPB8 show a punctate pattern of RFP fluorescence, indicating concentrated localization of signal at the chromocenters. MBD6HSPB8 shows diffuse nuclear staining. [0036] FIG.25B shows fluorescence confocal imaging of nuclei expressing MBD6-RFP, or MBD6-RFP variants in which the StkyC domain has been replaced by human sHSPs. All constructs except for MBD6-HSPB8 and MBD6 in the acd15 acd21 background show a punctate pattern of RFP fluorescence, indicating concentrated localization of signal at the chromocenters. [0037] FIGS.26A-26B show (FIG.26A) a cartoon representation of scFV-GFP-HSPB1 construct localizing to dCas9 with 10xGCN4 binding sites (black squares); and (FIG.26B) a plasmid map of SunTag system containing HSPB1. [0038] FIGS.27A-27B show (FIG.27A) a cartoon representation of scFV-GFP-HSPB3 construct localizing to dCas9 with 10xGCN4 binding sites (black squares); and (FIG.27B) a plasmid map of SunTag system containing HSPB3. [0039] FIGS.28A-28B show (FIG.28A) a cartoon representation of scFV-GFP-HSPB5 construct localizing to dCas9 with 10xGCN4 binding sites (black squares); and (FIG.28B) a plasmid map of SunTag system containing HSPB5. [0040] FIGS.29A-29B show (FIG.29A) a cartoon representation of scFV-GFP-HSPB8 construct localizing to dCas9 with 10xGCN4 binding sites (black squares); and (FIG.29B) a plasmid map of SunTag system containing HSPB8. [0041] FIGS.30A-30C show an alignment of the ACD15 ACD domain with full length proteins of ACD orthologs from other plant species (FIGS.30A-30B) and a phylogeny of the species included in the alignment (FIG.30C). FIGS.30D-30E show an alignment of the ACD21 ACD domain with full length proteins of ACD orthologs from other plant species (FIG.30E) and a phylogeny of the species included in the alignment (FIG.30E). sf-6059413 Attorney Docket No.: 26223-20027.40 [0042] FIG.31 shows a comparison of A. thaliana ACD15 and ACD21 Į-Crystalline Domains with protein coding sequences of Į-Crystalline Domain containing small heat shock proteins of different plant species: Corn (Zea Mays), Soybean (Glycine Max), Wheat (Triticum aestivum), rice (Oryza sativa Subspecies Japonica), Strawberry (Fragaria ananassa), sugar beets (Beta vulgaris), and Norway spruce (Picea abies). [0043] FIG.32 shows a comparison of A. thaliana ACD15 and ACD21 Į-Crystalline Domains with protein coding sequences of H. sapiens Į-Crystalline Domain containing small heat shock proteins including HSPB1-10. [0044] FIG.33 shows a comparison of thaliana ACD15 and ACD21 Į-Crystalline Domains with protein coding sequences of Į-Crystalline Domain containing small heat shock proteins from the following species which represent all kingdoms of life: HSPB1 (Homo sapiens) (a mammal), HSP22 (Drosophila melanogaster) (an insect), HSP26 (Saccharomyces cerevisiae) (a fungus), M1URI8 (Cyanidioschyzon merolae) (a red algae), P12811 (Chlamydomonas reinhardtii) (a green algae), Q9RTR5 (Deinococcus radiodurans) (a bacterium), and D0KNS6 (Saccharolobus solfataricus) (an archaebacterium). [0045] FIGS.34A-34C show the construct designs discussed in Example 10. FIG.34A shows a cartoon depiction of ZF-RFP constructs. FIG.34B shows a plasmid map of ZF alone-RFP. FIG.34C shows a plasmid map of ZF-MBD6 StkyC-RFP. [0046] FIG.35 shows StkyCMBD6 increases protein accumulation of a ZF binding domain to its genomic binding sites. Ratio of ChIP-seq signal of ZF-StkyCMBD6 over ZF alone over ZF binding sites. Increased blue color corresponds to increased signal intensity. [0047] FIGS.36A-36B show Cas^-MBD6 StkyC Construct Design Plasmid maps for Cas^-MBD6 StkyC (FIG.36A) and Cas^ alone (FIG.36B). [0048] FIGS.37A-37D show Cas^ Alone Construct Designs: Cas^ alone plasmid maps with no gRNA (FIG.37A), gRNA9 (FIG.37B), gRNA6 (FIG.37C), gRNA8 (FIG.37D). [0049] FIGS.38A-38D show Cas^-ACD15-ACD21 Construct Designs: Cas^-ACD15- ACD21 plasmid maps with no gRNA (FIG.38A), gRNA9 (FIG.38B), gRNA6 (FIG.38C), gRNA8 (FIG.38D). [0050] FIGS.39A-39B show an alignment of StykC domain homologs from MBD6 polypeptides (FIG.39A) and an associated phylogenetic tree (FIG.39B). sf-6059413 Attorney Docket No.: 26223-20027.40 [0051] FIGS.40A-40B show an alignment of StykC domain homologs from MBD7 polypeptides (FIG.40A) and an associated phylogenetic tree (FIG.40B). [0052] FIGS.41A-41C show StkyC enhancement of genome editing. FIG.41A: Genome browser tracks of the FWA locus demonstrating methylation state of wild type or fwa mutant epiallele as well as accessibility of that region. FIG.41B - FIG.41C: Editing at guide 4 and guide 17 of FWA locus in wild type plants. [0053] FIGS.42A-42C show Cas9-XTEN-StkyC construct design. FIG.42A: Graphical cartoon of Cas9-XTEN-StkyC construct (made with biorender). FIG.42B - FIG.42C: Plasmid maps of constructs targeting Guide 4 (FIG.42B) or Guide 17 (FIG.42C). [0054] FIGS.43A-43B show Cas9-SunTag-1xGCN4 construct design. FIG.43A: Graphical cartoon of Cas9-SunTag-1xGCN4 (made with biorender). FIG.43B: Plasmid map of constructs targeting Guide 4. [0055] FIGS.44A-44C show Cas9-SunTag-4xGCN4 construct design. FIG.44A: Graphical cartoon of Cas9-SunTag-4xGCN4 (made with biorender). FIG.44B - FIG.44C: Plasmid maps of constructs targeting Guide 4 (FIG.44B) or Guide 17 (FIG.44C). [0056] FIGS.45A-45C show Cas9 construct design. FIG.45A: Graphical cartoon of Cas9 construct (made with biorender). FIG.45B - FIG.45C: Plasmid maps of constructs targeting Guide 4 (FIG.45B) or Guide 17 (FIG.45C). [0057] FIGS.46A-46C show an exemplary SunTag-HSPB1 construct. FIG.46A: Graphical representation of an exemplary SunTag construct (made using BioRender). FIG. 46B: Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-HSPB1 construct creating distinct GFP foci. White dashes represent nuclear periphery. FIG.46C: Plasmid map of the SunTag-HSPB1 construct expressed in the plants in FIG.46B. [0058] FIGS.47A-47C show an exemplary SunTag-HSPB4 construct. FIG.47A: Graphical representation of an exemplary SunTag construct (made using BioRender). FIG. 47B: Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-HSPB4 construct creating distinct GFP foci. White dashes represent nuclear periphery. FIG.47C: Plasmid map of the SunTag-HSPB4 construct expressed in the plants in FIG.47B. sf-6059413 Attorney Docket No.: 26223-20027.40 [0059] FIGS.48A-48C show an exemplary SunTag-HSPB5 construct. FIG.48A: Graphical representation of an exemplary SunTag construct (made using BioRender). FIG. 48B: Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-HSPB5 construct creating distinct GFP foci. White dashes represent nuclear periphery. FIG.48C: Plasmid map of the SunTag-HSPB5 construct expressed in the plants in FIG.48B. [0060] FIGS.49A-49C show an exemplary SunTag-Chlamydomonas reinhardtii ACD construct. FIG.49A: Graphical representation of an exemplary SunTag construct (made using BioRender). FIG.49B: Representative nucleus from live cell imaging analysis of wild type plants expressing a SunTag-Chlamydomonas reinhardtii ACD construct creating distinct GFP foci. FIG.49C: Plasmid map of the SunTag-Chlamydomonas reinhardtii ACD construct expressed in the plants in FIG.49B. [0061] FIGS.50A-50C show an exemplary SunTag-Sacchrolobus solfataricus ACD construct. FIG.50A: Graphical representation of an exemplary SunTag construct (made using BioRender). FIG.50B: Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-Sacchrolobus solfataricus construct creating distinct GFP foci. White Dashes represent nuclear periphery. FIG.50C: Plasmid Map of the SunTag- Sacchrolobus solfataricus construct expressed in the plants in FIG.50B. [0062] FIGS.51A-51C show an exemplary SunTag-Zea mays ACD construct. FIG. 51A: Graphical representation of an exemplary SunTag construct (made using BioRender). FIG.51B: Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-Zea mays ACD construct creating distinct GFP foci. White dashes represent nuclear periphery. FIG.51C: Plasmid map of the SunTag-Zea mays ACD construct expressed in the plants in FIG.51B. [0063] FIGS.52A-52C show an exemplary SunTag-HSPB8 construct. FIG.52A: Graphical representation of an exemplary SunTag construct (made using BioRender). FIG. 52B: Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-HSPB8 construct, which do not form distinct GFP foci. White dashes represent nuclear periphery. FIG.52C: Plasmid map of the SunTag-HSPB8 construct expressed in the plants in FIG.52B. [0064] FIGS.53A-53C show an exemplary SunTag-Oryza sativa ACD construct. FIG. 53A: Graphical representation of an exemplary SunTag construct (made using BioRender). sf-6059413 Attorney Docket No.: 26223-20027.40 FIG.53B: Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-Oryza sativa ACD construct, which do not form distinct GFP foci. White Dashes represent nuclear periphery. FIG.53C: Plasmid map of the SunTag-Oryza sativa ACD construct expressed in the plants in FIG.53B. [0065] FIGS.54A-54C show an exemplary SunTag-Deinococcus radiodurans ACD construct. FIG.54A: Graphical representation of an exemplary SunTag construct (made using BioRender). FIG.54B: Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-Deinococcus radiodurans ACD construct, which do not form distinct GFP foci. White dashes represent nuclear periphery. FIG. 54C: Plasmid map of the SunTag-Deinococcus radiodurans ACD construct expressed in the plants in FIG.54B. [0066] FIGS.55A-55C show an exemplary SunTag-Solanum tuberosum ACD construct. FIG.55A: Graphical representation of an exemplary SunTag construct (made using BioRender). FIG.55B: Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-Solanum tuberosum ACD construct creating distinct GFP foci and staining DNA with DAPI. White dashes represent nuclear periphery. FIG.55C: Plasmid map of the SunTag-Solanum tuberosum ACD construct expressed in the plants in FIG.55B. [0067] FIGS.56A-56C show an exemplary SunTag-Solanum lycopersicum ACD construct. FIG.56A: Graphical representation of an exemplary SunTag construct (made using BioRender). FIG.56B: Representative nuclei from live cell imaging analysis of wild type plants expressing a SunTag-Solanum lycopersicum ACD construct creating distinct GFP foci and staining DNA with DAPI. White dashes represent nuclear periphery. FIG.56C: Plasmid map of the SunTag-Solanum lycopersicum ACD construct expressed in the plants in FIG.56B. [0068] FIG.57 shows complementation of FWA expression in mbd5 mbd6 mutant plants through use of human small heat shock proteins. The plot shows relative FWA expression by RT-qPCR of mbd5 mbd6 mutant plants expressing MBD6HSPB1, MBD6HSPB3, MBD6HSPB5, or MBD6HSPB8. Data were compared using a one-way ANOVA with corrections for multiple comparisons (*P < 0.05). [0069] FIG.58 shows editing efficiency for stable transgenic plants in Wild-Type (Col- 0) Arabidopsis thaliana. The plot displays editing efficiency comparing Cas9 controls to 1x and 4x GCN4 SunTagStkyC constructs in stably transformed plants. The Cas9 transgene construct design is listed along the X-axis. The editing efficiency (percent indel reads (%)) is sf-6059413 Attorney Docket No.: 26223-20027.40 plotted on the Y-axis. The standard error of the mean (SEM) was calculated for each target site. P values were calculated using an unpaired t-test. [0070] FIGS.59A-59B show Arabidopsis thaliana protoplast experiments testing improved Cas9 genome editing capability through ACD accumulation technology. FIG.59A shows editing efficiency of SunTag-ACD constructs in protoplasts from experiment 1. FIG. 59B shows editing efficiency of SunTag-ACD constructs in protoplasts from experiment 2. The Cas9 transgene construct design is plotted along the X-axis. Chlamy is an abbreviation for Chlamydomonas reinhardtii ACD protein, HSP14sacc is an abbreviation for Saccarolobus solfataricus ACD protein. The editing efficiency (percent indel reads (%)) is plotted on the Y-axis. Each dot represents an individual transfection. The standard error of the mean (SEM) was calculated for each target site. [0071] FIGS.60A-60C show plasmid maps of constructs used in experiments: Plasmid maps of SunTag-VP64 (FIG.60A), SunTagSacc-VP64 (FIG.60B), and PiggyBac Transposase plasmid (FIG.60C) construct. [0072] FIGS.61A-61B show HEK293 cells showing accumulation by ACD targeting technology. Representative HEK293 cells expressing SunTag-VP64 control construct (FIG. 61A) versus SunTagSacc-VP64 (FIG.61B). Nuclei are visualized by staining DNA with DAPI. [0073] FIGS.62A-62B show SunTagChlamy construct design. FIG.62A: Plasmid map of the SunTagChlamy construct for targeting the siren locus. FIG.62B: Representative root nucleus image demonstrating two foci formed by SunTagChlamy targeted to siren loci. [0074] FIGS.63A-63C show design of the vCasĭ-XTEN-StkyC construct. FIG.63A shows genome browser tracks of the FWA locus demonstrating methylation state of wild type or fwa mutant epiallele as well as accessibility of that region. gRNA17 and gRNA4 used in this example are shown. FIG.63B shows a plasmid map of vCasĭ negative control to be used in the experiments described in Example 20. FIG.63C shows a plasmid map of vCasĭ- XTEN-StkyC construct to be used in the experiments described in Example 20. [0075] FIGS.64A-64B show design of a Cas9-Suntag-HSPB5-4xGCN4-Truncated construct. FIG.64A shows genome browser tracks of FWA loci demonstrating methylation state of wild type or fwa mutant epiallele as well as accessibility of that region. The positions of gRNA17 and gRNA4 are shown. FIG.64B shows a plasmid map of Cas9-Suntag-HSPB5- 4xGCN4-Truncated construct. sf-6059413 Attorney Docket No.: 26223-20027.40 [0076] FIGS.65A-65E show exemplary designs of SunTag-ACD-TRBIP1-MQ1 constructs. Construct design of SunTag-Chlamy-TRBIP1-MQ1 (FIG.65A), SunTag-Zea Mays-TRBIP1-MQ1 (FIG.65B), SunTag-HSPB1-TRBIP1-MQ1 (FIG.65C), SunTag- HSPB4-TRBIP1-MQ1 (FIG.65D), SunTag-HSPB8-TRBIP1-MQ1 (FIG.65E). DETAILED DESCRIPTION [0077] The following description is presented to enable a person of ordinary skill in the art to make and use the various embodiments. Descriptions of specific devices, techniques, methods, and applications are provided only as examples. Various modifications to the examples described herein will be readily apparent to those of ordinary skill in the art, and the general principles defined herein may be applied to other examples and applications without departing from the spirit and scope of the various embodiments. Thus, the various embodiments are not intended to be limited to the examples described herein and shown. [0078] The present disclosure relates generally to methods of eukaryotic genome modification. More specifically, the present disclosure relates to compositions and methods for targeting a genetic modifier polypeptide and a Į-crystalline domain polypeptide to a target nucleic acid of interest to facilitate a genome modification. [0079] The present disclosure is based, at least in part, on Applicant’s surprising discovery described herein that recruitment of a sufficient number of Į-crystalline domain- containing polypeptides, such as small heat shock proteins (sHSPs), to a genomic locus (for example, by targeting with a dead Cas9) may form a nucleation center that recruits a large number of Į-crystalline domain proteins, concentrating them into a nuclear body tethered to the genomic site, and sequestering them away from other locations in the nucleus. Further, Applicant has discovered that genetic modifier polypeptides may be co-targeted with a Į- crystalline domain polypeptide to a nucleic acid of interest to facilitate improved modification of the target nucleic acid. [0080] In view of Applicant’s discovery, the present disclosure is directed to methods and compositions for aggregation of Į-crystalline domain polypeptides to concentrate polypeptides of interest near a genomic locus or other target nucleic acid of interest by targeting Į-crystalline domain-containing proteins to the genomic locus or other target nucleic acid. In some embodiments described herein, the methods described herein are used for, for example, epigenetic editing, genome editing, RNA editing, control of recombination, sf-6059413 Attorney Docket No.: 26223-20027.40 control of transcription, and/or any other process that occurs at specific regions of chromatin or other nucleic acids. [0081] The methods and compositions described herein may be used in the concentration of genome modification activities to one or more sites, which reduces the incidence of off- target activity and increases the efficiency of on-target activity. Thus, in some embodiments, the methods and compositions described herein may be used to increase efficiency of genome modification of a target nucleic acid by a genetic modifier polypeptide—for example, increased editing efficiency by, for example, a Cas protein. Editing efficiency could be measured in a variety of ways. For example, efficiency could be measured relative to a corresponding control. A corresponding control could comprise, for example, the same genetic modifier polypeptide but in which the genetic modifier polypeptide is not co-targeted with a Į-crystalline domain polypeptide to the target nucleic acid. Efficiency could be quantified by, for example, providing a sample comprising nucleic acids comprising a plurality of copies of a target nucleic acid sequence, targeting a genetic modifier polypeptide to the target nucleic acid sequence in the sample, and then measuring the proportion of nucleic acids comprising the target sequence in the sample that are modified by the genetic modifier polypeptide. For example, if a sample comprises 100 nucleic acids each comprising one copy of the target nucleic acid sequence (such as, for example, 50 cells with two nucleic acids per cell), and the genetic modifier polypeptide modifies the target sequence in 20 of the nucleic acids, then the editing efficiency could be quantified as 20%. Editing efficiency could also be measured as, for example, a fold change in targeted nucleic acid modifications in a sample comprising a genetic modifier polypeptide co-targeted with a Į-crystalline domain polypeptide to a target nucleic acid compared to a corresponding control comprising the same genetic modifier polypeptide to the same target nucleic acid but not co-targeted with a Į- crystalline domain polypeptide. [0082] Relative editing efficiencies compared to a corresponding control could be measured from various perspectives. For example, editing efficiency could be measured relative to a corresponding control based on relative number of edits that occur per sample within a given time frame. For example, if the genetic modifier polypeptide is expressed in a sample (such as, for example, in a sample of cells) beginning at time 0 with or without co- targeting with an Į-crystalline domain polypeptide, and then the nucleic acids are collected from the sample after an hour, then editing efficiency could be measured as edits per hour in a co-targeting sample relative to a non-co-targeting sample. Alternatively, editing efficiency sf-6059413 Attorney Docket No.: 26223-20027.40 could be measured as the relative time it takes for, for example, half of the nucleic acids comprising a target sequence in a first sample comprising co-targeting with an Į-crystalline domain polypeptide to be edited compared to the time it takes for half of the nucleic acids comprising a target sequence in a second sample lacking co-targeting with an Į-crystalline domain polypeptide to be edited. [0083] Editing efficiency could be improved (for example, as compared to a corresponding control) to various degrees. In some embodiments, targeting a genetic modifier polypeptide and an Į-crystalline domain polypeptide to a target nucleic acid may improve (e.g., increase) editing efficiency by 0-5%, 5-10%, 10-15%, 15-20%, 20-25%, 25-30%, 30- 35%, 35-40%, 40-45%, 45-50%, 50-55%, 55-60%, 60-65%, 65-70%, 70-75%, 75-80%, 80- 85%, 85-90%, 90-95%, 95-100%, 100-150%, 150-200%, or more than 200% compared to a corresponding control. [0084] The present disclosure also provides methods and compositions that allow for Į- crystalline domain polypeptides to be co-opted for use in targeting epigenetic enzymes, gene editing proteins, guide RNAs (gRNAs), template nucleic acids such as DNA, transcription factors, and other factors to 1) specifically accumulate proteins or other molecules of interest; 2) maintain enzymatic functions; and 3) sequester these factors away from other sites in the genome. General Techniques [0085] The techniques and procedures described or referenced herein are generally well understood and commonly employed using conventional methodology by those skilled in the art, such as, for example, the widely utilized methodologies described in Sambrook et al., Molecular Cloning: A Laboratory Manual 3d edition (2001) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.; Current Protocols in Molecular Biology (F.M. Ausubel, et al. eds., (2003)); the series Methods in Enzymology (Academic Press, Inc.): PCR 2: A Practical Approach (M.J. MacPherson, B.D. Hames and G.R. Taylor eds. (1995)), Harlow and Lane, eds. (1988); Oligonucleotide Synthesis (M.J. Gait, ed., 1984); Methods in Molecular Biology, Humana Press; Cell Biology: A Laboratory Notebook (J.E. Cellis, ed., 1998) Academic Press; Animal Cell Culture (R.I. Freshney), ed., 1987); Introduction to Cell and Tissue Culture (J.P. Mather and P.E. Roberts, 1998) Plenum Press; Cell and Tissue Culture: Laboratory Procedures (A. Doyle, J.B. Griffiths, and D.G. Newell, eds., 1993-8) J. Wiley and Sons; Gene Transfer Vectors for Mammalian Cells (J.M. Miller and M.P. Calos, sf-6059413 Attorney Docket No.: 26223-20027.40 eds., 1987); PCR: The Polymerase Chain Reaction, (Mullis et al., eds., 1994); Short Protocols in Molecular Biology (Wiley and Sons, 1999). General Terms [0086] The terminology used herein is for the purpose of describing particular embodiments and is not intended to be limiting. [0087] The use of the terms “a,” “an,” and “the,” and similar referents in the context of describing the disclosure (especially in the context of the following claims) are to be construed to cover both the singular and the plural, unless otherwise indicated herein or clearly contradicted by context. The terms “comprising,” “having,” “including,” and “containing” are to be construed as open-ended terms (i.e., meaning “including, but not limited to,”) unless otherwise noted. Recitation of ranges of values herein are merely intended to serve as a shorthand method of referring individually to each separate value falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein. For example, if the range 10-15 is disclosed, then 11, 12, 13, and 14 are also disclosed. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to better illuminate the embodiments of the disclosure and does not pose a limitation on the scope of the disclosure unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the embodiments of the disclosure. [0088] Reference to “about” a value or parameter herein refers to the usual error range for the respective value readily known to the skilled person in this technical field. Reference to “about” a value or parameter herein includes (and describes) aspects that are directed to that value or parameter per se. For example, description referring to “about X” includes description of “X.” [0089] The term “and/or” as used herein a phrase such as “A and/or B” is intended to include both A and B; A or B; A (alone); and B (alone). Likewise, the term “and/or” as used herein a phrase such as “A, B, and/or C” is intended to encompass each of the following embodiments: A, B, and C; A, B, or C; A or C; A or B; B or C; A and C; A and B; B and C; A (alone); B (alone); and C (alone). sf-6059413 Attorney Docket No.: 26223-20027.40 [0090] The terms “isolated” and “purified” as used herein refers to a material that is removed from at least one component with which it is naturally associated (e.g., removed from its original environment). The term “isolated,” when used in reference to an isolated protein, refers to a protein that has been removed from the culture medium of the host cell that expressed the protein. As such an isolated protein is free of extraneous or unwanted compounds (e.g., nucleic acids, native bacterial or other proteins, etc.). [0091] It is understood that aspects and embodiments of the present disclosure described herein include “comprising,” “consisting,” and “consisting essentially of” aspects and embodiments. [0092] It is to be understood that one, some, or all of the properties of the various embodiments described herein may be combined to form other embodiments of the present disclosure. These and other aspects of the present disclosure will become apparent to one of skill in the art. These and other embodiments of the present disclosure are further described by the detailed description that follows. Overview [0093] A key question in eukaryotic biology is how DNA binding proteins are able to gain access to DNA, given that DNA is packaged into nucleosomes and therefore frequently inaccessible. An important example of this is gene editing, where CRISPR enzymes, guided by sequence specific guide RNAs, must gain access to specific DNA sequences in the genome in order to make changes in these sequences. Because CRISPR systems evolved in bacteria and viruses that do not have nucleosomes, these systems did not evolve mechanisms to efficiently gain access to DNA bound to nucleosomes. Indeed, it has been found that CRISPR-mediated gene editing can be highly inefficient at particular DNA sequences, especially those with tightly associated nucleosomes. CRISPR-mediated gene editing techniques are revolutionizing the fields of crop and livestock improvement, and are the basis of a new class of human therapeutics. Therefore, the development of methods to allow CRISPR systems to more efficiently edit eukaryotic genomes could have a profound effect on many areas of basic and applied research. [0094] In certain aspects, the present disclosure provides methods to allow genetic modifier polypeptides, such as CRISPR systems, or other polypeptides of interest, to more readily gain access to nucleic acid (e.g. DNA) target sites in plant and animal genomes. Co- targeting Į-crystalline domain proteins with genetic modifier polypeptides, such as, for sf-6059413 Attorney Docket No.: 26223-20027.40 instance, CRISPR systems, causes the genetic modifier polypeptides to oligomerize and hyperaccumulate at the target locus, while also sequestering them away from non-target sites. Further, the methods described herein involving addition of Į-crystalline domain proteins to, for example, a CRISPR-based DNA methylation targeting systems, have the beneficial effect of increasing specificity to the target locus, thereby reducing or eliminating off-target effects, such as, for instance, off-target methylation. The methods described herein harness the unique properties of Į-crystalline domain proteins to develop, in some embodiments, a new class of CRISPR-based gene editing tools that are both more powerful and more specific. Without wishing to be bound by theory, it is believed that dramatically increasing the concentration of CRIPSR or other genetic modifier components at a target site will allow them to more effectively compete for DNA binding. [0095] The present disclosure relates to genetic modifier polypeptides that are capable of being targeted to the target nucleic acid, and Į-crystalline domain polypeptides that are capable of being targeted to the target nucleic acid, wherein the genetic modifier polypeptide and/or the Į-crystalline domain polypeptide is a recombinant polypeptide, as well as methods of using these genetic modifier polypeptides and Į-crystalline domain polypeptides for modifying a target nucleic acid in a eukaryotic cell. The present disclosure is based, at least in part, on Applicant’s discovery that Į-crystalline domain polypeptides can be used to concentrate polypeptides of interest, such as genetic modifier polypeptides, to a target nucleic acid, and that, once targeted this concentration can reduce the incidence of off-target activity and increase the efficiency of, for instance, the production of genetic modifications by a genetic modifier polypeptide. [0096] Accordingly, the present disclosure provides methods for targeting a genetic modifier polypeptide to a target nucleic acid and targeting a Į-crystalline domain polypeptide to the target nucleic acid, where the genetic modifier polypeptide and/or the Į-crystalline domain polypeptide is a recombinant polypeptide. Once targeted to the target nucleic acid, the genetic modifier polypeptide modifies the target nucleic acid. Also provided are nucleic acids encoding the genetic modifier polypeptides and Į-crystalline domain polypeptides, expression vectors containing nucleic acids that encode the genetic modifier polypeptides and Į-crystalline domain polypeptides, cells containing the genetic modifier polypeptides and Į- crystalline domain polypeptides, plants, mammals, and other organisms containing the genetic modifier polypeptides and Į-crystalline domain polypeptides, and plants, mammals, and other organisms having a target nucleic acid containing a genetic modification as a sf-6059413 Attorney Docket No.: 26223-20027.40 consequence of having the genetic modifier polypeptides and Į-crystalline domain polypeptides targeted to the target nucleic acid. [0097] Each one of the genetic modifier polypeptides and Į-crystalline domain polypeptides described herein may be expressed in a host cell individually or in various combinations to act to modify a target nucleic acid. Recombinant Polypeptides [0098] Certain aspects of the present disclosure relate to recombinant polypeptides containing genetic modifier polypeptides and/or Į-crystalline domain polypeptides. These recombinant polypeptides may be targeted to a target nucleic acid to facilitate genetic modifications of the target nucleic acid. In addition to recombinant polypeptides containing a genetic modifier polypeptide and/or a Į-crystalline domain polypeptide, these polypeptides may contain other features as described herein and as well be apparent to one of skill in the art. Other amino acid and/or polypeptide sequence features of the recombinant polypeptides may be used to provide additional functionality and/or features to the recombinant polypeptide including e.g. subcellular localization, downstream detection, etc. as will be readily apparent to one of skill in the art. [0099] As used herein, a “polypeptide” is an amino acid sequence including a plurality of consecutive polymerized amino acid residues (e.g., at least about 15 consecutive polymerized amino acid residues). “Polypeptide” refers to an amino acid sequence, oligopeptide, peptide, protein, or portions thereof, and the terms “polypeptide” and “protein” are used interchangeably. [0100] Polypeptides as described herein also include polypeptides having various amino acid additions, deletions, or substitutions relative to the native amino acid sequence of a polypeptide of the present disclosure. In some embodiments, polypeptides that are homologs of a polypeptide of the present disclosure contain non-conservative changes of certain amino acids relative to the native sequence of a polypeptide of the present disclosure. In some embodiments, polypeptides that are homologs of a polypeptide of the present disclosure contain conservative changes of certain amino acids relative to the native sequence of a polypeptide of the present disclosure, and thus may be referred to as conservatively modified variants. A conservatively modified variant may include individual substitutions, deletions or additions to a polypeptide sequence which result in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally sf-6059413 Attorney Docket No.: 26223-20027.40 similar amino acids are well-known in the art. Such conservatively modified variants are in addition to and do not exclude polymorphic variants, interspecies homologs, and alleles of the disclosure. The following eight groups contain amino acids that are conservative substitutions for one another: 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine (S), Threonine (T); and 8) Cysteine (C), Methionine (M) (see, e.g., Creighton, Proteins (1984)). A modification of an amino acid to produce a chemically similar amino acid may be referred to as an analogous amino acid. [0101] A “recombinant” polypeptide, protein, or enzyme of the present disclosure may be a polypeptide, protein, or enzyme that may be encoded by e.g. a “recombinant nucleic acid” or “heterologous nucleic acid” or “recombinant polynucleotide.” [0102] Recombinant polypeptides of the present disclosure that are composed of individual polypeptide domains may be described based on the individual polypeptide domains of the overall recombinant polypeptide. A domain in such a recombinant polypeptide refers to the particular stretches of contiguous amino acid sequences with a particular function or activity. For example, for a recombinant polypeptide that includes a sequence from a genetic modifier polypeptide and/or a sequence from a Į-crystalline domain polypeptide, the contiguous amino acids that encode the sequence from the genetic modifier polypeptide may be described as the “genetic modifier domain” in the overall recombinant polypeptide, and the contiguous amino acids that encode the sequence from the Į-crystalline domain polypeptide may be described as the “Į-crystalline domain polypeptide domain” in the over recombinant polypeptide. Individual domains in an overall recombinant polypeptide may also be referred to as units of the recombinant polypeptide. Recombinant polypeptides that are composed of individual polypeptide domains may also be referred to as fusion polypeptides. [0103] Fusion polypeptides of the present disclosure may contain an individual polypeptide domain that is in various N-terminal or C-terminal orientations relative to other individual polypeptide domains present in the fusion polypeptide. Fusion of individual polypeptide domains in fusion polypeptides may also be direct or indirect fusions. Direct fusions of individual polypeptide domains refer to direct fusion of the coding sequences of each respective individual polypeptide domain. In embodiments where the fusion is indirect, sf-6059413 Attorney Docket No.: 26223-20027.40 a linker domain or other contiguous amino acid sequence may separate the coding sequences of two individual polypeptide domains in a fusion polypeptide. [0104] Polypeptides of the present disclosure may be detecting using antibodies. Techniques for detecting polypeptides using antibodies include, for example, enzyme linked immunosorbent assays (ELISAs), Western blots, immunoprecipitations, and immunofluorescence. An antibody provided herein can be a polyclonal antibody or a monoclonal antibody. An antibody having specific binding affinity for a polypeptide provided herein can be generated using methods well known in the art. An antibody provided herein can be attached to a solid support such as a microtiter plate using methods known in the art. Linkers [0105] Various linkers may be used in the construction of recombinant polypeptides as described herein. In general, linkers are short peptides that separate the different domains in a multi-domain protein. They may play an important role in fusion proteins, affecting the crosstalk between the different domains, the yield of protein production, and the stability and/or the activity of the fusion proteins. Linkers are generally classified into 2 major categories: flexible or rigid. Flexible linkers are typically used when the fused domains require a certain degree of movement or interaction, and these linkers are usually composed of small amino acids such as, for example, glycine (G), serine (S) or proline (P). [0106] The certain degree of movement between domains allowed by flexible linkers is an advantage in some fusion proteins. However, it has been reported that flexible linkers can sometimes reduce protein activity due to an inefficient separation of the two domains. In this case, rigid linkers may be used since they enforce a fixed distance between domains and promote their independent functions. A thorough description of several linkers has been provided in Chen X et al., 2013, Advanced Drug Delivery Reviews 65 (2013) 1357–1369). [0107] Various linkers may be used in, for example, the construction of recombinant polypeptides as described herein. Linkers may be used to separate the coding sequences of a genetic modifier polypeptide and a Į-crystalline domain polypeptide. For example, a variety of wiggly/flexible linkers, stiff/rigid linkers, short linkers, and long linkers may be used as described herein. Various linkers as described herein may be used in the construction of recombinant polypeptides as described herein. sf-6059413 Attorney Docket No.: 26223-20027.40 [0108] A variety of shorter or longer linker regions are known in the art, for example corresponding to a series of glycine residues, a series of adjacent glycine-serine dipeptides, a series of adjacent glycine-glycine-serine tripeptides, or known linkers from other proteins. A flexible linker may include, for example, the amino acid sequence: SSGPPPGTG (SEQ ID NO: 489) and variants thereof. A rigid linker may include, for example, the amino acid sequence: AEAAAKEAAAKA (SEQ ID NO: 490) and variants thereof. The XTEN linker, SGSETPGTSESATPES (SEQ ID NO: 491) and variants thereof, described in Guilinget et al, 2014 (Nature Biotechnology 32, 577–582), may also be used. Nuclear Localization Signals (NLS) [0109] Recombinant polypeptides of the present disclosure may contain one or more nuclear localization signals (NLS). Nuclear localization signals may also be referred to as nuclear localization sequences, domains, peptides, or other terms readily apparent to those of skill in the art. Nuclear localization signals are a translocation sequence that, when present in a polypeptide, direct that polypeptide to localize to the nucleus of a eukaryotic cell. [0110] Various nuclear localization signals may be used in recombinant polypeptides of the present disclosure. For example, one or more SV40-type NLS or one or more REX NLS may be used in recombinant polypeptides. Recombinant polypeptides may also contain two or more tandem copies of a nuclear localization signal. For example, recombinant polypeptides may contain at least two, at least three, at least for, at least five, at least six, at least seven, at least eight, at least nine, or at least ten copies, either tandem or not, of a nuclear localization signal. Tags, Reporters, and Other Features [0111] Recombinant polypeptides of the present disclosure may contain one or more tags that allow for e.g. purification and/or detection of the recombinant polypeptide. Various tags may be used herein and are well-known to those of skill in the art. Exemplary tags may include HA, GST, FLAG, MBP, etc., and multiple copies of one or more tags may be present in a recombinant polypeptide. [0112] Recombinant polypeptides of the present disclosure may contain one or more reporters that allow for e.g. visualization and/or detection of the recombinant polypeptide. A reporter polypeptide encodes a protein that may be readily detectable due to its biochemical characteristics such as, for example, enzymatic activity or chemifluorescent features. Reporter polypeptides may be detected in a number of ways depending on the characteristics sf-6059413 Attorney Docket No.: 26223-20027.40 of the particular reporter. For example, a reporter polypeptide may be detected by its ability to generate a detectable signal (e.g. fluorescence), by its ability to form a detectable product, etc. Various reporters may be used herein and are well-known to those of skill in the art. Exemplary reporters may include GFP, GUS, mCherry, luciferase, etc., and multiple copies of one or more tags may be present in a recombinant polypeptide. [0113] Recombinant polypeptides of the present disclosure may contain one or more polypeptide domains that serve a particular purpose depending on the particular goal/need. Recombinant polypeptides may contain translocation sequences that target the polypeptide to a particular cellular compartment or area. Suitable features will be readily apparent to those of skill in the art. Genetic Modifier Polypeptides [0114] Certain aspects of the present disclosure relate to genetic modifier polypeptides that are capable of being targeted to a target nucleic acid. Genetic modifier polypeptides as described herein generally refer to polypeptides that can facilitate, whether directly or indirectly, modification of a feature of a nucleic acid. The nucleic acid may be any type of nucleic acid of any length, including but not limited to DNA or RNA; single- or double- stranded nucleic acids; linear or circular nucleic acids; chromosomal or extra-chromosomal nucleic acids; nuclear, cytoplasmic, or organellar nucleic acids. Features of a nucleic acid that may be modified include but are not limited to, for example, the genetic sequence of the nucleic acid (such as, for example, addition, deletion, or inversion of one or more nucleic acid residues); the chemical structure of one or more nucleic acid residues (such as, for example, methylation, pseudouridylation, and other types of base modifications; the expression of the nucleic acid (such as, for example, the level of expression, the timing of expression, and/or the location of expression); one or more characteristics of a polypeptide (such as, for example, a histone or other scaffold proteins or a transcription factor) that is bound to or otherwise closely associated with the nucleic acid; and structures including the nucleic acid (such as, for example, hybridization state, secondary structure, tertiary structure, quarternary structure, chromatin, chromosomes). Thus, genetic modifier polypeptides may include, for example, transcriptional repressors, transcriptional activators, methyltransferases, demethylases, nucleases, recombinases, topoisomerases, ligases, polynucleotide kinases, uracil DNA glycosylases, and terminal deoxynucleotidyl transferases. Examples of transcriptional repressor polypeptides may include but are not limited to a PHD1 polypeptide, a PIAL1 polypeptide, a PIAL2 polypeptide, a TRB1 polypeptide, a TRB2 polypeptide, a sf-6059413 Attorney Docket No.: 26223-20027.40 TRB3 polypeptide, a MSI1 polypeptide, a LHP1 polypeptide, a HD2A polypeptide, a HD2B polypeptide, a HD2C polypeptide, an ELF7 polypeptide, a CPL2 polypeptide, a MBD2 polypeptide, a SUVH7 polypeptide, a SSRP1 polypeptide, a SPT16 polypeptide, a JMJ18 polypeptide, a TRBIP1 polypeptide, a TRBIP2 polypeptide, and an ASF1B polypeptide. Examples of transcriptional activator polypeptides may include but are not limited to activating transcription factors and VP64. Examples of methyltransferase polypeptides may include but are not limited to an MQ1 polypeptide and a SDG2 histone methyltransferase polypeptide (an exemplary MQ1 polypeptide is set forth in Example 2 herein). Examples of demethylase polypeptides may include but are not limited to a DEMETER polypeptide, a Tet1 polypeptide, a TDG polypeptide, and a ROS1 polypeptide. Examples of DNA methyltransferase polypeptides may include but are not limited to a DRM2 polypeptide, an SssI polypeptide, and a Dnmt3 polypeptide. Examples of nuclease polypeptides may include but are not limited to an endonuclease polypeptide (such as, for example, a Cas9 or CasPhi polypeptide) and an exonuclease polypeptide. Examples of recombinase polypeptides may include but are not limited to a Cre recombinase, a Hin recombinase, a Tre recombinase, and a FLP recombinase. Examples of topoisomerase polypeptides may include but are not limited to a type IA topoisomerase, a type IB topoisomerase, a type IC topoisomerase, a type IIA topoisomerase, and a type IIB topoisomerase. Examples of ligase polypeptides may include but are not limited to a DNA ligase and an RNA ligase. Examples of polynucleotide kinase polypeptides may include but are not limited to a T4 Polynucleotide Kinase (PNK). Other suitable genetic modifier polypeptides for use in the methods and compositions of the present disclosure will be readily apparent to those skill in the art. [0115] Recombinant genetic modifier polypeptides of the present disclosure may be capable of being targeted to a specific nucleic acid sequence on a target nucleic acid and may be used in reducing the expression of a target nucleic acid, such as a gene, in a eukaryotic organism (e.g. plants or mammals) as described herein. In some instances, a recombinant genetic modifier polypeptide may comprise, for example, a Cas protein (or other RNA- guided nuclease), which may be used to target the genetic modifier polypeptide to the target nucleic acid and/or to make the desired genome edit. In some embodiments, the Cas protein is a “dead” or deactivated (i.e., comprising deficient or at least reduced nuclease activity) Cas protein, in which case the Cas protein may be used to target the genetic modifier polypeptide to the target nucleic acid. In other embodiments, the Cas protein may comprise nuclease activity (e.g., nickase or double-stranded break activity), in which case the Cas protein may sf-6059413 Attorney Docket No.: 26223-20027.40 actively mediate both the targeting to the target nucleic acid and the genetic modification. Thus, genome editing involving a Cas protein as described herein may comprise editing with an active Cas or with a deactivated Cas. Thus, any Cas protein (or other RNA-guided nuclease) sufficient to target to a specific nucleic acid may be used in the methods described herein (e.g., Cas9, Cas12a, or other Cas proteins, and/or “dead” versions thereof). [0116] In some embodiments, a genetic modifier polypeptide contains at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, at least 60 consecutive amino acids, at least 70 consecutive amino acids, at least 80 consecutive amino acids, at least 90 consecutive amino acids, at least 100 consecutive amino acids, at least 120 consecutive amino acids, at least 140 consecutive amino acids, at least 160 consecutive amino acids, at least 180 consecutive amino acids, at least 200 consecutive amino acids, at least 220 consecutive amino acids, at least 240 consecutive amino acids, or 241 or more consecutive amino acids of an endogenous or wild-type full-length genetic modifier polypeptide. In some embodiments, genetic modifier polypeptides include sequences with one or more amino acids removed from the consecutive amino acid sequence of an endogenous or wild-type full-length genetic modifier polypeptide. In some embodiments, genetic modifier polypeptides may include sequences with one or more amino acids replaced/substituted with an amino acid different from an endogenous or wild-type amino acid present at a given amino acid position in a consecutive amino acid sequence of an endogenous or wild-type full-length genetic modifier polypeptide. In some embodiments, genetic modifier polypeptides may include sequences with one or more amino acids added to an otherwise consecutive amino acid sequence of an endogenous or wild-type full-length genetic modifier polypeptide. [0117] Suitable genetic modifier polypeptides may be identified from any organism, including but not limited to monocot and dicot plants, algae, fungi, animals (including, but not limited to mammals, such as Homo sapiens, and insects, such as Drosophila melanogaster), bacteria, archaea, and protists. [0118] In some embodiments, a genetic modifier polypeptide of the present disclosure has an amino acid sequence with at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about sf-6059413 Attorney Docket No.: 26223-20027.40 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of any one of a Cas9 polypeptide (e.g. SEQ ID NO: 487), a CasPhi polypeptide (e.g. SEQ ID NO: 488), a TRBIP1 polypeptide (e.g. SEQ ID NO: 1), a MQ1 polypeptide (e.g. SEQ ID NO: 184), a Tet1 polypeptide (e.g. SEQ ID NO: 272), and a SDG2 histone methyltransferase polypeptide (e.g. SEQ ID NO: 288). TRBIP1 Polypeptides [0119] Certain aspects of the present disclosure relate to TRBIP1 polypeptides. Recombinant TRBIP1 polypeptides of the present disclosure may be capable of being targeted to a specific nucleic acid sequence on a target nucleic acid and may be used in reducing the expression of a target nucleic acid, such as a gene, in plants. [0120] TRBIP1 proteins are known in the art. TRB Interacting Protein 1 (TRBIP1 AT4G35510) interacts with TRB proteins. Additionally, TRBI1 proteins are annotated as PHD finger-like proteins in The Arabidopsis Information Resource (TAIR) database. However, the endogenous function of TRBIP1 proteins have not been elucidated. [0121] In some embodiments, a TRBIP1 polypeptide contains at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, at least 60 consecutive amino acids, at least 70 consecutive amino acids, at least 80 consecutive amino acids, at least 90 consecutive amino acids, at least 100 consecutive amino acids, at least 120 consecutive amino acids, at least 140 consecutive amino acids, at least 160 consecutive amino acids, at least 180 consecutive amino acids, at least 200 consecutive amino acids, at least 220 consecutive amino acids, at least 240 consecutive amino acids, or 241 or more consecutive amino acids of an endogenous or wild- type full-length TRBIP1 polypeptide. In some embodiments, TRBIP1 polypeptides include sequences with one or more amino acids removed from the consecutive amino acid sequence of an endogenous or wild-type full-length TRBIP1 polypeptide. In some embodiments, TRBIP1 polypeptides may include sequences with one or more amino acids replaced/substituted with an amino acid different from an endogenous or wild-type amino acid present at a given amino acid position in a consecutive amino acid sequence of an endogenous or wild-type full-length TRBIP1 polypeptide. In some embodiments, TRBIP1 polypeptides may include sequences with one or more amino acids added to an otherwise consecutive amino acid sequence of an endogenous or wild-type full-length TRBIP1 polypeptide. sf-6059413 Attorney Docket No.: 26223-20027.40 [0122] Suitable TRBIP1 polypeptides may be identified from monocot and dicot plants. Examples of suitable TRBIP1 polypeptides may include, for example, those listed in Table 1, homologs thereof, and orthologs thereof. [0123] Table 1: TRBIP1 Polypeptides
Figure imgf000034_0001
[0124] In some embodiments, a TRBIP1 polypeptide of the present disclosure has an amino acid sequence with at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of any one of a TRBIP polypeptide described herein, such as, for example, a TRBIP1 polypeptide described in Table 1, including, for example, the polypeptide encoded by Arabidopsis thaliana NP_195276.3 (SEQ ID NO: 1). Non-Genetic Modifier Polypeptides [0125] Certain aspects of the present disclosure relate to non-genetic modifier polypeptides that are capable of being targeted to a target nucleic acid. Non-genetic modifier polypeptides as described herein generally refer to polypeptides that can be targeted to a nucleic acid but that do not modify the nucleic acid in any of the ways described above for genetic modifier polypeptides. For instance, a non-genetic modifier polypeptide may be used as a visual marker, such as, for instance, a green fluorescent protein (GFP). The nucleic acid may be any type of nucleic acid of any length, including but not limited to DNA or RNA; single- or double-stranded nucleic acids; linear or circular nucleic acids; chromosomal or extra-chromosomal nucleic acids; nuclear, cytoplasmic, or organellar nucleic acids. sf-6059413 Attorney Docket No.: 26223-20027.40 Į (Alpha)-Crystalline Domain Polypeptides [0126] Certain aspects of the present disclosure relate to molecular chaperone polypeptides. Molecular chaperone polypeptides are highly conserved across species from bacteria to humans to plants. One such chaperone family of polypeptides is the Į-crystalline domain (ACD) containing polypeptides, which are often associated with small heat shock polypeptides (sHSPs). ACD containing proteins target the polypeptides containing them to perform a variety of functions including regulation of aggregation, oligomeric species formation, holdase functions, and sequestering of proteins into specific compartments (see, e.g., M. Haslbeck, E. Vierling, A First Line of Stress Defense: Small Heat Shock Proteins and Their Function in Protein Homeostasis. Journal of Molecular Biology.427, 1537–1548 (2015); F. McLoughlin, E. Basha, M. E. Fowler, M. Kim, J. Bordowitz, S. Katiyar-Agarwal, E. Vierling, Class I and II Small Heat Shock Proteins Together with HSP101 Protect Protein Translation Factors during Heat Stress1[OPEN]. Plant Physiol.172, 1221–1236 (2016); Z. Liu, S. Zhang, J. Gu, Y. Tong, Y. Li, X. Gui, H. Long, C. Wang, C. Zhao, J. Lu, L. He, Y. Li, Z. Liu, D. Li, C. Liu, Hsp27 chaperones FUS phase separation under the modulation of stress-induced phosphorylation. Nat Struct Mol Biol.27, 363–372 (2020); and J. M. Webster, A. L. Darling, V. N. Uversky, L. J. Blair, Small Heat Shock Proteins, Big Impact on Protein Aggregation in Neurodegenerative Disease. Front Pharmacol.10 (2019), doi:10.3389/fphar.2019.01047). Į-crystalline domain (ACD) polypeptides, and Į-crystalline domain-containing polypeptides, as described herein, are polypeptides that contain an Į- crystalline domain. [0127] ACDs evolved to be efficient, ATP independent, chaperones with the purpose of utilizing oligomerization capacity to maintain the efficacy of cellular processes (M. Haslbeck, S. Weinkauf, J. Buchner, Small heat shock proteins: Simplicity meets complexity. J Biol Chem.294, 2121–2132 (2019)). [0128] Certain aspects of the present disclosure relate to small heat shock polypeptides (referred to herein as small HSPs or sHSPs). sHSPs generally refer to ATP-independent molecular chaperone proteins that contain an Į-crystalline domain (ACD) and form ensembles of differently-sized oligomeric species (Haslbeck M, et al. Small heat shock proteins: Simplicity meets complexity. J Biol Chem.2019 Feb 8;294(6):2121-2132. doi: 10.1074/jbc.REV118.002809. Epub 2018 Oct 31. PMID: 30385502; PMCID: PMC63692950). sf-6059413 Attorney Docket No.: 26223-20027.40 [0129] ACDs form the general structure of a ȕ-sandwich composed of anti-parallel 3 and 4 beta sheets (M. Haslbeck, S. Weinkauf, J. Buchner, Small heat shock proteins: Simplicity meets complexity. J Biol Chem.294, 2121–2132 (2019)). Often, proteins containing ACDs will also contain a variable N-terminus as well as a short C-terminal region. ACDs function to form the primary building block of sHSP oligomers through formation of dimer interfaces between the ȕ6+7 strands (i.e. human sHSPs) and/or the incorporation of the ȕ6 strand directly into the ȕ-sheets of the partner ACD proteins (i.e. plant sHSPs) (E. R. Waters, E. Vierling, Plant small heat shock proteins – evolutionary and functional diversity. New Phytologist.227, 24–37 (2020)). The N- and C-terminal regions can also allow for further oligomeric formation through direct interactions with the ACD of opposite homodimers or heterodimers (M. Haslbeck, S. Weinkauf, J. Buchner, Small heat shock proteins: Simplicity meets complexity. J Biol Chem.294, 2121–2132 (2019)). In these ways, ACD proteins of various species allow for the formation of ordered, oligomeric species using at minimum the highly conserved ACD domain, augmented by the more variable N- and C-terminal regions. [0130] While the general structure of the ȕ-sheets in the Į-crystalline domain of ACDs and sHSPs is generally conserved, there is not necessarily exact sequence conservation between diverse species. Most of the conservation is related to hydrophobic or charged regions of the polypeptides, which does not necessarily produce conservation in exact sequences of amino acids (as described in, for example, Narberhaus F. Alpha-crystallin-type heat shock proteins: socializing minichaperones in the context of a multichaperone network. Microbiol Mol Biol Rev.2002 Mar;66(1):64-93; table of contents. doi: 10.1128/MMBR.66.1.64-93.2002. PMID: 11875128; PMCID: PMC120782). [0131] Į-crystalline domain polypeptides for use in the methods and compositions of the present disclosure may contain certain amino acid sequences at one or more positions corresponding to positions of the amino acid sequence of ACD15. Such Į-crystalline domain polypeptides may contain one or more of the amino acids, conservative substitutions thereof, or analogous amino acids corresponding thereof, corresponding to one or more of the amino acids shown in one or more of the consensus sequences compared to ACD15 shown in FIGS. 30A-30B, one or more of the amino acids shown in one or more of the consensus sequences compared to ACD21 shown in FIG.30D, and/or one or more of the amino acids shown in one or more of the consensus sequences compared to ACD15 and ACD21 shown in FIGS. 31-33. For example, such Į-crystalline domain polypeptides may contain one or more of the amino acids, conservative substitutions thereof, or analogous amino acids corresponding sf-6059413 Attorney Docket No.: 26223-20027.40 thereof, corresponding to the amino acids: D (38), LPG (52-54), G (70), G (76), F (104), LP (110-111), and/or G (127), shown in bold in the ACD15 sequence below (SEQ ID NO: 11): MNAENNQTTTTHSKVISHVFCTGTAKLGSVGPPIGLVDIGVSEVAYIFRVSLPGIEKN QDKIKCEIQREGRVCIQGVIPEIAIPSDTGCLYRMQVQQLCPPGPFSITFNLPGQVDPR LFSPNFRSDGIFEVVVVKLGVRIPTS. [0132] The Į-crystalline domain of the full-length ACD15 polypeptide sequence above has the following sequence (with corresponding bolded amino acids from the full-length sequence reproduced below): LVDIGVSEVAYIFRVSLPGIEKNQDKIKCEIQREGRVCIQGVIPEIAIPSDTGCLYRMQ VQQLCPPGPFSITFNLPGQVDPRLFSPNFRSDGIFEVVVVKL (SEQ ID NO: 12). Polypeptides that contain this sequence, conservative substitutions of the amino acids thereof, or analogous amino acids corresponding thereof, may also be used in the methods and compositions of the present disclosure. [0133] In some embodiments, an Į-crystalline domain polypeptide of the present disclosure may contain at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of the Į-crystalline domain from ACD15 (SEQ ID NO: 12). [0134] In some embodiments, an Į-crystalline domain polypeptide of the present disclosure contains at least 10 consecutive amino acids, at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, or at least 60 consecutive amino acids, conservative substitutions thereof, or analogous amino acids corresponding thereof, of the Į-crystalline domain from ACD15 (SEQ ID NO: 12). [0135] In some embodiments, an Į-crystalline domain polypeptide of the present disclosure contain one or more of the amino acids, conservative substitutions thereof, or analogous amino acids corresponding thereof, corresponding to the amino acids bolded in the Į-crystalline domain from ACD15 below: sf-6059413 Attorney Docket No.: 26223-20027.40 LVDIGVSEVAYIFRVSLPGIEKNQDKIKCEIQREGRVCIQGVIPEIAIPSDTGCLYRMQ VQQLCPPGPFSITFNLPGQVDPRLFSPNFRSDGIFEVVVVKL (SEQ ID NO: 12). [0136] Polypeptides that are homologs of ACD15 or ACD21 may include polypeptides having various amino acid additions, deletions, or substitutions relative to the amino acid sequences of ACD15 or ACD21. In some embodiments, polypeptides that are homologs of ACD15 or ACD21 contain non-conservative changes of certain amino acids relative to ACD15 or ACD21. In some embodiments, polypeptides that are homologs of ACD15 or ACD21 contain conservative changes of certain amino acids relative to ACD15 or ACD21, and thus may be referred to as conservatively modified variants. A conservatively modified variant may include individual substitutions, deletions or additions to a polypeptide sequence which result in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art. Such conservatively modified variants are in addition to and do not exclude polymorphic variants, interspecies homologs, and alleles of the disclosure. The following eight groups contain amino acids that are conservative substitutions for one another: 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine (S), Threonine (T); and 8) Cysteine (C), Methionine (M) (see, e.g., Creighton, Proteins (1984)). A modification of an amino acid to produce a chemically similar amino acid may be referred to as an analogous amino acid. [0137] Polypeptides that are homologs of ACD15 or ACD21 may contain the same amino acid or a conservative amino acid substitution at one or more positions corresponding to positions of the amino acid sequence of ACD15 or ACD21. In some embodiments, the homolog contains the same or a conservative amino acid substitution at a position corresponding to D (38) of ACD15. In some embodiments, the homolog contains the same or a conservative amino acid substitution at positions corresponding to LPG (52-54) of ACD15. In some embodiments, the homolog contains the same or a conservative amino acid substitution at a position corresponding to G (70) of ACD15. In some embodiments, the homolog contains the same or a conservative amino acid substitution at a position corresponding to G (76) of ACD15. In some embodiments, the homolog contains the same or a conservative amino acid substitution at a position corresponding to F (104) of ACD15. In some embodiments, the homolog contains the same or a conservative amino acid sf-6059413 Attorney Docket No.: 26223-20027.40 substitution at positions corresponding to LP (110-111) of ACD15. In some embodiments, the homolog contains the same or a conservative amino acid substitution at a position corresponding to G (127) of ACD15. The homolog may contain various combinations (e.g. at least two, at least three, at least four, at least five, or at least six) of one or more of the same amino acid or conservative amino acid substitutions at a position or positions corresponding to D (38), LPG (52-54), G (70), G (76), F (104), LP (110-111), and/or G (127) of ACD15. In some embodiments, the homolog contains the same amino acid or conservative amino acid substitutions at a position corresponding to all of D (38), LPG (52-54), G (70), G (76), F (104), LP (110-111), and G (127) of ACD15. [0138] Some small HSPs are known to form dynamic oligomeric assemblies (M. Haslbeck, E. Vierling, A First Line of Stress Defense: Small Heat Shock Proteins and Their Function in Protein Homeostasis. Journal of Molecular Biology.427, 1537–1548 (2015)). sHSPs are found across all kingdoms of life and generally contain an Į-crystalline domain (ACD). [0139] Examples of Į-crystalline domain polypeptides, such as e.g. sHSPs polypeptides, include but are not limited to, the plant proteins ACD15 (e.g., SEQ ID NO: 11) and ACD21 (e.g., SEQ ID NO: 13) and homologs and orthologs thereof; the plant sHSPs At1g52560 (e.g., SEQ ID NO: 14), At4g27670 (e.g., SEQ ID NO: 15), At5g51440 (e.g., SEQ ID NO: 16), and At4g25200 (e.g., SEQ ID NO: 17) and homologs and orthologs thereof; the human sHSPs HSPB1 (e.g., SEQ ID NO: 18), HSPB2 (e.g., SEQ ID NO: 19), HSPB3 (e.g., SEQ ID NO: 20), HSPB4 (e.g., SEQ ID NO: 21), HSPB5 (e.g., SEQ ID NO: 22), HSPB6 (e.g., SEQ ID NO: 23), HSPB7 (e.g., SEQ ID NO: 24), HSPB8 (e.g., SEQ ID NO: 25), HSPB9 (e.g., SEQ ID NO: 26), HSPB10 (e.g., SEQ ID NO: 27) and homologs and orthologs thereof; the Drosophila melanogaster sHSP HSP22 (e.g., SEQ ID NO: 28); the Saccharomyces cerevisiae sHSP HSP26 (e.g., SEQ ID NO: 29); the Cyanidioschyzon merolae sHSP M1URI8 (e.g., SEQ ID NO: 30); the Chlamydomonas reinhardtii sHSP P12811 (e.g., SEQ ID NO: 31); the Deinococcus radiodurans sHSP HSP17.7 (e.g., SEQ ID NO: 32); and the Saccharolobus solfataricus sHSP Hsp14 (e.g., SEQ ID NO: 33); exemplary sequences of which are provided herein in Tables XA, XB, XC, XD, XE, XF, and XG. For example, an Į-crystalline domain polypeptide of the present disclosure may have an amino acid or DNA sequence (as applicable) with at least 50%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or sf-6059413 Attorney Docket No.: 26223-20027.40 100% sequence identity to any one or more of SEQ ID NOs: 11-120, 310, 401-409, 413-414, 416-417, 419-420, 422-423, 425-426, 428-429, 431-432, 447, and 472-476. [0140] In certain aspects, Į-crystalline domain polypeptides may be modified in view of a particular intended application. For example, the Arabidopsis sHSPs At1g52560, At4g27670, At5g51440, and At4g25200 contain organelle targeting peptides that target them to the mitochondria or chloroplasts. Such targeting peptides may be modified and/or removed and replaced with, e.g., nuclear localization signals to localize these polypeptides to the nucleus. ACD15 Polypeptides [0141] Certain aspects of the present disclosure relate to ACD15 polypeptides. ACD15 is a Į-crystalline domain-containing protein from Arabidopsis. ACD15 polypeptides of the present disclosure may be capable of being targeted to a specific nucleic acid sequence on a target nucleic acid and may be used in modifying the expression of a target nucleic acid, such as a gene. [0142] ACD15 polypeptides are known in the art. As described herein, Arabidopsis ACD15 (along with ACD21) drives accumulation of MBD5/6 complex silencing assemblies at methyl-CG sites. [0143] In some embodiments, an ACD15 polypeptide contains at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, at least 60 consecutive amino acids, at least 70 consecutive amino acids, at least 80 consecutive amino acids, at least 90 consecutive amino acids, at least 100 consecutive amino acids, at least 120 consecutive amino acids, at least 140 consecutive amino acids, or 142 or more consecutive amino acids of an endogenous or wild-type full- length ACD15 polypeptide (e.g., SEQ ID NO: 11). In some embodiments, ACD15 polypeptides include sequences with one or more amino acids removed from the consecutive amino acid sequence of an endogenous or wild-type full-length ACD15 polypeptide. In some embodiments, ACD15 polypeptides may include sequences with one or more amino acids replaced/substituted with an amino acid different from an endogenous or wild-type amino acid present at a given amino acid position in a consecutive amino acid sequence of an endogenous or wild-type full-length ACD15 polypeptide. In some embodiments, ACD15 polypeptides may include sequences with one or more amino acids added to an otherwise sf-6059413 Attorney Docket No.: 26223-20027.40 consecutive amino acid sequence of an endogenous or wild-type full-length ACD15 polypeptide. [0144] Suitable ACD15 polypeptides may be identified from monocot and dicot plants. Examples of suitable ACD15 polypeptides may include, for example, At1g76440 from Arabidopsis (SEQ ID NO: 11), as well as those listed in Table 2 and/or Table 3, and homologs and orthologs thereof. [0145] Table 2: ACD15 Close Plant Homologs
Figure imgf000041_0001
sf-6059413 Attorney Docket No.: 26223-20027.40 [0146] Table 3: ACD15 Orthologs
Figure imgf000042_0001
[0147] In some embodiments, an ACD15 polypeptide of the present disclosure has an amino acid sequence with at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of any one of a ACD15 polypeptide described herein, such as, for example, an ACD15 polypeptide described in Table 2 or Table 3, and/or the protein encoded by Arabidopsis thaliana At1g76440 (SEQ ID NO: 11). ACD21 Polypeptides [0148] Certain aspects of the present disclosure relate to ACD21 polypeptides, such as, e.g., SEQ ID NO: 13. ACD21 is a Į-crystalline domain-containing protein from Arabidopsis. ACD21 polypeptides of the present disclosure may be capable of being targeted to a specific nucleic acid sequence on a target nucleic acid and may be used in modifying the expression of a target nucleic acid, such as a gene. sf-6059413 Attorney Docket No.: 26223-20027.40 [0149] ACD21 polypeptides are known in the art. As described herein, Arabidopsis ACD21 (along with ACD15) drives accumulation of MBD5/6 complex silencing assemblies at methyl-CG sites. [0150] In some embodiments, an ACD21 polypeptide contains at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, at least 60 consecutive amino acids, at least 70 consecutive amino acids, at least 80 consecutive amino acids, at least 90 consecutive amino acids, at least 100 consecutive amino acids, at least 120 consecutive amino acids, at least 140 consecutive amino acids, at least 160 consecutive amino acids, at least 180 consecutive amino acids, at least 200 consecutive amino acids, or 205 or more consecutive amino acids of an endogenous or wild-type full-length ACD21 polypeptide (e.g., SEQ ID NO: 13). In some embodiments, ACD21 polypeptides include sequences with one or more amino acids removed from the consecutive amino acid sequence of an endogenous or wild-type full-length ACD21 polypeptide. In some embodiments, ACD21 polypeptides may include sequences with one or more amino acids replaced/substituted with an amino acid different from an endogenous or wild-type amino acid present at a given amino acid position in a consecutive amino acid sequence of an endogenous or wild-type full-length ACD21 polypeptide. In some embodiments, ACD21 polypeptides may include sequences with one or more amino acids added to an otherwise consecutive amino acid sequence of an endogenous or wild-type full- length ACD21 polypeptide. [0151] Suitable ACD21 polypeptides may be identified from monocot and dicot plants. Examples of suitable ACD21 polypeptides may include, for example, At1g54850 from Arabidopsis (SEQ ID NO: 13), as well as those listed in Table 4 and/or Table 5, and homologs and orthologs thereof. [0152] Table 4: ACD21 Close Plant Homologs
Figure imgf000043_0001
sf-6059413 Attorney Docket No.: 26223-20027.40
Figure imgf000044_0001
[0153] Table 5: ACD21 Orthologs
Figure imgf000044_0002
[0154] In some embodiments, an ACD21 polypeptide of the present disclosure has an amino acid sequence with at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at sf-6059413 Attorney Docket No.: 26223-20027.40 least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of any one of a ACD21 polypeptide described herein, such as, for example, an ACD21 polypeptide described in Table 4 or Table 5, and/or the protein encoded by Arabidopsis thaliana At1g54850 (SEQ ID NO: 13). HSPB1 Polypeptides [0155] Certain aspects of the present disclosure relate to HSPB1 polypeptides (also referred to as hHSP1 polypeptides). HSPB1 (SEQ ID NO: 18) is a sHSP from Homo sapiens. HSPB1 polypeptides of the present disclosure may be capable of being targeted to a specific nucleic acid sequence on a target nucleic acid and may be used in modifying the expression of a target nucleic acid, such as a gene. [0156] In some embodiments, an HSPB1 polypeptide contains at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, at least 60 consecutive amino acids, at least 70 consecutive amino acids, at least 80 consecutive amino acids, at least 90 consecutive amino acids, at least 100 consecutive amino acids, at least 120 consecutive amino acids, at least 140 consecutive amino acids, at least 160 consecutive amino acids, at least 180 consecutive amino acids, at least 200 consecutive amino acids, at least 220 consecutive amino acids, at least 240 consecutive amino acids, or 241 or more consecutive amino acids of an endogenous or wild- type full-length HSPB1 polypeptide (e.g., SEQ ID NO: 18). In some embodiments, HSPB1 polypeptides include sequences with one or more amino acids removed from the consecutive amino acid sequence of an endogenous or wild-type full-length HSPB1 polypeptide. In some embodiments, HSPB1 polypeptides may include sequences with one or more amino acids replaced/substituted with an amino acid different from an endogenous or wild-type amino acid present at a given amino acid position in a consecutive amino acid sequence of an endogenous or wild-type full-length HSPB1 polypeptide. In some embodiments, HSPB1 polypeptides may include sequences with one or more amino acids added to an otherwise consecutive amino acid sequence of an endogenous or wild-type full-length HSPB1 polypeptide. sf-6059413 Attorney Docket No.: 26223-20027.40 [0157] In some embodiments, a HSPB1 polypeptide of the present disclosure has an amino acid sequence with at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of any one of a HSPB1 polypeptide described herein, such as, for example, the HSPB1 protein represented by Uniprot ID P04792, Gene ID No. ENSG00000106211 from H. sapiens (SEQ ID NO: 18). HSPB3 Polypeptides [0158] Certain aspects of the present disclosure relate to HSPB3 polypeptides (also referred to as hHSP3 polypeptides). HSPB3 (SEQ ID NO: 20) is a sHSP from Homo sapiens. HSPB3 polypeptides of the present disclosure may be capable of being targeted to a specific nucleic acid sequence on a target nucleic acid and may be used in modifying the expression of a target nucleic acid, such as a gene. [0159] In some embodiments, an HSPB3 polypeptide contains at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, at least 60 consecutive amino acids, at least 70 consecutive amino acids, at least 80 consecutive amino acids, at least 90 consecutive amino acids, at least 100 consecutive amino acids, at least 120 consecutive amino acids, at least 140 consecutive amino acids, at least 160 consecutive amino acids, at least 180 consecutive amino acids, at least 200 consecutive amino acids, at least 220 consecutive amino acids, at least 240 consecutive amino acids, or 241 or more consecutive amino acids of an endogenous or wild- type full-length HSPB3 polypeptide (e.g., SEQ ID NO: 20). In some embodiments, HSPB3 polypeptides include sequences with one or more amino acids removed from the consecutive amino acid sequence of an endogenous or wild-type full-length HSPB3 polypeptide. In some embodiments, HSPB3 polypeptides may include sequences with one or more amino acids replaced/substituted with an amino acid different from an endogenous or wild-type amino acid present at a given amino acid position in a consecutive amino acid sequence of an endogenous or wild-type full-length HSPB3 polypeptide. In some embodiments, HSPB3 polypeptides may include sequences with one or more amino acids added to an otherwise sf-6059413 Attorney Docket No.: 26223-20027.40 consecutive amino acid sequence of an endogenous or wild-type full-length HSPB3 polypeptide. [0160] In some embodiments, a HSPB3 polypeptide of the present disclosure has an amino acid sequence with at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of any one of a HSPB3 polypeptide described herein, such as, for example, the HSPB3 protein represented by Uniprot ID Q12988, Gene ID No. ENSG00000169271.3 from H. sapiens (SEQ ID NO: 20). HSPB5 Polypeptides [0161] Certain aspects of the present disclosure relate to HSPB5 polypeptides (also referred to as hHSP5 polypeptides). HSPB5 (SEQ ID NO: 22) is a sHSP from Homo sapiens. HSPB5 polypeptides of the present disclosure may be capable of being targeted to a specific nucleic acid sequence on a target nucleic acid and may be used in modifying the expression of a target nucleic acid, such as a gene. [0162] In some embodiments, an HSPB5 polypeptide contains at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, at least 60 consecutive amino acids, at least 70 consecutive amino acids, at least 80 consecutive amino acids, at least 90 consecutive amino acids, at least 100 consecutive amino acids, at least 120 consecutive amino acids, at least 140 consecutive amino acids, at least 160 consecutive amino acids, at least 180 consecutive amino acids, at least 200 consecutive amino acids, at least 220 consecutive amino acids, at least 240 consecutive amino acids, or 241 or more consecutive amino acids of an endogenous or wild- type full-length HSPB5 polypeptide (e.g., SEQ ID NO: 22). In some embodiments, HSPB5 polypeptides include sequences with one or more amino acids removed from the consecutive amino acid sequence of an endogenous or wild-type full-length HSPB5 polypeptide. In some embodiments, HSPB5 polypeptides may include sequences with one or more amino acids replaced/substituted with an amino acid different from an endogenous or wild-type amino acid present at a given amino acid position in a consecutive amino acid sequence of an sf-6059413 Attorney Docket No.: 26223-20027.40 endogenous or wild-type full-length HSPB5 polypeptide (e.g., SEQ ID NO: 22). In some embodiments, HSPB5 polypeptides may include sequences with one or more amino acids added to an otherwise consecutive amino acid sequence of an endogenous or wild-type full- length HSPB5 polypeptide. [0163] In some embodiments, a HSPB5 polypeptide of the present disclosure has an amino acid sequence with at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of any one of a HSPB5 polypeptide described herein, such as, for example, the HSPB5 protein represented by Uniprot ID P02511, Gene ID No. ENSG00000263007 from H. sapiens (SEQ ID NO: 22). HSPB8 Polypeptides [0164] Certain aspects of the present disclosure relate to HSPB8 polypeptides (also referred to as hHSP8 polypeptides). HSPB8 (SEQ ID NO: 25) is a sHSP from Homo sapiens. HSPB8 polypeptides of the present disclosure may be capable of being targeted to a specific nucleic acid sequence on a target nucleic acid and may be used in modifying the expression of a target nucleic acid, such as a gene. [0165] In some embodiments, an HSPB8 polypeptide contains at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, at least 60 consecutive amino acids, at least 70 consecutive amino acids, at least 80 consecutive amino acids, at least 90 consecutive amino acids, at least 100 consecutive amino acids, at least 120 consecutive amino acids, at least 140 consecutive amino acids, at least 160 consecutive amino acids, at least 180 consecutive amino acids, at least 200 consecutive amino acids, at least 220 consecutive amino acids, at least 240 consecutive amino acids, or 241 or more consecutive amino acids of an endogenous or wild- type full-length HSPB8 polypeptide (e.g., SEQ ID NO: 25). In some embodiments, HSPB8 polypeptides include sequences with one or more amino acids removed from the consecutive amino acid sequence of an endogenous or wild-type full-length HSPB8 polypeptide. In some embodiments, HSPB8 polypeptides may include sequences with one or more amino acids sf-6059413 Attorney Docket No.: 26223-20027.40 replaced/substituted with an amino acid different from an endogenous or wild-type amino acid present at a given amino acid position in a consecutive amino acid sequence of an endogenous or wild-type full-length HSPB8 polypeptide (e.g., SEQ ID NO: 25). In some embodiments, HSPB8 polypeptides may include sequences with one or more amino acids added to an otherwise consecutive amino acid sequence of an endogenous or wild-type full- length HSPB8 polypeptide. [0166] In some embodiments, a HSPB8 polypeptide of the present disclosure has an amino acid sequence with at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of any one of a HSPB8 polypeptide described herein, such as, for example, the HSPB8 protein represented by Uniprot ID Q9UJY1, Gene ID No. ENSG00000152137 from H. sapiens (SEQ ID NO: 25). StykC (STKYC) domains [0167] In some embodiments, one or more Į-crystalline domain polypeptides (e.g. sHSPs) are not recombinant. In embodiments involving a non-recombinant Į-crystalline domain polypeptide, a StykC (STKYC) domain may be used to mediate co-targeting of a polypeptide of interest and a Į-crystalline domain polypeptide. Thus, certain aspects of the present disclosure are related to recombinant polypeptides that contain a heterologous StykC domain. [0168] In some embodiments, targeting Į-crystalline domain polypeptides includes targeting of a polypeptide with a StykC domain (also referred to herein as a “Sticky-C”, “Sticky C”, “StkyC”, “STKYC”, and/or “STC” domain). StykC domains may include, for example, a conserved domain of MBD6 or MBD7 that recruits a Į-crystalline domain polypeptide such as, for example, the ACD15 and ACD21 proteins. Accordingly, polypeptides of the present disclosure may contain a StykC domain. For example, in some embodiments, recruitment of Į-crystalline domain polypeptides (such as, for example, ACD15 and ACD21) to a StykC domain that is present in at least one to at least ten copies relative to each dCas9 protein in a SunTag system results in the establishment of a nucleation sf-6059413 Attorney Docket No.: 26223-20027.40 site for the aggregation of other Į-crystalline domain proteins that may or may not be recombinant and are present diffusely around the nucleic acid target. [0169] In some embodiments, a StykC domain contains at least 20 consecutive amino acids, at least 30 consecutive amino acids, at least 40 consecutive amino acids, at least 50 consecutive amino acids, at least 60 consecutive amino acids, at least 70 consecutive amino acids, at least 80 consecutive amino acids, at least 90 consecutive amino acids, at least 100 consecutive amino acids, at least 120 consecutive amino acids, at least 140 consecutive amino acids, at least 160 consecutive amino acids, at least 180 consecutive amino acids, at least 200 consecutive amino acids, at least 220 consecutive amino acids, at least 240 consecutive amino acids, or 241 or more consecutive amino acids of an endogenous or wild- type full-length StykC domain. In some embodiments, a StykC domain includes sequences with one or more amino acids removed from the consecutive amino acid sequence of an endogenous or wild-type full-length StykC domain. In some embodiments, a StykC domain may include sequences with one or more amino acids replaced/substituted with an amino acid different from an endogenous or wild-type amino acid present at a given amino acid position in a consecutive amino acid sequence of an endogenous or wild-type full-length StykC domain. In some embodiments, StykC domains may include sequences with one or more amino acids added to an otherwise consecutive amino acid sequence of an endogenous or wild-type full-length StykC domain. [0170] In some embodiments, a StykC domain of the present disclosure has an amino acid sequence with at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of any one of a StykC domain described herein, such as, for example, the StykC domain sequence provided in SEQ ID NO: 182. [0171] In some embodiments, the StykC domain may be derived from an MBD6 polypeptide. Exemplary StykC domains from MBD6 polypeptides are illustrated below in Table 6. sf-6059413 Attorney Docket No.: 26223-20027.40 [0172] Table 6 – StykC homologies from MBD6 polypeptides. See Table XH for associated SEQ ID NOs.
Figure imgf000051_0001
[0173] In some embodiments, the StykC domain may be derived from an MBD7 polypeptide. Exemplary StykC domains from MBD7 polypeptides are illustrated below in Table 7. [0174] Table 7 – StykC homologies from MBD7 polypeptides. See Table XI for associated SEQ ID NOs.
Figure imgf000051_0002
sf-6059413 Attorney Docket No.: 26223-20027.40
Figure imgf000052_0001
Co-Targeting [0175] Certain aspects of the present disclosure relate to co-targeting a target nucleic acid with 1) one or more of a genetic modifier polypeptide (including but not limited to, for example, a TRBIP1 polypeptide and/or an MQ1 polypeptide); and 2) one or more of a Į- crystalline domain polypeptide. Certain aspects of the present disclosure relate to co- targeting a target nucleic acid with 1) one or more of a non-genetic modifier polypeptide (including but not limited to, for example, a fluorescent marker polypeptide, such as a GFP); and 2) one or more of a Į-crystalline domain polypeptide. [0176] Co-targeting a target nucleic acid with 1) one or more of a genetic modifier polypeptide (including but not limited to, for example, a TRBIP1 polypeptide and/or an MQ1 polypeptide); and 2) one or more of a Į-crystalline domain polypeptide may result in modification of the nucleic acid. For instance, if the genetic modifier polypeptide is a transcriptional repressor polypeptide, the co-targeting may result in reduced expression of the target nucleic acid. As an additional example, if the genetic modifier polypeptide is a DNA methyltransferase polypeptide, the co-targeting may result in increased efficiency of methylation. In some embodiments, more than one type of genetic modifier polypeptide is co-targeted with one or more Į-crystalline domain polypeptides. For example, in some embodiments, one or more of a transcriptional repressor polypeptide (e.g. a TRBIP1 polypeptide) and one or more of a DNA methyltransferase polypeptide (e.g. an MQ1 polypeptide) are co-targeted along with one or more Į-crystalline domain polypeptides to a target nucleic acid. [0177] In some embodiments involving, for example, co-targeting involving one or more genetic modifier polypeptides, wherein the one or more genetic modifier polypeptides comprises a methylase, the target nucleic acid may experience an increase in DNA methylation of about 1%, about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 100%, about 125%, about 150%, about 175%, about 200%, about 250%, or about 300% or more as sf-6059413 Attorney Docket No.: 26223-20027.40 compared to a corresponding control (e.g. a nucleic acid targeted with only Į-crystalline domain polypeptides as described herein). In some embodiments involving, for example, co- targeting involving one or more genetic modifier polypeptides, wherein the one or more genetic modifier polypeptides comprises a demethylase, the target nucleic acid may experience a decrease in DNA methylation of about 1%, about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 100%, about 125%, about 150%, about 175%, about 200%, about 250%, or about 300% or more as compared to a corresponding control (e.g. a nucleic acid targeted with only Į-crystalline domain polypeptides as described herein). [0178] In embodiments involving, for example, co-targeting involving one or more genetic modifier polypeptides, wherein the one or more genetic modifier polypeptides comprises one or more transcriptional repressor polypeptide (e.g. a TRBIP1 polypeptide) and/or a DNA methyltransferase polypeptide (e.g. an MQ1 polypeptide), the target nucleic acid may experience a decrease in transcriptional expression of about 1%, about 2%, about 3%, about 4%, about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, or about 100% as compared to a corresponding control (e.g. a nucleic acid targeted with only Į-crystalline domain polypeptides as described herein). [0179] In some embodiments, the co-targeting results in the formation of a polypeptide aggregate or “cloud” of concentrated polypeptides around the target nucleic acid. In some embodiments, the concentrated polypeptides extend no more than about 1 kb on either side of the target nucleic acid. In some embodiments, the concentrated polypeptides extend no more than about 500 bp, 400 bp, 300 bp, 200 bp, or 100 bp on either side of the target nucleic acid. In some embodiments, the concentrated polypeptides extend more than about 1 kb on either side of the target nucleic acid. In some embodiments, the concentrated polypeptides extend more than about 2 kb, 3 kb, 4 kb, 5 kb, 6 kb, 7 kb, 8 kb, 9 kb, 10 kb, or more than 10 kb on either side of the target nucleic acid. In some embodiments, the size of the polypeptide aggregate depends on the amount of Į-crystalline domain polypeptides available. In some embodiments, the expression level of the one or more co-targeted Į-crystalline domain polypeptides is tuned (increased or decreased) in order to modulate the dimensions of the concentrated polypeptides around the target nucleic acid. sf-6059413 Attorney Docket No.: 26223-20027.40 [0180] Thus, in some embodiments, a genetic modifier polypeptide being targeted “to” the target nucleic acid and/or a Į-crystalline domain polypeptide being targeted “to” the target nucleic acid may include the genetic modifier polypeptide and/or the Į-crystalline domain polypeptide having activity no more than about 1 kb on either side of the target nucleic acid sequence, including, for example, no more than about 500 bp, 400 bp, 300 bp, 200 bp, 100 bp, 75 bp, 50 bp, 25 bp, 10 bp, 5 bp, 1 bp or fewer on either side of the target nucleic acid sequence. In some embodiments, the activity of the genetic modifier polypeptide is limited directly on the target nucleic acid sequence. Alternatively, in some embodiments, a genetic modifier polypeptide being targeted “to” the target nucleic acid and/or a Į-crystalline domain polypeptide being targeted “to” the target nucleic acid may include the genetic modifier polypeptide and/or the Į-crystalline domain polypeptide having activity more than about 1 kb on either side of the target nucleic acid sequence, such as, for example, more than about 2 kb, 3 kb, 4 kb, 5 kb, 6 kb, 7 kb, 8 kb, 9 kb, 10 kb, or more than 10 kb on either side of the target nucleic acid. [0181] The polypeptides described herein can form different clustering patterns when expressed in cells, which can be observed in some instances by, for example, fluorescent microscopy of fluorescently tagged versions of the polypeptides (such as, e.g., fluorescently- tagged Į-crystalline domain polypeptides targeted to a target nucleic acid), immunogold microscopy against, for example, Į-crystalline domain polypeptides, or other methods of observing selective localization of polypeptides of interest within a cell. [0182] Such nuclear bodies may form (and, in some instances, be visible by microscopy), comprising aggregates of, for example, Į-crystalline domain polypeptides targeted to a target nucleic acid as described herein, and in some instances, further comprising additional types of polypeptides, such as, for example, one or more genetic modifier polypeptides. In some embodiments, one or more Į-crystalline domain polypeptides that aggregate into a nuclear body are derived from one or more species selected from the group consisting of: Homo sapiens, Arabidopsis thaliana, Chlamydomonas reinhardtii, Sacchrolobus solfataricus, Zea mays, Solanum tuberosum, Solanum lycopersicum, and Oryza sativa, Deinococcus radiodurans. [0183] Such nuclear bodies may take a variety of different forms. For example, in some instances, Į-crystalline domain polypeptide-containing aggregates may form foci, which may or may not be clearly visible by microscopy and may vary in number and size per nucleus. For example, in some embodiments, Į-crystalline domain polypeptide-containing aggregates sf-6059413 Attorney Docket No.: 26223-20027.40 may form many nuclear foci, while in other embodiments, Į-crystalline domain polypeptide- containing aggregates may form few or no apparent nuclear foci. In some embodiments, the number of foci per nucleus may be relatively consistent across a sample (for example, across a population of cells collected from the same organism at the same time), while in other embodiments, the number of foci per nucleus may vary cell to cell across a given sample. In some embodiments, some or all foci in a nucleus may be relatively large in diameter (e.g., ~0.5-1, 1-2, or >2 microns in diameter); in other embodiments, some or all foci in a nucleus are relatively small in diameter (e.g., less than about 0.5 microns in diameter). In some embodiments, one or a plurality of foci may have a relatively strong signal compared to the background signal when assessed microscopically (for example, in the case of fluorescently tagged Į-crystalline domain polypeptides, a relatively bright signal over background), while in other embodiments, one or a plurality of foci may have a relatively dim signal compared to the background signal, even to the point of not being effectively distinguishable from background. The number of foci per cell may vary from, for example, none to 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11-15, 16-20, 21-30, 31-40, 41-50, or more than 50 per cell. In some embodiments, there are about 1, 2, or 3 foci per cell. In some embodiments, there are no observable foci per cell. In some embodiments, a cell or nucleus may demonstrate relatively strong diffuse localization signals (i.e., non-foci), indicating ACD polypeptides expressed at relatively high levels but with relatively low levels of aggregation. In some embodiments, one or more foci are observed in a cell nucleus. In some embodiments, one or more foci are observed outside the nucleus, such as, for example, in an organelle or in the cytoplasm. [0184] In some embodiments, relatively large foci and/or foci with relatively strong localization signals represent relatively high levels of ACD polypeptide multimerization. In some embodiments, relatively small foci and/or foci with relatively weak localization signals represent relatively low levels of ACD polypeptide multimerization. [0185] In some embodiments, foci represent aggregation around a particular genetic feature, such as, for example, chromocenters. In some embodiments, foci represent aggregation around a target nucleic acid sequence. In some embodiments, the size, number, distribution, and/or intensity of foci in a sample may be tuned to, for example, drive more or less multimerization of ACD polypeptides and/or increase, decrease, or otherwise modulate (e.g., concentrate or disperse) targeting of a genome modification polypeptide. [0186] In some embodiments, different ACD proteins form different localization patterns—e.g., different patterns of nuclear bodies. For example, ACD proteins from different sf-6059413 Attorney Docket No.: 26223-20027.40 organisms may form different patterns of nuclear bodies. In some embodiments, ACD proteins from Chlamydomonas reinhardtii, Zea mays, Sacchrolobus solfataricus, and/or human HSPB1, form distinct foci with about 1, 2, 3, 4, or 5 foci per nucleus and lead to high multimerization and localization of the majority of multimerized proteins to a target. In some embodiments, HSPB4, HSPB5, HSPB8, and ACD proteins from Oryza sativa and Deinococcus radiodurans lead to relatively lower levels of multimerization and form, for example, many smaller clusters. In some embodiments, many smaller clusters could be efficient in, for example, scanning a genome for target sites. In some embodiments, having many smaller foci may increase gene editing efficiency. In some embodiments, having few larger foci may increase gene editing efficiency. [0187] In some embodiments, one or more foci overlap with chromocenters. In some embodiments, one or more foci partially overlap with chromocenters. In some embodiments, one or more foci do not overlap with chromocenters. In some embodiments, partial overlapping of foci and chromocenters indicates interaction between the ACD proteins and chromocenter complexes, such as, for example, interaction with polypeptides in the chromocenter complexes having sequence homology to an ACD protein (such as, for example, ACD15 or ACD21). In some embodiments, at least partial overlapping of foci and chromocenters indicates potential for targeting and/or increasing editing efficiency of nucleic acid sequences present in heterochromatin. Targeting Domains [0188] Certain aspects of the present disclosure relate to recombinant polypeptides that contain a targeting domain and are capable of being targeted to a target nucleic acid. A targeting domain generally refers to a polypeptide or amino acid sequence that is able to facilitate or is involved in facilitating, either directly or indirectly, targeting of a recombinant polypeptide to a target nucleic acid sequence. For example, the targeting domain may directly confer the specific targeting functionality of the genetic modifier polypeptide or Į- crystalline domain polypeptide to the target nucleic acid, or the targeting domain may be associated with or interact with another agent that confers the specific targeting functionality of the genetic modifier polypeptide or Į-crystalline domain polypeptide to the target nucleic acid. In some embodiments, the targeting domain may associate with a DNA-binding polypeptide that is able to be targeted to a target nucleic acid. Suitable targeting domains for use in the present disclosure are described herein and will be readily apparent to one of skill in the art. 54 sf-6059413 Attorney Docket No.: 26223-20027.40 DNA-Binding Domains [0189] In some embodiments, the targeting domain is or may include a DNA-binding domain or have DNA-binding activity. In some embodiments, this DNA-binding activity is achieved through a heterologous DNA-binding domain (e.g. binds with a sequence affinity other than that of a DNA-binding domain that may be present in the endogenous protein). In some embodiments, recombinant polypeptides of the present disclosure, including, for example, recombinant genetic modifier polypeptides, Į-crystalline domain polypeptides, and/or non-genetic modifier polypeptides, contain a DNA-binding domain. Recombinant polypeptides of the present disclosure may contain one DNA binding domain or they may contain more than one DNA-binding domain. Heterologous DNA-binding domains may be recombinantly fused to a genetic modifier polypeptide, Į-crystalline domain polypeptides, and/or a non-genetic modifier polypeptide of the present disclosure such that the polypeptide is then targeted to a specific nucleic acid sequence and can facilitate reduced expression and/or silencing of the specific nucleic acid. [0190] In some embodiments, the DNA-binding domain is a zinc finger domain. A zinc finger domain generally refers to a DNA-binding protein domain that contains zinc fingers, which are small protein structural motifs that can coordinate one or more zinc ions to help stabilize their protein folding. Zinc fingers were first identified as DNA-binding motifs (Miller et al., 1985), and numerous other variations of them have been characterized. Recent progress has been made that allows the engineering of DNA-binding proteins that specifically recognize any desired DNA sequence. For example, it was shown that a three-finger zinc finger protein could be constructed to block the expression of a human oncogene that was transformed into a mouse cell line (Choo and Klug, 1994). [0191] Zinc fingers can generally be classified into several different structural families and typically function as interaction modules that bind DNA, RNA, proteins, or small molecules. Suitable zinc finger domains of the present disclosure may contain two, three, four, five, six, seven, eight, or nine zinc fingers. Examples of suitable zinc finger domains may include, for example, Cys2His2 (C2H2) zinc finger domains, C-x8-C-x5-C-x3-H (CCCH) zinc finger domains, multi-cysteine zinc finger domains, and zinc binuclear cluster domains. [0192] In some embodiments, the DNA-binding domain binds a specific nucleic acid sequence. For example, the DNA-binding domain may bind a sequence that is at least 5 sf-6059413 Attorney Docket No.: 26223-20027.40 nucleotides, at least 6 nucleotides, at least 7 nucleotides, at least 8 nucleotides, at least 9 nucleotides, at least 10 nucleotides, at least 11 nucleotides, at least 12 nucleotides, at least 13 nucleotides, at least 14 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 25 nucleotides, at least 30 nucleotides, at least 35 nucleotides, at least 40 nucleotides, at least 45 nucleotides, at least 50 nucleotides, or a high number of nucleotides in length. [0193] In some embodiments, a genetic modifier polypeptide, Į-crystalline domain polypeptide, and/or non-genetic modifier polypeptide of the present disclosure further contains two N-terminal CCCH zinc finger domains. In some embodiments, the zinc finger domain is an engineered zinc finger array, such as a C2H2 zinc finger array. Engineered arrays of C2H2 zinc fingers can be used to create DNA-binding proteins capable of targeting desired genomic DNA sequences. Methods of engineering zinc finger arrays are well known in the art, and include, for example, combining smaller zinc fingers of known specificity. An exemplary zinc finger is ZF108 which targets the FWA locus of Arabidopsis and whose amino acid sequence is provided in SEQ ID NO: 486. [0194] In some embodiments, genetic modifier polypeptides, Į-crystalline domain polypeptide, and/or non-genetic modifier polypeptides of the present disclosure may contain a DNA-binding domain other than a zinc finger domain. Examples of such DNA-binding domains may include, for example, TAL (transcription activator-like) effector targeting domains, helix-turn-helix family DNA-binding domains, basic domains, ribbon-helix-helix domains, TBP (TATA-box binding protein) domains, barrel dimer domains, RHB domains (real homology domain), BAH (bromo-adjacent homology) domains, SANT domains, Chromodomains, Tudor domains, Bromodomains, PHD domains (plant homeo domain), WD40 domains, and MBD domains (methyl-CpG-binding domain). [0195] In some embodiments, the DNA-binding domain is a TAL effector targeting domain. TAL effectors generally refer to secreted bacterial proteins, such as those secreted by Xanthomonas or Ralstonia bacteria when infecting various plant species. Generally, TAL effectors are capable of binding promoter sequences in the host plant, and activate the expression of plant genes that aid in bacterial infection. TAL effectors recognize plant DNA sequences through a central repeat targeting domain that contains a variable number of approximately 34 amino acid repeats. Moreover, TAL effector targeting domains can be engineered to target specific DNA sequences. Methods of modifying TAL effector targeting domains are well known in the art, and described in Bogdanove and Voytas, Science.2011 Sep 30; 333(6051):1843-6. sf-6059413 Attorney Docket No.: 26223-20027.40 [0196] Other DNA-binding domains for use in the methods and compositions of the present disclosure will be readily apparent to one of skill in the art, in view of the present disclosure. RNA-Guided DNA-Binding Proteins and Systems [0197] In some embodiments, the targeting domain is or may include an RNA-guided DNA binding protein. For example, the targeting domain may be an RNA-guided DNA binding protein (e.g. Cas9, Cas12, etc.) and employ a CRISPR-based targeting system to target a recombinant polypeptide to a target nucleic acid. [0198] CRISPR systems naturally use small base-pairing guide RNAs to target and cleave foreign DNA elements in a sequence-specific manner (Wiedenheft et al., 2012). There are diverse CRISPR systems in different organisms that may be used to target proteins of the present disclosure to a target nucleic acid. One of the simplest systems is the type II CRISPR system from Streptococcus pyogenes. Only a single gene encoding the CAS9 protein and two RNAs, a mature CRISPR RNA (crRNA) and a partially complementary trans-acting RNA (tracrRNA), are necessary and sufficient for RNA-guided silencing of foreign DNAs (Jinek et al., 2012). Maturation of crRNA requires tracrRNA and RNase III (Deltcheva et al., 2011). However, this requirement can be bypassed by using an engineered small guide RNA (gRNA) containing a designed hairpin that mimics the tracrRNA-crRNA complex (Jinek et al., 2012). Base pairing between the gRNA and target DNA normally causes double-strand breaks (DSBs) due to the endonuclease activity of CAS9. [0199] It is known that the endonuclease domains of the CAS9 protein can be mutated to create a programmable RNA-dependent DNA-binding protein (dCAS9) (Qi et al., 2013). The fact that duplex gRNA-dCAS9 binds target sequences without endonuclease activity has been used to tether regulatory proteins, such as transcriptional activators or repressors, to promoter regions in order to modify gene expression (Gilbert et al., 2013), and CAS9 transcriptional activators have been used for target specificity screening and paired nickases for cooperative genome engineering (Mali et al., 2013, Nature Biotechnology 31:833-838). Thus, dCAS9 may be used as a modular RNA-guided platform to recruit different proteins to DNA in a highly specific manner. One of skill in the art would recognize other RNA-guided DNA binding protein/RNA complexes that can be used equivalently to CRISPR-CAS9. [0200] Various CAS proteins suitable for use in the methods and compositions of the present disclosure are known in the art and described herein. In some embodiments, the CAS sf-6059413 Attorney Docket No.: 26223-20027.40 polypeptide may be a Cas9 polypeptide having an amino acid sequence that has at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of dCas9 (SEQ ID NO: 487). [0201] In some embodiments, the CAS polypeptide may be a Cas^ polypeptide (also known as CasPhi and Cas12J) having an amino acid sequence that has at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% amino acid identity to the amino acid sequence of CasPhi (SEQ ID NO: 488). [0202] Targeting using CRISPR-based systems may be beneficial over other genome targeting techniques in certain instances. For example, one need only change the guide RNAs in order to target fusion proteins to a new genomic location, or even multiple locations simultaneously. In addition, guide RNAs can be extended to include sites for binding to proteins, such as the MS2 protein, which can be fused to proteins of interest. Variations of CRISPR-based targeting may also be used herein (e.g. a SunTag system) to facilitate targeting of a recombinant polypeptides to a target nucleic acid, as will be readily apparent to one of skill in the art. [0203] Suitable CRISPR-based targeting systems and variations thereof are well-known in the art and may be used in the embodiments of the present disclosure in view of the guidance provided herein. For example, WO2018/136783 describes a SunTag-based targeting system for use in plants. WO2018/136783 is incorporated herein by reference in its entirety. [0204] SunTag-based targeting in the context of the present disclosure may involve the recruitment of multiple copies of a genetic modifier polypeptide, Į-crystalline domain polypeptide, and/or non-genetic modifier polypeptide to a target nucleic acid in plants via CRISPR-based targeting. In certain aspects, this specific targeting involves the use of a sf-6059413 Attorney Docket No.: 26223-20027.40 system that includes (1) a nuclease-deficient CAS9 (dCAS9) polypeptide that is recombinantly fused to a multimerized epitope, (2) a genetic modifier polypeptide, Į- crystalline domain polypeptide, and/or non-genetic modifier polypeptide that is recombinantly fused to an affinity polypeptide, and (3) a guide RNA (gRNA). In this aspect, the dCAS9 portion of the dCAS9-multimerized epitope fusion protein is involved with targeting a target nucleic acid as directed by the guide RNA. The multimerized epitope portion of the dCAS9-multimerized epitope fusion protein is involved with binding to the affinity polypeptide (which is recombinantly fused to a transcriptional repressor). The affinity polypeptide portion of the genetic modifier polypeptide-, Į-crystalline domain polypeptide-, and/or non-genetic modifier polypeptide-affinity polypeptide fusion protein is involved with binding to the multimerized epitope so that the genetic modifier polypeptide, Į-crystalline domain polypeptide, and/or non-genetic modifier polypeptide can be in association with dCAS9. The transcriptional repressor portion of the transcriptional repressor-affinity polypeptide fusion protein is involved with repressing transcription of a target nucleic acid once the complex has been targeted to a target nucleic acid via the guide RNA. [0205] In some embodiments, targeting Į-crystalline domain polypeptides includes targeting of a polypeptide with a StykC domain. StykC domains include, for example, a conserved domain of MBD6 that recruits the ACD15 and ACD21 proteins. For example, in some embodiments, recruitment of Į-crystalline domain polypeptides (such as, for example, ACD15 and ACD21) to a StykC domain that is present in at least one to at least ten copies relative to each dCas9 protein in a SunTag system results in the establishment of a nucleation site for the aggregation of other Į-crystalline domain proteins that may or may not be recombinant and are present diffusely around the nucleic acid target. [0206] As described above, certain aspects of the present disclosure involve CRISPR- based targeting of a target nucleic acid, which may involve use of a CRISPR-CAS9 targeting system. CRISPR-CAS9 systems may involve the use of a CRISPR RNA (crRNA), a trans- activating CRISPR RNA (tracrRNA), and a CAS9 protein. The crRNA and tracrRNA aid in directing the CAS9 protein to a target nucleic acid sequence, and these RNA molecules can be specifically engineered to target specific nucleic acid sequences. In particular, certain aspects of the present disclosure involve the use of a single guide RNA (gRNA) that reconstitutes the function of the crRNA and the tracrRNA. Further, certain aspects of the present disclosure involve a CAS9 protein that does not exhibit DNA cleavage activity sf-6059413 Attorney Docket No.: 26223-20027.40 (dCAS9). As disclosed herein, gRNA molecules may be used to direct a dCAS9 protein to a target nucleic acid sequence. [0207] Certain aspects of the present disclosure involving SunTag-based targeting relate to recombinant polypeptides that contain an affinity polypeptide. Affinity polypeptides of the present disclosure may bind to one or more epitopes (e.g. a multimerized epitope). In some embodiments, an affinity polypeptide is present in a recombinant polypeptide that contains a transcriptional repressor polypeptide and an affinity polypeptide. [0208] A variety of affinity polypeptides are known in the art and may be used herein. Generally, the affinity polypeptide should be stable in the conditions present in the intracellular environment of a target cell of interest, such as, for example, a plant cell or an animal cell. Additionally, the affinity polypeptide should specifically bind to its corresponding epitope with minimal cross-reactivity. The affinity polypeptide may be an antibody such as, for example, an scFv. The antibody may be optimized for stability in the plant intracellular environment. When a GCN4 epitope is used in the methods described herein, a suitable affinity polypeptide that is an antibody may contain an anti-GCN4 scFv domain. Other exemplary affinity polypeptides include, for example, proteins with SH2 domains or the domain itself, 14-3-3 proteins, proteins with SH3 domains or the domain itself, the Alpha-Syntrophin PDZ protein interaction domain, the PDZ signal sequence, or proteins from plants which can recognize AGO hook motifs (e.g. AGO4 from Arabidopsis thaliana). [0209] Certain aspects of the present disclosure involving SunTag-based targeting relate to genetic modifier polypeptides, Į-crystalline domain polypeptide, and/or non-genetic modifier polypeptides that contain an epitope or a multimerized epitope. Epitopes of the present disclosure may bind to an affinity polypeptide. In some embodiments, an epitope or multimerized epitope is present in a recombinant polypeptide that contains a dCAS9 polypeptide. [0210] Epitopes of the present disclosure may be used for recruiting affinity polypeptides (and any polypeptides they may be recombinantly fused to) to a dCAS9 polypeptide. In embodiments where a dCAS9 polypeptide is fused to an epitope or a multimerized epitope, the dCAS9 polypeptide may be fused to one copy of an epitope, multiple copies of an epitope, more than one different epitope, or multiple copies of more than one different epitope as further described herein. sf-6059413 Attorney Docket No.: 26223-20027.40 [0211] A variety of epitopes and multimerized epitopes are known in the art and may be used herein. In general, the epitope or multimerized epitope may be any polypeptide sequence that is specifically recognized by an affinity polypeptide of the present disclosure. Exemplary epitopes may include a c-Myc affinity tag, an HA affinity tag, a His affinity tag, an S affinity tag, a methionine-His affinity tag, an RGD-His affinity tag, a FLAG octapeptide, a strep tag or strep tag II, a V5 tag, a VSV-G epitope, and a GCN4 epitope. Other exemplary amino acid sequences that may serve as epitopes and multimerized epitopes include, for example, phosphorylated tyrosines in specific sequence contexts recognized by SH2 domains, characteristic consensus sequences containing phosphoserines recognized by 14-3-3 proteins, proline rich peptide motifs recognized by SH3 domains, the PDZ protein interaction domain or the PDZ signal sequence, and the AGO hook motif from plants. [0212] Epitopes described herein may also be multimerized. Multimerized epitopes may include at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, or at least 24 or more copies of an epitope. [0213] Multimerized epitopes may be present as tandem copies of an epitope, or each individual epitope may be separated from another epitope in the multimerized epitope by a linker or other amino acid sequence. Suitable linker regions are known in the art and are described herein. The linker may be configured to allow the binding of affinity polypeptides to adjacent epitopes without, or without substantial, steric hindrance. Linker sequences may also be configured to provide an unstructured or linear region of the polypeptide to which they are recombinantly fused. The linker sequence may comprise e.g. one or more glycines and/or serines. The linker sequences may be e.g. at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, or at least 10 or more amino acids in length. Recombinant Nucleic Acids [0214] Certain aspects of the present disclosure relate to recombinant nucleic acids encoding recombinant polypeptides. [0215] As used herein, the terms “polynucleotide,” “nucleic acid,” and variations thereof shall be generic to polydeoxyribonucleotides (containing 2-deoxy-D-ribose), to polyribonucleotides (containing D-ribose), to any other type of polynucleotide that is an N- glycoside of a purine or pyrimidine base, and to other polymers containing non-nucleotidic sf-6059413 Attorney Docket No.: 26223-20027.40 backbones, provided that the polymers contain nucleobases in a configuration that allows for base pairing and base stacking, as found in DNA and RNA. Thus, these terms include known types of nucleic acid sequence modifications, for example, substitution of one or more of the naturally occurring nucleotides with an analog, and inter-nucleotide modifications. As used herein, the symbols for nucleotides and polynucleotides are those recommended by the IUPAC-IUB Commission of Biochemical Nomenclature. [0216] In one aspect, the present disclosure provides recombinant nucleic acids that encode a genetic modifier polypeptide capable of being targeted to the target nucleic acid, and/or 2) a Į-crystalline domain polypeptide capable of being targeted to the target nucleic acid. [0217] Sequences of the polynucleotides of the present disclosure may be prepared by various suitable methods known in the art, including, for example, direct chemical synthesis or cloning. For direct chemical synthesis, formation of a polymer of nucleic acids typically involves sequential addition of 3 '-blocked and 5 '-blocked nucleotide monomers to the terminal 5'-hydroxyl group of a growing nucleotide chain, wherein each addition is effected by nucleophilic attack of the terminal 5'-hydroxyl group of the growing chain on the 3 '- position of the added monomer, which is typically a phosphorus derivative, such as a phosphotriester, phosphoramidite, or the like. Such methodology is known to those of ordinary skill in the art and is described in the pertinent texts and literature (e.g., in Matteucci et al., (1980) Tetrahedron Lett 21:719-722; U.S. Pat. Nos.4,500,707; 5,436,327; and 5,700,637). In addition, the desired sequences may be isolated from natural sources by splitting DNA using appropriate restriction enzymes, separating the fragments using gel electrophoresis, and thereafter, recovering the desired polynucleotide sequence from the gel via techniques known to those of ordinary skill in the art, such as utilization of polymerase chain reactions (PCR; e.g., U.S. Pat. No.4,683,195). [0218] The nucleic acids employed in the methods and compositions described herein may be codon optimized relative to a parental template for expression in a particular host cell. Cells differ in their usage of particular codons, and codon bias corresponds to relative abundance of particular tRNAs in a given cell type. By altering codons in a sequence so that they are tailored to match with the relative abundance of corresponding tRNAs, it is possible to increase expression of a product (e.g. a polypeptide) from a nucleic acid. Similarly, it is possible to decrease expression by deliberately choosing codons corresponding to rare tRNAs. Thus, codon optimization/deoptimization can provide control over nucleic acid sf-6059413 Attorney Docket No.: 26223-20027.40 expression in a particular cell type (e.g. bacterial cell, plant cell, mammalian cell, etc.). Methods of codon optimizing a nucleic acid for tailored expression in a particular cell type are well-known to those of skill in the art. Methods of Identifying Sequence Similarity [0219] Various methods are known to those of skill in the art for identifying similar (e.g. homologs, orthologs, paralogs, etc.) polypeptide and/or polynucleotide sequences, including phylogenetic methods, sequence similarity analysis, and hybridization methods. [0220] Phylogenetic trees may be created for a gene family by using a program such as CLUSTAL (Thompson et al. Nucleic Acids Res.22: 4673-4680 (1994); Higgins et al. Methods Enzymol 266: 383-402 (1996)) or MEGA (Tamura et al. Mol. Biol. & Evo. 24:1596-1599 (2007)). Once an initial tree for genes from one species is created, potential orthologous sequences can be placed in the phylogenetic tree and their relationships to genes from the species of interest can be determined. Evolutionary relationships may also be inferred using the Neighbor-Joining method (Saitou and Nei, Mol. Biol. & Evo.4:406-425 (1987)). Homologous sequences may also be identified by a reciprocal BLAST strategy. Evolutionary distances may be computed using the Poisson correction method (Zuckerkandl and Pauling, pp.97-166 in Evolving Genes and Proteins, edited by V. Bryson and H.J. Vogel. Academic Press, New York (1965)). [0221] In addition, evolutionary information may be used to predict gene function. Functional predictions of genes can be greatly improved by focusing on how genes became similar in sequence (i.e. by evolutionary processes) rather than on the sequence similarity itself (Eisen, Genome Res.8: 163-167 (1998)). Many specific examples exist in which gene function has been shown to correlate well with gene phylogeny (Eisen, Genome Res.8: 163- 167 (1998)). By using a phylogenetic analysis, one skilled in the art would recognize that the ability to deduce similar functions conferred by closely-related polypeptides is predictable. [0222] When a group of related sequences are analyzed using a phylogenetic program such as CLUSTAL, closely related sequences typically cluster together or in the same clade (a group of similar genes). Groups of similar genes can also be identified with pair-wise BLAST analysis (Feng and Doolittle, J. Mol. Evol.25: 351-360 (1987)). Analysis of groups of similar genes with similar function that fall within one clade can yield sub-sequences that are particular to the clade. These sub-sequences, known as consensus sequences, can not only be used to define the sequences within each clade, but define the functions of these sf-6059413 Attorney Docket No.: 26223-20027.40 genes; genes within a clade may contain paralogous sequences, or orthologous sequences that share the same function (see also, for example, Mount, Bioinformatics: Sequence and Genome Analysis Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., page 543 (2001)). [0223] To find sequences that are homologous to a reference sequence, BLAST nucleotide searches can be performed with the BLASTN program, score=100, wordlength=12, to obtain nucleotide sequences homologous to a nucleotide sequence encoding a protein of the disclosure. BLAST protein searches can be performed with the BLASTX program, score=50, wordlength=3, to obtain amino acid sequences homologous to a protein or polypeptide of the disclosure. To obtain gapped alignments for comparison purposes, Gapped BLAST (in BLAST 2.0) can be utilized as described in Altschul et al. (1997) Nucleic Acids Res.25:3389. Alternatively, PSI-BLAST (in BLAST 2.0) can be used to perform an iterated search that detects distant relationships between molecules. See Altschul et al. (1997) supra. When utilizing BLAST, Gapped BLAST, or PSI-BLAST, the default parameters of the respective programs (e.g., BLASTN for nucleotide sequences, BLASTX for proteins) can be used. [0224] Methods for the alignment of sequences and for the analysis of similarity and identity of polypeptide and polynucleotide sequences are well-known in the art. [0225] As used herein “sequence identity” refers to the percentage of residues that are identical in the same positions in the sequences being analyzed. As used herein “sequence similarity” refers to the percentage of residues that have similar biophysical / biochemical characteristics in the same positions (e.g. charge, size, hydrophobicity) in the sequences being analyzed. [0226] Methods of alignment of sequences for comparison are well-known in the art, including manual alignment and computer assisted sequence alignment and analysis. This latter approach is a preferred approach in the present disclosure, due to the increased throughput afforded by computer assisted methods. As noted below, a variety of computer programs for performing sequence alignment are available, or can be produced by one of skill. [0227] The determination of percent sequence identity and/or similarity between any two sequences can be accomplished using a mathematical algorithm. Examples of such mathematical algorithms are the algorithm of Myers and Miller, CABIOS 4:11-17 (1988); the sf-6059413 Attorney Docket No.: 26223-20027.40 local homology algorithm of Smith et al., Adv. Appl. Math.2:482 (1981); the homology alignment algorithm of Needleman and Wunsch, J. Mol. Biol.48:443-453 (1970); the search- for-similarity-method of Pearson and Lipman, Proc. Natl. Acad. Sci.85:2444-2448 (1988); the algorithm of Karlin and Altschul, Proc. Natl. Acad. Sci. USA 87:2264-2268 (1990), modified as in Karlin and Altschul, Proc. Natl. Acad. Sci. USA 90:5873-5877 (1993). [0228] Computer implementations of these mathematical algorithms can be utilized for comparison of sequences to determine sequence identity and/or similarity. Such implementations include, for example: CLUSTAL in the PC/Gene program (available from Intelligenetics, Mountain View, Calif.); the AlignX program, version10.3.0 (Invitrogen, Carlsbad, CA) and GAP, BESTFIT, BLAST, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Version 8 (available from Genetics Computer Group (GCG), 575 Science Drive, Madison, Wis., USA). Alignments using these programs can be performed using the default parameters. The CLUSTAL program is well described by Higgins et al. Gene 73:237-244 (1988); Higgins et al. CABIOS 5:151-153 (1989); Corpet et al., Nucleic Acids Res.16:10881-90 (1988); Huang et al. CABIOS 8:155-65 (1992); and Pearson et al., Meth. Mol. Biol.24:307-331 (1994). The BLAST programs of Altschul et al. J. Mol. Biol. 215:403-410 (1990) are based on the algorithm of Karlin and Altschul (1990) supra. [0229] Polynucleotides homologous to a reference sequence can be identified by hybridization to each other under stringent or under highly stringent conditions. Single stranded polynucleotides hybridize when they associate based on a variety of well characterized physical-chemical forces, such as hydrogen bonding, solvent exclusion, base stacking and the like. The stringency of a hybridization reflects the degree of sequence identity of the nucleic acids involved, such that the higher the stringency, the more similar are the two polynucleotide strands. Stringency is influenced by a variety of factors, including temperature, salt concentration and composition, organic and non-organic additives, solvents, etc. present in both the hybridization and wash solutions and incubations (and number thereof), as described in more detail in references cited below (e.g., Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed., Vol.1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. ("Sambrook") (1989); Berger and Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology, vol.152 Academic Press, Inc., San Diego, Calif. ("Berger and Kimmel") (1987); and Anderson and Young, "Quantitative Filter Hybridisation." In: Hames and Higgins, ed., Nucleic Acid Hybridisation, A Practical Approach. Oxford, TRL Press, 73-111 (1985)). sf-6059413 Attorney Docket No.: 26223-20027.40 [0230] Encompassed by the disclosure are polynucleotide sequences that are capable of hybridizing to the disclosed polynucleotide sequences and fragments thereof under various conditions of stringency (see, for example, Wahl and Berger, Methods Enzymol.152: 399- 407 (1987); and Kimmel, Methods Enzymo.152: 507-511, (1987)). Full length cDNA, homologs, orthologs, and paralogs of polynucleotides of the present disclosure may be identified and isolated using well-known polynucleotide hybridization methods. [0231] Amino acid and polypeptide sequences of the present disclosure may also be compared to other amino acid or polypeptide sequences based on their three-dimensional structure. Homologs of polypeptide sequences may be those that have a similar folded structure as compared to a reference polypeptide sequence of the present disclosure. Programs such as AlphaFold or other similar folding algorithms known in the art may be used for such comparisons. Target Nucleic Acids and Sequences [0232] Recombinant polypeptides of the present disclosure may be targeted to specific target nucleic acid acids to, for example, modify the target nucleic acid. [0233] Certain aspects of the present disclosure relate to target sites on target nucleic acids. A target site generally refers to a location of a target nucleic acid that is targeted by a genetic modifier polypeptide and/or a Į-crystalline domain polypeptide of the present disclosure (e.g. a nucleotide sequence of a target nucleic acid that can be bound by a targeting agent, such as e.g. a DNA-binding domain, in a recombinant polypeptide). In some embodiments, the target site may include both the nucleotide sequence targeted as well as at least 5, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, or at least 50 nucleotides or more on the 3’ side, the 5’ side, or both the 3’ and 5’ side of the nucleotide sequence in the target nucleic acid that is targeted. In some embodiments, the target site may contain at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 125, at least 150, at least 175, or at least 200 or more nucleotides. [0234] In some embodiments, a recombinant polypeptide is targeted to a particular locus. A locus generally refers to a specific position on a chromosome or other nucleic acid molecule. A locus may contain, for example, a polynucleotide that encodes a protein or an RNA. A locus may also contain, for example, a non-coding RNA, a gene, a promoter, a 5’ sf-6059413 Attorney Docket No.: 26223-20027.40 untranslated region (UTR), an exon, an intron, a 3’ UTR, or combinations thereof. In some embodiments, a locus may contain a coding region for a gene. [0235] In some embodiments, a recombinant polypeptide is targeted to a gene. A gene generally refers to a polynucleotide that can produce a functional unit (for example, a protein or a noncoding RNA molecule). A gene may contain a promoter, an enhancer sequence, a leader sequence, a transcriptional start site, a transcriptional stop site, a polyadenylation site, one or more exons, one or more introns, a 5’ UTR, a 3’ UTR, or combinations thereof. A gene sequence may contain a polynucleotide sequence encoding a promoter, an enhancer sequence, a leader sequence, a transcriptional start site, a transcriptional stop site, a polyadenylation site, one or more exons, one or more introns, a 5’ UTR, a 3’ UTR, or combinations thereof. [0236] The target nucleic acid sequence may be located within the coding region of a target gene or upstream or downstream thereof. Moreover, the target nucleic acid sequence may reside endogenously in a target gene or may be inserted into the gene, e.g., heterologous, for example, using techniques such as homologous recombination. For example, a target gene of the present disclosure can be operably linked to a control region, such as a promoter, that contains a sequence that can be recognized by a targeting agent (e.g. a DNA-binding domain) or other factor in association with a targeting agent (e.g. a guide RNA) such that a recombinant polypeptide may be targeted to that sequence. [0237] The target nucleic acid sequence may be located in a region of chromatin. In some embodiments, the target nucleic acid sequence may be in a region of open chromatin or similar region of DNA that is generally accessible to transcriptional machinery. Regions of open chromatin may be characterized by nucleosome depletion, nucleosome disruption, accessibility to transcriptional machinery, and/or a transcriptionally active state. Regions of open chromatin will be readily understood and identifiable by one of skill in the art. [0238] Target genes or nucleic acid regions to be targeted for modification by a genetic modifier polypeptide of the present disclosure will be readily apparent to those of skill in the art depending on the particular application and/or purpose. For example, genes with particular agricultural importance may be targeted for reduced expression according to the methods of the present disclosure. Exemplary genes to be targeted for reduced expression may include, for example, those involved in light perception (e.g. PHYB, etc.), those involved in the circadian clock (e.g. CCA1, LHY, etc.), those involved in flowering time (e.g. CO, FT, sf-6059413 Attorney Docket No.: 26223-20027.40 etc.), those involved in meristem size (e.g. WUS, CLV3, etc.), those involved in plant architecture (S, SP, TFL1, SFT, etc.) and genes involved in embryogenesis, chromatin structure, stress response, growth and development, etc. Alternatively or additionally, genes with particular agricultural importance may be targeted for increased expression and/or any other type of modification known in the art according to the methods of the present disclosure. [0239] In some embodiments, the target nucleic acid is endogenous to the organism in which the expression of one or more genes is to be reduced according to the methods described herein. In some embodiments, the target nucleic acid is a transgene of interest that has been inserted into a plant, an alga, a fungus, an animal (including, but not limited to a mammal and an insect), a bacterium, an archaea, and a protist. Suitable target nucleic acids will be readily apparent to one of skill in the art depending on the particular need or outcome. The target nucleic acid sequence may be in e.g. a region of euchromatin (e.g. highly expressed gene), or the target nucleic acid sequence may be in a region of heterochromatin (e.g. centromere DNA). [0240] In some embodiments, the target nucleic acid may be in a region of repressive chromatin. Repressive chromatin generally refers to regions of chromatin where transcription is repressed or otherwise generally transcriptionally inactive. Exemplary regions of repressive chromatin include, for example, regions with repressive DNA methylation, compact chromatin, and/or no transcription). Recombinant Expression [0241] Recombinant nucleic acids and/or recombinant polypeptides of the present disclosure may be present in host cells (e.g. plant, algal, fungal, animal (including, but not limited to mammalian and insect), a bacterial, archael, and/or protist cells). In some embodiments, recombinant nucleic acids are present in an expression vector and may encode a recombinant polypeptide, and the expression vector may be present in host cells (e.g. plant, algal, fungal, animal (including, but not limited to mammalian and insect), a bacterial, archael, and/or protist cells). In some embodiments, recombinant nucleic acids and/or recombinant polypeptides are present in host cells (e.g. plant, algal, fungal, animal (including, but not limited to mammalian and insect), a bacterial, archael, and/or protist cells) via direct introduction into the cell (e.g. via RNPs). sf-6059413 Attorney Docket No.: 26223-20027.40 [0242] In some embodiments, the genes encoding the recombinant polypeptides in the host cell may be heterologous to the host cell. In certain embodiments, the host cell does not naturally produce one or more polypeptides of the present disclosure, and contains heterologous nucleic acid constructs capable of expressing one or more genes necessary for producing those molecules. In certain embodiments, the host cell does not naturally produce one or more polypeptides of the present disclosure, and is provided the one or more polypeptides through exogenous delivery of the polypeptides directly to the host cell without the need to express a recombinant nucleic acid encoding the recombinant polypeptide in the host cell. [0243] Recombinant polypeptides of the present disclosure may be introduced into host cells (e.g., plant, algal, fungal, animal (including, but not limited to mammalian and insect), a bacterial, archael, and/or protist cells) via any suitable methods known in the art. For example, a recombinant polypeptide can be exogenously added to host cells and the host cells are maintained under conditions such that the recombinant polypeptide is targeted to one or more target nucleic acids to reduce expression of the target nucleic acids in the host cells. Alternatively, a recombinant nucleic acid encoding a recombinant polypeptide of the present disclosure can be expressed in host cells and the host cells are maintained under conditions such that the recombinant polypeptide is targeted to one or more target nucleic acids to reduce expression of the target nucleic acids in the host cells. Additionally, in some embodiments, a recombinant polypeptide of the present disclosure may be transiently expressed in a host via viral infection of the host, or by introducing a recombinant polypeptide-encoding RNA into a host to facilitate reduced expression of a target nucleic acid of interest. Methods of introducing recombinant proteins via viral infection or via the introduction of RNAs into various host organisms are well known in the art. For example, Tobacco rattle virus (TRV) has been successfully used to introduce zinc finger nucleases in plants to cause genome modification (“Nontransgenic Genome Modification in Plant Cells”, Plant Physiology 154:1079-1087 (2010)). TRV and other appropriate viruses may be used herein to facilitate editing in plants cells. [0244] In some embodiments, a recombinant polypeptide and a guide RNA may be exogenously and directly supplied to a host cell as a ribonucleoprotein (RNP) complex. This particular form of delivery is useful for facilitating transgene-free editing in various organisms. Modified guide RNAs which are resistant to nuclease digestion could also be used sf-6059413 Attorney Docket No.: 26223-20027.40 in this approach. For example, in embodiments in which the host is a plant, transgene-free calli from plants cells provided with an RNP could be used to regenerate whole plants. [0245] A recombinant nucleic acid encoding a recombinant polypeptide of the present disclosure can be expressed in a host with any suitable plant expression vector. Typical vectors useful for expression of recombinant nucleic acids in higher plants are well known in the art and include, for example, vectors derived from the tumor-inducing (Ti) plasmid of Agrobacterium tumefaciens (e.g., see Rogers et al., Meth. in Enzymol. (1987) 153:253-277). These vectors are plant integrating vectors in that on transformation, the vectors integrate a portion of vector DNA into the genome of the host plant. Exemplary A. tumefaciens vectors useful herein are plasmids pKYLX6 and pKYLX7 (e.g., see of Schardl et al., Gene (1987) 61:1-11; and Berger et al., Proc. Natl. Acad. Sci. USA (1989) 86:8402-8406); and plasmid pBI 101.2 that is available from Clontech Laboratories, Inc. (Palo Alto, CA). Typical vectors useful for expression of recombinant nucleic acids in mammalian cells are known in the art. [0246] In addition to regulatory domains, recombinant polypeptides of the present disclosure can be expressed as a fusion protein that is coupled to, for example, a maltose binding protein ("MBP"), glutathione S transferase (GST), hexahistidine, c-myc, or the FLAG epitope for ease of purification, monitoring expression, or monitoring cellular and subcellular localization. [0247] Moreover, a recombinant nucleic acid encoding a recombinant polypeptide of the present disclosure can be modified to improve expression of the recombinant protein in host cells by using codon preference/codon optimization to target preferential expression in host cells. When the recombinant nucleic acid is prepared or altered synthetically, advantage can be taken of known codon preferences of the intended host where the nucleic acid is to be expressed. For example, recombinant nucleic acids of the present disclosure can be modified to account for the specific codon preferences and GC content preferences of monocotyledons and dicotyledons, as these preferences have been shown to differ (Murray et al., Nucl. Acids Res. (1989) 17: 477-498). [0248] The present disclosure further provides expression vectors encoding recombinant polypeptides of the present disclosure. A nucleic acid sequence coding for the desired recombinant nucleic acid of the present disclosure can be used to construct a recombinant expression vector which can be introduced into the desired host cell. A recombinant expression vector will typically contain a nucleic acid encoding a recombinant protein of the sf-6059413 Attorney Docket No.: 26223-20027.40 present disclosure, operably linked to transcriptional initiation regulatory sequences which will direct the transcription of the nucleic acid in the intended host cell, such as tissues of a transformed plant or plant cell, or transformed mammal or mammalian cell. [0249] Recombinant nucleic acids e.g. encoding recombinant polypeptides of the present disclosure may be expressed on multiple expression vectors or they may be expressed on a single expression vector. For example, plant expression vectors may include (1) a cloned gene under the transcriptional control of 5' and 3' regulatory sequences and (2) a dominant selectable marker. Such plant expression vectors may also contain, if desired, a promoter regulatory region (e.g., one conferring inducible or constitutive, environmentally- or developmentally-regulated, or cell- or tissue-specific/selective expression), a transcription initiation start site, a ribosome binding site, an RNA processing signal, a transcription termination site, and/or a polyadenylation signal. [0250] In some embodiments, expression of a nucleic acid of the present disclosure may be driven (in operable linkage) with a promoter (e.g. a promoter functional in plants or a plant-specific promoter). A promoter generally refers to a DNA sequence that contains an RNA polymerase binding site, transcription start site, and/or TATA box and assists or promotes the transcription and expression of an associated transcribable polynucleotide sequence such as, for example, a gene. For example, a plant promoter, or functional fragment thereof, can be employed to e.g. control the expression of a recombinant nucleic acid of the present disclosure in regenerated plants; alternatively or additionally, a mammalian promoter, or functional fragment thereof, can be employed to, for example, control the expression of a recombinant nucleic acid of the present disclosure in transformed mammalian cells . The selection of the promoter used in expression vectors will determine the spatial and temporal expression pattern of the recombinant nucleic acid in the modified host, e.g., the nucleic acid encoding the recombinant polypeptide of the present disclosure is only expressed in the desired tissue or at a certain time in the host’s development or growth. Certain promoters will express recombinant nucleic acids in all host tissues and are active under most environmental conditions and states of development or cell differentiation (i.e., constitutive promoters). Other promoters will express recombinant nucleic acids in specific cell types (such as, for example in the context of plant hosts, leaf epidermal cells, mesophyll cells, root cortex cells; or, for example, in the context of mammalian hosts, skin epidermal cells, neurons, or immune cells or in specific tissues or organs (such as, for example in the context of plant hosts, roots, leaves or flowers; or, for example in the context of mammals, lung cells, bladder cells, or sf-6059413 Attorney Docket No.: 26223-20027.40 ovary cells and the selection will reflect the desired location of accumulation of the gene product. Alternatively, the selected promoter may drive expression of the recombinant nucleic acid under various inducing conditions. [0251] Examples of suitable constitutive plant promoters may include, for example, the core promoter of the Rsyn7, the core CaMV 35S promoter (Odell et al., Nature (1985) 313:810-812), CaMV 19S (Lawton et al., 1987), rice actin (Wang et al., 1992; U.S. Pat. No. 5,641,876; and McElroy et al., Plant Cell (1985) 2:163-171); ubiquitin (Christensen et al., Plant Mol. Biol. (1989)12:619-632; and Christensen et al., Plant Mol. Biol. (1992) 18:675- 689), pEMU (Last et al., Theor. Appl. Genet. (1991) 81:581-588), MAS (Velten et al., EMBO J. (1984) 3:2723-2730), nos (Ebert et al., 1987), Adh (Walker et al., 1987), the P- or 2'- promoter derived from T-DNA of Agrobacterium tumefaciens, the Smas promoter, the cinnamyl alcohol dehydrogenase promoter (U.S. Pat. No.5,683,439), the Nos promoter, the pEmu promoter, the rubisco promoter, the GRP 1 - 8 promoter, and other transcription initiation regions from various plant genes known to those of skilled artisans, and constitutive promoters described in, for example, U.S. Pat. Nos.5,608,149; 5,608,144; 5,604,121; 5,569,597; 5,466,785; 5,399,680; 5,268,463; and 5, 608,142. [0252] Examples of suitable constitutive mammalian promoters may include, for example, CMV, SV40, UBC, PGK, EF1A, and CAGG in the context of mammalian systems, and COPIA and ACT5C in the context of Drosophila systems (see, e.g., Qin JY, et al. (2010) Systematic Comparison of Constitutive Promoters and the Doxycycline-Inducible Promoter. PLOS ONE 5(5): e10611. https://doi.org/10.1371/journal.pone.0010611). [0253] Recombinant nucleic acids of the present disclosure may be expressed using an RNA Polymerase III (Pol III) promoter such as, for example, the U6 promoter or the H1 promoter (eLife 20132:e00471). For example, an approach in plants has been described using three different Pol III promoters from three different Arabidopsis U6 genes, and their corresponding gene terminators (BMC Plant Biology 201414:327). One skilled in the art would readily understand that many additional Pol III promoters could be utilized to, for example, simultaneously express many guide RNAs to many different locations in the genome simultaneously. The use of different Pol III promoters for each gRNA expression cassette may be desirable to reduce the chances of natural gene silencing that can occur when multiple copies of identical sequences are expressed in hosts. sf-6059413 Attorney Docket No.: 26223-20027.40 [0254] Recombinant nucleic acids of the present disclosure may be expressed using an RNA Polymerase II (Pol II) promoter such as, for example, the CmYLCV promoter and the 35S promoter. Use of a Pol II promoter to drive expression of nucleic acids (e.g. guide RNA expression) may provide additional flexibility for controlling the strength/degree of expression and may provide the possibility of tissue-specific expression. One skilled in the art would recognize appropriate Pol II promoters for use in the methods and compositions of the present disclosure. [0255] Examples of suitable tissue specific promoters in plants may include, for example, the lectin promoter (Vodkin et al., 1983; Lindstrom et al., 1990), the corn alcohol dehydrogenase 1 promoter (Vogel et al., 1989; Dennis et al., 1984), the corn light harvesting complex promoter (Simpson, 1986; Bansal et al., 1992), the corn heat shock protein promoter (Odell et al., Nature (1985) 313:810-812; Rochester et al., 1986), the pea small subunit RuBP carboxylase promoter (Poulsen et al., 1986; Cashmore et al., 1983), the Ti plasmid mannopine synthase promoter (Langridge et al., 1989), the Ti plasmid nopaline synthase promoter (Langridge et al., 1989), the petunia chalcone isomerase promoter (Van Tunen et al., 1988), the bean glycine rich protein 1 promoter (Keller et al., 1989), the truncated CaMV 35s promoter (Odell et al., Nature (1985) 313:810-812), the potato patatin promoter (Wenzler et al., 1989), the root cell promoter (Conkling et al., 1990), the maize zein promoter (Reina et al., 1990; Kriz et al., 1987; Wandelt and Feix, 1989; Langridge and Feix, 1983; Reina et al., 1990), the globulin-1 promoter (Belanger and Kriz et al., 1991), the D-tubulin promoter, the cab promoter (Sullivan et al., 1989), the PEPCase promoter (Hudspeth & Grula, 1989), the R gene complex-associated promoters (Chandler et al., 1989), and the chalcone synthase promoters (Franken et al., 1991). [0256] Examples of suitable tissue specific promoters in mammals may include, for example, cytokeratin 18 and 19 for epithelial cell-specificity; the tissue kallikrein promoter for ductal cell-specificity in salivary glands; and the amylase 1C and aquaporin-5 (AQP5) promoters for acinar cell-specificity (Zheng C, Baum BJ. Evaluation of promoters for use in tissue-specific gene delivery. Methods Mol Biol.2008;434:205-19. doi: 10.1007/978-1- 60327-248-3_13. PMID: 18470647; PMCID: PMC2685069). [0257] Alternatively, the promoter can direct expression of a recombinant nucleic acid of the present disclosure in a specific tissue or may be otherwise under more precise environmental or developmental control. Such promoters are referred to here as “inducible” promoters. Environmental conditions that may affect transcription by inducible promoters sf-6059413 Attorney Docket No.: 26223-20027.40 include, for example, pathogen attack, anaerobic conditions, or the presence of light. Examples of inducible plant promoters include, for example, the AdhI promoter which is inducible by hypoxia or cold stress, the Hsp70 promoter which is inducible by heat stress, and the PPDK promoter which is inducible by light. Examples of promoters under developmental control include, for example, promoters that initiate transcription only, or preferentially, in certain tissues, such as, in plants, leaves, roots, fruit, seeds, or flowers. An exemplary promoter is the anther specific promoter 5126 (U.S. Pat. Nos.5,689,049 and 5,689,051). Examples of inducible mammalian systems include, for example, Tet operator (TetO)-based systems, cumate-controlled operator systems, and rapamycin-induced interaction between FKBP12 (FK506 binding protein 12) and mTOR-based systems (Kallunki T, et al. How to Choose the Right Inducible Gene Expression System for Mammalian Studies? Cells.2019 Jul 30;8(8):796. doi: 10.3390/cells8080796. PMID: 31366153; PMCID: PMC6721553). The operation of a promoter may also vary depending on its location in the genome. Thus, an inducible promoter may become fully or partially constitutive in certain locations. [0258] Moreover, any combination of a constitutive or inducible promoter, and a non- tissue specific or tissue specific promoter may be used to control the expression of various recombinant polypeptides of the present disclosure. [0259] The recombinant nucleic acids of the present disclosure and/or a vector housing a recombinant nucleic acid of the present disclosure, may also contain a regulatory sequence that serves as a 3’ terminator sequence. A terminator sequence generally refers to a nucleic acid sequence that marks the end of a gene or transcribable nucleic acid during transcription. One of skill in the art would readily recognize a variety of terminators that may be used in the recombinant nucleic acids of the present disclosure. For example, a recombinant nucleic acid of the present disclosure may contain a 3’ NOS terminator. In some embodiments, recombinant nucleic acids of the present disclosure contain a transcriptional termination site. Transcription termination sites may include, for example, OCS terminators, rbcS-E9 terminators, NOS terminators, HSP18.2 terminators, and poly-T terminators. [0260] Recombinant nucleic acids of the present disclosure may include one or more introns. Introns may be included in e.g. recombinant nucleic acids being expressed on a vector in a host cell. The inclusion of one of more introns in a recombinant nucleic acid to be expressed may be particularly helpful to increase expression in plant cells. sf-6059413 Attorney Docket No.: 26223-20027.40 [0261] Recombinant nucleic acids of the present disclosure may also contain selectable markers. A selectable marker can be used to assist in the selection of transformed cells or tissue due to the presence of a selection agent, such as an antibiotic or herbicide, where the selectable marker gene provides tolerance or resistance to the selection agent. Thus, the selection agent can bias or favor the survival, development, growth, proliferation, etc., of transformed cells expressing the selectable marker gene. Selectable marker genes may include, for example, those conferring tolerance or resistance to antibiotics, such as kanamycin and paromomycin ( nptll ), hygromycin B (aph IV), streptomycin or spectinomycin ( aadA ) and gentamycin ( aac3 and aacC4), or those conferring tolerance or resistance to herbicides such as glufosinate ( bar or pat), dicamba (DMO) and glyphosate (aroA or Cp4-EPSPS). Selectable marker genes which provide an ability to visually screen for transformants may also be used such as, for example, luciferase or green fluorescent protein (GFP), or a gene expressing a beta glucuronidase or uidA gene (GUS) for which various chromogenic substrates are known. In some embodiments, a nucleic acid molecule provided herein contains a selectable marker gene selected from the group consisting of nptll, aph IV, aadA, aac3, aacC4, bar, pat, DMO, EPSPS, aroA, luciferase, GFP, and GUS. Eukaryotes and Eukaryotic Cells [0262] Certain aspects of the present disclosure relate to eukaryotes and eukaryotic cells that contain recombinant polypeptides that are targeted to one or more target nucleic acids in the host/host cell in order to reduce expression of the target nucleic acid. [0263] As used herein, eukaryotes and eukaryotic cells refers to any of various species in the domain of Eukaryota, in which the cells contain a nucleus, including, for example, plant, algal, fungal, and animal (including, but not limited to mammalian and insect) species. [0264] As used herein, a “plant” refers to any of various photosynthetic, eukaryotic multi- cellular organisms of the kingdom Plantae, characteristically producing embryos, containing chloroplasts, having cellulose cell walls and lacking locomotion. As used herein, a “plant” includes any plant or part of a plant at any stage of development, including seeds, suspension cultures, plant cells, embryos, meristematic regions, callus tissue, leaves, roots, shoots, gametophytes, sporophytes, pollen, microspores, and progeny thereof. Also included are cuttings, and cell or tissue cultures. As used in conjunction with the present disclosure, plant tissue includes, for example, whole plants, plant cells, plant organs, e.g., leaves, stems, roots, sf-6059413 Attorney Docket No.: 26223-20027.40 meristems, plant seeds, protoplasts, callus, cell cultures, and any groups of plant cells organized into structural and/or functional units. [0265] Various eukaryotic cells may be used in the present disclosure so long as they remain viable after being transformed or otherwise modified to express recombinant nucleic acids or house recombinant polypeptides. Preferably, the eukaryotic cell is not adversely affected by the transduction of the necessary nucleic acid sequences, the subsequent expression of the proteins or the resulting intermediates. [0266] As disclosed herein, a broad range of species may be modified to incorporate recombinant polypeptides and/or polynucleotides of the present disclosure. Suitable plants that may be modified include both monocotyledonous (monocot) plants and dicotyledonous (dicot) plants. Suitable animals that may be modified include, for example, mammalian cells and insect cells. [0267] Examples of suitable plants may include, for example, species of the Family Gramineae, including Sorghum bicolor and Zea mays; species of the genera: Cucurbita, Rosa, Vitis, Juglans, Fragaria, Lotus, Medicago, Onobrychis, Trifolium, Trigonella, Vigna, Citrus, Linum, Geranium, Manihot, Daucus, Arabidopsis, Brassica, Raphanus, Sinapis, Atropa, Capsicum, Datura, Hyoscyamus, Lycopersicon, Nicotiana, Solanum, Petunia, Digitalis, Majorana, Ciahorium, Helianthus, Lactuca, Bromus, Asparagus, Antirrhinum, Heterocallis, Nemesis, Pelargonium, Panieum, Pennisetum, Ranunculus, Senecio, Salpiglossis, Cucumis, Browaalia, Glycine, Pisum, Phaseolus, Lolium, Oryza, Avena, Hordeum, Secale, and Triticum. Examples of suitable animal systems may include, for example, human cells and Drosophila cells. [0268] In some embodiments, plant cells may include, for example, those from corn (Zea mays), canola (Brassica napus, Brassica rapa ssp.), Brassica species useful as sources of seed oil, alfalfa (Medicago sativa), rice (Oryza sativa), rye (Secale cereale), sorghum (Sorghum bicolor, Sorghum vulgare), millet (e.g., pearl millet (Pennisetum glaucum), proso millet (Panicum miliaceum), foxtail millet (Setaria italica), finger millet (Eleusine coracana)), sunflower (Helianthus annuus), safflower (Carthamus tinctorius), wheat (Triticum aestivum), duckweed (Lemna), soybean (Glycine max), tobacco (Nicotiana tabacum), potato (Solanum tuberosum), peanuts (Arachis hypogaea), cotton (Gossypium barbadense, Gossypium hirsutum), sweet potato (Ipomoea batatus), cassava (Manihot esculenta), coffee (Coffea spp.), coconut (Cocos nucijra), pineapple (Ananas comosus), citrus trees (Citrus spp.), cocoa sf-6059413 Attorney Docket No.: 26223-20027.40 (Theobroma cacao), tea (Camellia sinensis), banana (Musa spp.), avocado (Persea americana), fig (Ficus casica), guava (Psidium guajava), mango (Mangifera indica), olive (Olea europaea), papaya (Carica papaya), cashew (Anacardium occidentale), macadamia (Macadamia spp.), almond (Prunus amygdalus), sugar beets (Beta vulgaris), sugarcane (Saccharum spp.), oats, barley, vegetables, ornamentals, and conifers. [0269] Examples of suitable vegetables plants may include, for example, tomatoes (Lycopersicon esculentum), lettuce (e.g., Lactuca sativa), green beans (Phaseolus vulgaris), lima beans (Phaseolus limensis), peas (Lathyrus spp.), and members of the genus Cucumis such as cucumber (C. sativus), cantaloupe (C. cantalupensis), and musk melon (C. melo). [0270] Examples of suitable ornamental plants may include, for example, azalea (Rhododendron spp.), hydrangea (Macrophylla hydrangea), hibiscus (Hibiscus rosasanensis), roses (Rosa spp.), tulips (Tulipa spp.), daffodils (Narcissus spp.), petunias (Petunia hybrida), carnation (Dianthus caryophyllus), poinsettia (Euphorbiapulcherrima), and chrysanthemum. [0271] Examples of suitable conifer plants may include, for example, loblolly pine (Pinus taeda), slash pine (Pinus elliotii), ponderosa pine (Pinus ponderosa), lodgepole pine (Pinus contorta), Monterey pine (Pinus radiata), Douglas-fir (Pseudotsuga menziesii), Western hemlock (Isuga canadensis), Sitka spruce (Picea glauca), redwood (Sequoia sempervirens), silver fir (Abies amabilis), balsam fir (Abies balsamea), Western red cedar (Thuja plicata), and Alaska yellow-cedar (Chamaecyparis nootkatensis). [0272] Examples of suitable leguminous plants may include, for example, guar, locust bean, fenugreek, soybean, garden beans, cowpea, mungbean, lima bean, fava bean, lentils, chickpea, peanuts (Arachis sp.), crown vetch (Vicia sp.), hairy vetch, adzuki bean, lupine (Lupinus sp.), trifolium, common bean (Phaseolus sp.), field bean (Pisum sp.), clover (Melilotus sp.) Lotus, trefoil, lens, and false indigo. [0273] Examples of suitable forage and turf grass may include, for example, alfalfa (Medicago s sp.), orchard grass, tall fescue, perennial ryegrass, creeping bent grass, and redtop. [0274] Examples of suitable crop plants and model plants may include, for example, Arabidopsis, corn, rice, alfalfa, sunflower, canola, soybean, cotton, peanut, sorghum, wheat, tobacco, and lemna. sf-6059413 Attorney Docket No.: 26223-20027.40 [0275] Examples of suitable animal cells include, for example, mammalian cells (such as, for instance, human cells), insect cells, and/or stem cells, such as, for example iPSCs. In some embodiments, iPSCs are reprogrammed using the methods described herein. [0276] The eukaryotes and eukaryotic cells of the present disclosure may be genetically modified in that recombinant nucleic acids have been introduced into the eukaryotes and eukaryotic cells, and as such the genetically modified eukaryotes and/or eukaryotic cells do not occur in nature. A suitable host of the present disclosure is e.g. one capable of expressing one or more nucleic acid constructs encoding one or more recombinant proteins. [0277] As used herein, the terms “transgenic” and “genetically modified” are used interchangeably and refer to a eukaryote and/or eukaryotic cell that contains within its genome a recombinant nucleic acid. Generally, the recombinant nucleic acid is stably integrated within the genome such that the polynucleotide is passed on to successive generations. However, in certain embodiments, the recombinant nucleic acid is transiently expressed in the eukaryote and/or eukaryotic cell. The recombinant nucleic acid may be integrated into the genome alone or as part of a recombinant expression cassette. “Transgenic” is used herein to include any cell, cell line, callus, tissue, or whole or part of an organism, the genotype of which has been altered by the presence of exogenous nucleic acid including those transgenics initially so altered as well as those created by sexual crosses or asexual propagation from the initial transgenic. [0278] Plant transformation protocols as well as protocols for introducing recombinant nucleic acids of the present disclosure into plants may vary depending on the type of plant or plant cell, e.g., monocot or dicot, targeted for transformation. Suitable methods of introducing recombinant nucleic acids of the present disclosure into plant cells and subsequent insertion into the plant genome include, for example, microinjection (Crossway et al., Biotechniques (1986) 4:320-334), electroporation (Riggs et al., Proc. Natl. Acad Sci. USA (1986) 83:5602-5606), Agrobacterium-mediated transformation (U.S. Pat. No. 5,563,055), direct gene transfer (Paszkowski et al., EMBO J. (1984) 3:2717-2722), and ballistic particle acceleration (U.S. Pat. No.4,945,050; Tomes et al. (1995). "Direct DNA Transfer into Intact Plant Cells via Microprojectile Bombardment," in Plant Cell, Tissue, and Organ Culture: Fundamental Methods, ed. Gamborg and Phillips (Springer-Verlag, Berlin); and McCabe et al., Biotechnology (1988) 6:923-926). sf-6059413 Attorney Docket No.: 26223-20027.40 [0279] Additionally, recombinant polypeptides of the present disclosure can be targeted to a specific organelle within a eukaryotic cell. Targeting can be achieved by providing the recombinant protein with an appropriate targeting peptide sequence. Examples of such targeting peptides include, for example, secretory signal peptides (for secretion or cell wall or membrane targeting), plastid transit peptides, chloroplast transit peptides, mitochondrial target peptides, vacuole targeting peptides, nuclear targeting peptides, and the like (e.g., see Reiss et al., Mol. Gen. Genet. (1987) 209(1):116-121; Settles and Martienssen, Trends Cell Biol (1998) 12:494-501; Scott et al., J Biol Chem (2000) 10:1074; and Luque and Correas, J Cell Sci (2000) 113:2485-2495). [0280] Modified eukaryotes and eukaryotic cells may be grown in accordance with conventional methods For example, modified plants may be grown in accordance with conventional methods (e.g., see McCormick et al., Plant Cell. Reports (1986) 81-84.). These plants may then be grown, and pollinated with either the same transformed strain or different strains, with the resulting hybrid having the desired phenotypic characteristic. Two or more generations may be grown to ensure that the subject phenotypic characteristic is stably maintained and inherited and then seeds harvested to ensure the desired phenotype or other property has been achieved. [0281] The present disclosure also provides plants derived from plants having modified expression of a target nucleic acid as a consequence of the methods of the present disclosure. A plant having modified expression of a target nucleic acid as a consequence of the methods of the present disclosure may be crossed with itself or with another plant to produce an F1 plant. In some embodiments, one or more of the resulting F1 plants may also have modified expression of a target nucleic acid. Accordingly, in some embodiments, provided are progeny plants that are the progeny (either directly or indirectly) of plants having modified expression of a targeted nucleic acid as a consequence of the methods of the present disclosure. These progeny plants may also have modified expression of a target nucleic acid. Progeny plants may also have an altered or modified phenotype as compared to a corresponding control plant. [0282] Further provided are methods of screening plants derived from plants having modified expression of a target nucleic acid as a consequence of the methods of the present disclosure. In some embodiments, the derived plants (e.g. F1 or F2 plants resulting from or derived from crossing the plant having modified expression of a target nucleic acid as a consequence of the methods of the present disclosure with another plant) can be selected sf-6059413 Attorney Docket No.: 26223-20027.40 from a population of derived plants. For example, provided are methods of selecting one or more of the derived plants that (i) lack recombinant nucleic acids, and (ii) have modified expression of a target nucleic acid. Because the modified expression of the target nucleic acid may be heritable, progeny plants as described herein do not necessarily need to contain a recombinant polypeptide in order to maintain the modified expression of the target nucleic acid. Methods of Modifying a Target Nucleic Acid [0283] Growing and/or cultivation conditions sufficient for the recombinant polypeptides and/or polynucleotides of the present disclosure to be expressed and/or maintained in the eukaryote/eukaryotic cell and to be targeted to and to modify one or more target nucleic acids of the present disclosure are well known in the art and include any suitable growing conditions disclosed herein. For example, typically the cell is grown under conditions sufficient to express a recombinant polypeptide of the present disclosure, and for the expressed recombinant polypeptides to be localized to the nucleus in order to be targeted to and modify the target nucleic acids (if those target nucleic acids are present in the nucleus). Alternatively, nucleic acids present outside the nucleus, such as, for example, in the cytoplasm or in an organelle, may be targeted for modification. Generally, the conditions sufficient for the expression of the recombinant polypeptide (if being encoded from a recombinant nucleic acid) will depend on the promoter used to control the expression of the recombinant polypeptide. For example, if an inducible promoter is utilized, expression of the recombinant polypeptide in a host will require that the host is grown or cultivated in the presence of the inducer. Growth Conditions [0284] As noted above, growing conditions sufficient for the recombinant polypeptides of the present disclosure to be expressed and/or maintained in the eukaryote/eukaryotic cells and to be targeted to one or more target nucleic acids to modify one or more target nucleic acids may vary depending on a number of factors (e.g. species, use of inducible promoter, etc.). Suitable growing conditions may include, for example, ambient environmental conditions, standard laboratory conditions, standard greenhouse conditions, growth in long days under standard environmental conditions (e.g.16 hours of light, 8 hours of dark), growth in 12 hour light : 12 hour dark day/night cycles, etc. sf-6059413 Attorney Docket No.: 26223-20027.40 [0285] Various time frames may be used to observe modification of a target nucleic acid according to the methods of the present disclosure. Eukaryotes and/or eukaryotic cells may be observed/assayed for modified expression of a target nucleic acid after, for example, about 30 minutes, about 45 minutes, about 1 hour, about 2.5 hours, about 5 hours, about 7.5 hours, about 10 hours, about 15 hours, about 20 hours, about 1 day, about 5 days, about 10 days, about 15 days, about 20 days, about 25 days, about 30 days, about 35 days, about 40 days, about 45 days, about 50 days, or about 55 days or more after being cultivated/grown in conditions sufficient for a recombinant polypeptide to facilitate modification of a target nucleic acid. Modified Expression of a Target Nucleic Acid [0286] A target nucleic acid of the present disclosure may have its expression modified as compared to a corresponding control nucleic acid. A target nucleic acid of the present disclosure in a eukaryote/eukaryotic cell housing recombinant polypeptides of the present disclosure may have its expression decreased/downregulated by at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% as compared to a corresponding control. A target nucleic acid of the present disclosure in a eukaryote/eukaryotic cell housing recombinant polypeptides of the present disclosure may have its expression increased/upregulated by at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% as compared to a corresponding control. [0287] Various controls will be readily apparent to one of skill in the art. For example, a control may be a corresponding eukaryote or eukaryotic cell that does not contain recombinant polypeptides of the present disclosure (e.g. wild-type plant or plant cell). sf-6059413 Attorney Docket No.: 26223-20027.40 [0288] A target nucleic acid may have its expression modified (e.g. increased or decreased) at least about 1-fold, at least about 2-fold, at least about 3-fold, at least about 4- fold, at least about 5-fold, at least about 10-fold, at least about 15-fold, at least about 20-fold, at least about 25-fold, at least about 30-fold, at least about 40-fold, at least about 50-fold, at least about 75-fold, at least about 100-fold, at least about 150-fold, at least about 200-fold, at least about 300-fold, at least about 400-fold, at least about 500-fold, at least about 600-fold, at least about 700-fold, at least about 800-fold, at least about 900-fold, at least about 1,000- fold, at least about 1,250-fold, at least about 1,500-fold, at least about 1,750-fold, at least about 2,000-fold, at least about 2,500-fold, at least about 3,000-fold, at least about 3,500-fold, at least about 4,000-fold, at least about 4,500-fold, at least about 5,000-fold, at least about 5,500-fold, at least about 6,000-fold, at least about 6,500-fold, at least about 7,000-fold, at least about 7,500-fold, at least about 8,000-fold, at least about 8,500-fold, at least about 9,000-fold, at least about 9,500-fold, at least about 10,000-fold, at least about 12,000-fold, at least about 14,00-fold, at least about 16,000-fold, at least about 18,000-fold, or at least about 20,000-fold or more as compared to a corresponding control nucleic acid. As stated above, various controls will be readily apparent to one of skill in the art. For example, a control nucleic acid may be a corresponding nucleic acid from a eukaryote or eukaryotic cell that does not contain a nucleic acid encoding a recombinant polypeptide of the present disclosure. [0289] Comparisons in the present disclosure may also be in reference to corresponding control eukaryotes/eukaryotic cells. Various control eukaryotes/eukaryotic cells will be readily apparent to one of skill in the art. For example, a control plant or plant cell may be a plant or plant cell that does not contain a recombinant polypeptide (e.g. a wild-type plant) of the present disclosure. [0290] Methods of probing the expression level of a nucleic acid are well-known to those of skill in the art. For example, qRT-PCR analysis may be used to determine the expression level of a population of nucleic acids isolated from a nucleic acid-containing sample (e.g. plants, animals, plant tissues, animal tissues, animal cells, or plant cells). [0291] In some embodiments, recombinant polypeptides of the present disclosure may facilitate an epigenetic change or other chromatin modification at the target nucleic acid that does not involve a change to the actual nucleic acid nucleotide sequence. Such epigenetic changes and/or chromatin modifications at the target nucleic acid may include, for example, increased DNA methylation, H3K27me3 deposition, H3K4me3 removal/demethylation, and histone deacetylation (e.g. H3K9, H3K14, H3K27, and H4K16 deacetylation). Target sf-6059413 Attorney Docket No.: 26223-20027.40 nucleic acids of the present disclosure may exhibit one or more of increased: DNA methylation, H3K27me3 deposition, H3K4me3 removal/demethylation, and histone deacetylation (e.g. H3K9, H3K14, H3K27, and H4K16 deacetylation) at a level or frequency that is at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% higher as compared to a corresponding control nucleic acid. Target nucleic acids of the present disclosure may exhibit one or more of decreased: DNA methylation, H3K27me3 deposition, H3K4me3 removal/demethylation, and histone deacetylation (e.g. H3K9, H3K14, H3K27, and H4K16 deacetylation) at a level or frequency that is at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% reduced as compared to a corresponding control nucleic acid. Various controls will be readily apparent to one of skill in the art. For example, a control nucleic acid may be a corresponding nucleic acid from a eukaryote or eukaryotic cell that does not contain recombinant polypeptides of the present disclosure (e.g. wild-type plant or plant cells). [0292] In some embodiments, recombinant polypeptides of the present disclosure may interfere with transcription of the target nucleic acid. Such interference may include, e.g. interference with RNA Polymerase II transcription elongation and RNA Polymerase II Serine 5 (Ser-5) dephosphorylation. Target nucleic acids of the present disclosure may exhibit one or more of interference with RNA Polymerase II transcription elongation and RNA Polymerase II Serine 5 (Ser-5) dephosphorylation at a level or frequency that is at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at sf-6059413 Attorney Docket No.: 26223-20027.40 least about 99%, or at least about 100% higher as compared to a corresponding control nucleic acid. Various controls will be readily apparent to one of skill in the art. For example, a control nucleic acid may be a corresponding nucleic acid from a eukaryote or eukaryotic cell that does not contain recombinant polypeptides of the present disclosure (e.g. wild-type plant or plant cell). Additional Exemplary Embodiments [0293] In some embodiments, the method of modifying a target nucleic acid in a eukaryotic cell includes a) providing a cell comprising 1) a genetic modifier polypeptide capable of being targeted to the target nucleic acid, and 2) one or more polypeptides comprising an Į-crystallin domain (ACD), such as, for example, one or more small heat shock polypeptides (sHSPs) capable of being targeted to the target nucleic acid; and b) maintaining the cell under conditions whereby the genetic modifier polypeptide and the small HSP are targeted to the target nucleic acid, thereby modifying the target nucleic acid. [0294] Further, in some embodiments, provided herein are methods of targeting or aggregating polypeptides of interest to a target nucleic acid in a eukaryotic cell, including includes a) providing a cell comprising 1) a polypeptide of interest capable of being targeted to the target nucleic acid, and 2) one or more polypeptides comprising an Į-crystallin domain (ACD), such as, for example, one or more small heat shock polypeptides (sHSPs) capable of being targeted to the target nucleic acid; and b) maintaining the cell under conditions whereby the polypeptide of interest and the small HSP are targeted to the target nucleic acid. [0295] In some embodiments, the polypeptide of interest comprises a transcription factor, a transcriptional repressor polypeptide, and/or a visualizable marker protein. In some embodiments, the visualizable maker protein is a fluorescent protein, such as, for example a GFP. [0296] In some embodiments, the method involves targeting a polypeptide of interest, such as a genetic modifier polypeptide, to a target nucleic acid in a eukaryotic cell, thereby modifying the target nucleic acid. In some embodiments, the genetic modifier polypeptide may include, for example, a sequence specific endonuclease, a demethylation enzyme (such as, for example, TET1, or LSD1), a methyltransferase (such as, for example, TRBIP1-MQ1 or a Dnmt3 protein), a component of a methylation binding complex (such as, for example, MBD5 and/or MBD6), and/or a sequence specific recombinase. Exemplary sequence- specific recombinases include, for example, a CRISPR protein (such as, for example, a Cas sf-6059413 Attorney Docket No.: 26223-20027.40 protein), a TALEN protein, a zinc finger nuclease (ZFN) protein. In embodiments involving CRISPR systems, the method may further entail, for example, providing a gRNA that targets a Cas protein to the target nucleic acid. [0297] Modifying a target nucleic acid may include various different mechanisms, such as, for example, epigenetic editing, genome editing, RNA editing (include, for example, A-to- I and C-to-U editing; see, e.g., https://www.frontiersin.org/articles/10.3389/fendo.2018.00762/full), targeted recombination, regulation of transcription, or modifications of any other process that occurs at specific regions of chromatin, or any combinations thereof. [0298] Į-crystalline domain polypeptides (e.g. sHSPs) from any different species may be used for the methods described herein. For example, the small HSP may be a plant small HSP or an animal small HSP. Alternatively, the sHSP may be a bacterial small HSP, a fungal small HSP, a protist small HSP, and an archaeal small HSP. Alternatively, the sHSP may be a modified version of any natural sHSP, such as, for example, a recombinant and/or chimeric sHSP. [0299] In some embodiments, the methods described herein may entail providing more than one type of Į-crystalline domain polypeptide (for example, ACD15 and ACD21) capable of being targeted to the target nucleic acid together or independently. In some embodiments, the methods described herein may entail providing more than one type of Į- crystalline domain polypeptide, in which one or more types are targeted to the target nucleic acid, and one or more types are expressed diffusely. In some embodiments, the different types of Į-crystalline domain polypeptides are from the same species. In some embodiments, the different types of Į-crystalline domain polypeptides are from different species. In some embodiments, the different types of Į-crystalline domain polypeptides are non-naturally occurring, such as modifications of naturally occurring Į-crystalline domain polypeptides.. [0300] In some embodiments, more than one type of genetic modifier polypeptide capable of being targeted to the target nucleic acid is provided, sequentially or simultaneously. [0301] In some embodiments, one or more genetic modifier polypeptides is tethered to one or more the Į-crystalline domain polypeptides. In some embodiments, additional, non- tethered Į-crystalline domain polypeptides are co-expressed with Į-crystalline domain polypeptides that are tethered to a genetic modifier polypeptide. sf-6059413 Attorney Docket No.: 26223-20027.40 [0302] In some embodiments, the genetic modifier polypeptide and/or the Į-crystalline domain polypeptide comprises a StkyC domain. In some embodiments, the genetic modifier polypeptide and/or the Į-crystalline domain polypeptide does not comprise a StkyC domain. [0303] In some embodiments, the methods and compositions of the present disclosure involving targeting a genetic modifier polypeptide and an Į-crystalline domain polypeptide to a target nucleic acid may be used for making improved transcription factors for reprogramming of human stem cells, or making improved transcription factors for reprogramming plant cells to more easily regenerate plants from tissue culture or other cells. [0304] In some embodiments, a dCas9 is tethered to a location in the genome that is attached to ACDs in a SunTag system, and then another CRISPR of another type (e.g., CasPhi) is sent to an adjacent location that is also fused with the same ACD so that it becomes concentrated there. This may, for example, result in higher frequency of editing. In some embodiments, a SunTag-ACD forms a condensate, such that, for example, anything fused to the ACD (e.g., a nuclease and/or one or more other peptides, nucleic acids, or other domains or attachments of interest) will then concentrate along with the ACD condensate. For example, in some embodiments, the ACD could be attached to a gRNA and thus concentrated there, or the ACD could be attached to a protein that is fused to a piece of DNA that could be used as a repair template. [0305] In some embodiments, a CRISPR system comprising a Cas polypeptide and a guide RNA (gRNA) is targeted to a target nucleic acid using the constructs provided herein with a non-truncated (e.g., about 20 bp long) gRNA that provides “normal” DNA cleavage for a corresponding CRISPR system using the same Cas polypeptide. In some embodiments, a CRISPR system is targeted to a target nucleic acid using the constructs provided herein with a truncated (e.g., less than about 20 bp long, such as, for example, up to 10, 11, 12, 13, 14, 15, 16, 17, 18, or 19 bp long) gRNA that provides reduced DNA cleavage compared to a corresponding non-truncated gRNA in a CRISPR system using the same Cas polypeptide. In some embodiments, the truncated gRNA is 14 bp long. In some embodiments, the truncated gRNA is 15 bp long. In some embodiments, both truncated and non-truncated gRNAs are co- targeted within the same cell. In some such embodiments, both the co-targeted truncated and non-truncated gRNAs bind to the same type of Cas polypeptide (e.g., Cas9 or Casĭ). In some embodiments, each of the co-targeted truncated and non-truncated gRNAs bind to different types of Cas polypeptides. sf-6059413 Attorney Docket No.: 26223-20027.40 [0306] Also provided herein for use in the methods and compositions of the present disclosure are amino acid and/or nucleotide sequences (as applicable) having at least 50%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one or more of SEQ ID NOs: 1 – 492. Kits [0307] Certain aspects of the present disclosure relate to an article of manufacture or kit comprising a polynucleotide, vector, cell, and/or composition described herein. In some embodiments, the kit further comprises a packed insert comprising instructions for the use of the polynucleotide, vector, cell, and/or composition. In some embodiments, the article of manufacture or kit further comprises one or more buffer, e.g., for storing, transferring, or otherwise using the polynucleotide, vector, cell, and/or composition. In some embodiments, the kit further comprises one or more containers for storing the polynucleotide, vector, cell, and/or composition. [0308] The foregoing written description is considered to be sufficient to enable one skilled in the art to practice the present disclosure. The following Examples are offered for illustrative purposes only, and are not intended to limit the scope of the present disclosure in any way. Indeed, various modifications of the present disclosure in addition to those shown and described herein will become apparent to those skilled in the art from the foregoing description and fall within the scope of the appended claims. EXAMPLES [0309] The following examples are offered to illustrate provided embodiments and are not intended to limit the scope of the present disclosure. Example 1: ACD15, ACD21, and SLN regulate accumulation and mobility of MBD6 to silence genes and transposable elements Summary [0310] The Examples provided herein illustrate experiments on recruitment of small heat shock proteins to various genomic loci (for example, with a dead Cas9). Some small heat shock proteins are known to form dynamic oligomeric assemblies (M. Haslbeck, E. Vierling, A First Line of Stress Defense: Small Heat Shock Proteins and Their Function in Protein Homeostasis. Journal of Molecular Biology.427, 1537–1548 (2015)). The experiments sf-6059413 Attorney Docket No.: 26223-20027.40 presented demonstrate that recruitment of small heat shock proteins formed nucleation centers that recruited a large number of small heat shock proteins, concentrating them into a nuclear body tethered to the genomic site, and sequestering them away from other locations in the nucleus. [0311] This Example demonstrates recruitment of small heat shock proteins using a SunTag system, in which dCas9 was fused to 10 copies of a protein epitope repeat sequence. A single chain antibody was then separately expressed, in which the single chain antibody bound to the epitope and was fused to 1) GFP, which allowed its visualization, and 2) the StkyC domain of the MBD6 protein, which is known to recruit the small heat shock proteins ACD15 and ACD21. As detailed herein, this indeed caused nuclear bodies to occur at the genomic sites. Abstract [0312] DNA methylation mediates silencing of transposable elements (TEs) and genes in part via recruitment of the Arabidopsis MBD5/6 complex, which contains the methyl-CpG- binding domain (MBD) proteins MBD5 and MBD6, and the J-domain containing protein SILENZIO (SLN). Here we characterize two additional complex members: Į-crystalline domain containing proteins ACD15 and ACD21. We show that they are necessary for gene silencing, bridge SLN to the complex, and promote higher order multimerization of MBD5/6 complexes within heterochromatin. These complexes are also highly dynamic, with the mobility of complex components regulated by the activity of SLN. Using a dCas9 system, we demonstrate that tethering the ACDs to an ectopic site outside of heterochromatin can drive massive accumulation of MBD5/6 complexes into large nuclear bodies. These results demonstrate that ACD15 and ACD21 are critical components of gene silencing complexes that act to drive the formation of higher order, dynamic assemblies. One-Sentence Summary [0313] Arabidopsis ACD21 and ACD15 drive accumulation of MBD5/6 complex silencing assemblies at methyl-CG sites and recruit SLN to maintain protein mobility in these assemblages. sf-6059413 Attorney Docket No.: 26223-20027.40 Materials and Methods Plant materials and growth conditions [0314] All plants used in this study were in the Columbia-0 ecotype (Col-0) and were grown on soil in a greenhouse under long-day conditions (16h light / 8h dark). Plants grown for microscopy were plated on 1/2MS plates in growth rooms at room temperature (~25qC), with 16h of light and 8h of dark. [0315] The following mutant lines were previously described: mbd5 mbd6 T-DNA double mutant composed of mbd5 T-DNA line SAILseq_750_A09.1 and mbd6 T-DNA line SALK_043927 (29); mbd5 mbd6 double mutant composed of mbd5 CRISPR/Cas9-generated indel and mbd6 T-DNA mutation SALK_043927 (29); sln (SALK_090484) (29), fwa rdr6-15 (41), lil-1 (30). Novel mutants and transgenic lines were generated as described below. Generation of CRISPR lines [0316] CRISPR/Cas9 mutants for ACD15.5 and ACD21.4 were generated with the pYAO::hSpCas9 system (44). We designed two guide RNAs per gene with the goal of generating large deletions, one of them targeting the beginning of the coding region and another one targeting the end of the gene (FIG.1H). We were not able to obtain large deletions at these loci, but we found small indels causing frameshifts (FIG.1H). The guide RNAs were cloned sequentially in the AtU6-26-sgRNA cassette by overlapping PCR. The PCR product was cloned into the SpeI site of the pYAO::hSpCas9 destination plasmid by In- Fusion (Takara, 639650). The procedure was repeated four times (two guides for each gene). The final vector was electroporated into AGLO agrobacteria and transformed in Col0 or sln mutant plants (SALK_090484). T1 plants were selected on ½ MS agar plates with hygromycin B and were genotyped by PCR and by sanger sequencing of PCR amplified genomic regions surrounding each guide RNA. The lines containing the desired mutations were propagated to identify null segregants for the Cas9 transgene, and to obtain homozygous mutations. Experiments were performed in T4 generation. Generation of transgenic lines [0317] The transgenic lines expressing FLAG-tagged constructs used for IP-MS and ChIP-seq were generated as follows. Genomic DNA was cloned into pENTR/D-TOPO vectors (Thermo Fisher), including endogenous promoters and introns, until the last base before the STOP codon. The MBD5 gene was cloned starting from 1094 bp before the TSS, MBD6 from 294 bp before the TSS, SLN from 2351 bp before the TSS, ACD15.5 from 644 sf-6059413 Attorney Docket No.: 26223-20027.40 bp before the TSS, and ACD21.4 from 266 bp before the TSS. The genes were then transferred via a Gateway LR Clonase II reaction (Invitrogen, 11791020) into a pEG302 based binary destination vector including a C-terminal 3xFLAG epitope tag. The final vectors were electroporated into AGL0 agrobacteria that were used for plant transformation by agrobacterium-mediated floral dipping. T1 transgenic plants were selected with hygromycin B on ½ MS agar medium or with Basta (Glufosinate) on soil. IP-MS and ChIP-seq experiments were done in T2 or T3 generation. [0318] Transgenic plants expressing fluorescently tagged proteins were created using the pGWB553 (https://www.addgene.org/74883/), pGWB540 (https://www.addgene.org/74874/), and pGWB543 (https://www.addgene.org/74877/). Specifically, ACD15, ACD21, and SLN promoters and coding sequences were PCR amplified from genomic DNA (as explained above) and cloned into pENTR vectors. These coding sequences were then inserted into final destination vectors using Gateway LR Clonase II Enzyme mix(Catalog number: 11791020, ThermoFisher). These final destination vectors were then electroporated into AGLO and transformed into Col0, mbd5 mbd6 (SALK_043927), sln (SALK_090484), acd15 acd21, acd21, acd15, and acd15 acd21 sln (SALK_090484) mutant plants. Positive T1 plants were selected on ½ MS agar plates with hygromycin B and confirmed by western blots using fluorescent protein specific antibodies. [0319] Transgenic plants expressing SunTagStkyC were created from a previously published SunTagTET1 plasmid using the StkyC domain sequence (amino acids 173-225) of MBD6 (45). The SunTagStkyC was targeted using two guides (Guide 4 (ACGGAAAGATGTATGGGCTT; SEQ ID NO: 152) and Guide 17 (AAAACTAGGCCATCCATGGA; SEQ ID NO: 153) which were cloned as previously described (46). This plasmid was electroporated into AGLO and transformed into Col0, mbd5 mbd6 (SALK_043927), acd15 acd21, acd21, acd15, and sln (SALK_090484), and fwa rdr-6- 15(41). Positive selection of transgenic plants was done on ½ MS agar plates with hygromycin B after 5 days in the dark at 4C, 8 hours in the light at room temperature, and another 5 days in the dark at room temperature. Immunoprecipitation-Mass Spectrometry [0320] IP-MS experiments were performed as previously described (29). Briefly, 8 to 10 g of inflorescences for each sample were used. Frozen tissue was ground with a tissue lyser and resuspended in IP buffer (50 mM TrisāHCl pH 8.0, 150 mM NaCl, 5 mM EDTA, 20% sf-6059413 Attorney Docket No.: 26223-20027.40 glycerol, 0.1% Tergitol, 0.5 mM DTT, and cOmplete EDTA-free Protease Inhibitor Cocktail [Roche]). Samples were filtered with miracloth, disrupted with a Dounce homogenizer, and centrifuged for 10 min at 4°C at 20,000 g. Supernatants were incubated with 200 ^L of M2 magnetic FLAG-beads (SIGMA, M8823) for 2 hours rotating at 4°C. The beads were washed 5 times in IP buffer and eluted with 250 ^g/mL 3X-FLAG peptides in TE. The eluted protein complexes were precipitated overnight with 20% trichloroacetic acid (TCA). Digestion and Desalting [0321] The protein pellets were resuspended with 100 ^l digestion buffer (8M Urea, 0.1M Tris-HCl pH 8.5). Then the samples were reduced and alkylated via sequential 20- minute incubations with 5 mM TCEP and 10 mM iodoacetamide at room temperature in the dark while being mixed at 1200 rpm in an Eppendorf thermomixer.20 ^l of carboxylate- modified magnetic beads (CMMB, also widely known as SP3 (47) was added to each sample. Ethanol was added to a concentration of 50% to induce protein binding to CMMB. CMMB were washed 3 times with 80% ethanol and then resuspended with 50 ^l 50 mM TEAB. [0322] The protein was digested overnight with 0.1 ^g LysC (Promega) and 0.8 ^g trypsin (Thermo Scientific, 90057) at 37 °C. Following digestion, 1.2 ml of 100% acetonitrile was added to each sample to increase the final acetonitrile concentration to over 95% to induce peptide binding to CMMB. CMMB were then washed 3 times with 100% acetonitrile and the peptide was eluted with 65 ^l of 2% DMSO. Eluted peptide samples were dried by vacuum centrifugation and reconstituted in 5% formic acid before analysis by LC-MS/MS. LC-MS Acquisition and Analysis [0323] Peptide samples were separated on a 75 ^M ID, 25 cm C18 column packed with 1.9 ^M C18 particles (Dr.Maisch GmbH) using a 140-minute gradient of increasing acetonitrile concentration ,and injected into a Thermo Orbitrap-Fusion Lumos Tribrid mass spectrometer. MS/MS spectra were acquired using Data Dependent Acquisition (DDA) mode. [0324] MS/MS database searching was performed using MaxQuant (1.6.10.43) (48) against the Arabidopsis thaliana reference proteome TAIR (Araport11 release). Chromatin Immunoprecipitation-sequencing (ChIP-seq) [0325] The anti-FLAG ChIP-seq experiments were performed as previously described (29). The RFP ChIP-seq experiments (FIGS.3A-3I) were done with the following variations: sf-6059413 Attorney Docket No.: 26223-20027.40 1) After sonication and two rounds of centrifugation, 50 Pl of ChromoTek RFP-Trap Magnetic beads (proteintech, Cat No. rtma) were added to each sample for overnight incubation. 2) For elution, 250 μl elution buffer (SDS 1%, NaHCO30.1 M) was added, and samples were shaken for 15 min at room temperature. This step was repeated twice to reach 500ul of final elution volume.480 μl of eluate was combined with 20 μl of 5M NaCl and incubated in a thermomixer overnight at 65C and 400rpm for reverse crosslinking. The following steps were performed as previously described (29). [0326] ChIP-seq libraries were prepared with the Ovation Ultra Low System V21-16 kit (NuGEN, 0344NB-A01) following the manufacturer’s instructions, with 15 cycles of PCR. Final libraries were sequenced with the Illumina NovaSeq 6000 System. ChIP-seq analysis [0327] Raw reads were filtered based on quality score and trimmed to remove Illumina adapters using Trim Galore (Babraham Institute). Filtered reads were mapped to the Arabidopsis reference genome (TAIR10) with Bowtie2 (49) with default parameters. PCR duplicates were removed using MarkDuplicates.jar (picard-tools suite, Broad Institute). Genome browser tracks for visualization purposes were generated using deeptools (v 3.0.2) bamCoverage (46) with the options -- normalizeUsing RPKM and --binSize 10. To obtain tracks normalized over the no-FLAG control, we used deeptools bamCompare (50) with the “log2” option. [0328] The analysis of correlation between ChIP-seq data and mCG density was performed as previously described (29), by calculating the sum of CG methylation percentages in 400 bp bins. The data were plotted using the R package ggplot with the option geom_smooth. [0329] ChIP-seq peaks were called with MACS2 (v 2.1.0) (51) using an FDR cutoff of 0.01. The FLAG and RFP associated hyperchippable regions, defined as peaks called in the anti-FLAG Col0 or anti-RFP Col0 controls, were subtracted from the peak sets of each sample. The peaks of individual replicates for ACD15 and ACD21 were merged with homer mergePeaks using the option -d given (52). Overlap analysis of different ChIP-seq peak sets was performed with homer mergePeaks using the options -d given and -venn (52). RT-qPCR [0330] RNA samples for RT-qPCR experiments were purified using Direct-zol RNA miniprep kit (catalog number: R2052, Zymo Research) from unopened flower bud tissue or sf-6059413 Attorney Docket No.: 26223-20027.40 leaf tissue used in Figure 5. cDNA samples were prepared using Superscript IV mastermix (catalog number: 11760500, Invitrogen) from ~400 ng of RNA and qPCR was performed using BioRad Sybergreen mastermix (catalog number: 1708882, Bio-Rad). Each qPCR experiment contained 2 technical replicates for each gene (either FWA or IPP2 housekeeping control). qPCR results were analyzed using BioRad CFX maestro software. FWA expression was normalized to expression of the reference gene IPP2, and to the control samples as indicated in each plot (i.e., mbd5 mbd6 mutants or fwa rdr-6 mutant) using the ǻǻCq method. The data was graphed using GraphPad Prism software. Statistical analysis was done as described in the figure legends. List of primers used for RT-qPCR:
Figure imgf000095_0001
RNA-Sequencing [0331] RNA-sequencing was performed on mature pollen samples isolated as previously described (53), with 6 biological replicates per genotype, grown and processed in 2 batches (3 replicates each). Briefly, 700-1000 PL of open flowers were harvested in 2-mL protein low bind tubes (Eppendorf).700 PL of Galbraith buffer (45 mM MgCl2,30 mMC6H5Na3O7.2H2O [Trisodium citrate dihydrate], 20 mM MOPS, 0.1% [v/v] Triton X- 100, pH 7) supplemented with 70 mM 2-Mercaptoethanol, were added to the tube, and the flowers were vortexed for 3 min at max speed in the cold room, to release the pollen from the anthers. The extraction procedure was repeated two times, and the two aliquots of pollen in solution were combined. The suspension was filtered with an 80 Pm nylon mesh into a new 1.5 mL tube, and then spun-down for 5 minutes at 500 g. The supernatant was carefully removed and the pollen was flesh frozen with a metal bead. Frozen samples were disrupted with a tissue grinder and RNA extraction was performed with the Zymo Direct-zol RNA MiniPrep kit (Zymo Research), with in-column DNase digestion. ~500 ng of RNA were used as input for library preparation using the TruSeq Stranded mRNA Library Prep Kit (Illumina), according to the manufacturer’s instructions. The final libraries were sequenced with the Illumina NovaSeq 6000 System. sf-6059413 Attorney Docket No.: 26223-20027.40 RNA-Sequencing analysis [0332] RNA-sequencing reads were filtered based on quality score and trimmed to remove Illumina adapters using Trim Galore (Babraham Institute). The filtered reads were mapped to the Arabidopsis reference genome (TAIR10) using STAR (54) , allowing 5% of mismatches (-outFilterMismatchNoverReadLmax 0.05) and unique mapping (– outFilterMultimapNmax 1). MarkDuplicates from the Picard Tools suite was used to remove PCR duplicates. Coverage tracks for visualization in the genome browser were generated using Deeptools 3.0.2 bamCoverage with the options –normalizeUsing RPKM and –binSize 10 (50). [0333] To obtain gene counts, we used a set of reference pollen transcriptome annotations that we previously generated (53) and are available from Github at https://github.com/clp90/mbd56_pollen. We used HTseq (55) with the option –mode = union, to obtain counts for all transcripts (genes, TEs, and other undefined non-coding transcripts). The HTseq gene counts were used to perform the differential gene expression analysis using the R package DEseq2 (56) with a cutoff for significance of adjusted p-value <0.05 and |log2FC|>1. Figures were generated using the R packages ggplot and UpSetR. [0334] To determine the promoter CG methylation levels at each transcript (FIG.1D), we first identified promoters as a 600 bp region surrounding the TSS. Then we calculated average CG methylation percentages at promoters using bedtoolsmap (57) with the option ‘‘mean’’. Our previously published Col0 flower buds BS-seq dataset was used for this analysis and for the representative genome browser tracks: GSM5026060 and GSM5026061 (combined replicates) (29). Amino Acid Alignment [0335] Amino acid alignments of MBD5, MBD6, and MBD7 were performed using Clustal Omega multiple sequence alignment tool (https://www.ebi.ac.uk/Tools/msa/clustalo/). Amino acid sequences MBD5 (Accession No. Q9SNC0), MBD6 (Accession No. Q9LTJ1), MBD7 (Accession No. Q9FJF4) were obtained from UniProt protein database. The alignment was run with default settings. AlphaFold Multimer Protein Structure Prediction [0336] Protein structure predictions were run with the AlphaFold Colab notebook (AlphaFold.ipynb, https://colab.research.google.com/github/deepmind/alphafold/blob/main/notebooks/AlphaFol sf-6059413 Attorney Docket No.: 26223-20027.40 d.ipynb#scrollTo=XUo6foMQxwS2 ) (33). The standard workflow was followed, and “run_relax” option was disabled. The .pdb output files were visualized with Pymol (Delano Scientific, LLC.). Leaf Counting [0337] Leaf counting was performed as mentioned previously (29) where total numbers of rosette and cauline leaves were counted in T1 generation of plants grown side-by-side under the same conditions. Confocal Microscopy [0338] All confocal microscopy experiments were performed using the LSM 980 confocal microscope. Unless otherwise stated, all experiments were performed using a 40x magnification water objective lens. For all experiments using multiple fluorescent tags, we manually gated the excitation and emission spectrum to limit any cross reactivity of the samples. [0339] Live plant samples were prepared as follows: 2 week-old seedlings were grown on ½ MS plats at room temperature, ~25C, and then transferred using forceps onto 1mm thick glass slides (FisherScientific, Cat No.12-550-08) containing de-ionized water (room temperature). Seedlings were oriented such that root tips were on the middle of the slide while leaves were extending from the top of slides. #1.5 Coverslips (FisherScientific, Cat No. 12-544-EP) were placed on top of the plant, gently, so as not to destroy or stress the seedling. Usually, 1-4 plants were placed on the one slide for imaging. FRAP Experiment and Analysis [0340] FRAP experiments were performed on a LSM 980 using 40x magnification and water objective lens. Images of a region of interested were obtained as a “snap” in order to circle a region of interest to be bleached. Then an experiment was run such that 5 images were taken followed by a bleaching event using 100% laser excitation wavelength, dependent on the fluorescent protein being imaged, for 300 iterations. Signal was then tracked post bleaching for indicated amount of time. FRAP analysis was performed using EasyFRAP online analysis software (https://easyfrap.vmnet.upatras.gr/). Briefly, three regions of interest were measured for each FRAP replicate: Bleached region (1), specific nucleus region containing the bleached foci (2), and a random region containing no signal in the root of the plant (3). The signal of these three regions across time were added into the given excel template from EasyFRAP and uploaded for analysis along with other replicate files (N=25 for sf-6059413 Attorney Docket No.: 26223-20027.40 each FRAP experiment). Plotted FRAP curves represent full normalization of the data to account for any variations in bleaching depth among samples. FRAP data starting from the bleaching event are plotted using GraphPad Prism software with 95% confidence intervals calculated from normalized FRAP data of FRAP experiment replicates. One-Phase association, non-linear regressions were fitted to estimate and statistically compare maximum plateau and t1/2 for each FRAP experiment. Quantification of foci counts, volume, and nuclear distributions. [0341] All foci counts, volumes, and nuclear distribution plots were quantified using ImageJ, Image analysis software. Foci counts and volume measurements were obtained using 3D object counter from 50-slice z-stacks of root meristems across multiple plant lines using thresholding through ImageJ software. Nuclear distributions were obtained using plot profile feature across a fixed line length in ImageJ after converting images to RBG format. Intensities from nuclear distribution plots were then normalized to maximum intensity within each replicate to normalize the data distribution. Foci counts, volumes, and nuclear distribution intensity values were all plotted using GraphPad Prism software and statistical analysis was performed using GraphPad Prism software as mentioned in figure legends. Main Text [0342] Eukaryotic organisms must properly localize macromolecules within cells to maintain homeostasis. Membrane bound organelles serve this purpose, but recent discoveries have revealed the existence of membrane-less organelles or compartments(1–3). Often referred to as biological condensates, liquid-liquid phase separated (LLPS) condensates, or supramolecular assemblies, these compartments such as stress granules, p-granules, heterochromatin, and the nucleolus concentrate proteins and nucleic acids to facilitate specific and efficient processes(4–7). If not properly controlled, the accumulation of these protein assemblies can lead to aggregates with detrimental impacts on cellular homeostasis and disease, yet how cells regulate these assemblies remains unclear(3, 8, 9). Molecular chaperones, such as heat shock proteins (HSPs), serve highly conserved roles to regulate the solubility, folding, and aggregation of proteins within cells, making them potential candidates for the regulation of biological condensates(10–14). Small HSPs (sHSPs) use their conserved Į-crystalline domains (ACD) to form dimers which then create large and dynamic oligomeric assemblies that act as first line of defense against protein aggregation via a “holdase” activity(14). sHSPs further recruit J-domain containing proteins (JDPs) which act as co- sf-6059413 Attorney Docket No.: 26223-20027.40 chaperones for HSP70 proteins to maintain protein homeostasis (14–17). Both sHSPs and JDP/HSP70 pairs have been shown to associate with and regulate disease related cellular condensates across species(18–21). [0343] In Arabidopsis thaliana, pericentromeric heterochromatin is organized into compartments called chromocenters that are chromatin dense regions containing most of the DNA methylated and constitutively silenced TEs and genes, as well as heterochromatic proteins such as DNA methylation binding complexes (13, 22–26). Previous work has shown that multiple Arabidopsis DNA methylation binding complexes silence or promote expression of genes through recruitment of molecular chaperones with unknown functions (27–29). For example, MBD5 and MBD6 redundantly silence a subset of TEs and promoter- methylated genes via recruitment of SLN, a JDP. MBD5/6 also interact with two ACD containing proteins called ACD15.5/RDS2 and ACD21.4/RDS1, hereafter referred to as ACD15 and ACD21(29–31). While ACD15 and ACD21 have been implicated in silencing of a transgene reporter, their specific chromatin functions remain unknown (31). Here we demonstrate that ACD15 and ACD21 are necessary and sufficient for the accumulation of high density MBD6 at methylated CG sites to silence genes and TEs, while also bridging SLN to MBD5/6 to maintain the high mobility of all complex components. We further demonstrate that MBD5/6 complex assemblies can be formed at discrete foci outside of chromocenters, in an ACD15 and ACD21 dependent manner, to cause gene silencing. ACD15 and ACD21 colocalize with MBD5 and MBD6 genome-wide and are essential for silencing [0344] We previously observed that MBD5, MBD6 and SLN pulled-down two ACD containing proteins named ACD15 and ACD21 (29). To investigate their binding patterns on chromatin, we performed Chromatin Immunoprecipitation sequencing (ChIP-seq) of FLAG- tagged ACD15 and ACD21. We observed that all five proteins colocalized genome-wide, and none of them appeared to have truly unique ChIP-seq peaks, suggesting that they could be recruited to DNA together as a complex (FIG.1A. FIG.1B, FIG.1G). Furthermore, ACD15 and ACD21 showed a non-linear correlation with meCG density similar to MBD6 and SLN(29) suggesting that MBD5/6 complex members all accumulate preferentially at high density meCG sites (FIG.1C). [0345] To test whether ACD15 and ACD21 are required for silencing we generated acd15 and acd21 single mutants, an acd15 acd21 double mutant, and an acd15 acd21 sln triple mutant via CRISPR/Cas9 (FIG.1H). RNA-seq analysis revealed that all mutants sf-6059413 Attorney Docket No.: 26223-20027.40 showed very similar transcriptional derepression patterns at DNA methylated genes and TEs as compared to mbd5 mbd6 and sln mutants (FIGS.1D-1F, FIGS.1I-1J). This includes the FWA gene which was previously shown to be silenced by the MBD5/6 complex (FIG.1F) (29). These results demonstrate that ACD15 and ACD21 are critical components of the MBD5/6 complex required for silencing. ACD15 and ACD21 bridge SLN to MBD5 and MBD6. [0346] To determine the specific organization of the MBD5/6 complex, we performed IP- MS experiments using FLAG-tagged transgenic lines for each complex component in different mutant backgrounds (FIG.2A). In the wild type Col0 background, ACD15 and ACD21 pulled-down each other, MBD5, MBD6, SLN, and the same HSP70 proteins that were found to interact with the MBD5/6 complex previously (29) (FIG.2A). MBD5 and MBD6 pulled down peptides of ACD15 and ACD21 in the absence of SLN, while SLN did not pull down MBD5 and MBD6 in the absence of ACD15 and ACD21, suggesting that ACD15 and ACD21 bridge the interaction between MBD5/6 and SLN (FIGS.2A-2B). Consistent with this model, ACD15 and ACD21 pulled down MBD5 and MBD6 in the sln mutant background, and SLN pulled down ACD15 and ACD21 in the mbd5 mbd6 mutant background (FIG.2A). ACD15 also pulled-down MBD5 and MBD6 but not SLN in the absence of ACD21, while ACD21 did not pull down MBD5 and MBD6 in the absence of ACD15 (FIG.2A). These results suggest that the MBD5/6 complex is organized such that MBD5 or MBD6 interact with ACD15, ACD15 interacts with ACD21, and ACD21 interacts with SLN (29) (FIG.2B). [0347] To further study the organization and localization of MBD5/6 complex components we used live confocal imaging of root tips to determine the cellular localization of fluorescent-protein-tagged ACD15, ACD21, SLN, and MBD6. In wild-type plants, ACD15, ACD21, and SLN all showed clear nuclear localization which correlated strongly with nuclear MBD6 (FIGS.2C-2E and FIGS.2G-2I). ACD21, ACD15, and SLN all showed an increase in cytosolic signal in mbd5 mbd6 mutant plants which was rescued by coexpressing MBD6, demonstrating that all members of the complex require genetically redundant MBD5 or MBD6 for proper nuclear localization (FIGS.2C-2E). The reduction of nuclear localization of SLN is also consistent with previous ChIP-seq experiments showing loss of chromatin bound SLN in the absence of MBD5 and MBD6(29). ACD15 maintained nuclear localization and correlation with MBD6 in acd15 acd21 and sln mutant plants whereas ACD21 lost nuclear localization and correlation with MBD6 in acd15 and acd15 sf-6059413 Attorney Docket No.: 26223-20027.40 acd21 mutants, but not in the sln mutant (FIGS.2C-2D, FIGS.2J-2M). Finally, SLN nuclear localization and correlation with MBD6 decreased in acd15, acd21, and acd15 acd21, mutant plants (FIG.2E, FIGS.2N-2O). These results demonstrate that ACD21 requires ACD15 for proper nuclear localization, while SLN requires both ACD15 and ACD21, consistent with the complex organization model suggested by IP-MS experiments (FIG.2B). [0348] We used the protein folding algorithm AlphaFold Multimer to predict protein- protein interactions within the MBD5/6 complex(32, 33). AlphaFold Multimer confidently predicted that ACD15 interacts with MBD6 (or MBD5), that ACD15 interacts with ACD21, and that ACD21 interacts with SLN, all consistent with our experimental data from IP-MS and confocal microscopy (Fig.2F). When given two copies of each member of the complex (MBD6, ACD15, ACD21, and SLN), AlphaFold Multimer also confidently predicted that ACD15 and ACD21 form a dimer of two heterodimers in the middle of the structure, suggesting that the MBD5/6 complex likely contains at least two copies of each protein (FIGS.2P-2R). This is consistent with previous results showing dimer formation by other ACD containing sHSPs (34). Given the genetic redundancy of MBD5 and MBD6, the complex would be predicted to contain a minimum of two MBD5s, two MBD6s, or one MBD5 plus one MBD6 (FIG.2R). In line with this prediction, we found that MBD5 and MBD6 pull-down each other in IP-MS data in wild-type, but not in the acd15 acd21 double mutant background, indicating that ACD15/ACD21 facilitate interaction between two MBD5/6 proteins (FIG.2A, FIG.2R). ACD15, ACD21, and SLN regulate heterochromatic localization, accumulation, and dynamics of the MBD5/6 complex [0349] Given the known role of molecular chaperones in the regulation of protein complexes and aggregates (35), we hypothesized that ACD15, ACD21, and SLN may regulate the dynamics of MBD5/6 nuclear complexes. To test this, we measured the nuclear localization and mobility of MBD6 in root cells using live-cell, fluorescence, confocal microscopy. In wild-type and mbd5 mbd6 mutant plants, MBD6 formed foci, which colocalized with ACD15, ACD21, and SLN foci (FIGS.3A-B, FIG.3J). MBD6 foci also overlapped with DAPI-staining chromocenters, as previously shown when MBD6 was overexpressed in leaf cells (FIG.3K) (36). To measure the mobility of MBD6 protein we used fluorescence recovery after photobleaching (FRAP) experiments (37). FRAP in wild- type plants revealed that MBD6 moves rapidly within nuclei with a FRAP recovery half time (t1/2) of ~3.60 seconds back into chromocenters after bleaching (FIGS.3C-3D, FIG.3M). sf-6059413 Attorney Docket No.: 26223-20027.40 [0350] We next tested whether MBD6 nuclear distribution or mobility was altered in sln mutants. Although MBD6 formed a similar number of nuclear foci in sln compared to wild- type plants, these foci showed somewhat reduced fluorescence intensity, suggesting that MBD6 was accumulating less efficiently within heterochromatin (FIGS.3A, 3E, 3F). FRAP of MBD6 in sln mutant plants revealed a dramatic reduction in mobility and a lack of full recovery of signal post bleaching (FIGS.3C-3D and 3M). Similar FRAP experiments on ACD15 and ACD21 nuclear foci showed that both were highly mobile in wild-type (t1/2 of 3.63 and 4.30 seconds respectively), but were much less mobile and failed to recover full signal in sln mutant plants (FIGS.3M-3Q), and also showed decreased fluorescence intensity of foci in sln compared to wild-type (FIGS.3R-3S). SLN thus regulates the mobility, and to a lesser extent the accumulation of the MBD5/6 complex. [0351] Given the IP-MS, microscopy, and structure prediction results showing that ACD15 and ACD21 bridge the interaction between MBD6 and SLN we expected acd15 acd21 mutants to alter the FRAP mobility of MBD6 in a manner similar to sln mutants. However, we found that the number of MBD6 foci were dramatically lower in acd15 acd21 mutant plants compared to wild-type plants, with only occasional MBD6 foci observed (FIGS.3A, 3B, 3E). Instead, MBD6 nuclear signal in acd15 acd21 mutant plants was more diffusely distributed across nuclei compared to either wild-type or sln plants (FIG.3A). A decreased number of MBD6 foci and a lack of overlap of these foci with DAPI stained chromocenters was also observed in acd15, acd21, and acd15 acd21 sln mutant plants (FIGS. 3A, 3L). Thus, ACD15 and ACD21 are required for MBD6 to efficiently concentrate into nuclear foci. This effect was specific to ACD15 and ACD21 since loss of IDM3 (LIL), an ACD protein in the MBD7 complex (28), did not affect the MBD6 nuclear foci (FIGS.3T- 3U). [0352] We performed ChIP-seq on MBD6 in acd15 acd21 mutant plants to quantify the impact on MBD6 chromatin localization. In wild-type plants, MBD6 localized to previously published MBD6 peaks(29) and showed a non-linear correlation with meCG density, displaying strong enrichment at highly dense methylated regions (FIGS.3G-3I). However, MBD6 chromatin enrichment in acd15 acd21 mutant plants, although not abolished, was decreased dramatically and showed much less preference for binding to high density meCG sites (FIGS.3G-3I). Thus, while ACD15/ACD21 are not necessary for MBD6 to bind meCG sites, they are needed for high accumulation of MBD6 at high density meCG sites, which is consistent with the decrease of observable MBD6 foci in acd mutants (FIGS.3A-3B). sf-6059413 Attorney Docket No.: 26223-20027.40 [0353] Taken together, these results demonstrate that ACD15 and ACD21 are required for high level accumulation of MBD5/6 complexes in chromocenters and at high density meCG sites, while SLN regulates the mobility of these complexes to maintain dynamic recycling of proteins. The StkyC domain of MBD6 is required for gene silencing and recruits ACD15 to the complex [0354] The AlphaFold-predicted structure of MBD6 reveals two structured domains, the MBD and a C-terminal domain of unknown function, as well as two intrinsically disordered regions (IDRs) (FIGS.4A, 4I). The C-terminal folded domain shares amino acid similarity with the C-terminus of two related MBD proteins, MBD5 and MBD7 (FIG.4J). This region of MBD7 has been termed the StkyC domain, and is the proposed binding site for the ACD containing IDM3 protein, which belongs to the same family as ACD15 and ACD21(28). This suggests that the StkyC of MBD6 would interact with ACD15, and indeed this interaction is confidently predicted by AlphaFold Multimer (FIGS.2F, 2P-2Q, 4I). [0355] To experimentally determine what domains of MBD6 are necessary for gene silencing and chaperone interactions we first truncated the N-terminus (MBD6Nǻ (leaving amino acids 66-224)) or the C-terminus of MBD6 (MBD6Cǻ (leaving amino acids 1-146)) (fig. S4C). To test if these mutants are functional for silencing, we performed RT-qPCR of the FWA gene, a target of the MBD5/6 complex (FIG.1F)(29), in mbd5 mbd6 mutant plants expressing full-length or truncated MBD6 alleles. FWA derepression in mbd5 mbd6 plants was rescued by full length MBD6-RFP or MBD6Nǻ-RFP, but not by MBD6Cǻ-RFP, showing that the middle IDR and/or the StkyC domain are required for MBD6 function (FIG. 4L). MBD6Cǻ also showed a dramatic reduction in nuclear foci compared to full length MBD6, a phenotype similar to that observed in acd15 acd21 mutants and consistent with loss of the ACD15 binding site (FIGS.4B-4C, 4M). [0356] To test if the StkyC domain was critical, we added back the StkyC domain (amino acids 167-224) to MBD6Cǻ (MBD6Cǻ+StkyC). MBD6Cǻ+StkyC was able to rescue MBD6 nuclear foci counts, and complemented the derepression of FWA in the mbd5 mbd6 mutant (FIGS.4B-4E). Importantly, MBD6Cǻ+StkyC expressed in acd15 acd21 mutant plants formed very few nuclear foci, similar to the low number of MBD6Cǻ foci in wild-type plants, demonstrating that foci localization rescue by the StkyC domain requires ACD15 and ACD21 (FIG.4D). sf-6059413 Attorney Docket No.: 26223-20027.40 [0357] To determine if the StkyC domain is responsible for localizing ACD15 and ACD21 to the MBD5/6 complex, we performed fluorescent protein colocalization experiments by co-expressing ACD21-CFP or ACD15-YFP with MBD6-RFP, MBD6Cǻ- RFP, or MBD6Cǻ+StkyC RFP in mbd5 mbd6 mutants. ACD15 and ACD21 both strongly correlated with full length MBD6 (Pearson correlation coefficient (r) of 0.96 and 0.86 respectively) and overlapped well with MBD6 signal across root nuclei, whereas ACD15 and ACD21 showed much weaker correlations with MBD6Cǻ (r = 0.67 and 0.46, respectively) and lost overlap with MBD6Cǻ nuclear signal (FIGS.4F-4G, 4N-4O). ACD15 and ACD21 also showed visibly higher cytosolic signal and lower nuclear signal when co-expressed with MBD6Cǻ in mbd5 mbd6 (FIGS.4F-4G and 4N-4O). The addition of the StkyC domain (MBD6Cǻ+StkyC) restored the correlation of ACD15 and ACD21 with MBD6 (r = 0.94 and 0.84 respectively), restored the overlap of ACD15 and ACD21 with MBD6 nuclear signal, and reversed the cytosolic localization of ACD15 and ACD21 (FIGS.4F-4G and 3O-3P). [0358] To further test if ACD15 is needed for ACD21 to associate with MBD6Cǻ+StkyC, we colocalized ACD15 and ACD21 with MBD6Cǻ+StkyC in wild-type or acd15 acd21 double mutant plants (FIG.4H). ACD21 showed a reduced correlation with MBD6Cǻ+StkyC in acd15 acd21 plants compared to wild type (0.48 vs 0.79), a reduction of colocalization with MBD6 across nuclei, and a visible increase in ACD21 cytosolic localization, suggesting that ACD21 requires ACD15 to associate properly with MBD6Cǻ+StkyC (FIGS.4I and 4P-4Q). On the other hand, ACD15 correlated strongly with MBD6Cǻ+StkyC in acd15 acd21 plants (r = 0.92), maintained strong nuclear signal, and directly overlapped with nuclear MBD6, demonstrating that ACD15 does not require ACD21 for proper localization with MBD6Cǻ+StkyC (FIGS.4H and 4R). These experiments demonstrate that the StkyC domain of MBD6 is required for the function of the MBD5/6 complex, is needed for proper localization of ACD15 and ACD21, and mediates the accumulation of MBD6 at heterochromatic foci through ACD15 and ACD21. ACD15 and ACD21 can mediate functional and targeted gene silencing foci. [0359] Some ACD containing sHSP proteins are known to form dynamic oligomeric assemblies as part of their function in maintaining protein homeostasis (16), which could explain how ACD15/ACD21 drive high levels of MBD5/6 complex accumulation at meCG dense heterochromatin. To further explore this concept, we created a system to target MBD5/6 complexes to a discrete genomic location outside of pericentromeric heterochromatin. We utilized the SunTag system(38), composed of a dead Cas9 protein sf-6059413 Attorney Docket No.: 26223-20027.40 (dCas9) fused to ten single-chain variable fragment (scFv) binding sites, targeted to the promoter of the euchromatic FWA gene (39). To nucleate MBD5/6 foci at the dCas9 binding site we fused the scFv to the StkyC domain of MBD6 and to GFP, to visualize the nuclear distribution of the fusion proteins (FIG.5A). [0360] If ACD15 and ACD21 drive higher order multimerization of MBD5/6 complexes, we would expect to observe discrete GFP foci in nuclei representing the dCas9 binding sites, as well as other foci corresponding to chromocenters since the scFv-GFP-StykC fusion would likely be recruited into multimerized MBD5/6 complexes at heterochromatin sites (FIG.5A). Indeed, we observed an average of 6.4 GFP foci per nucleus in SunTagStkyC expressing wild-type plants (FIGS.5B, 5C), some of which overlapped with DAPI staining chromocenters and others that did not (FIG.5H). We also transformed SunTagStkyC into the mbd5 mbd6 mutant, which would be predicted to eliminate recruitment of the scFv-GFP- StykC fusion protein into chromocenters by elimination of meCG bound endogenous MBD5/6 complexes. As predicted, we now observed an average of only two foci per nucleus (FIGS.5B, 5C), likely corresponding to the FWA alleles on the two homologous chromosomes. Consistent with these foci representing dCas9 bound to euchromatic FWA (39), these foci did not overlap with DAPI staining chromocenters (FIG.5I). Notably, the volume of SunTagStkyC foci were increased in mbd5 mbd6 (FIG.5D), with the vast majority of nuclear GFP signal accumulating at the two nuclear bodies (FIG.5B), suggesting that excess scFv-GFP-StykC fusion protein shifted from heterochromatic regions to the dCas9 binding sites. [0361] We also expressed the SunTagStkyC system in acd15 acd21 mutants to determine if ACD15 and ACD21 are required for foci formation. Indeed, SunTagStkyC now only displayed diffuse nucleoplasmic GFP signal, lacking detectable foci (FIGS.5B, 5C). This pattern was similar to control plants expressing a SunTag-TET1 system(40), in which the scFv was fused to GFP and the human TET1 protein, suggesting that the GFP foci observed in SunTagStkyC is not a general property or artifact of the SunTag system (FIG.5J). We also introduced the SunTagStkyC system into the sln genetic background and observed GFP foci counts and localization similar to the wild-type plants, showing around 6.1 foci per nucleus (FIGS.5B, 5C). These results demonstrate that ACD15/21 are necessary and sufficient to drive high level accumulation of MBD5/6 complexes at discrete foci. [0362] We next tested if the foci formed by the SunTagStkyC system are capable of gene silencing. The FWA gene is normally methylated and silent in wild-type plants. However, sf-6059413 Attorney Docket No.: 26223-20027.40 stably unmethylated and expressed fwa epigenetic alleles exist that cause a later flowering phenotype (41, 42). This allowed us to test if the SunTagStkyC system could silence FWA by introducing the system into the fwa epigenetic background. Indeed, we found a significant suppression of FWA expression compared to fwa control plants (FIG.5E). fwa plants expressing SunTagStkyC also flowered earlier on average, showing a decrease in the number of leaves produced before flowering compared to fwa mutant plants (FIGS.5F-5G). Correlation of fwa expression with leaf counts for fwa SunTagStkyC plants showed a strong positive correlation as expected (FIG.5K). [0363] Lastly, we tested whether the SunTagStkyC system could complement the FWA derepression phenotype of MBD5/6 complex mutants (FIG.1E)(29). Interestingly, SunTagStkyC was able to silence FWA in mbd5 mbd6 mutant plants demonstrating that the tethering function of MBD6 could be largely replaced by targeting with the StkyC domain, and that silencing can occur without the methyl binding proteins (FIG.5L). Surprisingly, SunTagStkyC could also partially complement FWA derepression in the sln mutant background, while SunTagStkyC could not complement FWA derepression in the acd15 acd21 mutant background (FIGS.5M-5N). These results demonstrate that the SunTagStkyC system maintains some gene silencing capability without SLN, suggesting that ACD15 and ACD21 alone possess silencing ability. Concluding remarks [0364] Our results provide evidence for distinct mechanistic roles for ACD15, ACD21, and SLN in the formation and regulation of the meCG specific MBD5/6 silencing complex (FIG.6). ACD15 and ACD21 function to both drive the formation of higher order MBD5/6 complex assemblies, and bridge SLN to the complex. In contrast, the main role of SLN appears to be regulation of the dynamics of protein mobility within these complex assemblies. The activity of both ACD15/21 and SLN are clearly required for proper silencing function of the complex. The accumulation of multiple MBD5/6 proteins into higher order complex assemblies can explain why these complexes preferentially localize to high density meCG sites in the genome, likely via cooperative binding to closely spaced meCG sites. [0365] ACD domain containing small HSPs are found in all eukaryotic lineages and are most well known for their role in regulating the aggregation of proteins (14, 15, 17, 34, 43). In the MBD5/6 complex however, the oligomerization capacities of ACD15 and ACD21 are sf-6059413 Attorney Docket No.: 26223-20027.40 specifically co-opted to control complex multimerization and silencing function. It seems likely that ACD proteins in other systems may also play important roles outside of general protein homeostasis. References 1. S. F. Banani, H. O. Lee, A. A. Hyman, M. K. Rosen, Biomolecular condensates: organizers of cellular biochemistry. Nat Rev Mol Cell Biol.18, 285–298 (2017). 2. A. A. Hyman, C. A. Weber, F. Jülicher, Liquid-Liquid Phase Separation in Biology. Annual Review of Cell and Developmental Biology.30, 39–58 (2014). 3. Y. Shin, C. P. Brangwynne, Liquid phase condensation in cell physiology and disease. Science.357 (2017), doi:10.1126/science.aaf4382. 4. M. Feric, N. Vaidya, T. S. Harmon, D. M. Mitrea, L. Zhu, T. M. Richardson, R. W. Kriwacki, R. V. Pappu, C. P. Brangwynne, Coexisting Liquid Phases Underlie Nucleolar Subcompartments. Cell.165, 1686–1697 (2016). 5. D. L. J. Lafontaine, J. A. Riback, R. Bascetin, C. P. Brangwynne, The nucleolus as a multiphase liquid condensate. Nat Rev Mol Cell Biol.22, 165–182 (2021). 6. A. Boija, I. A. Klein, B. R. Sabari, A. Dall’Agnese, E. L. Coffey, A. V. Zamudio, C. H. Li, K. Shrinivas, J. C. Manteiga, N. M. Hannett, B. J. Abraham, L. K. Afeyan, Y. E. Guo, J. K. Rimel, C. B. Fant, J. Schuijers, T. I. Lee, D. J. Taatjes, R. A. Young, Transcription Factors Activate Genes through the Phase-Separation Capacity of Their Activation Domains. Cell.175, 1842-1855.e16 (2018). 7. J. E. Henninger, O. Oksuz, K. Shrinivas, I. Sagi, G. LeRoy, M. M. Zheng, J. O. Andrews, A. V. Zamudio, C. Lazaris, N. M. Hannett, T. I. Lee, P. A. Sharp, I. I. Cissé, A. K. Chakraborty, R. A. Young, RNA-Mediated Feedback Control of Transcriptional Condensates. Cell.184, 207-225.e24 (2021). 8. A. Zbinden, M. Pérez-Berlanga, P. De Rossi, M. Polymenidou, Phase Separation and Neurodegenerative Diseases: A Disturbance in the Force. Developmental Cell.55, 45–68 (2020). 9. M. A. Mensah, H. Niskanen, A. P. Magalhaes, S. Basu, M. Kircher, H. L. Sczakiel, A. M. V. Reiter, J. Elsner, P. Meinecke, S. Biskup, B. H. Y. Chung, G. Dombrowsky, C. Eckmann-Scholz, M. P. Hitz, A. Hoischen, P.-M. Holterhus, W. Hülsemann, K. Kahrizi, V. M. Kalscheuer, A. Kan, M. Krumbiegel, I. Kurth, J. Leubner, A. C. Longardt, J. D. Moritz, H. Najmabadi, K. Skipalova, L. Snijders Blok, A. Tzschach, E. Wiedersberg, M. Zenker, C. Garcia-Cabau, R. Buschow, X. Salvatella, M. L. Kraushar, S. Mundlos, A. Caliebe, M. Spielmann, D. Horn, D. Hnisz, Aberrant phase separation and nucleolar dysfunction in rare genetic diseases. Nature.614, 564–571 (2023). 10. M. H. Al-Whaibi, Plant heat-shock proteins: A mini review. Journal of King Saud University - Science.23, 139–150 (2011). 11. F. Hennessy, W. S. Nicoll, R. Zimmermann, M. E. Cheetham, G. L. Blatch, Not all J domains are created equal: Implications for the specificity of Hsp40–Hsp70 interactions. Protein Sci.14, 1697–1709 (2005). 12. R. Imamoglu, D. Balchin, M. Hayer-Hartl, F. U. Hartl, Bacterial Hsp70 resolves misfolded states and accelerates productive folding of a multi-domain protein. Nature Communications.11, 365 (2020). 13. H. H. Kampinga, C. Andreasson, A. Barducci, M. E. Cheetham, D. Cyr, C. Emanuelsson, P. Genevaux, J. E. Gestwicki, P. Goloubinoff, J. Huerta-Cepas, J. Kirstein, K. Liberek, M. P. Mayer, K. Nagata, N. B. Nillegoda, P. Pulido, C. Ramos, P. De los Rios, S. Rospert, R. Rosenzweig, C. Sahi, M. Taipale, B. Tomiczek, R. Ushioda, J. C. Young, R. sf-6059413 Attorney Docket No.: 26223-20027.40 Zimmermann, A. Zylicz, M. Zylicz, E. A. Craig, J. Marszalek, Function, evolution, and structure of J-domain proteins. Cell Stress and Chaperones.24, 7–15 (2019). 14. M. Haslbeck, S. Weinkauf, J. Buchner, Small heat shock proteins: Simplicity meets complexity. J Biol Chem.294, 2121–2132 (2019). 15. J. M. Webster, A. L. Darling, V. N. Uversky, L. J. Blair, Small Heat Shock Proteins, Big Impact on Protein Aggregation in Neurodegenerative Disease. Front Pharmacol.10 (2019), doi:10.3389/fphar.2019.01047. 16. M. Haslbeck, E. Vierling, A First Line of Stress Defense: Small Heat Shock Proteins and Their Function in Protein Homeostasis. Journal of Molecular Biology.427, 1537–1548 (2015). 17. W. C. Boelens, Structural aspects of the human small heat shock proteins related to their functional activities. Cell Stress and Chaperones.25, 581–591 (2020). 18. H. Yoo, J. A. M. Bard, E. Pilipenko, D. A. Drummond, “Chaperones directly and efficiently disperse stress-triggered biomolecular condensates” (2021), p. 2021.05.13.444070, , doi:10.1101/2021.05.13.444070. 19. H. Yu, S. Lu, K. Gasior, D. Singh, S. Vazquez-Sanchez, O. Tapia, D. Toprani, M. S. Beccari, J. R. Yates, S. D. Cruz, J. M. Newby, M. Lafarga, A. S. Gladfelter, E. Villa, D. W. Cleveland, HSP70 chaperones RNA-free TDP-43 into anisotropic intranuclear liquid spherical shells. Science (2020), doi:10.1126/science.abb4309. 20. Z. Liu, S. Zhang, J. Gu, Y. Tong, Y. Li, X. Gui, H. Long, C. Wang, C. Zhao, J. Lu, L. He, Y. Li, Z. Liu, D. Li, C. Liu, Hsp27 chaperones FUS phase separation under the modulation of stress-induced phosphorylation. Nat Struct Mol Biol.27, 363–372 (2020). 21. E. W. J. Wallace, J. L. Kear-Scott, E. V. Pilipenko, M. H. Schwartz, P. R. Laskowski, A. E. Rojek, C. D. Katanski, J. A. Riback, M. F. Dion, A. M. Franks, E. M. Airoldi, T. Pan, B. A. Budnik, D. A. Drummond, Reversible, Specific, Active Aggregates of Endogenous Proteins Assemble upon Heat Stress. Cell.162, 1286–1298 (2015). 22. E. Chytilova, J. Macas, E. Sliwinska, S. M. Rafelski, G. M. Lambert, D. W. Galbraith, Nuclear Dynamics in Arabidopsis thaliana. Mol Biol Cell.11, 2733–2741 (2000). 23. R. J. Emenecker, A. S. Holehouse, L. C. Strader, Annual Review of Plant Biology, in press, doi:10.1146/annurev-arplant-081720-015238. 24. S. Zhao, L. Cheng, Y. Gao, B. Zhang, X. Zheng, L. Wang, P. Li, Q. Sun, H. Li, Plant HP1 protein ADCP1 links multivalent H3K9 methylation readout to heterochromatin formation. Cell Res.29, 54–66 (2019). 25. G. Grafi, A. Zemach, L. Pitto, Methyl-CpG-binding domain (MBD) proteins in plants. Biochim. Biophys. Acta.1769, 287–294 (2007). 26. S. Feng, Z. Zhong, M. Wang, S. E. Jacobsen, Efficient and accurate determination of genome-wide DNA methylation patterns in Arabidopsis thaliana with enzymatic methyl sequencing. Epigenetics & Chromatin.13, 42 (2020). 27. C. J. Harris, M. Scheibe, S. P. Wongpalee, W. Liu, E. M. Cornett, R. M. Vaughan, X. Li, W. Chen, Y. Xue, Z. Zhong, L. Yen, W. D. Barshop, S. Rayatpisheh, J. Gallego- Bartolome, M. Groth, Z. Wang, J. A. Wohlschlegel, J. Du, S. B. Rothbart, F. Butter, S. E. Jacobsen, A DNA methylation reader complex that enhances gene transcription. Science. 362, 1182–1186 (2018). 28. Z. Lang, M. Lei, X. Wang, K. Tang, D. Miki, H. Zhang, S. K. Mangrauthia, W. Liu, W. Nie, G. Ma, J. Yan, C.-G. Duan, C.-C. Hsu, C. Wang, W. A. Tao, Z. Gong, J.-K. Zhu, The Methyl-CpG-Binding Protein MBD7 Facilitates Active DNA Demethylation to Limit DNA Hyper-Methylation and Transcriptional Gene Silencing. Molecular Cell.57, 971–983 (2015). 29. L. Ichino, B. A. Boone, L. Strauskulage, C. J. Harris, G. Kaur, M. A. Gladstone, M. Tan, S. Feng, Y. Jami-Alahmadi, S. H. Duttke, J. A. Wohlschlegel, X. Cheng, S. Redding, S. sf-6059413 Attorney Docket No.: 26223-20027.40 E. Jacobsen, MBD5 and MBD6 couple DNA methylation to gene silencing through the J- domain protein SILENZIO. Science (2021), doi:10.1126/science.abg6130. 30. D. Li, A. M. S. Palanca, S. Y. Won, L. Gao, Y. Feng, A. A. Vashisht, L. Liu, Y. Zhao, X. Liu, X. Wu, S. Li, B. Le, Y. J. Kim, G. Yang, S. Li, J. Liu, J. A. Wohlschlegel, H. Guo, B. Mo, X. Chen, J. A. Law, The MBD7 complex promotes expression of methylated transgenes without significantly altering their methylation status. eLife.6, e19893 (2017). 31. Z. Feng, X. Zhan, J. Pang, X. Liu, H. Zhang, Z. Lang, J.-K. Zhu, Genetic analysis implicates a molecular chaperone complex in regulating epigenetic silencing of methylated genomic regions. Journal of Integrative Plant Biology.63, 1451–1461 (2021). 32. J. Jumper, R. Evans, A. Pritzel, T. Green, M. Figurnov, O. Ronneberger, K. Tunyasuvunakool, R. Bates, A. Žídek, A. Potapenko, A. Bridgland, C. Meyer, S. A. A. Kohl, A. J. Ballard, A. Cowie, B. Romera-Paredes, S. Nikolov, R. Jain, J. Adler, T. Back, S. Petersen, D. Reiman, E. Clancy, M. Zielinski, M. Steinegger, M. Pacholska, T. Berghammer, S. Bodenstein, D. Silver, O. Vinyals, A. W. Senior, K. Kavukcuoglu, P. Kohli, D. Hassabis, Highly accurate protein structure prediction with AlphaFold. Nature.596, 583–589 (2021). 33. R. Evans, M. O’Neill, A. Pritzel, N. Antropova, A. Senior, T. Green, A. Žídek, R. Bates, S. Blackwell, J. Yim, O. Ronneberger, S. Bodenstein, M. Zielinski, A. Bridgland, A. Potapenko, A. Cowie, K. Tunyasuvunakool, R. Jain, E. Clancy, P. Kohli, J. Jumper, D. Hassabis, Protein complex prediction with AlphaFold-Multimer (2022), p. 2021.10.04.463034, , doi:10.1101/2021.10.04.463034. 34. G. K. A. Hochberg, D. A. Shepherd, E. G. Marklund, I. Santhanagoplan, M. T. Degiacomi, A. Laganowsky, T. M. Allison, E. Basha, M. T. Marty, M. R. Galpin, W. B. Struwe, A. J. Baldwin, E. Vierling, J. L. P. Benesch, Structural principles that enable oligomeric small heat-shock protein paralogs to evolve distinct functions. Science.359, 930– 935 (2018). 35. R. Rosenzweig, N. B. Nillegoda, M. P. Mayer, B. Bukau, The Hsp70 chaperone network. Nature Reviews Molecular Cell Biology.20, 665–680 (2019). 36. A. Zemach, Y. Li, B. Wayburn, H. Ben-Meir, V. Kiss, Y. Avivi, V. Kalchenko, S. E. Jacobsen, G. Grafi, DDM1 binds Arabidopsis methyl-CpG binding domain proteins and affects their subnuclear localization. Plant Cell.17, 1549–1558 (2005). 37. N. N. Giakoumakis, M. A. Rapsomaniki, Z. Lygerou, Analysis of Protein Kinetics Using Fluorescence Recovery After Photobleaching (FRAP). Methods Mol. Biol.1563, 243– 267 (2017). 38. J. Gardiner, B. Ghoshal, M. Wang, S. E. Jacobsen, CRISPR–Cas-mediated transcriptional control and epi-mutagenesis. Plant Physiology.188, 1811–1824 (2022). 39. W. J. J. Soppe, Z. Jasencakova, A. Houben, T. Kakutani, A. Meister, M. S. Huang, S. E. Jacobsen, I. Schubert, P. F. Fransz, DNA methylation controls histone H3 lysine 9 methylation and heterochromatin assembly in Arabidopsis. The EMBO Journal.21, 6549– 6559 (2002). 40. B. Ghoshal, B. Vong, C. L. Picard, S. Feng, J. M. Tam, S. E. Jacobsen, A viral guide RNA delivery system for CRISPR-based transcriptional activation and heritable targeted DNA demethylation in Arabidopsis thaliana. PLOS Genetics.16, e1008983 (2020). 41. J. Gallego-Bartolomé, W. Liu, P. H. Kuo, S. Feng, B. Ghoshal, J. Gardiner, J. M.-C. Zhao, S. Y. Park, J. Chory, S. E. Jacobsen, Co-targeting RNA Polymerases IV and V Promotes Efficient De Novo DNA Methylation in Arabidopsis. Cell.176, 1068-1082.e19 (2019). 42. M. Wang, Z. Zhong, J. Gallego-Bartolomé, Z. Li, S. Feng, H. Y. Kuo, R. L. Kan, H. Lam, J. C. Richey, L. Tang, J. Zhou, M. Liu, Y. Jami-Alahmadi, J. Wohlschlegel, S. E. Jacobsen, A gene silencing screen uncovers diverse tools for targeted gene repression in Arabidopsis. Nat. Plants.9, 460–472 (2023). sf-6059413 Attorney Docket No.: 26223-20027.40 43. E. R. Waters, E. Vierling, Plant small heat shock proteins – evolutionary and functional diversity. New Phytologist.227, 24–37 (2020). 44. L. Yan, S. Wei, Y. Wu, R. Hu, H. Li, W. Yang, Q. Xie, High-Efficiency Genome Editing in Arabidopsis Using YAO Promoter-Driven CRISPR/Cas9 System. Molecular Plant. 8, 1820–1823 (2015). 45. C. Zhao, D. J. Segal, S. E. Jacobsen, Targeted DNA demethylation of the Arabidopsis genome using the human TET1 catalytic domain. Proceedings of the National Academy of Sciences.115, E2125–E2134 (2018). 46. B. Ghoshal, J. Gardiner, "CRISPR-dCas9-Based Targeted Manipulation of DNA Methylation in Plants" in CRISPR-Cas Methods: Volume 2, M. T. Islam, K. A. Molla, Eds. (Springer US, New York, NY, 2021; https://doi.org/10.1007/978-1-0716-1657-4_5), Springer Protocols Handbooks, pp.57–71. 47. C. S. Hughes, S. Foehr, D. A. Garfield, E. E. Furlong, L. M. Steinmetz, J. Krijgsveld, Ultrasensitive proteome analysis using paramagnetic bead technology. Molecular Systems Biology.10, 757 (2014). 48. J. Cox, M. Mann, MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol.26, 1367–1372 (2008). 49. B. Langmead, S. L. Salzberg, Fast gapped-read alignment with Bowtie 2. Nat Methods.9, 357–359 (2012). 50. F. Ramírez, D. P. Ryan, B. Grüning, V. Bhardwaj, F. Kilpert, A. S. Richter, S. Heyne, F. Dündar, T. Manke, deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Research.44, W160–W165 (2016). 51. Y. Zhang, T. Liu, C. A. Meyer, J. Eeckhoute, D. S. Johnson, B. E. Bernstein, C. Nusbaum, R. M. Myers, M. Brown, W. Li, X. S. Liu, Model-based analysis of ChIP-Seq (MACS). Genome Biol.9, R137 (2008). 52. S. Heinz, C. Benner, N. Spann, E. Bertolino, Y. C. Lin, P. Laslo, J. X. Cheng, C. Murre, H. Singh, C. K. Glass, Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell. 38, 576–589 (2010). 53. L. Ichino, C. L. Picard, J. Yun, M. Chotai, S. Wang, E. K. Lin, R. K. Papareddy, Y. Xue, S. E. Jacobsen, Single-nucleus RNA-seq reveals that MBD5, MBD6, and SILENZIO maintain silencing in the vegetative cell of developing pollen. Cell Reports.41, 111699 (2022). 54. A. Dobin, C. A. Davis, F. Schlesinger, J. Drenkow, C. Zaleski, S. Jha, P. Batut, M. Chaisson, T. R. Gingeras, STAR: ultrafast universal RNA-seq aligner. Bioinformatics.29, 15–21 (2013). 55. S. Anders, P. T. Pyl, W. Huber, HTSeq—a Python framework to work with high- throughput sequencing data. Bioinformatics.31, 166–169 (2015). 56. M. I. Love, W. Huber, S. Anders, Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology.15, 550 (2014). 57. A. R. Quinlan, I. M. Hall, BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics.26, 841–842 (2010). sf-6059413 Attorney Docket No.: 26223-20027.40 Example 2: Co-targeting DNA methylation by fusing TRBIP1 and MQ1 in the SunTag system. Summary [0366] This Example demonstrates that co-targeting TRBIP1 and MQ1 synergistically caused silencing and DNA methylation at the target locus compared with MQ1 targeting alone in the SunTag system. The experiments described in this Example involved constructing a CRISPR-dCAS9 based SunTag plasmid, using a straight fusion of TRBIP1 and MQ1 as an individual effector protein (SunTag-TRBIP1-MQ1). TRBIP1 was identified from TRB protein Immunoprecipitation and Mass Spectrometry (IP-MS), and can be used to silence target genes through H3K4me3 demethylation and H3K27me3 deposition. MQ1 is a bacteria DNA methyltransferase with a site mutation of Q147L, which can be used to target DNA methylation in SunTag system (Ghoshal et al., 2021). The plasmid map and modules are shown in FIGS.7A-7B and the sequences of each module are shown in Table 2A. Materials and Methods Construct Design [0367] To construct the SunTag-TRBIP1-MQ1, the original SunTag-MQ1 plasmid from (Ghoshal et al., 2021) was digested with BsiWI (ThermoFisher). Arabidopsis cDNA was used as a template to amplify TRBIP1 CDS fragment, using oligo 26987 (SEQ ID NO: 158) and oligo 27079 (SEQ ID NO: 159). SunTag-TRBIP1-MQ1 (Ghoshal et al., 2021) was used as a template to amplify MQ1 PCR fragment, using oligo 26987 (SEQ ID NO: 160) and oligo 26988 (SEQ ID NO: 161). TRBIP1-MQ1 fragment was amplified by using the TRBIP1 and MQ1 PCR fragments as template and using oligo 26987 and 26988 as primers. The SV40 sequence was thus synthesized in the oligos. TRBIP1-MQ1 PCR fragment were cloned into SunTag vector by infusion (Takara). [0368] As the specific position of the scFv, GFP, TRBIP1 and MQ1 in the fusion protein may impact function, different constructs were prepared such that TRBIP1 and MQ1 is oriented either N-terminal or C-terminal to the position of the GFP protein in the fusion protein. [0369] In order to change the target sequence present in the different gRNAs, the protocol described in Ghoshal et al., 2021 was followed using the plasmid SunTag-MQ1. As an example, to generate the gRNA FWA-17 that targets the sequence “AAAACTAGGCCATCCATGGA” (SEQ ID NO: 162) in the FWA Promoter, two sf-6059413 Attorney Docket No.: 26223-20027.40 consecutive PCRs using the plasmid SunTag-MQ1 as a template and the oligos 27141 ( SEQ ID NO: 163) and oligo 27296 (SEQ ID NO: 164) for PCR1; oligo 27297 (SEQ ID NO: 165) and oligo 27142 (SEQ ID NO: 166) for PCR2 were performed. The overlapping PCR was conducted by using PCR1 and PCR2 as templates, and oligo 27141 and 27142 as primers. The plasmid was digested with KpnI and MauBI, purified with column and used to perform infusion reaction (Takara) together with the overlapping PCR fragment. [0370] To target the FWA locus, various alternative gRNA sequences described were tested, as presented in Table 2A. Table 2A: gRNA Molecules Targeting the FWA Promoter
Figure imgf000112_0001
[0371] The sequence of each module in the SunTag-TRBIP1-MQ1 plasmid is listed in Table 2B. Transformation of fwa rdr6 Plants [0372] Agrobacterium AGL0 cells were transformed with the final binary vector containing the fusion proteins and the gRNA. Arabidopsis fwa rdr6 plants were transformed using floral dip methods well-known in the art. Flowering Time Measurements [0373] Progeny of transformed plants (T1s) were planted and screened for hygromycin- resistant plants that incorporate the T-DNA into the Arabidopsis genome, which confers resistance to hygromycin. Among the hygromycin -resistant transgenic plants, flowering time was measured and compared to early-flowering wild-type Col0 and late-flowering fwa rdr6 plants. Flowering time was measured by counting the total number of leaves (rosette and cauline) of each individual plant. Data Analysis [0374] Plants transformed with the fusion constructs described above were evaluated for phenotypic differences as compared to corresponding control plants (e.g. fwa rdr6, SunTag- sf-6059413 Attorney Docket No.: 26223-20027.40 MQ1, and SunTag-TRBIP1-MQ1) which were suggestive of successful fusion protein targeting to the locus of interest and subsequent silencing at the locus. Other analyses included measuring the expression level of the targeted locus in the transformed plants, measuring the degree of DNA methylation at the targeted locus in the transformed plants, and other assays well-known to those of skill in the art. Table 2B: Parameters and Sequences for Fusion Construct Modules
Figure imgf000113_0001
Results [0375] The results shown in FIGS.7A-7B and FIGS.8A-8E demonstrate that co- targeting TRBIP1 and MQ1 synergistically caused silencing and DNA methylation at the target locus of FWA, when compared with MQ1 targeting alone in the SunTag system. 111 sf-6059413 Attorney Docket No.: 26223-20027.40 However, SunTag-TRBIP1-MQ1 also triggered a strong hyper CG DNA methylation over plant genome. Example 3: Replacing the UBQ10 promoter with 20 different Arabidopsis promoters with weaker transcription activity resulted in reduced CG DNA hypermethylation in SunTag-TRBIP1-MQ1 transgenic lines Summary [0376] This Example describes experiments demonstrating a method of reducing the CG DNA hypermethylation in SunTag-TRBIP1-MQ1 by using weaker promoters to drive the expression of TRBIPMQ1. However, the UBQ10 promoter was still maintained to drive the expression of dCAS9-GCN4. Materials and Methods Construct Design [0377] This Example describes experiments replacing UBQ10 promoter with 20 different Arabidopsis promoters with different activity to drive the expression of scFv antibody, GFP, TRBIP1 and MQ1 expression cassette (scFv-GFP-TRBIP1-MQ1). The plasmid map and modules were shown in FIG.9A and 9B, and the sequences of each module are shown in Table 2B. [0378] This Example describes experiments replacing the UBQ10 promoter with APX1 promoter to drive the expression of scFv-GFP-TRBIP1-MQ1. The SunTag-TRBIP1-MQ1 was digested with NruI and AleI and purified with column. Three fragments were amplified by using PCR, including fragment1 (TBS insulator) using plasmid SunTag-TRBIP1-MQ1 as template, and oligo 28056 (SEQ ID NO: 191), oligo 28057 (SEQ ID NO: 192) as primers. The fragment2 (APX1 Promoter) was amplified by using Arabidopsis genomic DNA as template, as well as oligo 28058 (SEQ ID NO: 193) and oligo 28059 (SEQ ID NO: 194) as primers. Fragment 3 (scFv) was amplified by using the plasmid SunTag-TRBIP1-MQ1 as a template and oligo 28098 (SEQ ID NO: 195) and oligo 28061 (SEQ ID NO: 196) as primers. These fragments were gel purified and cloned into NruI/AlelI digested plasmid by infusion. By using this method, two PacI digestion sites were introduced into both ends of APX1 promoter through synthesized oligoes, which can be used for the further construction of the remaining plasmids with the other 19 promoters. [0379] This Example also describes construction of plasmids with the other 19 promoters. SunTag-TRBIP1-MQ1-ProAPX1 was digested with PacI and purified by using Qiagen sf-6059413 Attorney Docket No.: 26223-20027.40 column.19 promoters were amplified by using Arabidopsis genomic DNA as template and the corresponding oligos as primers listed in Table 4A. The 19 PCR products were gel purified and cloned into the PacI digested SunTag-TRBIP1-MQ1 by using infusion (Takara). [0380] Table 4A:
Figure imgf000115_0001
sf-6059413 Attorney Docket No.: 26223-20027.40 [0381] Below are the DNA sequences of all the 20 Promoters: SEQ ID NO: 235: Promoter 1 AT4G32020. SEQ ID NO: 236: Promoter 2 AT1G19770. SEQ ID NO: 237: Promoter 3 AT5G11770900bp. SEQ ID NO: 238: Promoter 4 AT1G57720. SEQ ID NO: 239: Promoter 5 AT1G06570. SEQ ID NO: 240: Promoter 6 AT3G16100. SEQ ID NO: 241: Promoter 7 AT4G28220. SEQ ID NO: 242: Promoter 8 AT2G48020. SEQ ID NO: 243: Promoter 9 AT3G50410. SEQ ID NO: 244: Promoter 10 AT1G16640. SEQ ID NO: 245: Promoter 11 AT1G79400. SEQ ID NO: 246: Promoter 12 AT2G28860. SEQ ID NO: 247: Promoter 13 AT1G07890 APX1. SEQ ID NO: 248: Promoter 14 AT2G45190. SEQ ID NO: 249: Promoter 15 AT5G24860. SEQ ID NO: 250: Promoter 16 AT4G18960. SEQ ID NO: 251: Promoter 17 AT1G55480 MET1. SEQ ID NO: 252: Promoter 18 AT2G33830 DRM2. SEQ ID NO: 253: Promoter 19 AT4G19020 CMT2. SEQ ID NO: 254: Promoter 20 AT1G69770 CMT3. Transformation of fwa rdr6 Plants [0382] Agrobacterium AGL0 cells were transformed with the final binary vector containing the fusion proteins and the gRNA. Arabidopsis fwa rdr6 plants were transformed using floral dip methods well-known in the art. Flowering Time Measurements [0383] Progeny of transformed plants (T1s) were planted and screened for hygromycin- resistant plants that incorporated the T-DNA into the Arabidopsis genome, which confers resistance to hygromycin. Among the hygromycin-resistant transgenic plants, flowering time was measured and compared to early-flowering wild-type Col0 and late-flowering fwa rdr6 plants. Flowering time was measured by counting the total number of leaves (rosette and cauline) of each individual plant. Data Analysis [0384] Plants transformed with the fusion constructs described above were evaluated for phenotypic differences as compared to corresponding control plants (e.g. fwa rdr6) suggestive of successful fusion protein targeting to the locus of interest and subsequent silencing at the locus. The phenotype evaluated varied depending on the locus targeted. Other analyses included measuring the expression level of the targeted locus in the transformed plants, measuring the degree of DNA methylation at the targeted locus in the transformed plants, or other assays well-known to those of skill in the art. sf-6059413 Attorney Docket No.: 26223-20027.40 Results [0385] FIGS.9A-9B, 10, 11, 12, and 13 demonstrate that using weaker promoters still maintained DNA methylation at the target locus, while notably reducing the genome wide hyper CG DNA methylation. However, the hyper CG DNA methylation throughout the nuclear genome and also in the chloroplast genome was still not completely removed, which is addressed in Example 4 below. Example 4: Removing the CG DNA hypermethylation in SunTag-TRBIP1-MQ1 and SunTag-MQ1 transgenic lines by using StkyC domain [0386] This Example describes construction of fusion constructs containing StkyC directly fused to MQ1 and TRBIP1-MQ1, as an individual effector protein, which aimed to further remove the genome wide CG DNA hypermethylation caused by SunTag-TRBIP1- MQ1. StykC is a conserved domain of MBD6 that recruits the ACD15 and ACD21 proteins. Without wishing to be bound by theory, it is believed that the recruitment of the ACD proteins to the StykC domain, which was present at least ten copies relative to each dCas9 protein, provided a nucleation site for the aggregation of other ACD15 and ACD21 proteins, as well other StkyC-TRBIP1-MQ1 fusion proteins that would otherwise be present diffusely through the nucleus. Without wishing to be bound by theory, it is believed that this concentrated all of the fusion proteins to the site of action, preventing off target activity. The plasmid map and modules of SunTag-StykC-TRBIP1-MQ1 are shown in FIG.14A and FIG. 14B, and the sequence of each module are shown in Table 2B. [0387] To test whether epigenetic regulators as described herein may be targeted to a target nucleic acid using SunTag system, a series of different fusion constructs were prepared. As the specific position of the epigenetic regulator in the fusion protein may impact function, different constructs were prepared such that the epigenetic regulator was oriented either N-terminal or C-terminal to the position of the GFP protein in the fusion protein. In order to reduce the CG DNA hypermethylation of SunTag-MQ1-TRBIP1, different peptides, such as StykC were fused with MQ1-TRBIP1 to evaluate whether CG DNA hypermethylation could be removed while still maintaining strong CG DNA methylation at the target locus. sf-6059413 Attorney Docket No.: 26223-20027.40 Materials and Methods Cloning of Fusion Proteins and gRNA-fwa [0388] Structures of the fusion constructs used in the SunTag-TRBIP1-MQ1 system are presented in FIG.14A and FIG.14B. In these figures, different regions of the construct are labeled, with each region representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these figures were also be prepared, and the sequences are described in Table 2B. Construct Design [0389] To construct the SunTag-StykC-TRBIP1-MQ1, the original SunTag-TRBIP1- MQ1 plasmid was digested with BsiWI (ThermoFisher). SunTag-TRBIP1-MQ1 was used as a template to amplify TRBIP1-MQ1, using oligo 23069 (SEQ ID NO: 255) and oligo 27102 (SEQ ID NO: 256). The sequence of StykC-Xten liner was ordered from IDT (see Table 2B), together with TRBIP1-MQ1 PCR fragment, were cloned into SunTag vector by infusion (Takara). The UBQ10 promoter was used to drive the expression of scFv-GFP-TRBIP1- MQ1. [0390] In order to change the target sequence present in the different gRNAs, the protocol described in Ghoshal et al., 2021 was followed. As an example, to generate the gRNA FWA- 17 that targets the sequence “AAAACTAGGCCATCCATGGA” (SEQ ID NO: 257) in the FWA Promoter, two consecutive PCRs using the plasmid SunTag-TRBIP1-MQ1 as a template and the oligos 27141 (SEQ ID NO: 258) and oligo 27296 ( SEQ ID NO: 259) as primers for PCR1; oligo 27297 (SEQ ID NO: 260) and oligo 27142 ( SEQ ID NO: 261) for PCR2 were performed. The overlapping PCR were conducted by using PCR1 and PCR2 as templates, and oligo 27141 and 27142 as primers. The plasmid was digested with KpnI and MaubI, after column purification, it was used to perform infusion reaction (Takara) together with the overlapping PCR fragment. [0391] Alternatively, a tRNA-gRNA expression cassette (Xie, X et al, 2015, Proc Natl Acad Sci U S A.2015 Mar 17;112(11):3570-5) was used to deliver multiple gRNAs simultaneously with high expression levels. Due to the repetitive nature of these modules, gene synthesis, instead of traditional cloning, was used to generate the cassettes. [0392] To target the FWA locus, various alternative gRNA sequences described were tested, as presented in Table 2B. sf-6059413 Attorney Docket No.: 26223-20027.40 [0393] Various other loci in the genome were also targeted to demonstrate the ability of the fusion protein to target a locus of interest. Exemplary loci that were targeted include GA1, FLC, and RITA. A series of different gRNA molecules were designed that target these loci. The crRNA portion of these gRNAs are presented below in Table 2B. The gRNA was a fusion of the crRNA and tracrRNA. Transformation of fwa rdr6 Plants [0394] Agrobacterium AGL0 cells were transformed with the final binary vector containing the fusion proteins and the gRNA. Arabidopsis fwa rdr6 plants were transformed using floral dip methods well-known in the art. Flowering Time Measurements [0395] Progeny of transformed plants (T1s) were planted and screened for hygromycin- resistant plants that incorporated the T-DNA into the Arabidopsis genome, which conferred resistance to hygromycin. Among the hygromycin-resistant transgenic plants, flowering time was measured and compared to early-flowering wild-type Col0 and late-flowering fwa rdr6 plants. Flowering time was measured by counting the total number of leaves (rosette and cauline) of each individual plant. Data Analysis [0396] Plants transformed with the fusion constructs described above were evaluated for phenotypic differences as compared to corresponding control plants (e.g., fwa rdr6, SunTag- StykC-MQ1 and SunTag-StykC-TRBIP1-MQ1) for evidence suggestive of successful fusion protein targeting to the locus of interest and subsequent silencing at the locus. The phenotype that was evaluated varied depending on the locus targeted. Other analyses performed included measuring the expression level of the targeted locus in the transformed plants, measuring the degree of DNA methylation at the targeted locus in the transformed plants, and other assays well-known to those of skill in the art. Results [0397] The results presented in FIGS.15-17 demonstrate that adding the StykC domain to the SunTag-TRBIP1-MQ1 increased the specificity and significantly reduced CG DNA hypermethylation. Remarkably, we found that the off-target DNA methylation throughout the genome was completely eliminated by addition of the StkyC domain, but on-target methylation and silencing FWA were still effective. Even more remarkably, the addition of sf-6059413 Attorney Docket No.: 26223-20027.40 the StkyC domain completely eliminated the ectopic CG hypermethylation of the chloroplast genome. This shows that the oligomerization properties of the ACD proteins could not only sequester the fusion proteins away from off-target sites in the nuclear genome, but could also sequester the fusion proteins away from the chloroplasts and into the nucleus. Example 5: ACD15 and ACD21 chaperone system for use to increase specificity of TDG-TET1 mediated DNA demethylation in plants Summary [0398] This Example describes exemplary experimental guidelines for constructing a genome targeting system utilizing a dCas9 SunTag targeting system combined with plant ACD15-ACD21 mediated recruitment using the plant StkyC domain of MBD6 to increase the specificity of the TDG-TET1 DNA demethylase enzymes for their intended genomic targets. These constructs will be used to target TDG-TET1 to a specific locus of the genome using dCas9 targeting and decrease off target demethylation through accumulation of excess TDG-TET1 at the target site through ACD15/ACD21 oligomerization recruited by the StkyCMBD6 domain. [0399] To demonstrate the efficacy of this technology we will create this dCas9 system which will be targeted to the promoter of the CACTA gene in heterochromatin. It is expected this technology will cause the formation of foci in nuclei of cells corresponding to the dCas9 binding sites. To track the formation of nuclear bodies in cells formed by the accumulation of protein through ACD15/ACD21, a GFP tag will be added to the dCAS9 targeting system. Materials and Methods Cloning of Fusion Proteins and gRNA-fwa [0400] Exemplary structures of these fusion constructs to be used in the CRISPR-CAS9 system are presented in FIGS.18A-18C. In this figure, different regions of the construct are numerically labeled, with each region representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these figures will also be prepared, as described below in Table 5A. [0401] Table 5A: Exemplary Parameters for Fusion Construct Modules
Figure imgf000120_0001
sf-6059413 Attorney Docket No.: 26223-20027.40
Figure imgf000121_0001
Exemplary Construct Design [0402] To construct the dCas9 system a SunTag plasmid will be used which contains the dCas9-1xHA-3xNLS-10xGCN4 as well as the two guide RNAs targeting the dCas9 to CACTA. Coding sequences for the StkyCMBD6 domain will be amplified from genomic DNA while TDG-TET1 will be amplified from an existing SunTag plasmid and will be cloned into a separate SunTag plasmid after cutting the plasmid with the appropriate restriction enzymes. This will add the TDG-StkyCMBD6-TET1 directly after the GFP and before the HA tag. Features of SunTag-TDG-StkyCMBD6-TET1 include a UBQ10 promoter, dCas9_1xHA_3xNLS_10xGCN4_OCS, gRNA-CACTA sequences, single chain variable fragment (scFV)_GFP_NLS_1xHA, StkyCMBD6 Domain, TDG, TET1, and the XTEN Linker. [0403] All the different modules will be amplified by PCR using specific oligos and cloned into a binary plasmid using InFusion (Clontech) Transformation of Plants [0404] Agrobacterium AGL0 cells will be transformed with the final binary vector containing the fusion proteins and the gRNA. Arabidopsis wild type (Col-0) plants will be transformed using floral dip methods. Microscopy Experiments [0405] Root meristems of seedlings selected for hygromycin resistance will be analyzed using an LSM980 confocal microscope. The GFP reporter will allow for observation of cellular localization and nuclear bodies. ACD containing proteins are known to oligomerize and therefore GFP foci are expected to form in nuclei of cells. A SunTag system expressing the scFV-GFP-TDG--TET1 lacking the StkyCMBD6 will be used as a control in all experiments. Whole Genome Bisulfite Experiment [0406] To determine the methylation status of the CACTA locus, whole genome bisulfite sequencing experiments will be performed. This experiment will be performed using both the sf-6059413 Attorney Docket No.: 26223-20027.40 SunTag-TDG-StkyCMBD6-TET1 technology and the SunTag-TDG-TET1 only control which is lacking the StkyCMBD6. Data Analysis [0407] Multiple seedlings of each SunTag-TDG-StkyCMBD6-TET1 will be imaged using confocal microscopy to determine expression of GFP. If GFP signal is concentrated in nuclei of the cells, then the construct will be assumed to properly localize. Z-stack of root meristems will be obtained across multiple plant lines to acquire images across many cells for each SunTag-TDG-StkyCMBD6-TET1 construct. ImageJ software will be used to analyze the images and quantify the foci across multiple cells. [0408] It is expected that ACD15/ACD21 will oligomerize in nuclei of plants. This will lead to the formation of GFP foci representing accumulation of scFV-GFP-TDG-StkyCMBD6- TET1 targeted to the promoter of CACTA. [0409] Multiple plants lines will be used for the whole genome bisulfite experiments. Leaf tissue will be harvested for these experiments. Downstream analysis will be performed according to previous protocols. [0410] This technology will demonstrate the functional use of ACD15 and ACD21 to specifically accumulate TDG-TET1 at a targeted locus through a dCas9 targeting system and reduce off-target demethylation events. By comparing constructs with and without the StkyCMBD6 we expect that SunTag-TDG-StkyCMBD6- TET1 construct will show lower levels of demethylation at non-target sites. In addition to the StkyCMBD6 domain described in this example, it may also be beneficial to fuse the StkyC domain of MBD7 to TDG-TET1 (StkyCMBD7). It may also be beneficial to fuse the human small heat shock proteins HSPB1, HSPB3, or HSPB5 to TDG-TET1. It may also be beneficial to fuse small heat shock proteins from other organisms with TDG-TET1. All of these fusions are predicted to make targeted demethylation more specific by concentrating TDG-TET1 activity at the genomic site of action. In addition, fusing the StkyC domains or the human small heat shock proteins to other proteins of interest, may allow them to also be targeted in a more specific manner. Sequences [0411] SEQ ID NO: 262: UBQ10 promoter (DNA sequence). SEQ ID NO: 263: dCas9_1xHA_3xNLS_10xGCN4_OCS (DNA sequence). SEQ ID NO: 264: dCas9_1xHA_3xNLS_10xGCN4_OCS (protein sequence). SEQ ID NO: 265: gRNA-FWA sequences (gRNA8 and scaffold). SEQ ID NO: 266: gRNA-FWA sequences (gRNA17 and sf-6059413 Attorney Docket No.: 26223-20027.40 scaffold). SEQ ID NO: 267: single chain variable fragment (scFV)_GFP_NLS_1xHA (DNA sequence). SEQ ID NO: 268: single chain variable fragment (scFV)_GFP_NLS_1xHA (protein sequence). SEQ ID NO: 269: StkyCMBD6 (DNA sequence). SEQ ID NO: 270: StkyCMBD6 (protein sequence). SEQ ID NO: 271: TET1 (DNA sequence). SEQ ID NO: 272: TET1 (protein sequence). SEQ ID NO: 273: TDG (DNA sequence). SEQ ID NO: 274: TDG (protein sequence). SEQ ID NO: 275: XTEN (DNA sequence). SEQ ID NO: 276: SunTag- TDG-StkyCMBD6-TET1 Plasmid Sequence. SEQ ID NO: 277: SunTag-TDG-TET1 Plasmid Sequence. Example 6: ACD15 and ACD21 chaperone system for use to increase specificity of SDG2 histone methyltransferase Summary [0412] This Example describes planned experiments involving dCas9-epitope tail + antibody-GFP-SDG2-StkyC constructs, with control constructs lacking StkyC. Described herein are exemplary experimental guidelines for constructing a genome targeting system utilizing a dCas9 SunTag system combined with plant ACD15-ACD21 recruitment through the MBD6 StkyC domain (StkyCMBD6) to increase the specificity of the SDG2 histone methyltransferase enzymes. These constructs will be used to target SDG2 to a specific locus of the genome using dCas9 targeting and decrease off target histone methylation through accumulation of excess SDG2 at the target site through ACD15/ACD21 oligomerization. [0413] To demonstrate the efficacy of this technology we will create this dCas9 system which will be targeted to the promoter of the FWA gene. It is expected this technology will form foci in the nuclei of cells corresponding to the dCas9 target sites. To track the formation of nuclear bodies in cells formed by the accumulation of protein through ACD15/ACD21, a GFP tag will be added to the dCAs9 targeting system. Materials and Methods Cloning of Fusion Proteins and gRNA-fwa [0414] Exemplary structures of these fusion constructs to be used in the CRISPR-CAS9 system are presented in FIGS.19A-19D. In these figures, different regions of the construct are numerically labeled, with each region representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these figures will also be prepared, as described below in Table 2A. [0415] Table 2A: Exemplary Parameters for Fusion Construct Modules sf-6059413 Attorney Docket No.: 26223-20027.40
Figure imgf000124_0001
Exemplary Construct Design [0416] To construct the dCAs9 system, the current SunTag plasmid will used, which contains the dCas9-1xHA-3xNLS-10xGCN4 as well as the two guide RNAs targeting the dCas9 to FWA. Coding sequences of StkyCMBD6 domain and SDG2 will be amplified from genomic DNA and will be cloned into the SunTag plasmid after cutting the plasmid with the appropriate restriction enzymes. This will add the SDG2- StkyCMBD6 coding sequence directly after the GFP and before the HA tag. Features of SunTag-SDG2- StkyCMBD6 include a UBQ10 promoter, dCas9_1xHA_3xNLS_10xGCN4, gRNA-FWA sequences, and the gRNA backbone including the tracrRNA and the gRNA terminator, single chain variable fragment (scFV)_GFP_NLS_1xHA, StkyCMBD6 Domain, SDG2. [0417] All the different modules will be amplified by PCR using specific oligos and cloned into a binary plasmid using InFusion (Clontech) Transformation of Plants [0418] Agrobacterium AGL0 cells will be transformed with the final binary vector containing the fusion proteins and the gRNA. Arabidopsis wild type (Col-0) plants will be transformed using floral dip methods. Microscopy Experiments [0419] Root meristem of seedlings selected for hygromycin resistance will be analyzed using an LSM980 confocal microscope. GFP reporter will allow for observation of cellular localization and nuclear phenotypes. ACD containing proteins are known to oligomerize and therefore GFP foci are expected to form in nuclei of cells. A SunTag system expressing the scFV-GFP-SGD2 without StkyCMBD6 will be used as a control in all experiments. ChIP-Seq [0420] To determine the specificity of the technology, ChIP-Seq experiments will be performed using antibodies that recognize H3K4me3. This experiment will be performed using both the SunTag-SDG2- StkyCMBD6 technology and the SunTag-SDG2 control. sf-6059413 Attorney Docket No.: 26223-20027.40 Data Analysis [0421] Multiple seedlings of each SunTag-SDG2- StkyCMBD6 will be imaged using confocal microscopy to determine expression of GFP. If GFP signal is concentrated in nuclei of the cells then the construct will be assumed to properly localize without any misfolding. Z- stack of root meristems will be obtained across multiple plant lines to acquire images across many cells for each SunTag-SDG2- StkyCMBD6 construct. ImageJ software will be used to analyze the images and quantify the foci across multiple cells. H3K4me3 will be measured by ChIP-seq and gene expression of the target will be measured by RT-PCR and RNA-seq. Expected Results [0422] It is expected that ACD15/ACD21 will oligomerize in nuclei of plants. This will lead to the formation of GFP foci representing accumulation of scGV-GFP- SDG2- StkyCMBD6 targeted to the promoter of FWA. It is expected SunTag-GFP-SDG2 alone will contain no foci in nuclei of plant cells as measured by microscopy. [0423] Multiple plants lines will be used for the ChIP-Seq experiments to control biological variability. Leaf plant tissue will be harvested for the experiments. Downstream analysis will be performed according to previous protocols. [0424] This technology will demonstrate the functional use of the SunTag- StkyCMBD6 targeting system to specifically accumulate SDG2 at a targeted loci through a dCas9 system and reduce off-target histone methylation events. By comparing to a control construct that does not contain the StkyC domain, we expect to see enhanced specificity of targeting of H3K4 trimethylation to the FWA locus. In addition to the StkyCMBD6 domain described in this example, it may also be beneficial to fuse the StkyC domain of MBD7 (StkyCMBD7) to SDG2. It may also be beneficial to fuse the human heat shock proteins HSPB1, HSPB3, or HSPB5 to SDG2. All of these fusions are predicted to make SDG2 targeting more specific by concentrating SDG2 activity at the genomic site of action. In addition, fusing the StkyC domains or the human small heat shock proteins to other proteins of interest, may allow them to also be targeted in a more specific manner. Sequences [0425] SEQ ID NO: 278: UBQ10 promoter DNA sequence. SEQ ID NO: 279: dCas9_1xHA_3xNLS_10xGCN4 DNA sequence. SEQ ID NO: 280: dCas9_1xHA_3xNLS_10xGCN4 protein sequence. SEQ ID NO: 281: gRNA-FWA sequences; gRNA4 and scaffold. SEQ ID NO: 282: gRNA-FWA sequences; gRNA17 and sf-6059413 Attorney Docket No.: 26223-20027.40 scaffold. SEQ ID NO: 283: single chain variable fragment (scFV)_GFP_NLS_1xHA; DNA sequence. SEQ ID NO: 284: single chain variable fragment (scFV)_GFP_NLS_1xHA; protein sequence. SEQ ID NO: 285: StkyCMBD6 DNA sequence. SEQ ID NO: 286: StkyCMBD6 protein sequence. SEQ ID NO: 287: SDG2 DNA sequence. SEQ ID NO: 288: SDG2 protein sequence. SEQ ID NO: 289: SunTag-SDG2-StkyCMBD6 Plasmid Sequence. SEQ ID NO: 290: SunTag-SDG2 Plasmid Sequence. Example 7: dCas9 directed accumulation of protein through ACD proteins of the MBD7 complex using the StkyC domain of MBD7 Summary [0426] This Example describes exemplary experimental guidelines for constructing a genome targeting system utilizing a dCas9 SunTag system combined with the plant StkyC domain of MBD7 (StkyCMBD7). These constructs will be used to accumulate the dCas9 targeting system at a specific locus of the genome using the StkyCMBD7 to recruit the oligomeric ACD proteins IDM3 and IDM2 of the MBD7 complex. [0427] To demonstrate the efficacy of this technology we will create a dCas9 system which will be targeted to the promoter of the FWA gene. It is expected this technology will form foci in the nuclei of cells. To track the formation of nuclear bodies in cells formed by the accumulation of protein through IDM3/IDM2, a GFP tag will be added to the dCas9 targeting system. Materials and Methods Cloning of Fusion Proteins and gRNA-fwa [0428] Exemplary structures of these fusion constructs to be used in the CRISPR-CAS9 system are presented in FIGS 20A-20B. In these figures, different regions of the construct are numerically labeled, with each region representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these figures will also be prepared, as described below in Table 7A. Table 7A: Exemplary Parameters for Fusion Construct Modules
Figure imgf000126_0001
sf-6059413 Attorney Docket No.: 26223-20027.40
Figure imgf000127_0001
Exemplary Construct Design [0429] To construct the dCas9 system the current SunTag plasmid will used which contains the dCas9-1xHA-3xNLS-10xGCN4 as well as the two guide RNAs targeting the dCas9 to FWA. Coding sequences of StkyCMBD7 domain will be amplified from genomic DNA and will be cloned into the SunTag plasmid after cutting the plasmid with the appropriate restriction enzymes. This will add the StkyCMBD7 domain directly after the GFP and before the HA tag. Features of SunTag-StkyCMBD7 include a UBQ10 promoter, dCas9_1xHA_3xNLS_10xGCN4, gRNA-FWA sequences, and the gRNA backbone including the tracrRNA and the gRNA terminator, single chain variable fragment (scFV)_GFP_NLS_1xHA, and StkyCMBD7 Domain. [0430] All the different modules will be amplified by PCR using specific oligos and cloned into a binary plasmid using InFusion (Clontech) Transformation of Plants [0431] Agrobacterium AGL0 cells will be transformed with the final binary vector containing the fusion proteins and the gRNA. Arabidopsis wild type (Col-0) plants and idm3 mutant plants will be transformed using floral dip methods. Microscopy Experiments [0432] Root meristem of seedlings selected for hygromycin resistance will be analyzed using an LSM980 confocal microscope. GFP reporter will allow for observation of cellular localization and nuclear phenotypes. ACD containing proteins are known to oligomerize and therefore GFP foci are expected to form in nuclei of cells. A SunTag system expressing the scFV-GFP- StkyCMBD7 in plants lacking IDM3 will be used as a control in all experiments. Data Analysis [0433] Multiple seedlings of each SunTag-StkyCMBD7 will be imaged using confocal microscopy to determine expression of GFP. If GFP signal is concentrated in nuclei of the cells then the construct will be assumed to properly localize without any misfolding. Z-stack of root meristems will be obtained across multiple plant lines to acquire images across many cells for each SunTag-StkyCMBD7 construct. ImageJ software will be used to analyze the images and quantify the foci across multiple cells. sf-6059413 Attorney Docket No.: 26223-20027.40 Expected Results [0434] It is expected that IDM3/IDM2 will oligomerize once bound to the dCas9 in the nuclei of plants. This will lead to the formation of GFP foci representing accumulation of scFV-GFP-StkyCMBD7 targeted to the promoter of FWA. It is expected that SunTag- StkyCMBD7 nuclear signal in idm3 mutant plants will be diffuse, containing no obvious foci. [0435] This example will demonstrate the functional use of the SunTag-StkyCMBD7 targeting system to specifically accumulate at a targeted locus through a dCas9 system. Because the ACD proteins normally associated with the StkyC domain of MBD7 are present in chromocenters, we also expect to observe GFP signal in the chromocenters in the SunTag- StkyCMBD7 system. It is predicted that in an mbd7 mutant, GFP signal from the StkyCMBD7 system will only be localized at the dCas9 binding sites. It is anticipated that in mutations that eliminate the MBD7 associated IDM2 and IDM3 ACD proteins, that GFP signal from the StkyCMBD7 system will be localized diffusely throughout the nucleus. By fusing the StkyCMBD7 domain to other proteins, they can also be targeted into the nuclear bodies present at the target locus, in a similar manner as we have shown StkyC domain of MBD6. Sequences [0436] SEQ ID NO: 291: UBQ10 promoter DNA sequence. SEQ ID NO: 292: dCas9_1xHA_3xNLS_10xGCN4, DNA sequence. SEQ ID NO: 293: dCas9_1xHA_3xNLS_10xGCN4, protein sequence. SEQ ID NO: 294: gRNA-FWA sequences: gRNA4 and scaffold. SEQ ID NO: 295: gRNA-FWA sequences: gRNA17 and scaffold. SEQ ID NO: 296: single chain variable fragment (scFV)_GFP_NLS_1xHA, DNA sequence. SEQ ID NO: 297: single chain variable fragment (scFV)_GFP_NLS_1xHA, protein sequence. SEQ ID NO: 298: StkyCMBD7; DNA sequence. SEQ ID NO: 299: StkyCMBD7; protein sequence. SEQ ID NO: 300: SunTag-StkyCMBD7 Plasmid Sequence. Example 8: Human small heat shock proteins cause accumulation of MBD6 at chromocenters in plants Summary [0437] This example describes experiments in which we created fusions of human small heat shock proteins HSPB1, HSPB3, HSPB5, and HSPB8 to the plant methyl-CpG-binding domain (MBD) protein 6 (MBD6), while also deleting the MBD6 StkyC domain that is known to be required for interaction with ACD15 and ACD21, and subsequent localization of MBD6 at Arabidopsis chromocenters. These constructs were used to demonstrate that human sf-6059413 Attorney Docket No.: 26223-20027.40 Į-crystalline domain containing proteins (i.e. human sHSPs) can functionally replace the accumulation function of plant Į-crystalline domain proteins ACD15 and ACD21 and provide evidence for the use of Į-crystalline domain containing proteins from other organisms in plants to accumulate proteins in a targeted manner. [0438] To test whether human sHSPs can replace the function of MBD5/6 complex specific Į-crystalline domain containing proteins ACD15 and ACD21, multiple fusion constructs were prepared. Due to the knowledge of the organization of the MBD5/6 complex, the coding sequences for human sHSPs HSPB1, HSPB3, HSPB5, and HSPB8 were added to the C-terminus of the MBD6 protein in place of the StkyC domain of MBD6 (amino acids 168-225). The promoter of MBD6 as well as the other regions of the protein, including the MBD of MBD6, were left intact. These fusions constructs also contain a C-terminal RFP tag in order to observe their cellular localization using fluorescence, confocal microscopy. To avoid any splicing issues, only the cDNA of human sHSPs HSPB1, HSPB3, HSPB5, and HSPB8 were used in the creation of these fusion constructs. Materials and Methods Cloning of Fusion Proteins [0439] Exemplary structures of these fusion constructs used are presented in FIG221A- 21B, 22A-22B, 23A-23B, and 24A-24B. In these figures, different regions of the construct are numerically labeled, with each region representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these figures were also prepared, as described below in Table 8A. Table 8A: Parameters for Fusion Construct Modules
Figure imgf000129_0001
Figure imgf000129_0002
Exemplary Construct Design [0440] To construct the MBD6-human sHSPs fusion proteins, a PENTR_D (Invitrogen) plasmid that contains the genomic DNA of MBD6 without the C-terminal StkyC domain sf-6059413 Attorney Docket No.: 26223-20027.40 (promoter and coding sequence) was cloned into a separate PENTR_D (Invitrogen) vector along with the cDNA of human HSPB1, HSPB3, HSPB5, and HSPB8, and 8 using infusion reaction (Takara). These final PENTR_D constructs containing MBD6 promoter, MBD6 coding sequence, and human sHSPs were then cloned into the final pGWB553 destination vector (Addgene) containing the C-terminal RFP and Nos terminator using gateway ligation kit (ThermoFisher). Features of these MBD6-human sHSP constructs include the MBD6 promoter, the MBD6StkyCǻ coding sequence, HSPB1 sequence, HSPB3 coding sequence, HSPB5 coding sequence, HSPB8 coding sequence, mRFP sequence, and Nos terminator. [0441] All the different modules were amplified by PCR using specific oligos and cloned into a binary plasmid using InFusion (Clontech). Transformation of Plants [0442] Agrobacterium AGL0 cells were transformed with the final binary vector containing the fusion proteins. Arabidopsis mbd5 mbd6 plants were transformed using floral dip methods. Microscopy Experiments [0443] Progeny of transformed plants (T1s) were planted and screened for hygromycin- resistant plants that incorporate the T-DNA into the Arabidopsis genome, which conferred resistance to hygromycin. Seedlings were then imaged using a LSM980 confocal microscopes. Z-stacks of root meristem tissue were imaged across multiple plant lines to confirm RFP signal and acquire data of protein localization. Data Analysis [0444] Image analysis was performed using ImageJ image analysis software. Using the 3D-projection application of ImageJ, reconstruction of root meristems were created using Z- stack data from microscopy experiments. These images allow for the direct comparison of MBD6-HSP phenotypes. Further, using the 3D objects counter application of ImageJ, the amounts of foci were quantified across multiple plant lines to directly compared to wild-type MBD6. [0445] Chromocenters are located on the nuclear periphery of plants and it has been shown that MBD6 localizes strongly to chromocenters when analyzed using microscopy. Therefore, the expectation based on Applicant’s discovery described herein was that MBD6- sf-6059413 Attorney Docket No.: 26223-20027.40 human sHSPs would results in large nuclear foci similar to wild-type MBD6 which correspond to the chromocenters. Results [0446] To determine the impact of human sHSPs on the localization of MBD6, seedling root meristems were imaged across multiple plant lines. Root meristem tissue is known to contain a high density of nuclei providing a large number of nuclear data in a region. This tissue is also known to express all members of the MBD5/6 complex, allowing for directly testing whether human sHSPs can impact localization of MBD6 and if this phenotype is independent of ACD15 and ACD21. [0447] Analysis of MBD6HSPB1, MBD6HSPB3, MBD6HSPB5, and MBD6HSPB8 RFP protein across multiple plant lines revealed clear nuclear localization of all of these fusion proteins. This is consistent with the correct folding and MBD6 fusion proteins allowing the protein to properly localize to the nuclei of cells. [0448] Comparison of the four fusion proteins in mbd5 mbd6 mutant plant backgrounds revealed clear nuclear foci across a multitude of cells for MBD6HSPB1, MBD6HSPB3, MBD6HSPB5 RFP fusion constructs (FIG.25). The formation of nuclear foci demonstrated that these human sHSPs localized similar to wild-type MBD6, causing accumulation at chromocenters. Since the coding sequence of the human sHSPs replaced the StkyC domain, the site for interaction with plant Į-crystalline domain containing proteins ACD15 and ACD21, these results strongly suggest the human sHSPs have replaced the functions of ACD15 and ACD21. [0449] The MBD6HSPB8 RFP construct did not demonstrate clear nuclear foci across cells, but instead resulted in a diffuse RFP signal throughout nuclei (FIG.25, far right panel). This phenotype is similar to the wild-type MBD6 localization without plant sHSPs ACD15 and ACD21. Therefore, this data suggests MBD6HSPB8 RFP was not able to functionally replace the function of Į-crystalline domain proteins in the MBD5/6 complex. It is known from the literature that HSPB1, HSPB3, and HSPB5 form oligomers, while HSPB8 only can form dimers (B. Tedesco et al., Insights on Human Small Heat Shock Proteins and Their Alterations in Diseases. Front Mol Biosci 9, 842149 (2022)), which is consistent with HSPB1, HSPB3, and HSPB5 but not HSPB8 being able complement the function of the StkyC domain in MBD6. It is possible MBD6HSPB8 RFP could be expressed with another human sHSP to form oligomers consistent with functions of some Į-crystalline domain sf-6059413 Attorney Docket No.: 26223-20027.40 proteins that work together with other Į-crystalline domain partner proteins (M. Haslbeck, S. Weinkauf, J. Buchner, Small heat shock proteins: Simplicity meets complexity. J Biol Chem. 294, 2121–2132 (2019)). [0450] These results show that human sHSPs HSPB1, HSPB3, and HSPB5 can replace the function of ACD15 and ACD21, leading to the targeted accumulation of MBD6 at chromocenters. These results also suggest that the HSPB1, HSPB3 and HSPB5 will be similarly able to cause accumulation of proteins to the dCas9 sites in the SunTag systems, replacing the function of the StkyC domain of MBD6, as detailed in the next example. Fusion of HSPB1, HSPB3, and HSPB5 should therefore be useful in concentrating other proteins to a genomic site of interest. It is also anticipated that small heat shock proteins from many other organisms throughout all kingdoms of life may be similarly useful in targeting and concentrating proteins of interest to a genomic site of interest. Sequences [0451] SEQ ID NO: 301: MBD6 promoter, DNA sequence. SEQ ID NO: 302: MBD6StkyCǻ coding DNA sequence. SEQ ID NO: 303: MBD6StkyCǻ protein sequence. SEQ ID NO: 304: HSPB1 DNA sequence. SEQ ID NO: 305: HSPB1 protein sequence. SEQ ID NO: 306: HSPB3 DNA sequence. SEQ ID NO: 307: HSPB3 protein sequence. SEQ ID NO: 308: HSPB5 DNA sequence. SEQ ID NO: 309: HSPB5 protein sequence. SEQ ID NO: 310: HSPB8 DNA sequence. SEQ ID NO: 311: HSPB8 protein sequence. SEQ ID NO: 312: mRFP DNA sequence. SEQ ID NO: 313: mRFP protein sequence. SEQ ID NO: 314: Nos terminator DNA sequence. SEQ ID NO: 315: MBD6-HSPB8 Plasmid Sequence. SEQ ID NO: 316: MBD6-HSPB5 Plasmid Sequence. SEQ ID NO: 317: MBD6-HSPB3 Plasmid Sequence. SEQ ID NO: 318: MBD6-HSPB1 Plasmid Sequence. Example 9: Human small heat shock protein target accumulation using CRISPR-CAS9 system Summary [0452] This Example describes exemplary experimental guidelines for constructing a genome targeting system utilizing a dCas9 SunTag and human, Į-crystalline domain containing, small heat shock proteins (sHSPs). These constructs may be used to target a protein of interest to a specific locus of the genome using dCas9 specific targeting and oligomerization through the Į-crystalline domain of the human sHSPs. Essentially, the sf-6059413 Attorney Docket No.: 26223-20027.40 StkyC domain (which normally recruits the plant Į-crystalline domain containing proteins) of the existing system will be replaced by different human sHSPs. [0453] To demonstrate the efficacy of this technology we will create this dCas9 system using multiple human sHSPs which will be targeted to the promoter of the unmethylated FWA gene in the fwa epiallele background. To track the formation of nuclear bodies in cells formed by the accumulation of protein through human sHSPs, a GFP tag will be added to the dCas9 targeting system. Materials and Methods Cloning of Fusion Proteins and gRNA-fwa [0454] Exemplary structures of these fusion constructs to be used in the CRISPR-Cas9 system are presented in FIGS.26A-26B, 27A-27B, 28A-28B, and 29A-29B. In these figures, different regions of the construct are numerically labeled, with each region representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these figures will also be prepared, as described below in Table 9A. Table 9A: Exemplary Parameters for Fusion Construct Modules
Figure imgf000133_0001
Exemplary Construct Design [0455] To construct the dCas9 system the current SunTag plasmid will be used which contains the dCas9-1xHA-3xNLS-10xGCN4 as well as the two guide RNAs targeting the dCas9 to FWA. Coding sequences of human HSPB1, HSPB3, HSPB5, and HSPB8 will be amplified from cDNA and will be cloned into the SunTag plasmid after cutting the plasmid with the appropriate restriction enzymes. This will add the human sHSPs directly after the GFP and before the HA tag. Features of SunTag_sHSPs include a UBQ10 promoter, dCas9_1xHA_3xNLS_10xGCN4, gRNA-FWA sequences, and the gRNA backbone including the tracrRNA and the gRNA terminator, single chain variable fragment (scFV)_GFP_NLS_1xHA, HSPB1, HSPB3, HSPB5, HSPB8. sf-6059413 Attorney Docket No.: 26223-20027.40 [0456] All the different modules will be amplified by PCR using specific oligos and cloned into a binary plasmid using InFusion (Clontech) Transformation of fwa-4 Plants [0457] Agrobacterium AGL0 cells will be transformed with the final binary vector containing the fusion proteins and the gRNA. Arabidopsis wild type (Col-0) plants will be transformed using floral dip methods. Microscopy Experiments [0458] Root meristem of seedlings selected for hygromycin resistance will be analyzed using an LSM980 confocal microscope. GFP reporter will allow for observation of cellular localization and nuclear phenotypes. sHSPs are known to oligomerize using Į-crystalline domains, and therefore GFP foci are expected to form in nuclei of cells. A SunTag system expressing the scFV-GFP without any human sHSPs will be used as a control in all experiments. Data Analysis [0459] Multiple seedlings of each SunTag-sHSPs will be imaged using confocal microscopy to determine expression of GFP. If GFP signal is concentrated in nuclei of the cells then the construct will be assumed to properly localize without any misfolding. Z-stack of root meristems will be obtained across multiple plant lines to acquire images across many cells for each SunTag-sHSPs construct. ImageJ software will be used to analyze the images and quantify the foci across multiple cells. Expected Results [0460] It is expected that the human sHSPs will oligomerize with other human sHSPs in nuclei of plants. This will lead to the formation of GFP foci representing accumulation of GFP-scFV-sHSP targeted to the promoter of FWA. Given the results in the previous example, in which the human sHSPs were used to replace the StkyC domain of MBD6, we expect that HSPB1, HSPB3 and HSPB5 will cause localization of the fusion proteins to the dCas9 sites, while HSPB8 will fail cause this aggregation, and lead to diffuse localization of the fusion protein throughout the nucleus. [0461] This technology will demonstrate the functional use of human sHSPs to specifically accumulate fusion proteins at a targeted locus utilizing a dCas9 system. By fusing the human sHSPs to other proteins, they would be anticipated to similarly be targeted into the sf-6059413 Attorney Docket No.: 26223-20027.40 nuclear bodies present at the target locus. The human sHSPs should therefore be useful for concentrating a variety of proteins to a genomic site of interest. It is also anticipated that small heat shock proteins from many other organisms throughout all kingdoms of life may be similarly useful in targeting and concentrating proteins of interest to a genomic site of interest. Sequences [0462] SEQ ID NO: 319: UBQ10 promoter, DNA sequence. SEQ ID NO: 320: dCas9_1xHA_3xNLS_10xGCN4, DNA sequence. SEQ ID NO: 321: dCas9_1xHA_3xNLS_10xGCN4, protein sequence. SEQ ID NO: 322: gRNA-FWA sequences, gRNA4 and scaffold. SEQ ID NO: 323: gRNA-FWA sequences, gRNA17 and scaffold. SEQ ID NO: 324: single chain variable fragment (scFV)_GFP_NLS_1xHA, DNA sequence. SEQ ID NO: 325: single chain variable fragment (scFV)_GFP_NLS_1xHA, protein sequence. SEQ ID NO: 326: HSPB1, DNA sequence. SEQ ID NO: 327: HSPB1, protein sequence. SEQ ID NO: 328: HSPB3, DNA sequence. SEQ ID NO: 329: HSPB3, protein sequence. SEQ ID NO: 330: HSPB5, DNA sequence. SEQ ID NO: 331: HSPB5, protein sequence. SEQ ID NO: 332: HSPB8, DNA sequence. SEQ ID NO: 333: HSPB8, protein sequence. SEQ ID NO: 334: SunTag-HSPB8 Plasmid Sequence. SEQ ID NO: 335: SunTag-HSPB5 Plasmid Sequence. SEQ ID NO: 336: SunTag-HSPB1 Plasmid Sequence. SEQ ID NO: 337: SunTag-HSPB3 Plasmid Sequence. Example 10: Targeted hyperaccumulation of Zinc Finger binding domains using small heat shock protein fusion Summary [0463] This example describes experiments in which we created a fusion of a Zinc finger domain to the StkyC domain of MBD6 (StkyCMBD6). These constructs were used to demonstrate that the StkyCMBD6 domain can cause hyperaccumulation of fusion proteins at the zinc finger binding site, even though only a single StkyCMBD6 domain has been added to the fusion. [0464] To test if StkyCMBD6 can lead to the hyperaccumulation of ZF domains at their binding sites, multiple fusion constructs were prepared. The StkyC domain of MBD6 was added to the C-terminus of the ZF domain (amino acids 168-225). This fusion construct contains a C-terminal 3x Flag for western blots and immunoprecipitation experiments and an sf-6059413 Attorney Docket No.: 26223-20027.40 RFP tag in order to observe cellular localization using fluorescence, confocal microscopy. A control construct was also prepared which is lacking the StkyCMBD6. Materials and Methods Cloning of Fusion Proteins [0465] Structures of these fusion constructs used are presented in FIGS.34A-34C. In this figure, different regions of the construct are numerically labeled, with each region representing a respective module of the construct. The different modules of these constructs are presented below in Table 10A. Table 10A: Parameters for Fusion Construct Modules
Figure imgf000136_0001
Construct Design [0466] To construct the ZF-StkyCMBD6 fusion proteins, the coding sequence of UBQ10- 6xZF-2xSV40 NLS was cloned, both with and without the C-terminal StkyCMBD6 Domain, into PENTR_D (Invitrogen) plasmids using infusion reaction (Takara). These final PENTR_D constructs containing UBQ10 promoter, 6xZF-2xSV40 NLS, and StkyCMBD6 were then cloned into the final pGWB553 destination vector containing the C-terminal RFP and Nos terminator using the gateway ligation kit (ThermoFisher). Features of these UBQ10 Promoter, 6xZF-3xFlag-2xSV40NLS, StkyCMBD6, RFP, and Nos terminator are presented. [0467] All the different modules were amplified by PCR using specific oligos and cloned into a binary plasmid using InFusion (Clontech). Transformation of Plants [0468] Agrobacterium AGL0 cells were transformed with the final binary vector containing the fusion proteins. Arabidopsis wild type (Col0) and mbd5 mbd6 plants were transformed using floral dip methods. Progeny of transformed plants (T1s) were planted and screened for hygromycin-resistant plants that incorporate the T-DNA into the Arabidopsis genome, which confers resistance to hygromycin. sf-6059413 Attorney Docket No.: 26223-20027.40 Chromatin Immunoprecipitation Experiments (ChIP-Seq) [0469] Seedlings of both ZF alone vs ZF-StkyCMBD6 in both wild type and mbd5 mbd6 mutant plants were grown on plates for ~2 weeks and then harvested. ChIP-Seq experiments were performed using anti-flag tag beads. Results [0470] ChIP-Seq analysis revealed that ZF-StkyCMBD6 not only localized to zinc finger binding sites but also had increased peak intensity on average compared to the ZF alone controls in both wild type and mbd5 mbd6 plants. This shows that the addition of the StkyCMBD6 domain causes hyperaccumulation of the ZF binding fusions at the target sites. [0471] Comparing ZF-StkyCMBD6 peaks for ZF-StkyCMBD6 binding in mbd5 mbd6 mutant plants consistently resulted in broader peaks while also increasing the peak intensity (FIG. 35). This is consistent with increased protein accumulation with SunTagStkyC construct in the mbd5 mbd6 mutant plants. This is likely because in mbd5 mbd6 mutant plants, the natural binding sites for ACD15 and ACD21 are eliminated, increasing the pool of Į-crystalline domain containing proteins ACD15 and ACD21 available to oligomerize with ZF-StkyCMBD6 fusions, thereby recruiting additional ZF-StkyCMBD6 fusions in a broader domain that can interact with chromatin. These results suggest that the amount of protein targeted to a genomic locus can be tunable by modifying the level of Į-crystalline domain proteins available for oligomerization. These results also suggest that oligomerization via Į-crystalline domain proteins could be used to enhance the activity of transcription factors, by bringing more transcriptional activation to target sites. For example, Į-crystalline domain protein- enhanced transcription factors could be used to make developmental regulators more potent, for example in stem cell reprogramming for human health or for morphoregulator factors used in plant regeneration processes. Sequences [0472] SEQ ID NO: 338: UBQ10 promoter (DNA sequence). SEQ ID NO: 339: StkyCMBD6 (DNA sequence). SEQ ID NO: 340: StkyCMBD6 (protein sequence). SEQ ID NO: 341: 6xZF-3xFlag-2xSV40 NLS (DNA sequence). SEQ ID NO: 342: 6xZF-3xFlag-2xSV40 NLS (protein sequence). SEQ ID NO: 343: mRFP sequence (DNA sequence). SEQ ID NO: 344: mRFP sequence (protein sequence). SEQ ID NO: 345: Nos terminator (DNA sequence). SEQ ID NO: 346: ZF-RFP Alone Plasmid Sequence. SEQ ID NO: 347: ZF-StkyCMBD6 RFP Plasmid Sequence. sf-6059413 Attorney Docket No.: 26223-20027.40 Example 11: ACD15 and ACD21 chaperone system for use to increase editing efficiency of Cas^. Summary [0473] This Example describes exemplary experimental guidelines for constructing a Cas^ (also known as CasPhi and Cas12J) nuclease combined with plant ACD15-ACD21 mediated accumulation technology using the plant StkyC domain of MBD6 to increase accumulation of Cas^ at the targeted locus and therefore increase editing efficiency. To demonstrate the efficacy of this technology we will create this Cas^-StkyCMBD6 system which will be targeted to the promoter of the FWA gene. Materials and Methods Cloning of Fusion Proteins and gRNA-FWA [0474] Exemplary structures of these fusion constructs to be used in the Cas^-StkyCMBD6 system are presented in FIGS.36A-36B. In these figures, different regions of the construct are numerically labeled, with each region representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these figures will also be prepared, as described below in Table 11A. Table 11A: Exemplary Parameters for Fusion Construct Modules
Figure imgf000138_0001
Exemplary Construct Design [0475] To construct the Cas^-StkyCMBD6 system a current Cas^ plasmid lacking a StkyCMBD6 domain will be used which contains the Cas^ as well as the guide RNA targeting the Cas^ to FWA. Coding sequences for the StkyCMBD6 domain will be amplified from genomic DNA and will be cloned into a separate Cas^ plasmid after cutting the plasmid with the appropriate restriction enzymes. This will add the StkyCMBD6directly after the Cas^. Features of Cas^-StkyCMBD6 include a UBQ10 promoter, Cas^, gRNA-FWA sequence, StkyCMBD6 Domain, U6 promoter, Terminator-PolyT, and RbcS-E9t. All the different sf-6059413 Attorney Docket No.: 26223-20027.40 modules will be amplified by PCR using specific oligos and cloned into a binary plasmid using InFusion (Clontech) Transformation of protoplasts [0476] Protoplasts will be created following current protocols. Plasmids will be directly Transformation into protoplasts made from wild type (Col0) plants. Amplicon Sequence [0477] To determine if the promoter of FWA was successfully edited we will perform whole genome sequencing on protoplasts transfected with the Cas^-StkyCMBD6. Data Analysis [0478] It is expected that Į-crystalline domain proteins ACD15/ACD21 will oligomerize in nuclei of plants. This will lead to the accumulation of Cas^-StkyCMBD6 at the site of interest. This will lead to increased opportunity for an editing event to occur. Cas^ will create double strand breaks, which will need to be repaired through nonhomologous end joining (NHEJ), leading to insertion and deletion mutations at the guide 4 site of FWA promoter. [0479] It is expected that Cas^-StkyCMBD6 will demonstrate a higher editing frequency compared to Cas^ alone due to the increased accumulation of Cas^-StkyCMBD6. Cas^ currently shows extremely low editing efficiency in protoplasts from wild type plants, in part because the FWA gene is methylated and the DNA is relatively inaccessibly. It is anticipated that the addition of the StkyC domain allow Cas^ to gain more frequent access to its target sites. [0480] This technology will demonstrate the functional use of ACD15 and ACD21 to specifically accumulate Cas^-StkyCMBD6 at the editing locus. This technology will further demonstrate the use of targeted accumulation of a genome editing enzyme as a mechanism for increasing editing efficiency, without needing extensive optimization of the nuclease enzyme itself. Sequences [0481] SEQ ID NO: 348: UBQ10 promoter (DNA sequence). SEQ ID NO: 349: gRNA- FWA sequences (gRNA4 and scaffold). SEQ ID NO: 350: StkyCMBD6 (DNA sequence). SEQ ID NO: 351: StkyCMBD6 (protein sequence). SEQ ID NO: 352: U6 Promoter, DNA. SEQ ID NO: 353: Cas^, DNA. Terminator-PolyT, DNA: tttttttt. SEQ ID sf-6059413 Attorney Docket No.: 26223-20027.40 NO: 355: RbcS-E9t, DNA. SEQ ID NO: 356: Cas^-StkyCMBD6 Plasmid DNA Sequence. SEQ ID NO: 357: Cas^ alone Plasmid Sequence. Example 12: ACD15 and ACD21 chaperone system for use to increase editing efficiency of Cas^ in HEK293 Cells Summary [0482] This Example describes exemplary experimental guidelines for constructing a Cas^ (Cas12J) nuclease combined with plant ACD15-ACD21 mediated accumulation technology to increase accumulation of Cas^ at the targeted locus and therefore increase editing efficiency. [0483] To demonstrate the efficacy of this technology we will create this Cas^-ACD15- ACD21 system which will be targeted to a GFP gene inserted into the genome of the HEK293 cells expressed using a EF1alpha promoter (P. Pausch, B. Al-Shayeb, E. Bisom- Rapp, C. A. Tsuchida, Z. Li, B. F. Cress, G. J. Knott, S. E. Jacobsen, J. F. Banfield, J. A. Doudna, CRISPR-Casĭ from huge phages is a hypercompact genome editor. Science.369, 333–337 (2020)). Materials and Methods Cloning of Fusion Proteins and gRNA-GFP [0484] Exemplary structures of these fusion constructs to be used in the Cas^-ACD15- ACD21 system are presented in FIGS.37A-37D and FIGS.38A-38D. In these figures, different regions of the construct are numerically labeled, with each region representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these figures will also be prepared, as described below in Table 12A. [0485] Table 12A: Exemplary Parameters for Fusion Construct Modules
Figure imgf000140_0001
sf-6059413 Attorney Docket No.: 26223-20027.40
Figure imgf000141_0001
Exemplary Construct Design [0486] To construct the Cas^-ACD15-ACD21 system a current Cas^ plasmid lacking a ACD-ACD21 will be used which contains the Cas^ as well as the guide RNA(s) targeting the Cas^ to GFP. Coding sequences for the ACD15-ACD21 will be amplified from cDNA and will be cloned into a separate Cas^ plasmid after cutting the plasmid with the appropriate restriction enzymes. This will add the ACD15-ACD21 directly the Cas^ with an XTEN linker in between. Features of Cas^-ACD15-ACD21 include a Chicken ȕ-actin Promoter, Cas^, gRNA-GFP sequences, ACD15, ACD21, U6 promoter, Terminator-PolyT, XTEN Linker, and bGH poly(A) signal. [0487] All the different modules will be amplified by PCR using specific oligos and cloned into a binary plasmid using InFusion (Clontech) Transfection of HEK293 Cells [0488] Cas^-ACD15-ACD21 and Cas^ alone constructs will be transfected into HEK293 cell expressing GFP following previous protocols. Both Cas^-ACD15-ACD21 and Cas^ without any gRNA will transfected to control for background editing signal. Flow Cytometry [0489] To determine if the Cas^-ACD15-ACD21 has increased editing efficiency compared to Cas^ alone, HEK293 cells will be sorted using flow cytometry based on GFP fluorescence. This will provide percentages of cells with and without GFP signal which can be directly compared. The GFP negative cells indicate editing. Data Analysis [0490] It is expected that Į-crystalline domain proteins ACD15/ACD21 will oligomerize Cas^ fusion constructs in the nuclei of HEK293 cells. This will lead to the accumulation of Cas^-ACD15-ACD21 at the GFP cut sites. This will lead to increased opportunity for an editing event to occur. Cas^ will create double strand breaks, which will need to be repaired through nonhomologous end joining (NHEJ), leading to insertion and deletion mutations at the GFP gene. [0491] It is expected that Cas^-ACD15-ACD21 will demonstrate a higher editing percentage compared to Cas^ alone due to the increased accumulation of Cas^-ACD15- sf-6059413 Attorney Docket No.: 26223-20027.40 ACD21. It is therefore anticipated that this will result in a higher proportion of GFP negative cells when cells are transfected with the ACD15-ACD21 containing fusions compared to cells transfected with Cas^ that is not fused with ACD15-ACD21. [0492] This example will demonstrate the functional use of ACD15 and ACD21 to specifically accumulate Cas^ at an editing locus in human cells. This technology will further demonstrate the use of targeted accumulation of genome editing enzymes as a mechanism for increasing editing efficiency without needing extensive optimization of the nuclease enzyme itself. Example 13a: Targeting gRNAs and template DNAs for enhanced genome editing [0493] This example outlines proposed future experiments for targeting gRNAs and template DNAs for enhanced genome editing. [0494] The assembly of Cas nuclease proteins with their gRNA sequences is a key limiting biochemical step in genome editing (2). For this reason, researchers commonly achieve higher editing efficiencies when Cas protein is preassembled with its gRNA in a high concentration in vitro reaction to form ribonucleoprotein (RNP), after which the RNPs are delivered to cells for editing (2). We propose to target the hyperaccumulation of gRNAs to a genomic site of interest by fusing the gRNA with the MS2 RNA binding sequence (3), and then expressing the MS2 RNA binding protein fused to StkyC domains, or to sHSPs that have been shown to be effective. In this way the gRNA should be compartmentalized along with the Cas protein at chromatin sites, increasing the effective concentration and stimulating RNP formation in vivo. In addition to testing this with Casĭ, we also plan to test this with CasL. It was previously found that CasL shows high editing in protoplasts when delivered as RNPs, but very low editing when the Cas protein and gRNA are expressed from plasmids (4). It is assumed that this difference is because of poor RNP formation in vivo. If this is the case, we will expect to see a dramatic improvement of editing by compartmentalizing both CasL and its gRNA to genomic sites. [0495] An important genome editing technique involves delivering a DNA template together with the gRNA and Cas protein to induce either insertion of the DNA into the genomic cut site, or homologous recombination to create precise sequence replacements (5). For example, oligonucleotides can be inserted into CRISPR/gRNA cut sites in protoplasts, or in plants by particle bombardment, but the efficiency is very low (5, 6). We propose to increase this efficiency by increasing the concentration of oligonucleotides at the target site. sf-6059413 Attorney Docket No.: 26223-20027.40 To do this we will covalently attach oligonucleotides in vitro to the HUH endonuclease protein (6-8), which has been purified as a fusion protein with StkyC domains or sHSPs that have been shown to be effective. The HUH-DNA complex will then be delivered along with the other reagents for genome editing to protoplasts and tested for DNA insertion efficiency. If we encounter difficulties with the HUH system, we will use the SNAP-tag system in which O6-benzylguanine-labeled DNA oligonucleotides are covalently linked in vitro to SNAP fusion proteins (9, 10) prior to delivery to protoplasts. Insertion frequencies will be measured by amplicon sequencing. [0496] Homologous recombination yielding precise sequence replacement is very inefficient relative to DNA insertion which is driven by NHEJ (non-homologous end joining) (5). If the approaches described in the previous paragraph are successful, future experiments could involve using the sHSPs to target homologous recombination machinery components (11-17) to stimulate precise replacement of DNA sequences. Other types of editing systems could also be targeted with sHSPs such as base editors (18) and prime editors (19). 1. E. R. Waters, E. Vierling, Plant small heat shock proteins - evolutionary and functional diversity. New Phytol 227, 24-37 (2020). 2. M. A. DeWitt, J. E. Corn, D. Carroll, Genome editing via delivery of Cas9 ribonucleoprotein. Methods 121-122, 9-15 (2017). 3. C. Jiang et al., Multiplexed Gene Engineering Based on dCas9 and gRNA-tRNA Array Encoded on Single Transcript. Int J Mol Sci 24, (2023). 4. B. Al-Shayeb et al., Diverse virus-encoded CRISPR-Cas systems include streamlined genome editors. Cell 185, 4574-4586 e4516 (2022). 5. C. Schmidt, M. Pacher, H. Puchta, DNA Break Repair in Plants and Its Application for Genome Engineering. Methods Mol Biol 1864, 237-266 (2019). 6. E. D. Nagy et al., Site-directed integration of exogenous DNA into the soybean genome by LbCas12a fused to a plant viral HUH endonuclease. Plant J 111, 905-916 (2022). 7. E. J. Aird, K. N. Lovendahl, A. St Martin, R. S. Harris, W. R. Gordon, Increasing Cas9-mediated homology-directed repair efficiency through covalent tethering of DNA repair template. Commun Biol 1, 54 (2018). 8. K. N. Lovendahl, A. N. Hayward, W. R. Gordon, Sequence-Directed Covalent Protein-DNA Linkages in a Single Step Using HUH-Tags. J Am Chem Soc 139, 7030- 7035 (2017). 9. N. Savic et al., Covalent linkage of the DNA repair template to the CRISPR-Cas9 nuclease enhances homology-directed repair. Elife 7, (2018). 10. N. Savic et al., In vitro Generation of CRISPR-Cas9 Complexes with Covalently Bound Repair Templates for Genome Editing in Mammalian Cells. Bio Protoc 9, (2019). 11. M. Charpentier et al., CtIP fusion to Cas9 enhances transgene integration by homology-dependent repair. Nat Commun 9, 1133 (2018). 12. N. T. Tran et al., Enhancement of Precise Gene Editing by the Association of Cas9 With Homologous Recombination Factors. Frontiers in genetics 10, 365 (2019). sf-6059413 Attorney Docket No.: 26223-20027.40 13. A. Carusillo et al., A novel Cas9 fusion protein promotes targeted genome editing with reduced mutational burden in primary human cells. Nucleic Acids Res 51, 4660- 4673 (2023). 14. H. Shaked, C. Melamed-Bessudo, A. A. Levy, High-frequency gene targeting in Arabidopsis plants expressing the yeast RAD54 gene. Proc Natl Acad Sci U S A 102, 12265-12269 (2005). 15. G. Shalev, Y. Sitrit, N. Avivi-Ragolski, C. Lichtenstein, A. A. Levy, Stimulation of homologous recombination in plants by expression of the bacterial resolvase ruvC [In Process Citation]. Proc Natl Acad Sci U S A 96, 7398-7402 (1999). 16. A. Barakate, E. Keir, H. Oakey, C. Halpin, Stimulation of homologous recombination in plants expressing heterologous recombinases. BMC Plant Biol 20, 336 (2020). 17. Z. Ali et al., Fusion of the Cas9 endonuclease and the VirD2 relaxase facilitates homology-directed repair for precise genome engineering in rice. Commun Biol 3, 44 (2020). 18. S. J. Tekel, N. Brookhouser, K. Standage-Beier, X. Wang, D. A. Brafman, Cytosine and adenosine base editing in human pluripotent stem cells using transient reporters for editing enrichment. Nat Protoc 16, 3596-3624 (2021). 19. K. Godbout, J. P. Tremblay, Prime Editing for Human Gene Therapy: Where Are We Now? Cells 12, (2023). [0497] Sequences for Examples 12-13a: SEQ ID NO: 358: Chicken ȕ-actin Promoter (DNA sequence). SEQ ID NO: 359: U6 Promoter (DNA sequence). SEQ ID NO: 360: Cas^ (DNA sequence). SEQ ID NO: 361: Cas^ (protein sequence). SEQ ID NO: 362: ACD15 DNA sequence. SEQ ID NO: 363: ACD15 protein sequence). SEQ ID NO: 364: ACD21 DNA sequence. SEQ ID NO: 13: ACD21 protein sequence. SEQ ID NO: 365: XTEN Linker DNA sequence. Terminator-PolyT DNA: Tttttttt. SEQ ID NO: 367: bGH poly(A) signal, DNA. SEQ ID NO: 368: gRNA9 DNA sequence. SEQ ID NO: 369: gRNA6 DNA sequence. SEQ ID NO: 370: gRNA8 DNA. SEQ ID NO: 371: No gRNA DNA sequence. SEQ ID NO: 372: CRISPR Repeat DNA Sequence. SEQ ID NO: 373: Cas^ -No gRNA Plasmid DNA Sequence. SEQ ID NO: 374: Cas^-gRNA9 DNA sequence. SEQ ID NO: 375: Cas^-gRNA6 DNA sequence. SEQ ID NO: 376: Cas^-gRNA8 DNA sequence. SEQ ID NO: 377: Cas^- ACD15-ACD21 no gRNA DNA sequence. SEQ ID NO: 378: Cas^-ACD15-ACD21-gRNA9 DNA sequence. SEQ ID NO: 379: Cas^-ACD15-ACD21-gRNA6 DNA sequence. SEQ ID NO: 380: Cas^-ACD15-ACD21-gRNA8 DNA sequence. Example 13b: ACD15 and ACD21 chaperone system used to increase editing efficiency of Cas9 Summary [0498] This Example describes that addition of the StkyC domain can increase the efficiency of genome editing. Specifically, this example describes experiments wherein ACD15- and ACD21-accumulation technology was used to increase the editing efficiency of sf-6059413 Attorney Docket No.: 26223-20027.40 the Cas9 nuclease. This was achieved using the plant StkyC domain of MBD6 (StkyCMBD6) to cause multimerization and accumulation of Cas9 at the target locus and therefore increase editing efficiency. [0499] To demonstrate the efficacy of this technology, multiple variations of Cas9- StkyCMBD6 were created and targeted to the promoter of the FWA gene using either guide 4 (g4) or guide 17 (g17). Materials and Methods Cloning of Fusion Proteins and gRNA-FWA [0500] Structures of these fusion constructs used are presented in (FIGS.42A-45C). In these Figures, each region represents a respective module of the construct. Fusion constructs contained modules as described below in Table 2A. Table 2A: Parameters for Fusion Construct Modules
Figure imgf000145_0001
Construct Design [0501] Three different constructs were created using the StkyCMBD6: Cas9-XTEN- StkyCMBD6 (FIGS.42A-42C), Cas9-SunTag-1xGCN4 (FIGS.43A-43B), and Cas9-SunTag- 4xGCN4 (FIGS.44A-44C). All constructs were created using Golden Gate Assembly to assemble each expression cassette into a binary backbone using Cermak et al. (https://pubmed[dot]ncbi[dot]nlm[dot]nih[dot]gov/28522548/). To construct the Cas9- XTEN-StkyCMBD6 expression cassette, a current Cas9 sequence lacking the XTEN- StkyCMBD6 domain was ligated with coding sequences for the XTEN-StkyCMBD6 domain, amplified from plasmids containing those sequences, after cutting the cassettes and final plasmid with the appropriate restriction enzymes. This added the XTEN-StkyCMBD6 directly after the Cas9. To make the Cas9-1xGCN4- StkyCMBD6 and the Cas9-4xGCN4- StkyCMBD6 cassettes, the Cas9-1xGCN4 and 4xGCN4 constructs were amplified from current SunTag plasmids and cloned into plasmids for the Golden Gate Assembly, which were cut with the sf-6059413 Attorney Docket No.: 26223-20027.40 necessary restriction enzymes. Features of these constructs include a UBQ10 promoter, Cas9, gRNA-FWA sequence, StkyCMBD6 Domain, CamV 35S promoters, GCN4 sites, and scFv- GFP-StkC. Transformation of protoplasts [0502] Protoplasts were created following current protocols. Plasmids were directly transformed into protoplasts made from wild type (Col0) plants. Amplicon Sequencing [0503] To determine if the promoter of FWA was successfully edited, amplicon sequencing was performed on protoplasts transfected with the Cas9 constructs. Data Analysis [0504] The hypothesis was that ACD15/ACD21 would oligomerize in nuclei of plants and lead to the accumulation of the Cas9 constructs at the site of interest, presenting increased opportunities for an editing event to occur. Cas9 was expected to create double strand breaks, which would need to be repaired through nonhomologous end joining (NHEJ), leading to insertion and deletion mutations at the guide 4 or guide 17 site of FWA promoter. [0505] Accordingly, Cas9 constructs containing StkyC domains were designed with the goal of demonstrating a higher editing rate compared to Cas9 alone due to the increased accumulation of Cas9-StkyCMBD6. Cas9 alone is known to show low editing efficiency in protoplasts, in part because the FWA gene is methylated and the DNA is relatively inaccessibly (FIGS.41A-41C). Thus, the constructs described in this Example were designed such that the addition of the StkyC domain would allow Cas9 to gain more frequent access to its target sites. [0506] These constructs demonstrate the functional use of ACD15 and ACD21 to specifically accumulate Cas9 at the editing locus. This technology further demonstrated the use of targeted accumulation of a genome editing enzyme as a mechanism for increasing editing efficiency, without needing extensive optimization of the nuclease enzyme itself. Results [0507] All Cas9 constructs containing StkyCMBD6 domain transformed in protoplasts from the wild type (Col-0) background resulted in increased editing at guide 4 and/or guide 17 (FIGS.41B-41C). These results also suggested that adding more of the StkyC domain (1x vs 4x) may lead to more of an increase in the editing efficiency. sf-6059413 Attorney Docket No.: 26223-20027.40 [0508] These results demonstrated the use of the StkyCMBD6 domain to increase the editing efficiency of Cas9. These editing results also occurred at sites of very low chromatin accessibility, demonstrating the implications of this work to aid in the localizing of nucleases to regions often difficult to edit (FIG.41A). Sequences for Example 13b: [0509] SEQ ID NO: 381: UBQ10 promoter (DNA sequence). SEQ ID NO: 382: gRNA- FWA: gRNA4 and scaffold (DNA sequence). SEQ ID NO: 383: gRNA-FWA: StkyCMBD6 (DNA sequence). SEQ ID NO: 384: gRNA-FWA: StkyCMBD6 (protein sequence). SEQ ID NO: 385: CaMV 35S promoter (DNA sequence). SEQ ID NO: 386: Cas9 (DNA sequence). SEQ ID NO: 387: Cas9-XTEN-StkyCMBD6 Plasmid Sequence (guide 4; DNA sequence). SEQ ID NO: 388: Cas9-XTEN-StkyCMBD6 Plasmid Sequence (guide 17). SEQ ID NO: 389: Cas9 Plasmid Sequence (Guide 4; DNA sequence). SEQ ID NO: 390: Cas9 Plasmid Sequence (Guide 17; DNA sequence). SEQ ID NO: 391: Cas9-SunTag-1xGCN4 (Guide 4; DNA sequence). SEQ ID NO: 392: Cas9-SunTag-4xGCN4 Plasmid Sequence (Guide 4; DNA sequence). SEQ ID NO: 393: Cas9-SunTag-4xGCN4 Plasmid Sequence (Guide 17; DNA sequence). Example 14: Targeted protein accumulation of CRISPR-Cas9 system using હ- crystalline domain proteins Summary [0510] This Example describes a number of additional ACD proteins from different organisms and shows that many, but not all, can multimerize highly like the ACD15/21 proteins. Specifically, this Example describes experiments in which a genome targeting system was constructed utilizing a dCas9 SunTag system and હ-crystalline domain (ACD)- containing proteins. These constructs were used to demonstrate the specific accumulation of a protein of interest to a specific locus of the genome using dCas9-specific targeting and oligomerization through ACD proteins. In this system, the StkyC domain (which normally recruits the plant ACD proteins) of the previously described SunTag system (in which ACD15, ACD21 and SLN regulate accumulation and mobility of MBD6 to silence genes and transposable elements; Science Advances, 9(46):eadi9036. doi: 10.1126/sciadv.adi9036) was replaced by different ACD proteins to facilitate locus specific accumulation of protein constructs. sf-6059413 Attorney Docket No.: 26223-20027.40 [0511] To demonstrate the efficacy of this technology and screen the diverse range of ACD proteins, dCas9 systems were created using multiple ACD proteins which were targeted to the promoter of the FWA gene in wild type Arabidopsis thaliana plants (Col-0). Candidate ACD proteins were discovered through sequence homology analysis of the existing Arabidopsis thaliana ACD proteins (ACD21 and ACD15), which were shown to cause higher order multimerization of the MBD5/6 complex (ACD15, ACD21 and SLN regulate accumulation and mobility of MBD6 to silence genes and transposable elements; Science Advances, 9(46):eadi9036. doi: 10.1126/sciadv.adi9036) and are capable of causing hyperaccumulation of the SunTag-StkyC system discussed previously (ACD15, ACD21 and SLN regulate accumulation and mobility of MBD6 to silence genes and transposable elements; Science Advances, 9(46):eadi9036. doi: 10.1126/sciadv.adi9036.). To track the formation of nuclear bodies in live cells, formed by the accumulation of protein through ACD proteins, a GFP tag was included within the dCas9 targeting system. Most of these previously uncharacterized ACD proteins could function similarly to ACD15 and ACD21 and form discrete nuclear bodies, indicating that they can also multimerize. Materials and Methods Cloning of Fusion Proteins and gRNA-FWA [0512] Structures of the fusion constructs used in the CRISPR-Cas9 system are presented in FIGS.46A-FIG.56C. In FIGS.46A-FIG.56C, different regions of each construct are labeled, with each region representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in FIG.46A-FIG.56C were also prepared as described below in Table 2A. Table 2A: Parameters for Fusion Construct Modules
Figure imgf000148_0001
sf-6059413 Attorney Docket No.: 26223-20027.40 Construct Design [0513] To construct the dCas9 system, a SunTag plasmid was used that contains the dCas9-1xHA-3xNLS-10xGCN4 as well as the two guide RNAs targeting the dCas9 to FWA. Coding sequences of the ACD proteins from module 3 were amplified from cDNA or gene blocks ordered of the cDNA sequences and were cloned into the SunTag plasmid after cutting the plasmid with the appropriate restriction enzymes. This process added the ACD proteins directly after the GFP and before the HA tag. Features of SunTag_ACD proteins include a UBQ10 promoter, dCas9_1xHA_3xNLS_10xGCN4, gRNA-FWA sequences, single chain variable fragment (scFV)_GFP_NLS_1xHA, HSPB1, HSPB4, HSPB5, HSPB8, Chlamydomonas reinhardtii ACD, Saccharolobus solfataricus ACD, Oryza sativa ACD, Solanum tuberosum ACD, Solanum lycopersicum ACD, Deinococcus radiodurans ACD, and Zea Mays ACD. All the different modules were amplified by PCR using specific oligos and cloned into a binary plasmid using InFusion (Clontech). Transformation of Plants [0514] Agrobacterium AGL0 cells were transformed with the final binary vector containing the fusion proteins and the gRNA. Wild-type (Col-0) Arabidopsis thaliana plants were transformed using floral dip methods. Microscopy Experiments [0515] The root meristem regions of hygromycin resistance-positive seedlings were analyzed using an LSM980 confocal microscope. The GFP reporter allowed for observations of cellular localization and nuclear morphologies. Oligomerization of ACD proteins was assessed by screening for GFP foci in nuclei of transformed cells. The SunTag_HSPB8 construct, known to be unable to form oligomeric assemblies and shown in Example 8 herein to be unable to create visible nuclear foci when fused with Arabidopsis thaliana MBD6 protein FIGS.25A-25B), was used as a negative control. Data Analysis [0516] Multiple seedlings of each SunTag-ACD protein were imaged using confocal microscopy to determine expression of GFP. If GFP signal was concentrated in nuclei of the cells, this was interpreted as the construct having properly localized without any misfolding. Z-stacks of root meristems were obtained across multiple plant lines to acquire images across many cells for each SunTag-ACD protein construct. ImageJ software was used to analyze the images and quantify the foci across multiple cells. sf-6059413 Attorney Docket No.: 26223-20027.40 [0517] The hypothesis was that the ACD proteins would oligomerize in nuclei of plants with potential variability in the specificity or strength of these interactions. Since the newly tested ACD proteins had been previously relatively uncharacterized and share sequence similarity to ACD15 and ACD21, the hypothesis was that there was potential for interaction between these new ACDs and the endogenous Arabidopsis thaliana ACD proteins, which would result in chromocenter localization and therefore more than two foci per nucleus. The accumulation of SunTag-ACD proteins was thus assessed by formation of GFP foci representing accumulation of dCas9-GFP-scFv-ACD protein targeted to the promoter of FWA. Given the results in Example 8 herein, in which some human ACD proteins functionally replaced the StkyC domain of MBD6, we expected that HSPB1, HSPB5, and HSPB4 would cause localization of the fusion proteins to the dCas9 sites, while HSPB8 would fail to cause this aggregation and lead to diffuse localization of the fusion protein throughout the nucleus. These human ACD proteins were therefore chosen as positive and negative controls to compare to the novel ACD proteins from a diverse range of species. [0518] This example demonstrates the functional use of different ACD proteins to specifically accumulate at a targeted locus utilizing a dCas9 system. By fusing the ACD proteins to other proteins, they would be anticipated to similarly be targeted into the nuclear bodies present at the target locus. The ACD protein should therefore be useful for concentrating a variety of proteins to a genomic site of interest. It is also anticipated that ACD proteins from many other organisms throughout all kingdoms of life may be similarly useful in targeting and concentrating proteins of interest to a genomic locus. Results [0519] To determine the impact of ACD proteins on the localization of the SunTag targeting system, seedling root meristems were imaged across multiple plant lines. [0520] Microscopic analysis of seedlings demonstrated clear foci formation of all ACD proteins tested, with the exceptions of HSPB8 (FIG.52B), Oryza sativa ACD (FIG.53B), and Deinococcus radiodurans ACD (FIG.54B). The morphology of these foci varied; some ACD proteins (HSPB1 (FIG.46B), HSPB4 (FIG.47B), HSPB5 (FIG.48B), Chlamydomonas (FIG.49B), Sacchrolobus (FIG.50B), and Zea mays (FIG.51B)) created few, distinct GFP foci similar to the previously characterized SunTag-StkyC construct (ACD15, ACD21 and SLN regulate accumulation and mobility of MBD6 to silence genes and transposable elements. Science Advances, 9(46):eadi9036. doi: 10.1126/sciadv.adi9036), sf-6059413 Attorney Docket No.: 26223-20027.40 while two ACD proteins created many foci in the nuclei (Solanum tuberosum (FIG.55B) and Solanum lycopersicum (FIG.56B)), some of which likely represent chromocenters, since some of them overlapped with DAPI staining. [0521] ACD proteins from Chlamydomonas reinhardtii, Zea mays, and Sacchrolobus solfataricus, along with human HSPB1, created the most consistent accumulation of SunTag GFP foci into distinct foci, very often only two foci per nucleus, consistent with strong multimerization driving the majority of these ACD proteins into the nuclear bodies corresponding to the two dCas9 binding sites. The use of these ACDs may be desirable when very high levels of multimerization are desired, for example, to drive the majority of ACD fusions proteins to dCas9 sites. [0522] On the other hand, HSPB4, and HSPB5 created very clear nuclear bodies, but they were less intense. We also observed background GFP signal throughout the nucleoplasm, as well as numerous smaller nuclear bodies. This suggests that HSPB4 and HSPB5 induced a lower level of multimerization. These ACD proteins may be ideal in situations where some, but less, multimerization is desirable, for example, in active Cas genome editing where it would be desirable to have many smaller multimerized clusters for optimal genome scanning and editing. [0523] Three of the tested ACD proteins (HSPB8, Oryza sativa, and Deinococcus radiodurans) did not form clear visible nuclear foci but still showed a clear nuclear GFP signal (FIGS.52B, 53B, and 54B, respectively). These results suggest that some ACDs (e.g., those shown in FIGS.52B, 53B, and 54B) do not multimerize sufficiently to cause the formation of visible nuclear bodies. These ACDs may multimerize to some extent, however, and may be desirable when lower levels of multimerization are desirable, for example in active Cas genome editing where it would be desirable to have many smaller multimerized clusters for optimal genome scanning and editing. [0524] The Solanum tuberosum and Solanum lycopersicum ACD proteins resulted in foci which both overlap with and do not overlap with chromocenters (DAPI stained bodies), suggesting possible interaction with Arabidopsis thaliana chromocenter complexes. Due to the high sequence homology with ACD15 and ACD21, the most likely interactions were the endogenous MBD5/6 complex components. These ACD proteins would be useful when localization to both Cas9 binding sites, and chromocenters, is desirable, for example genome editing or dCas9 binding of sites present in heterochromatin. sf-6059413 Attorney Docket No.: 26223-20027.40 [0525] The results of this initial screen demonstrated the use of ACD proteins from a diverse range of organisms to multimerize and cause the accumulation of fusion proteins through the SunTag targeting system. These results further suggest conservation in the ability of ACD proteins of different organisms to multimerize and accumulate in the genome and suggest that many ACD proteins from many organisms can be used for this purpose. Sequences for Example 14: [0526] SEQ ID NO: 394: UBQ10 promoter (DNA sequence). SEQ ID NO: 395: dCas9_1xHA_3xNLS_10xGCN4 (DNA sequence). SEQ ID NO: 396: dCas9_1xHA_3xNLS_10xGCN4 (protein sequence). SEQ ID NO: 397: gRNA-FWA sequences: gRNA4 and scaffold. SEQ ID NO: 398: gRNA-FWA sequences: gRNA17 and scaffold. SEQ ID NO: 399: single chain variable fragment (scFV)_GFP_NLS_1xHA (DNA sequence). SEQ ID NO: 400: single chain variable fragment (scFV)_GFP_NLS_1xHA (protein sequence). SEQ ID NO: 401: HSPB1 (DNA sequence). SEQ ID NO: 402: HSPB1 (protein sequence). SEQ ID NO: 403: HSPB5 (DNA sequence). SEQ ID NO: 404: HSPB5 (protein sequence). SEQ ID NO: 405: HSPB4 (DNA sequence). SEQ ID NO: 406: HSPB4 (protein sequence). SEQ ID NO: 407: SunTag HSPB4 Plasmid Sequence. SEQ ID NO: 408: HSPB8 (DNA sequence). SEQ ID NO: 409: HSPB8 (protein sequence). SEQ ID NO: 410: SunTag-HSPB8 Plasmid Sequence. SEQ ID NO: 411: SunTag-HSPB5 Plasmid Sequence. SEQ ID NO: 412: SunTag-HSPB1 Plasmid Sequence. SEQ ID NO: 413: Chlamydomonas reinhardtii ACD protein (DNA sequence). SEQ ID NO: 414: Chlamydomonas reinhardtii ACD protein (protein sequence). SEQ ID NO: 415: SunTag-Chlamydomonas reinhardtii ACD Plasmid Sequence. SEQ ID NO: 416: Saccharolobus solfataricus ACD Protein (DNA sequence). SEQ ID NO: 417: Saccharolobus solfataricus ACD Protein (protein sequence). SEQ ID NO: 418: SunTag-Saccharolobus solfataricus ACD Plasmid Sequence. SEQ ID NO: 419: Oryza sativa ACD Protein Sequence (DNA sequence). SEQ ID NO: 420: Oryza sativa ACD Protein Sequence (protein sequence). SEQ ID NO: 421: SunTag-Oryza sativa ACD Plasmid Sequence. SEQ ID NO: 422: Solanum tuberosum ACD Protein (DNA sequence). SEQ ID NO: 423: Solanum tuberosum ACD Protein (protein sequence). SEQ ID NO: 424: SunTag-Solanum tuberosum ACD Plasmid Sequence. SEQ ID NO: 425: Solanum lycopersicum ACD Protein Sequence (DNA sequence). SEQ ID NO: 426: Solanum lycopersicum ACD Protein Sequence (protein sequence). SEQ ID NO: 427: SunTag-Solanum lycopersicum ACD Plasmid Sequence. SEQ ID NO: 428: Deinococcus radtiodurans ACD Protein Sequence (DNA sequence). SEQ ID NO: 429: Deinococcus radtiodurans ACD sf-6059413 Attorney Docket No.: 26223-20027.40 Protein Sequence (protein sequence). SEQ ID NO: 430: SunTag-Deinococcus radiodurans ACD Plasmid Sequence. SEQ ID NO: 431: Zea Mays ACD Protein Sequence (DNA sequence). SEQ ID NO: 432: Zea Mays ACD Protein Sequence (protein sequence). SEQ ID NO: 433: SunTag-Zea Mays ACD Plasmid Sequence. Example 15: Use of MBD6-human small heat shock protein chimeric proteins to silence FWA Summary [0527] This Example describes experiments that demonstrated that three mammalian ACD proteins could functionally substitute for the silencing function of ACD15/21. This Example is related to Example 8 herein. In this Example, experiments were conducted in which the unmethylated epiallele fwa was silenced in the mbd5 mbd6 mutant background using chimeric proteins comprising MBD6 in which the StkyC domain was replaced with human small heat shock protein coding sequences of HSPB1, HSPB3, HSPB5, and HSPB8. These constructs were described supra in Example 8, which demonstrated that MBD6HSPB1, MBD6HSPB3, and MBD6HSPB5 were able to accumulate at chromocenters in plant cells, while MBD6HSPB8 was not. HSPB1, HSPB3, HSPB5, and HSPB8 are all ACD-containing proteins, but only HSPB1, HSPB3, and HSPB5 are able to highly multimerize. [0528] In this example, we further tested the transgenic plants described in Example 8 herein to determine if the different human sHSPs could functionally complement the gene silencing function of the StkyC domain of MBD6. These experiments were designed and conducted to test a hypothesis based on the data in Example 8: since some human sHSPs were able to functionally mimic the accumulation and functions of the endogenous ACD proteins (ACD15 and ACD21) of the MBD5/6 complex, then those MBD6-sHSP chimeric proteins might also be able to silence MBD6 targets, such as the FWA gene. To demonstrate this gene silencing function, we utilized the MBD6HSPB1, MBD6HSPB3, MBD6HSPB5, and MBD6HSPB8 constructs with c-terminal RFP tags in mbd5 mbd6 mutant plants, as described in Example 8, and measured the expression level of FWA compared to control plants. We found that, indeed, there was a significant decrease in FWA expression in MBD6HSPB1, MBD6HSPB3, and MBD6HSPB5, but not in MBD6HSPB8, which correlated well with the ability of these different MBD6 chimeric proteins to oligomerize (FIG.57). Materials and Methods RT-qPCR to determine FWA expression sf-6059413 Attorney Docket No.: 26223-20027.40 [0529] To determine the expression of FWA, we performed RT-qPCR of cDNA from RNA of flower bud tissue. Briefly, RNA was extracted from flower bud tissue using the Zymo RNA extraction kit (Cat# R2052) and cDNA was created using Superscript IV enzyme mix from Invitrogen (Cat# 18090010). RT-qPCR was then performed using a Bio-rad qPCR machine. Data Analysis [0530] FWA expression was determined by normalizing FWA expression across multiple plant lines and technical replicates relative to the expression of a control gene, IPP2, and were compared to FWA expression in the mbd5 mbd6 mutant plants, which were centered to a value of 1. This normalized data was then plotted using GraphPad Prism where the results were statistically compared using a one-way ANOVA with corrections for multiple comparisons. Results [0531] Consistent with the chromocenter accumulation shown in Example 8, MBD6HSPB1, MBD6HSPB3, and MBD6HSPB5 were able to significantly decrease FWA expression relative to mbd5 mbd6 controls, but MBD6HSPB8 was not able to silence FWA (FIG.57). [0532] These results demonstrated that human sHSPs could both functionally mimic the accumulation of the MBD5/6 complex, which is normally driven by ACD15 and ACD21, and could functionally silence a gene that is normally silenced by the MBD5/6 complex. Therefore, the use of human ACD proteins in plants for protein accumulation is functional. In this way, the HSPB1, HSPB3, and HSPB5 ACD proteins were able to functionally substitute for the function of the Arabidopsis ACD15/ACD21 proteins. Example 16: Leveraging ACD accumulation technology, through the StkyC domain of MBD6, to increase genome editing efficiency in stable transgenic Arabidopsis thaliana plants Summary [0533] This example describes experiments wherein ACD15 and ACD21 accumulation technology was used to increase the editing efficiency of the Cas9 nuclease. This was achieved using the StkyC domain of MBD6 (StkyC) to cause the accumulation of Cas9 at the target locus and therefore increase editing efficiency. This example is related to Example 13b. sf-6059413 Attorney Docket No.: 26223-20027.40 [0534] Here, we demonstrate the use of the StkyC domain for increased Cas9 genome editing in stable transgenic wild-type (Col-0) Arabidopsis thaliana plants. We used the Cas9, Cas9-Suntag-StkyC-1xGCN4 (FIGURES 43A-43B) and Cas9-Suntag-StkyC-4XGCN4 (FIGURES.44C-44D) constructs to target the promoter region of FWA with guide RNA 4 (gRNA 4), as previously described in Example 13b. Materials and Methods Creation of Transgenic Plants [0535] Transgenic plants were created using the standard floral dip protocol. Following floral dip, seeds were harvested and screened for transgenic plants using hygromycin selection. Transgenic plants were then transferred from hygromycin plates to soil and allowed to grow at room temperature. Sample Preparation [0536] Roughly three weeks after seedlings were transplanted from hygromycin selection plates to soil, tissue samples were collected from the leaves of each plant. Genomic DNA was then extracted from the leaf tissue and Next Generation amplicon sequencing was performed to quantify editing efficiencies. Data Analysis [0537] Based on the results described in the above Examples, we hypothesized that the StkyC domain will generally lead to an increase in genomic editing in plants. Without wishing to be bound by theory, we believe this occurs because the StkyC domain of MBD6 accumulated the dead Cas9 nuclease targeting system, the SunTag System, in root cells as demonstrated in Example 1 above. Results [0538] Analysis of the amplicon sequencing revealed a significant increase in editing for both Cas9-Suntag-StkyC-1xGCN4 and Cas9-Suntag-StkyC-4xGCN4 constructs, relative to Cas9 (FIG.58). We observed a 3.3-fold and 3.7-fold increase in editing for the Cas9-Suntag- StkyC-4xGCN4 and Cas9-Suntag-StkyC-1xGCN4 constructs, respectively (FIG.58). These results indicate the StkyC domain can be used to increase Cas9 editing efficiency. We expect that this technology will be transferable to other genome editing nucleases such as Cas12, TnpB or other nucleases, as well as other plant species. sf-6059413 Attorney Docket No.: 26223-20027.40 Example 17: Using alpha crystalline domain (ACD) proteins from various organisms to increase genome editing efficiency through ACD accumulation technology. Summary [0539] This example describes experiments wherein novel ACDs were used to increase the editing efficiency of the Cas9 nuclease. This example is related to Examples 13 and 14 above. [0540] In Example 14, we demonstrated novel ACDs from various organisms could behave similarly to the StkyC domain of MBD6, forming dCas9, SunTag GFP foci at the promoter of FWA. Based on these results, we predicted that this accumulation of an enzyme, such as a nuclease, through ACD accumulation technology would increase its enzymatic activity. Therefore, we moved from a dead Cas9 to an active Cas9 SunTag system and tested the impact of ACD accumulation technology on the editing efficiency of Cas9. [0541] Here, we tested five of these previously tested ACDs for improved Cas9 editing capabilities: Chlamydomonas reinhardtii ACD protein (SEQ ID NO: 413), Saccarolobus solfataricus ACD protein (SEQ ID NO: 416), human HSPB4 ACD protein (SEQ ID NO: 405), human HSPB5 ACD protein (SEQ ID NO: 403), and human HSPB8 ACD protein (SEQ ID NO: 310). The StkyC sequence from the Cas9-Suntag-StkyC-4xGCN4 vector (Example 13b) was replaced with the specific ACD protein full-length sequence to create the Cas9-Suntag-ACD-4xGCN4 plasmid vectors. Materials and Methods Protoplast Transfection and Sequencing [0542] Each vector was transfected into Arabidopsis thaliana wild-type (Col-0) mesophyll protoplast cells. Protoplast cells were incubated at 26°C for 48 hours. At 48 hours post-transfection, protoplasts were harvested for genomic DNA extraction and targeted mutagenesis analysis using Next-generation amplicon sequencing. Data Analysis [0543] Based on the results described in the above Examples, we hypothesized that the ACD proteins from other organisms will generally increase the editing efficiency of Cas9 constructs in protoplasts. Sequencing results were analyzed for the amount of insertion and deletions created at the target site. This was plotted as a percentage to allow for statistical comparison among experiments. sf-6059413 Attorney Docket No.: 26223-20027.40 Results [0544] Two separate protoplast experiments were performed. In the first experiment, we observed increased editing for all ACD proteins tested, except for Chlamydomonas reinhardtii, relative to Cas9 (FIG.59A). In experiment 2, we observed a similar trend in ACD-mediated editing efficiencies as in experiment 1, where the same four ACD proteins showed increased editing relative to Cas9 (FIG.59B). In both experiments, HSPB8 and Saccarolobus solfataricus ACD proteins showed the greatest increase in editing efficiency, ranging from 1.5-fold to 2.4-fold improvements. [0545] These data indicate the ACD accumulation technology can be used as a novel strategy to increase genome editing efficiency of nucleases. We anticipate these results will translate to additional ACD proteins, including others not listed herein, and will be applicable for a variety of other genome editing nucleases such as Cas12 and TnpB across wide ranges of plants, animals, and other eukaryotic organisms. Example 18: Saccharolobus solfataricus ACD protein SunTag (SunTagSacc) system for locus-specific accumulation in human cells. Summary [0546] This example describes experiments wherein ACD accumulation technology was used to accumulate the SunTag system in human embryonic kidney cells (HEK293T). This was achieved using the full-length coding sequence of the ACD containing, small heat shock protein (sHSP) of the thermophilic Archaeon Saccharolobus solfataricus to concentrate the transcriptional activator VP64, which is attached to GFP, after binding the dCas9, SunTag targeting system. [0547] To demonstrate the efficacy of this technology we created the SunTagSacc-VP64 system which was targeted to the promoter of the NLRC5 gene using a single guide RNA (gRNA). The SunTag system was otherwise the same as used previously to characterize the SunTag-StkyC accumulation system (Example 1). Materials and Methods Cloning of Fusion Proteins and gRNA-NLRC5 [0548] Exemplary structures of this construct to be used in the SunTag-VP64 control construct and the SunTagSacc-VP64 system are presented in (FIGS.60A-60B). In these Figures, different regions of the construct are numerically labeled, with each region sf-6059413 Attorney Docket No.: 26223-20027.40 representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these figures were also prepared, as described below in Table 18A. Table 18A: Exemplary Parameters for Fusion Construct Modules
Figure imgf000158_0001
Exemplary Construct Design [0549] The construct that was created using the Saccharolobus solfataricus ACD protein was the SunTagSacc-VP64 (FIG.60B). To add the Saccharolobus solfataricus ACD coding sequence to the SunTag system to create the SunTagSacc-VP64, the SunTag plasmid was cut using restriction enzyme digest to make a linear plasmid. A gene block was ordered containing the Saccharolobus solfataricus ACD coding sequence with homology to the cut site and used to ligate into the cut site. To add in the NLRC5 gRNA, the SunTag plasmid was cut using a restriction enzyme and the gRNA sequence was ligated into the cut restriction site. To allow for insertion of the construct into the HEK293T cell genomes a PiggyBac Transposase (FIG.60C) was co-transfected with the SunTag constructs. Features of these constructs include a CMV Enhancer with Chicken ȕ Promoter and Chimeric Intron (SEQ ID NO: 434), dCas9 with GCN4 (SEQ ID NO: 435), U6 promoter+NLRC5 gRNA+scaffold RNA (SEQ ID NO: 438), P2A-scFV-sfGFP-VP64- Saccharolobus solfataricus ACD (SEQ ID NO: 437), P2A-scFV-sfGFP-VP64 (SEQ ID NO: 436), CMV Enhancer with CMV Promoter (SEQ ID NO: 439), and Synthetic PiggyBac Transposase with SV40 PolyA (SEQ ID NO: 440). Transfection of human cell line HEK293T sf-6059413 Attorney Docket No.: 26223-20027.40 [0550] Standard transfection of HEK293T cells was performed using lipofectamine with P3000 to package the SunTag plasmids into lipid nanoparticles. HEK293T cells were incubated with the plasmid-lipofectamine mix. Plasmids were integrated into cells through the transposase expressed along with the PiggyBAC Transposase system (FIG.60C). [0551] As a negative control, SunTag plasmid with only VP64 was used which also targets the NLRC5 promoter. To locate nuclei, the expected localization site for the SunTag system, cells were also stained using 4-6-diamidino-2-phenylinodole (DAPI). Microscopy Experiment [0552] HEK293T cells were collected and imaged using confocal microscopy wherein both GFP and DAPI were measured across multiple cells. Data Analysis [0553] Based on the results described in the above Examples, we hypothesized that Saccharolobus solfataricus ACD protein would oligomerize in nuclei of human cells. We assume that this oligomerization led to the accumulation of the SunTag construct, through oligomerization of bound Saccharolobus solfataricus ACD protein, at the promoter of the NLRC5 gene, which then led to the creation of GFP foci in nuclei. The negative control, SunTag-VP64, was not expected to oligomerize due to the lack of ACD protein and therefore not form foci and show diffuse nucleoplasmic staining. [0554] To determine if foci accumulate, cells were imaged with confocal microscopy. This technology demonstrates the functional use of ACD accumulation technology to specifically accumulate proteins in human cells. This broads the functionality of the ACD accumulation technology and shows that it works in cells other than plant cells. We expect that ACDs could be used to accumulate proteins in any eukaryotic cell or prokaryotic cell. Results [0555] Microscopic analysis of HEK293T cells revealed diffuse GFP signal for the SunTag-VP64 negative control which was localized primarily to the nuclei of cells but also contained cytoplasmic GFP signal as well (FIG.61A). As expected, HEK293T cells expressing the SunTagSacc-VP64 construct demonstrated clear GFP foci in nuclei (FIG. 61B). [0556] Foci formation in HEK293T cells only containing the ACD-VP64 construct demonstrates the ability of ACD accumulation technology to functions not only in plants, as sf-6059413 Attorney Docket No.: 26223-20027.40 demonstrated in Example 14, but also in mammals (human cells). Its further predicted that this ACD-accumulation technology will work in the majority of organisms. Additional Sequences [0557] SEQ ID NO: 441: Synthetic PiggyBac Transposase with SV40 PolyA (DNA). SEQ ID NO: 442: SunTag-VP64 Plasmid Sequence. SEQ ID NO: 443: SunTagSacc-VP64 Plasmid Sequence. SEQ ID NO: 444: PiggyBac Transposase Plasmid Sequence. Example 19: Chlamydomonas reinhardtii ACD protein SunTag (SunTagChlamy) system for locus-specific labeling through microscopy Summary [0558] This example describes experiments wherein Chlamydomonas reinhardtii ACD protein accumulation technology was used to accumulate the SunTag system for the goal of locating a specific locus within nuclei by microscopy. This was achieved using the full-length coding sequence of the ACD containing, small heat shock protein (sHSP) of Chlamydomonas reinhardtii to concentrate GFP after binding the dCas9, SunTag targeting system at specific loci. Many previous Examples herein showed that SunTag-ACD systems showed punctate nuclear GFP signal at the FWA locus, two per nuclei in general corresponding to the two copies of the FWA gene on the two homologous chromosomes, indicating that the GFP bodies indicated the location of the FWA locus within nuclei. This important information can be used in many research projects and is difficult to obtain by other known methods. Theoretically, by simply changing the gRNA, one can visualize any locus and determine its nuclear location. As an additional experiment we targeted the SunTagChlamy system to a different chromosomal coordinate to confirm that we observed two nuclear bodies per nucleus. [0559] To demonstrate the efficacy of this technology we created the SunTagChlamy system which was targeted to a small-interfering RNA producing locus (siren) locus using a single guide RNA. The SunTag system was otherwise the same as was described previously to characterize the SunTag-MBD6 StkyC accumulation system (See Example 1). Materials and Methods Cloning of Fusion Proteins and gRNA-siren [0560] Exemplary structures of this construct used in the SunTagChlamy system are presented in (FIG.62A). In this figure, different regions of the construct are numerically sf-6059413 Attorney Docket No.: 26223-20027.40 labeled, with each region representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these figures were prepared as described below in Table 19A. Table 19A: Exemplary Parameters for Fusion Construct Modules
Figure imgf000161_0002
Exemplary Construct Design [0561] The construct that was created using the Chlamydomonas reinhardtii ACD protein was the SunTagChlamy (FIG.62A). All constructs were created using golden gate to assemble
Figure imgf000161_0001
(https://pubmed[dot]ncbi[dot]nlm[dot]nih[dot]gov/28522548/). To add the Chlamydomonas reinhardtii ACD coding sequence to the SunTag system to create the SunTagChlamy a single restriction site in the SunTag plasmid was cut using restriction enzyme digest to make a linear plasmid. A gene block was ordered containing the Chlamydomonas reinhardtii ACD coding sequence with homology to the cut site and used to ligate into the cut site. To add in the siren gRNA, the SunTag plasmid was cut using restriction enzyme digest and the gene block was ordered with homology to the cut site and ligated into the plasmid. Features of these constructs include a UBQ10 promoter (SEQ ID NO: 445), Cas9 (SEQ ID NO: 448), gRNA-siren sequence (SEQ ID NO: 448), Chlamydomonas reinhardtii ACD gene (Example 14), and scFv-HA-GFP (SEQ ID NO: 492). Transformation of plants [0562] Agrobacterium AGL0 cells was transformed with the final binary vector containing the fusion proteins and the gRNA. Arabidopsis wild type (Col-0) plants were transformed using floral dip methods. Microscopy Experiment [0563] The root meristem region of hygromycin resistance seedlings were analyzed using an LSM980 confocal microscope. The GFP reporter allowed for observations of cellular sf-6059413 Attorney Docket No.: 26223-20027.40 localization and nuclear phenotypes. ACD proteins are known to oligomerize and therefore GFP foci were expected to form in nuclei of cells. Data Analysis [0564] Multiple seedlings expressing SunTagChlamy were imaged using confocal microscopy to determine localization of GFP. If GFP signal was concentrated in nuclei of the cells, then the construct was assumed to properly localize without any misfolding. Z-stacks of root meristems were obtained across multiple plant lines to acquire images across many cells for each SunTagChlamy. ImageJ software was used to analyze the images and quantify the foci across multiple cells. [0565] Based on the data provided in the Examples above, we hypothesized that Chlamydomonas reinhardtii ACD protein would oligomerize in nuclei of plants, which we assume leads to the accumulation of the SunTag constructs, through oligomerization of bound Chlamydomonas reinhardtii ACD protein, at the site of interest (siren locus). This will lead to the creation of ~2 GFP foci in plant nuclei. [0566] This technology demonstrates the functional use of Chlamydomonas reinhardtii ACD protein to specifically accumulate dCas9 at a specific locus and the use of targeted accumulation of Chlamydomonas reinhardtii ACD protein as a mechanism for locating genes through microscopy. Results [0567] To determine the impact of Chlamydomonas reinhardtii ACD protein on the localization of SunTag targeting system, seedling root meristems were imaged across multiple plant lines. Root meristem tissue contains a high density of nuclei providing a large number of nuclear data in a single region. [0568] Microscopic analysis of seedlings demonstrated clear foci formation for the SunTagChlamy construct (FIG.62B). Importantly, the SunTagChlamy formed mainly 2 foci as expected. This result demonstrates that the SunTagChlamy system can target the siren locus to visualize and track this specific locus through microscopy. Additional Sequences: [0569] SEQ ID NO: 445: UBQ10 promoter. SEQ ID NO: 446: Chlamydomonas reinhardtii ACD protein. SEQ ID NO: 447: gRNA-siren sequence, DNA (5’-3’). SEQ ID sf-6059413 Attorney Docket No.: 26223-20027.40 NO: 448: Cas9 DNA. SEQ ID NO: 449: SunTagChlamy Plasmid Sequence (siren loci), DNA. Example 20: Co-targeting vCasĭ with dead Cas9 through ACD accumulation technology for improved genome editing efficiency of Casĭ nuclease Summary [0570] This Example describes designed experiments to demonstrate the goal of increasing vCasĭ genome editing in plants. Using plants expressing the previously described SunTagStkyC ACD mediated accumulation technology (Example 1), we propose to concentrate vCasĭ to its target site at the promoter of FWA through ACD mediated multimerization. The SunTagStkyC system has already been shown to be localized to the previously characterized guide RNA 17 (gRNA 17) and guide RNA 4 (gRNA4) sites at the FWA promoter (FIG.63A) showing nuclear bodies which indicate high concentration of the components to the genomic site. vCasĭ on the other hand will be localized to a guide RNA site downstream of those sites called guide RNA 1 (gRNA1). [0571] Since the SunTagStkyC accumulation technology contains accumulated dCas9 at FWA, this will create an optimal scaffold to localize other proteins that are similarly interacting with the ACDs. vCasĭ will be expressed containing a C-terminal fusion of the MBD6 StkyC domain (StkyC) with an XTEN linker (vCasĭ-XTEN-StkyC). As described in Example 1, the StkyC domain mediates the accumulation of proteins through ACD15 and ACD21 interactions. [0572] We, therefore, anticipate the SunTagStkyC will also incorporate vCasĭ-XTEN- StkyC into the accumulation at the FWA promoter. This accumulation would concentrate vCasĭ-XTEN-StkyC near its target site, providing more opportunities to edit the nearby gRNA1 position The vCasĭ-XTEN-StkyC is therefore predicted to edit more efficiently than vCasĭ alone. [0573] To test this technology protoplasts will be created from transgenic plants which already express the SunTagStkyC construct from Example 1. The vCasĭ and vCasĭ-XTEN- StkyC will be transfected into wild-type (Col-0) protoplasts and wild-type protoplasts expressing the SunTagStkyC system. Then the protoplasts will be collected and next generation sequencing (NGS) will be performed to measure editing efficiencies. In addition, transgenic plants will be created with these same constructs, and we again expect that in the presence of sf-6059413 Attorney Docket No.: 26223-20027.40 the SunTagStkyC , the both the vCasĭ-XTEN-StkyC will show a higher editing efficiency as compared to vCasĭ. [0574] We further anticipate this same methodology should work for other CRISPR nucleases and in fact any protein that one wishes to concentrate at a particular genomic locus using a SunTag scaffold to accumulate proteins through any ACD-mediated accumulation technology. Materials and Methods Cloning of Fusion Proteins and gRNA-FWA [0575] Exemplary structures of these fusion constructs to be used in the vCasĭ-XTEN- StkyC system are presented in FIGS.63B-C. In these Figures, different regions of the construct are numerically labeled, with each region representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these figures will also be prepared, as described below in Table 20. Table 20: Exemplary Parameters for Fusion Construct Modules
Figure imgf000164_0002
Exemplary Construct Design [0576] The construct created using the StkyC domain was vCasĭ-XTEN-StkyC (FIG. 63C). All constructs were created using golden gate to assemble each expression cassette into
Figure imgf000164_0001
(https://pubmed[dot]ncbi[dot]nlm[dot]nih[dot]gov/28522548/). To construct the vCasĭ- XTEN-StkyC, the vCasĭ construct (FIG.63B) containing gRNA1 was cut with a restriction enzyme downstream of vCasĭ. A gene block was ordered with homology to the cut site containing the XTEN linker with the StkyC and was ligated at the cut site. This will add the sf-6059413 Attorney Docket No.: 26223-20027.40 XTEN-StkyC directly after the vCasĭ. Features of these constructs include a UBQ10 promoter (SEQ ID NO: 450), vCasĭ (SEQ ID NO: 452), XTEN-StkyC (SEQ ID NO: 453), gRNA1 sequence (SEQ ID NO: 451), RbcS-E9t (SEQ ID NO: 455), U6 Promoter (SEQ ID NO: 454). Transformation of protoplasts [0577] Protoplasts were created following current protocols. Plasmids were directly transformed into protoplasts made from wild-type (Col-0) plants. Sequencing [0578] To determine if the promoter of FWA was successfully edited we will perform amplicon sequencing on protoplasts transfected with the Cas9 constructs. Data Analysis and expected results [0579] We hypothesize that ACD15/ACD21 will oligomerize in nuclei of plants. This will lead to the accumulation of the vCasĭ-XTEN-StkyC constructs at the promoter of FWA where the SunTagStkyC scaffold will be accumulated. This will lead to increased opportunity for an editing event to occur and therefore will demonstrate increased editing compared to vCasĭ control. [0580] This technology will demonstrate the functional use of ACD accumulation technology to increase editing of vCasĭ in a unique, ACD mediated mechanism. This technology will further demonstrate the use of targeted accumulation of a genome editing enzyme as a mechanism for increasing editing efficiency, without needing extensive optimization of the nuclease enzyme itself. Additional Sequences: [0581] SEQ ID NO: 456: vCasĭ Plasmid Sequence, DNA. SEQ ID NO: 457: vCasĭ- XTEN-StkyC, DNA. Example 21: Truncated guide RNA (gRNA) co-targeting through ACD accumulation technology for improved genome editing efficiency. Summary [0582] The CRISPR-Cas9 genome editing system relies on a guide RNA (gRNA) to direct the Cas9 protein to the genomic DNA target site. The standard gRNA length for optimal CRISPR-Cas9 genome editing is 20 base pairs (bp). When the guide RNA is shorter than 20 bp (for example 14-15 bp, truncated at the 3’ end), DNA cleavage capabilities are sf-6059413 Attorney Docket No.: 26223-20027.40 hindered while still allowing for Cas9 to bind genomic DNA (Pan et al.2022; Kiani et al. 2015). [0583] This example describes experimental guidelines to improve editing efficiency using ACD accumulation technology (ACD proteins) in combination with editing (20 bp) and non-editing (14 bp truncated) Cas9 gRNAs. The following experiments will be performed in wild-type (Col-0) Arabidopsis thaliana protoplast cells and stable transgenic plants. We hypothesize that simultaneously expressing a 20 bp gRNA (for editing) and a 14 bp gRNA (for binding) targeting nearby genomic locations would be an effective strategy to localize more Cas9 to the site being targeted for editing. In this example, the Cas9 fusion proteins would accumulate at the site corresponding to the truncated gRNA4 site because of ACD mediated multimerization. This would increase the concentration of Cas9 fusion proteins in the local vicinity of the gRNA17 site, stimulating higher genome editing activity at the gRNA17 site. [0584] To test this, we propose a system where the FWA guide 4 site is targeted with a 14 bp truncated guide, while the FWA guide 17 site is simultaneously targeted with a 20 bp gRNA (FIG.64A). We anticipate the FWA guide 4 Cas9-SunTag-ACD-4xGCN4 to bind the genomic DNA and act as an anchor to attract more Cas9-SunTag-ACD-4xGCN4 to the nearby FWA guide 17 target site via ACD-mediated accumulation. [0585] In this example we propose the use of the Cas9-SunTag-HSPB5-4xGCN4- Truncated construct (FIG.64B), however various other configurations may be explored such as Cas9 fusions to other ACD proteins (see Example 14) and various size SunTag chains with varying numbers of GCN4 repeats. Direct fusions of ACDs to Cas nucleases are likely to give similar results. We propose the use of HSPB5 (see Example 14 for HSPB5 information), as it has been demonstrated as a way to improve editing of Cas9-SunTag-4xGCN4 (see Example 17), and because it been shown to strongly multimerize in the SunTag-ACD system showing nuclear bodies corresponding the dCas9 binding sites (see FIGS.48A-48B). However, we expect this approach to be applicable for fusions with other multimerizing ACD proteins and other genome editing nucleases such as Cas12s and TnpBs, as well as well as in many other organisms such as other plants, animals, fungi, or prokaryotes. sf-6059413 Attorney Docket No.: 26223-20027.40 Materials and Methods Cloning of Fusion Proteins and gRNA [0586] Exemplary structures of this construct to be used in the Cas9-SunTag-HSPB5- 4xGCN4-Truncated system are presented in (FIG.64B). In this Figure, different regions of the construct are numerically labeled, with each region representing a respective module of the construct. Fusion constructs containing different variants of the modules presented in these Figures will also be prepared, as described below in Table 21. Table 21: Exemplary Parameters for Fusion Construct Modules
Figure imgf000167_0001
Exemplary Construct Design [0587] The construct to be created using HSPB5 is the Cas9-SunTag-HSPB5-4xGCN4- Truncated construct (FIG.64B). All constructs were created using golden gate to assemble each expression cassette into a binary backbone using Cermak et al. (https://pubmed[dot]ncbi[dot]nlm[dot]nih[dot]gov/28522548/). The previously used Cas9- SunTag-HSPB5-4xGCN4 construct (Example 13) was used to clone the Cas9-SuntTag- HSPB5-4xGCN4-Truncated. Features of these constructs include a UBQ10 promoter (SEQ ID NO: 458), Cas9-SunTag-4xGCN4 (SEQ ID NO: 460), 35S promoter (SEQ ID NO: 459), HSP Terminator (SEQ ID NO: 461), AtU6 promoter (SEQ ID NO: 462), FWA Guide 17 (SEQ ID NO: 463: AAAACTAGGCCATCCATGGA), FWA Guide 4 Truncated (SEQ ID NO: 464: GACGGAAAGATGTAT), scFV-sfGFP-HSPB5 (SEQ ID NO: 465), Rbcs E9 terminator (SEQ ID NO: 466). sf-6059413 Attorney Docket No.: 26223-20027.40 Transformation of plants [0588] Agrobacterium AGL0 cells will be transformed with the final binary vector containing the fusion proteins and the gRNA. Arabidopsis wild type (Col-0) plants will be transformed using floral dip methods. Plant selection [0589] Plants will be selected for on appropriate selection plates. Selection positive plants will then be transferred to soil to grow under normal greenhouse conditions. Plant selection [0590] Protoplasts will also be created from wild-type (Col-0) plants while transgenic plants are growing. These protoplasts will be prepared following usual protocols. Next Generation Sequencing [0591] Plant tissue will then be collected, and next-generation sequencing will be performed to determine the rates of insertion and deletion at the guide site. As a negative control we will also include Cas9-SunTag-4xGCN4 in the experiment (see Example 13) along with Cas9-4xGCN4. Results [0592] We expect that Cas9-SunTag-HSPB5-4xGCN4-Truncated will lead to an increase in the amount of editing in both protoplasts and stable plant lines relative to controls. We will use two control plasmids for these experiments to ensure the increase in editing is due to HSPB5 ACD protein. The first control will be a version of the Cas9-SunTag-HSPB5- 4xGCN4-Truncated plasmid without HSPB5 attached to the scFV-GFP. The second control will be the Cas9-SunTag-HSPB5-4xGCN4-Truncated plasmid without truncated FWAg4 gRNA, just the FWAg17 gRNA used for editing. Sequences and References [0593] SEQ ID NO: 467: Cas9-SunTag-HSPB5-4xGCN4-Truncated Plasmid Sequence. SEQ ID NO: 468: Cas9-SunTag-HSPB5 Without 4xGCN4-Truncated Plasmid Sequence. Kiani, Samira, Alejandro Chavez, Marcelle Tuttle, Richard N. Hall, Raj Chari, Dmitry Ter- Ovanesyan, Jason Qian, et al.2015. “Cas9 gRNA Engineering for Genome Editing, Activation and Repression.” Nature Methods 12 (11): 1051–54. Pan, Changtian, Gen Li, Aimee A. Malzahn, Yanhao Cheng, Benjamin Leyson, Simon Sretenovic, Filiz Gurel, Gary D. Coleman, and Yiping Qi.2022. “Boosting Plant Genome Editing with a Versatile sf-6059413 Attorney Docket No.: 26223-20027.40 CRISPR-Combo System.” Nature Plants 8 (5): 513–25. Zhang, Xiuren, Rossana Henriques, Shih-Shun Lin, Qi-Wen Niu, and Nam-Hai Chua.2006. “Agrobacterium-Mediated Transformation of Arabidopsis Thaliana Using the Floral Dip Method.” Nature Protocols 1 (2): 641–46. Example 22: Targeting DNA methylation by TRBIP1-MQ1 DNA methylation system using ACD mediated multimerization technology. Summary [0594] This Example describes experimental guidelines to improve the genomic targeting of DNA methylation in plants using the efficient localization of DNA methylation enzymes using ACD-mediated multimerization technology. This example expands on the results found in Example 2 herein wherein ACD-mediated multimerization technology, mediated by the StkyC domain of MBD6, was able induce very specific targeting of DNA methylation. This example details plans to utilize different ACD proteins from different organisms to similarly induce very specific genomic targeting of DNA methylation. [0595] This example combines the use of SunTag-TRBIP1-MQ1 DNA methylation targeting technology from Example 2 along with the ACD proteins described in Example 14 for the specific targeting of DNA methylation and reducing off-target methylation (SunTag- ACD-TRBIP1-MQ1). Following the experimental protocol in Example 2, the SunTag-ACD- TRBIP1-MQ1 will be targeted to unmethylated promoter of FWA in the unmethylated fwa epiallele in the rdr-6 background. [0596] Due to the demonstrated ability of ACD proteins tested in the SunTag system to induce the formation of concentrated nuclear bodies in plant cells (Example 14), we anticipate that SunTag-ACD-TRBIP1-MQ1 system will similarly concentrate at the target site and induce very specific methylation. It therefore anticipated that the SunTag-ACD- TRBIP1-MQ1 systems will accumulate at the site of interest, the FWA promoter, and target DNA methylation very specifically and efficiently. Materials and Methods Cloning of Fusion Proteins and gRNA [0597] Exemplary structures of this construct to be used in the SunTag-ACD-TRBIP1- MQ1 system are presented in (FIGS 65A-65E). In these Figures, different regions of the construct are numerically labeled, with each region representing a respective module of the sf-6059413 Attorney Docket No.: 26223-20027.40 construct. Fusion constructs containing different variants of the modules presented in these Figures will also be prepared, as described below in Table 22A. Table 22A: Exemplary Parameters for Fusion Construct Modules
Figure imgf000170_0001
Exemplary Construct Design [0598] Constructs that will be created for SunTag-ACD-TRBIP1-MQ1 systems for each of five different ACD proteins (FIGS 65A-65E). All constructs will be created using golden gate to assemble each expression cassette into a binary backbone using Cermak et al. (https://pubmed.ncbi[dot]nlm[dot]nih[dot]gov/28522548/). The previously used SunTag- TRBIP1-MQ1 construct (Example 2) will be the template to create the SunTag-ACD- TRBIP1-MQ1 constructs. SunTag-TRBIP1-MQ1 will be cut by restriction enzyme digest to make the plasmid linear. Then the ACD proteins will be ordered as gene blocks with homology to the cut site and ligated into the linear SunTag-TRBIP1-MQ1 plasmid. The features of these constructs include a UBQ10 promoter (SEQ ID NO: 469), Cas9-10xGCN4 (SEQ ID NO: 470), scFV-sfGFP (SEQ ID NO: 471), Chlamydomonas reinhardtii ACD protein (SEQ ID NO: 472), Zea mays ACD protein (SEQ ID NO: 473), HSPB1 ACD protein (SEQ ID NO: 474), HSPB4 ACD protein (SEQ ID NO: 475), HSPB8 ACD protein (SEQ ID NO: 476), fwa Guide 4 (SEQ ID NO: 478), fwa Guide 17 (SEQ ID NO: 477), XTEN Linker- TRBIP1-MQ1 (SEQ ID NO: 479), Nos Terminator (SEQ ID NO: 480). Exemplary plasmid sequences are provided in SEQ ID NO: 481 (SunTag-Chlamy-TRBIP1-MQ1 Plasmid Sequence), SEQ ID NO: 482 (SunTag-Zea Mays-TRBIP1-MQ1 Plasmid Sequence), SEQ ID NO: 483 (SunTag-HSPB1-TRBIP1-MQ1 Plasmid Sequence), SEQ ID NO: 484 (SunTag- sf-6059413 Attorney Docket No.: 26223-20027.40 HSPB4-TRBIP1-MQ1 Plasmid Sequence), and SEQ ID NO: 485 (SunTag-HSPB8-TRBIP1- MQ1 Plasmid Sequence). Transformation of plants [0599] Agrobacterium AGL0 cells will be transformed with the final binary vector containing the fusion proteins and the gRNA. Arabidopsis fwa rdr-6 (Col-0) plants will be transformed using floral dip methods. Plant selection [0600] Plants will be selected for on appropriate selection plates. Selection positive plants will then be transferred to soil to grow under normal greenhouse conditions. Flowering Time Assay [0601] One way to determine the impact of the ACD proteins on the efficacy of the SunTag-TRBIP1-MQ1 DNA methylation system is by following the silencing of FWA. FWA is usually methylated and the plants have normal flowering time. However when DNA methylation is absent, as in fwa epiallele mutant plants, the plants have a late flowering phenotype. Addition of the SunTag-ACD-TRBIP1-MQ1 to the plants will silence FWA and induce earlier flowering, which is measured by how many leaves are produced prior to flowering. To measure this, true leaves will be counted at the time when bolting begins. Therefore, if the SunTag-ACD protein-TRBIP1-MQ1 constructs efficiently methylate the promoter of fwa then transformed plants will demonstrate an early flowering phenotype and have fewer leaves. As controls, wild-type (Col-0), fwa rdr-6, and fwa rdr-6 plants transformed with SunTag-TRBIP1-MQ1 with no ACD will be grown along with SunTag- ACD-TRBIP1-MQ1 plants. Bisulfite Sequencing [0602] To quantify the amount of DNA methylation added at the promoter of fwa and across the entire genome, whole genome bisulfite sequencing will be performed similarly to the methods used in Example 2 herein. Results [0603] We expect that SunTag-ACD-TRBIP1-MQ1 constructs will result in efficient and specific methylation of the promoter of fwa. This will lead to clear on-site methylation from the bisulfite sequencing experiment and will result in a decrease in leaves once the plant bolts. This will be in stark contrast to the fwa rdr-6 plants expressing SunTag-TRBIP1-MQ1 sf-6059413 Attorney Docket No.: 26223-20027.40 which will also flower earlier but will have extensive off-target effects, showing methylation on all of the chromosomes and in the chloroplast genome as measured by whole genome bisulfite sequencing. We also predict that SunTag-HSPB8-TRBIP1-MQ1 which contains HSPB8, which does not form foci in plants, will perform similarly to SunTag-TRBIP1-MQ1 without any ACD proteins, and thus serves as another negative control. [0604] We propose that this experimental technique will be useful for targeting many enzymes in plants and animals and other organisms. We anticipate that many different ACD proteins can be used in similar targeting strategies to make any genomic targeting more effective and more specific. Example 23: ADDITIONAL SEQUENCE INFORMATION [0605] The amino acid sequence of ACD15 (tr|Q9S842|Q9S842_ARATH AT1G76440 protein OS=Arabidopsis thaliana OX=3702 GN=F14G6.4 PE=1 SV=1) is (ACD15.5143 amino acids, At1g76440): SEQ ID NO: 11. [0606] The amino acid sequence of ACD21 (>sp|Q84K79|IDM2L_ARATH Alpha- crystallin domain-containing protein 22.3 OS=Arabidopsis thaliana OX=3702 GN=ACD22.3 PE=1 SV=1) is (ACD21.4 – 206 amino acids At1g54850): SEQ ID NO: 13. [0607] Table XA: ACD15 Close Plant Homologs
Figure imgf000172_0001
sf-6059413 Attorney Docket No.: 26223-20027.40
Figure imgf000173_0001
[0608] Table XB: ACD15 Orthologs
Figure imgf000173_0002
[0609] An alignment of the ACD15 ACD domain with full length proteins of ACD orthologs from other plant species and a phylogeny of the species included in the alignment are provided in FIGS.30A-30C. [0610] Table XC: ACD21 Close Plant Homologs sf-6059413 Attorney Docket No.: 26223-20027.40
Figure imgf000174_0001
[0611] An alignment of the ACD21 ACD domain with full length proteins of ACD orthologs from other plant species and a phylogeny of the species included in the alignment are provided in FIGS.30D-30E. [0612] Table XD: ACD21 Orthologs.
Figure imgf000174_0002
sf-6059413 Attorney Docket No.: 26223-20027.40
Figure imgf000175_0001
[0613] Table XE: ACD15 and ACD21 ACD domains compared to broader plant species
Figure imgf000175_0002
[0614] Alignments of the ACD15 and ACD21 ACD domains compared to the sequences shown in Table XE are presented in FIG.31. Alignments of the ACD15 and ACD21 ACD domains compared to the sequences shown in Table XF are presented in FIG.32. Alignments of the ACD15 and ACD21 ACD domains compared to the sequences shown in Table XG are presented in FIG.33. sf-6059413 Attorney Docket No.: 26223-20027.40 [0615] Table XF: Protein coding sequences of H. sapiens Į-Crystalline Domain containing small heat shock proteins HSPB1-10.
Figure imgf000176_0001
[0616] Table XG: Protein coding sequences of Į-Crystalline Domain containing small heat shock proteins from the following species which represent all kingdoms of life: HSPB1 (Homo sapiens) (a mammal), HSP22 (Drosophila melanogaster) (an insect), HSP26 (Saccharomyces cerevisiae) (a fungus), M1URI8 (Cyanidioschyzon merolae) (a red algae), P12811 (Chlamydomonas reinhardtii) (a green algae), Q9RTR5 (Deinococcus radiodurans) (a bacterium), and D0KNS6 (Saccharolobus solfataricus) (an archaebacterium).
Figure imgf000176_0002
[0617] Additional exemplary protein sequences: NP_195276.3 PHD finger-like protein [Arabidopsis thaliana]: SEQ ID NO: 111; XP_003517132.1 uncharacterized protein sf-6059413 Attorney Docket No.: 26223-20027.40 LOC100794366 [Glycine max]: SEQ ID NO: 112; XP_004966523.1 uncharacterized protein LOC101783772 [Setaria italica]: SEQ ID NO: 113; XP_021839485.1 uncharacterized protein LOC110779264 [Spinacia oleracea] SEQ ID NO: 114; KAG0516341.1 hypothetical protein BDA96_10G353700 [Sorghum bicolor] SEQ ID NO: 115; XP_011650244.1 uncharacterized protein LOC101214022 [Cucumis sativus] SEQ ID NO: 116; XP_002277317.1 PREDICTED: uncharacterized protein LOC100242269 [Vitis vinifera] SEQ ID NO: 117; XP_016726278.1 uncharacterized protein LOC107937813 [Gossypium hirsutum] SEQ ID NO: 118; XP_006467090.1 uncharacterized protein LOC102608953 [Citrus sinensis] SEQ ID NO: 119; XP_002306433.1 uncharacterized protein LOC7480464 [Populus trichocarpa] SEQ ID NO: 120. [0618] Table XH: Amino acid sequences of MBD6 StykC domain homologs
Figure imgf000177_0001
sf-6059413 Attorney Docket No.: 26223-20027.40
Figure imgf000178_0001
[0619] Table XI: Amino acid sequences of MBD7 StykC domain homologs
Figure imgf000178_0002
sf-6059413

Claims

Attorney Docket No.: 26223-20027.40 CLAIMS What is claimed is: 1. A method of modifying a target nucleic acid in a eukaryotic cell, the method comprising: a) providing a eukaryotic cell comprising: 1) a genetic modifier polypeptide capable of being targeted to the target nucleic acid, and 2) a Į-crystalline domain polypeptide capable of being targeted to the target nucleic acid, wherein at least one of the genetic modifier polypeptide or the Į-crystalline domain polypeptide is a recombinant polypeptide; and b) maintaining the eukaryotic cell under conditions whereby the genetic modifier polypeptide and the Į-crystalline domain polypeptide are targeted to the target nucleic acid, thereby modifying the target nucleic acid. 2. The method of claim 1, wherein at least one of the genetic modifier polypeptide or the Į-crystalline domain polypeptide is encoded on a recombinant nucleic acid. 3. The method of claim 1, wherein at least one of the genetic modifier polypeptide or the Į-crystalline domain polypeptide comprise a heterologous targeting domain which facilitates targeting of the polypeptide to the target nucleic acid. 4. The method of claim 3, wherein the heterologous targeting domain is a DNA-binding domain. 5. The method of claim 1, wherein at least one of the genetic modifier polypeptide or the Į-crystalline domain polypeptide is targeted to the target nucleic acid via a SunTag- based targeting system involving a RNA-guided DNA-endonuclease polypeptide and a guide RNA. 6. The method of claim 1, wherein the genetic modifier polypeptide comprises a heterologous StykC (STKYC) domain. 7. The method of any one of claims 1-6, wherein at least two different Į-crystalline domain polypeptides are targeted to the target nucleic acid. sf-6059413 Attorney Docket No.: 26223-20027.40 8. The method of any one of claims 1-7, wherein the genetic modifier polypeptide comprises a DNA methyltransferase polypeptide. 9. The method of claim 8, wherein the DNA methyltransferase polypeptide is a TRBIP1 polypeptide having at least 80% amino acid identity to the polypeptide encoded by Arabidopsis thaliana NP_195276.3 (SEQ ID NO: 1). 10. The method of any one of claims 1-9, wherein the Į-crystalline domain polypeptide comprises an amino acid sequence having at least 80% amino acid identity to ACD15 or ACD21 from Arabidopsis thaliana (SEQ ID NO: 11 or SEQ ID NO: 13, respectively). 11. The method of any one of claims 1-10, wherein the Į-crystalline domain polypeptide is selected from the group consisting of: ACD15, ACD21, HSPB1, HSPB3, and HSPB5. 12. The method of any one of claims 1-11, wherein modification of the target nucleic acid confers a change in expression and/or a change in the target nucleotide sequence of the target nucleic acid as compared to a corresponding control. 13. The method of any one of claims 1-12, wherein the incidence of modification of a non-target nucleic acid is reduced as compared to a corresponding control. 14. The method of any one of claims 1-13, wherein the eukaryotic cell is a plant cell or a mammalian cell. 15. The method of claim 14, wherein the eukaryotic cell is a plant cell and the method further comprises regenerating a whole plant from said plant cell. 16. The method of claim 15, the method further comprising: (c) crossing the plant with a modified target nucleic acid to a second plant to produce one or more F1 plants. 17. The method of claim 16, the method further comprising: sf-6059413 Attorney Docket No.: 26223-20027.40 (d) selecting from the one or more F1 plants an F1 plant that (i) lacks a recombinant genetic modifier polypeptide and/or a recombinant Į-crystalline domain polypeptide, and (ii) has the modified target nucleic acid. 18. A recombinant nucleic acid encoding at least one of 1) a genetic modifier polypeptide capable of being targeted to a target nucleic acid, and 2) a Į-crystalline domain polypeptide capable of being targeted to a target nucleic acid. 19. An expression vector comprising the recombinant nucleic acid of claim 18. 20. A plant cell comprising: 1) a genetic modifier polypeptide capable of being targeted to the target nucleic acid, and 2) a Į-crystalline domain polypeptide capable of being targeted to the target nucleic acid, wherein at least one of the genetic modifier polypeptide or the Į-crystalline domain polypeptide is a recombinant polypeptide, wherein the plant cell comprises a modified nucleic acid as compared to a corresponding control nucleic acid. sf-6059413
PCT/US2024/042723 2023-08-16 2024-08-16 Alpha-crystalline domain proteins and their use in genome modification Pending WO2025038948A2 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202363533017P 2023-08-16 2023-08-16
US63/533,017 2023-08-16
US202463553478P 2024-02-14 2024-02-14
US63/553,478 2024-02-14

Publications (2)

Publication Number Publication Date
WO2025038948A2 true WO2025038948A2 (en) 2025-02-20
WO2025038948A3 WO2025038948A3 (en) 2025-04-17

Family

ID=94632713

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2024/042723 Pending WO2025038948A2 (en) 2023-08-16 2024-08-16 Alpha-crystalline domain proteins and their use in genome modification

Country Status (1)

Country Link
WO (1) WO2025038948A2 (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RS59199B1 (en) * 2012-05-25 2019-10-31 Univ California Methods and compositions for rna-directed target dna modification and for rna-directed modulation of transcription
EP3397760A2 (en) * 2015-12-30 2018-11-07 Avectas Limited Vector-free delivery of gene editing proteins and compositions to cells and tissues
EP3728588A4 (en) * 2017-12-22 2022-03-09 The Broad Institute, Inc. CAS12A SYSTEMS, METHODS AND COMPOSITIONS FOR TARGETED RNA BASE EDITING
EP4051792A4 (en) * 2019-10-28 2024-03-20 Targetgene Biotechnologies Ltd. Pam-reduced and pam-abolished cas derivatives compositions and uses thereof in genetic modulation

Also Published As

Publication number Publication date
WO2025038948A3 (en) 2025-04-17

Similar Documents

Publication Publication Date Title
US11692198B2 (en) Targeted gene activation in plants
US12043839B2 (en) Methods and compositions for targeting RNA polymerases and non-coding RNA biogenesis to specific loci
US20240141367A1 (en) Targeted gene demethylation in plants
US11479781B2 (en) Methods and compositions for targeting RNA polymerases and non-coding RNA biogenesis to specific loci
US20220010293A1 (en) Novel cas9 orthologs
EP2981166B1 (en) Methods and compositions for integration of an exogenous sequence within the genome of plants
AU2019398351A1 (en) Novel CRISPR-Cas systems for genome editing
US20230084762A1 (en) Novel crispr-cas systems for genome editing
EP4139447A1 (en) Crispr systems in plants
US20250042956A1 (en) Tools for gene silencing
WO2025038948A2 (en) Alpha-crystalline domain proteins and their use in genome modification
US20230374528A1 (en) Compositions, systems, and methods for orthogonal genome engineering in plants

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24855004

Country of ref document: EP

Kind code of ref document: A2