WO2025224715A1 - Methods for improving precise genome modification and reducing unwanted mutations by crispr-cas editing - Google Patents
Methods for improving precise genome modification and reducing unwanted mutations by crispr-cas editingInfo
- Publication number
- WO2025224715A1 WO2025224715A1 PCT/IB2025/054391 IB2025054391W WO2025224715A1 WO 2025224715 A1 WO2025224715 A1 WO 2025224715A1 IB 2025054391 W IB2025054391 W IB 2025054391W WO 2025224715 A1 WO2025224715 A1 WO 2025224715A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- cell
- sequence
- polq
- crispr
- dna
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/102—Mutagenizing nucleic acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/113—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/113—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
- C12N15/1138—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing against receptors or cell surface proteins
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]
Definitions
- the disclosed invention is generally in the field of gene editing and specifically in the area of CRISPER/CAS mediated gene editing.
- CRISPR-Cas9 can introduce double strand breaks (DSBs) to a specific genomic locus that shares sequence complementarity with the CRISPR guide RNA (gRNA).
- the DSBs can be repaired through different cellular mechanisms, including the classical non-homologous end joining (C-NHEJ, hereafter referred to as NHEJ), MMEJ (also called alternative NHEJ), homologous recombination (HR) and single-stranded annealing (SSA) pathways.
- NHEJ often generates small insertions and deletions (indels) [1] and is believed to be the dominant repair pathway for DSBs induced by CRISPR-Cas9 [2].
- MMEJ relies on small homologies for DNA repair, while SSA requires longer ones.
- HR is an error-free DNA repair mechanism that requires a homologous template.
- the LD issue can have significant implications for the application of the 68 otherwise versatile genome editing tool CRISPR-Cas9.
- CRISPR-Cas9 otherwise versatile genome editing tool
- compositions and methods for increasing the homology directed repair (HDR) efficiency following CRISPER/CAS mediated gene editing result in up to a two-fold increase in the HDR efficiency and up to a 50% reduction in the frequency of large deletion.
- the methods are based in some forms, on the discovery that MMEJ is the major repair pathway to mediate CRISPR-induced on-target large deletions.
- One embodiment to reduce the large deletion includes inhibiting DNA polymerase theta (PolQ) levels or activity, the main player of the MMEJ repair pathway.
- PolQ levels are inhibited by treating cells with a PolQ inhibitor such as Novobiocin (NVB) or ART558, for example about 24 hours before and after electroporation.
- a PolQ inhibitor such as Novobiocin (NVB) or ART558, for example about 24 hours before and after electroporation.
- Another embodiment includes delivering recombinant replication protein A (RPA) proteins together with Cas9/sgRNA RNP and single-stranded oligodeoxynucleotide (ssODN) to cells subjected to CRISPER/CAS gene editing to avoid annealing of single-stranded DNA resected after CRISPR-induced DSBs.
- RPA replication protein A
- the RPA proteins can prevent the donor ssDNA from degradation in the cells and can activate the HDR repair pathway, and the inhibition of MMEJ may switch the DNA repair pathway from MMEJ to HDR.
- the disclosed strategy reduces CRISPR-induced on-target large deletions by up to 50% and can increase HDR efficiency by up to two-fold.
- the methods are also based in some forms on the discovery that modifying nucleic acids involve in CRISPER/CAS gene editing can dramatically improve CRISPER/CAS gene editing efficiency.
- the ssODN or sgRNA is labelled with a fluorophore at its 5’ and/3’ end.
- FIG. 1A Left: analysis of microhomology (MH) frequency at Cas9 induced breakpoint junctions in two published data; right: schematic of how MMEJ could lead to CRISPR-induced LDs (created with BioRender.com). NH: no homology.
- FIG. IB shows LD events detected in Cas9-edited human pluripotent cell lines 9SEQ ID NO:127-135). Boxes indicate MH sequences.
- FIG. 1C Left: schematic of the strategy to analyze CRISPR-induced LDs in the CD9 locus (created with BioRender.com); right: MH frequency in deletions > 310 bp quantified from long- read sequencing data. **** p ⁇ 0.0001, Fisher’s exact test.
- FIG. 1A Left: analysis of microhomology (MH) frequency at Cas9 induced breakpoint junctions in two published data; right: schematic of how MMEJ could lead to CRISPR-induced LDs (created with BioRender.com). NH: no homology
- FIG. 2A is a schematic of the roles of four key genes in the MMEJ pathway (created with BioRender.com).
- FIG. 2B is a schematic of the workflow for the knockdown experiments.
- FIG. 2E is a schematic of IDMseq analysis of LDs.
- FIG. 2F Frequency of LD (> 30 bp) quantified by IDMseq. The numerator indicates the LD event number, and the denominator indicates the total event number detected by IDMseq. **** p ⁇ 0.0001, * P ⁇ 0.05, Fisher's exact test.
- FIG. 3B top Schematics of the inducible RPA constructs; bottom: the mRNA level of RPA genes after doxycycline treatment for two days.
- FIG. 3C is a schematic of workflow of RPA overexpression experiments.
- FIG. 4A is a schematic of mutant GFP correction by CRISPR-mediated HDR (left) and strategies to improve HDR efficiency (right) (created with BioRender.com). The green color indicates the restoration of green fluorescence.
- 4C is a schematic of ddPCR probe-based assay design for detecting CRISPR-mediated precise mutation via HDR and representative 2D plots of ddPCR events from positive (an EPOR G6002 mutant cell line), mock (non-edited line) and control (edited in EPOR locus) Hl ESC samples (data were shown in FIG. 4D).
- Probe 2 is designed to specifically recognize the installed point mutation but not wildtype sequence.
- Probe 1 is designed to recognize both mutant and wild-type sequences.
- the treatments followed the strategy illustrated in a using 25 pM NVB or 2.5 pmol RPA, respectively.
- FIG. 5 A is an agarose gel electrophoresis of long-range PCR products of SH2B3 and Hl.3 edited cell line; KO#1 (under SH2B3), KO#5, 7, 8 and 11 (under Hl.3) indicate LD clone.
- FIG. 5B is a schematic of the strategy to analyze CRISPR-induced LDs in PIGA locus.
- FIG. 5C Representative Integrative Genomics Viewer (IGV) tracks and coverage of long-read sequencing data on the PIGA locus from PIGA FLAER positive and negative sorted populations. The dashed arrow indicates the position of the PIGA FLAER positive LD proximal end on the PIGA locus.
- IGFV Integrative Genomics Viewer
- FIG. 5D MH frequency in deletions > 30 bp of PIGA intrl_l sgRNA quantified from long-read sequencing data. **** p ⁇ 0.0001, Fisher’s exact test.
- FIG. 5F Plot of the LD frequency and the distance between the sgRNA and its nearest exon.
- FIG. 5G Plot of the LD frequency and the distance between the sgRNA and its nearest exon.
- FIG. 6A Flow cytometry analysis of PIGA expression, the number in the gate indicates the percentage of PIGA FLAER negative population.
- FIG. 6B Representative Western blotting analysis for RPA1 and POLQ expression. The grey value was quantified using ImageJ and normalized by the control siRNA (siCtrl) treated sample.
- FIG. 6D Schematic of the strategy and workflow for cell cycle synchronization by nocodazole and LD analysis by FACS (created with BioRender.com).
- FIG. 6E Schematic of the strategy and workflow for cell cycle synchronization by nocodazole and LD analysis by FACS (created with BioRender.com).
- FIG. 6K LD size distribution analysis for RPA and POLQ knockdown samples.
- FIG. 6L MH frequency in LDs of RPA and POLQ knockdown samples. The numerator indicates the MH > 2 bp event number, and the denominator indicates the LD event number detected by IDMseq. ns: not significant.
- FIG. 6M LD size distribution analysis for NVB treated and RPA overexpression samples.
- FIG. 6N LD size distribution analysis for NVB treated and RPA overexpression samples.
- FIG. 7D Top: the location of LAMP2 intronic gRNAs, the numbers indicate the distances between sgRNA cutting sites and the nearest exons; bottom: example flow cytometry analysis of LAMP2 expression.
- FIG. 7H
- FIG. 71 Top: the location of the WAS intronic gRNA; bottom: frequency of LD (> 30 bp) quantified by ONT long-read sequencing. The numerator indicates the LD event number, and the denominator indicates the total event number detected by nanopore reads. **** p ⁇ 0.0001, Fisher’s exact test.
- FIG. 71 Top: the location of the HBB intronic gRNA; bottom: frequency of LD (> 30 bp) quantified by ONT long-read sequencing. The numerator indicates the LD event number, and the denominator indicates the total event number detected by nanopore reads. **** p ⁇ 0.0001, Fisher’s exact test.
- FIG. 8A Representative Coomassie-stained SDS-PAGE images of the RPA protein complex including the RPA1, RPA2 and RPA3 subunits, kDa: kilodalton.
- FIG. 8B Flow cytometry analysis of GFP expression.
- FIG. 8D Delivery efficiency of Cy3-ssODN mixed with recombinant RPA.
- FIG. 8E Flow cytometry analysis of human primary peripheral blood erythroid progenitor cell surface markers.
- FIG. 8A Representative Coomassie-stained SDS-PAGE images of the RPA protein complex including the RPA1, RPA2 and RPA3 subunits, kDa: kilodalton.
- FIG. 8B Flow
- FIG. 9D Lei: schematic of mutant HBB correction (NM_000518.5:c.20A>T) from “GTG” to “GAG” via HDR. Right: frequency of HBB correction quantified by ddPCR. The numerator indicates the HBB mutant corrected event number, and the denominator indicates the total event number detected by ddPCR. *P ⁇ 0.05, ****P ⁇ 0.0001, ns: not significant, two-sided Fisher’s exact test.
- FIG. 9E Lei: schematic of mutant HBB correction (NM_000518.5:c.20A>T) from “GTG” to “GAG” via HDR.
- FIG. 9F Lei: schematic of mutant GLP1R correction (NM_002062.3:c.402+3delG, c.396A>G) via HDR. Right: frequency of GLP1R correction quantified by ddPCR. The numerator indicates the GLP1R mutant corrected event number, and the denominator indicates the total event number detected by ddPCR. ****P ⁇ 0.0001, two-sided Fisher’s exact test.
- FIG. 9G Lei: schematic of mutant GLP1R correction (NM_002062.3:c.402+3delG, c.396A>G) via HDR. Right: frequency of GLP1R correction quantified by ddPCR. The numerator indicates the GLP1R mutant corrected event number, and the denominator indicates the total event number detected by ddPCR. ****P ⁇ 0.0001, two-sided Fisher’s exact test.
- FIG. 9G Lei: schematic of mutant GLP1R correction (NM_002062.3:c.402
- FIG. 10A Schematic of in situ mutant GFP correction in iCas9 iPSC-derived human heart organoid (hHO).
- FIG. 10E Lei: the violin plot of the formed blastoid diameter. Data were presented as the mean ⁇ SEM. A two-tailed t-test was used, ns: not significant; Right: frequency of blastoids with cavitated structures. Data were presented as the mean ⁇ SEM of four independent experiments.
- FIG. 10F Top: schematic of in vitro human blastoid attachment assay.
- Right detection of stimulated CGP of edited blastoids using commercial pregnancy test kit.
- Bottom representative maximum projection of immunofluorescence images for the attached blastoids, magenta: GATA4, yellow: Oct4, cyanine: GATA3. Scale bar, 100 pm.
- Lei schematic of mutant HBB correction (NM_000518.5:c.20A>T) from “CTG” to “CAG” in human blastoids.
- FIG. 11 A Schematic of GFP correction followed by bulk RNA-seq and western blot workflow.
- FIG. 11B Volcano plot comparing gene expression fold changes between 5’-Cy5- ssODN (lei) and unmodified ssODN conditions.
- FIG. 11C Gene Ontology (GO) enrichment analysis of differentially upregulated genes in the 5’-Cy5-ssODN condition compared to the unmodified ssODN condition.
- FIG. 11D Heatmap of the expression levels of genes involved in DNA repair and chromatin remodeling pathways. Gene expression is represented by a gradient color scale, with red indicating higher expression and blue representing lower expression.
- FIG. HE
- FIG. 12A The radius of gyration Rg as a function of simulation time for 5’Cy5- ssODN90 and unmodified SSODN90.
- FIG. 12B Hydrogen bond number analysis as a function of simulation time for 5’Cy5-ssODN9o and unmodified SSODN90.
- FIG. 12C Calculated RDF plots of various cyanines around the C fragment in their corresponding GFPD10 systems.
- FIG. 12D Calculated binding energy of unmodified and various cyanine modified Cy-G:C systems in GFP sequence.
- 5Cy5, 5Cy5.5, and 5Cy3 fragments are the last three rightmost bars, respectively, with 5Cy5 at the rightmost and binding energy of unmodified and various cyanine modified Cy- G:C systems in GFP sequence. 5Cy5, 5Cy5.5 in the middle. DETAILED DESCRIPTION OF THE INVENTION
- CRISPR/Cas9 Although the genome editing efficiency of CRISPR/Cas9 is relatively high, which mainly refers to gene knocking out (KO), the HDR efficiency is still considered low.
- the high frequency of unintended on-target large deletions or other complex genomic rearrangements induced by CRISPR/Cas9 is a potential risk for both scientific research and clinic application.
- Genetic diseases, particularly point-mutation diseases can be treated by CRISPR/Cas9 to knock in the correct sequences to fix the mutations or deletions via HDR repair pathway.
- the relatively low HDR efficiency and high frequency of on-target large deletion of CRISPR/Cas9 constrain its applications.
- the term “gene” refers to a nucleic acid (e.g., DNA or RNA) sequence that including coding sequences necessary for the production of a polypeptide, RNA (e.g., including, but not limited to, mRNA, tRNA and rRNA) or precursor.
- the polypeptide, RNA, or precursor can be encoded by a full-length coding sequence or by any portion thereof.
- the term also encompasses the coding region of a structural gene and the sequences located adjacent to the coding region on both the 5' and 3' ends for a distance of about 1 kb on either end such that the gene corresponds to the length of the full-length mRNA.
- genomic form or clone of a gene may contain the coding region interrupted with non-coding sequences termed “introns” or “intervening regions” or “intervening sequences.”
- Introns are segments of a gene that are transcribed into nuclear RNA (hnRNA); introns may contain regulatory elements such as enhancers. Introns are removed or "spliced out" from the nuclear or primary transcript; introns therefore are absent in the messenger RNA (mRNA) transcript. The mRNA functions during translation.
- the term “inhibit” or other forms of the word such as “inhibiting” or “inhibition” means to hinder or restrain a particular characteristic, for example, to reduce, decrease or prevent, either partially or entirely. It is understood that this is typically in relation to some standard or expected value, in other words it is relative, but that it is not always necessary for the standard or relative value to be referred to.
- “inhibits POLQ” means hindering or restraining the activity of the protein relative to a standard or a control.
- “Inhibits POLQ” can also mean to hinder or restrain the synthesis or expression of the protein, or mRNA encoding the protein, relative to a standard or control.
- mammal includes both humans and non-humans and include but is not limited to humans, non-human primates, canines, felines, murines, bovines, equines, and porcines.
- a “vector” is a replicon, such as a plasmid, phage, or cosmid, into which another DNA segment may be inserted so as to bring about the replication of the inserted segment.
- the vectors described herein can be expression vectors.
- an “expression vector” is a vector that includes one or more expression control sequences.
- an “expression control sequence” is a DNA sequence that controls and regulates the transcription and/or translation of another DNA sequence.
- treating includes alleviating the symptoms associated with a specific disorder or condition and/or preventing or eliminating the symptoms.
- operably linked refers to a juxtaposition wherein the components are configured so as to perform their usual function.
- control sequences or promoters operably linked to a coding sequence are capable of effecting the expression of the coding sequence
- an organelle localization sequence operably linked to protein will direct the linked protein to be localized at the specific organelle.
- transformed and transfected encompass the introduction of a nucleic acid (e.g. a vector) into a cell by a number of techniques known in the art.
- Effective amount and “therapeutically effective amount,” used interchangeably, as applied to the nanoparticles, therapeutic agents, and pharmaceutical compositions described herein, mean the quantity necessary to render the desired therapeutic result.
- an effective amount is a level effective to treat, cure, or alleviate the symptoms of a disease for which the composition and/or therapeutic agent, or pharmaceutical composition, is/are being administered.
- Ranges may be expressed herein as from “about” one particular value, and/or to "about” another particular value. When such a range is expressed, also specifically contemplated and considered disclosed is the range from the one particular value and/or to the other particular value unless the context specifically indicates otherwise. Similarly, when values are expressed as approximations, by use of the antecedent “about,” it will be understood that the particular value forms another, specifically contemplated embodiment that should be considered disclosed unless the context specifically indicates otherwise. It will be further understood that the endpoints of each of the ranges are significant both in relation to the other endpoint, and independently of the other endpoint unless the context specifically indicates otherwise.
- compositions in some forms include (A) Agents for reducing POLQ level/activity in a cell; (B) agents for increasing RPA levels/activity in a cell and (C) fluorophore-modified nucleic acids used in gene editing.
- Agents for reducing POLQ level/activity include (i) that inhibit expression or activity in a cell, including, functional nucleic acids such siRNA, shRNA, and small molecules such as Novobiocin.
- Agents for increasing RPA in cells include nucleic acids encoding RPA.
- Fluorophore-modified nucleic acids used in gene editing include for example, fluorophore labelled ssODN and/or sgRNA.
- compositions for reducing PolQ activity such small molecule inhibitors, siRNA and ShRNA are useful in the disclosed methods.
- the agent used to reduce PolQ activity is a small molecule.
- a preferred small molecule is Novobiocin.
- other PolQ inhibitors can be used, including, but not limited to ART4215, (a potent and selective inhibitor of deoxyribonucleic acid (DNA) polymerase (pol) theta) (https://clinicaltrials.gov/ct2/show/NCT04991480) and ART558 (Zatreanu, D., Robinson, Alkhatib, 0. et al. Poll) inhibitors elicit BRCA-gene synthetic lethality and target PARP inhibitor resistance. Nat Commun 12, 3636 (2021). https://doi.org/10.1038/s41467-021-23463-8).
- the inhibitor can be a functional nucleic acid.
- Protein expression and/or activity of a desired protein can be inhibited using a functional nucleic acid (herein, inhibiting NA), or vector encoding the same, which reduces expression of the desired protein i.e., POLQ.
- Functional nucleic acids refer to those nucleic acids whose functions are beyond the conventional genetic roles of nucleic acids.
- functional nucleic acid molecules can be divided into the following non-limiting categories: antisense molecules, siRNA, miRNA, aptamers, ribozymes, triplex forming molecules, RNAi, external guide sequences, and other gene editing compositions.
- the functional nucleic acid molecules can act as effectors, inhibitors, modulators, and stimulators of a specific activity possessed by a target molecule, or the functional nucleic acid molecules can possess a de novo activity independent of any other molecules.
- Functional nucleic acid molecules can interact with any macromolecule, such as DNA, RNA, polypeptides, or carbohydrate chains.
- functional nucleic acids can interact with the mRNA or the genomic DNA of a target polypeptide or they can interact with the polypeptide itself.
- functional nucleic acids are designed to interact with other nucleic acids based on sequence homology between the target molecule and the functional nucleic acid molecule.
- the specific recognition between the functional nucleic acid molecule and the target molecule is not based on sequence homology between the functional nucleic acid molecule and the target molecule, but rather is based on the formation of tertiary structure that allows specific recognition to take place.
- compositions can include one or more functional nucleic acids designed to reduce expression of the POLQ gene, or a gene product thereof.
- the functional nucleic acid or polypeptide can be designed to target and reduce or inhibit expression or translation of POLQ; or to reduce or inhibit expression, reduce activity, or increase degradation of POLQ protein.
- the composition includes a vector suitable for in vivo expression of the functional nucleic acid.
- a functional nucleic acid or polypeptide is designed to target a segment of the nucleic acid sequence encoding POLQ, or the complement thereof, or a genomic sequence corresponding therewith, or variants thereof having a nucleic acid sequence at least 65%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical to a sequence encoding POLQ.
- a functional nucleic acid or polypeptide is designed to target a segment of a the nucleic acid encoding the amino acid sequence of POLQ,, or the complement thereof, or variants thereof having a nucleic acid sequence 65%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical to a nucleic acid encoding the amino acid sequence of POLQ,.
- the function nucleic acid hybridizes to the nucleic acid encoding POLQ,, or a complement thereof, for example, under stringent conditions. In some embodiments, the functional nucleic acid hybridizes to a nucleic acid sequence that encodes the amino acid sequence of POLQ, or a complement thereof, for example, under stringent conditions. i. RNA Interference
- RNA interference RNA interference
- dsRNA double stranded RNA
- Dicer double stranded small interfering RNAs
- RNAi induced silencing complex RISC
- RISC RNAi induced silencing complex
- Short Interfering RNA is a double-stranded RNA that can induce sequencespecific post-transcriptional gene silencing, thereby decreasing or even inhibiting gene expression.
- a siRNA triggers the specific degradation of homologous RNA molecules, such as mRNAs, within the region of sequence identity between both the siRNA and the target RNA.
- WO 02/44321 discloses siRNAs capable of sequence-specific degradation of target mRNAs when base-paired with 3’ overhanging ends, herein incorporated by reference for the method of making these siRNAs.
- Sequence specific gene silencing can be achieved in mammalian cells using synthetic, short double-stranded RNAs that mimic the siRNAs produced by the enzyme dicer (Elbashir, et al. (2001) Nature, 411:494 498) (Ui-Tei, et al. (2000) FEBS Lett 479:79-82).
- SiRNA can be chemically or in vztro-synthesized or can be the result of short double-stranded hairpin-like RNAs (shRNAs) that are processed into siRNAs inside the cell.
- Synthetic siRNAs are generally designed using algorithms and a conventional DNA/RNA synthesizer.
- SiRNA can also be synthesized in vitro using kits such as Ambion’ s SILENCER® siRNA Construction Kit.
- siRNA from a vector is more commonly done through the transcription of a short hairpin RNAse (shRNAs).
- Kits for the production of vectors comprising shRNA are available, such as, for example, Imgenex’s GENESUPPRESSORTM Construction Kits and Invitrogen’s BLOCK- ITTM inducible RNAi plasmid and lentivirus vectors.
- Imgenex s GENESUPPRESSORTM Construction Kits and Invitrogen’s BLOCK- ITTM inducible RNAi plasmid and lentivirus vectors.
- PolQ expression can be reduced using can be antisense molecules.
- Antisense molecules are designed to interact with a target nucleic acid molecule through either canonical or non- canonical base pairing. The interaction of the antisense molecule and the target molecule is designed to promote the destruction of the target molecule through, for example, RNAse H mediated RNA-DNA hybrid degradation. Alternatively, the antisense molecule is designed to interrupt a processing function that normally would take place on the target molecule, such as transcription or replication. Antisense molecules can be designed based on the sequence of the target molecule. There are numerous methods for optimization of antisense efficiency by finding the most accessible regions of the target molecule. Exemplary methods include in vitro selection experiments and DNA modification studies using DMS and DEPC. It is preferred that antisense molecules bind the target molecule with a dissociation constant (Kd) less than or equal to 10-6, 10-8, 10-10, or 10-12.
- Kd dissociation constant
- an “antisense” nucleic acid sequence can include a nucleotide sequence that is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the TRF2.
- Antisense nucleic acid sequences and delivery methods are well known in the art (Goodchild, Curr. Opin. Mol. Ther., 6(2): 120-128 (2004); Clawson, et al., Gene Ther., 11(17): 1331-1341 (2004).
- the antisense nucleic acid can be complementary to an entire coding strand of a target sequence, or to only a portion thereof.
- An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.
- an antisense nucleic acid can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art.
- an antisense nucleic acid e.g., an antisense oligonucleotide
- an antisense nucleic acid can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used.
- the antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).
- AONs/ASOs include an alpha- anomeric nucleic acid.
- An alpha-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual beta-units, the strands run parallel to each other (Gaultier et al., Nucleic Acids. Res. 15:6625-6641 (1987)).
- the antisense nucleic acid molecule can also comprise a 2” -o- methylribonucleotide (Inoue et al. Nucleic Acids Res. 15:6131-6148 (1987)) or a chimeric RNA-DNA analogue (Inoue et al. FEBS Lett., 215:327-330 (1987)).
- Triplex forming functional nucleic acid molecules are molecules that can interact with either double-stranded or single-stranded nucleic acid.
- triplex molecules When triplex molecules interact with a target region, a structure called a triplex is formed in which there are three strands of DNA forming a complex dependent on both Watson-Crick and Hoogsteen base-pairing.
- Triplex molecules are preferred because they can bind target regions with high affinity and specificity. It is preferred that the triplex forming molecules bind the target molecule with a Kd less than 10-6, 10-8, 10-10, or 10-12.
- PolQ RNA expression can be reduced using external guide sequences.
- External guide sequences are molecules that bind a target nucleic acid molecule forming a complex, which is recognized by Rnase P, which then cleaves the target molecule.
- EGSs can be designed to specifically target a RNA molecule of choice.
- RNAse P aids in processing transfer RNA (tRNA) within a cell.
- Bacterial RNAse P can be recruited to cleave virtually any RNA sequence by using an EGS that causes the target RNA:EGS complex to mimic the natural tRNA substrate.
- EGS/RNAse P-directed cleavage of RNA can be utilized to cleave desired targets within eukaryotic cells. Representative examples of how to make and use EGS molecules to facilitate cleavage of a variety of different target molecules are known in the art.
- the functional nucleic acids can be aptamers.
- Aptamers are molecules that interact with a target molecule, preferably in a specific way.
- aptamers are small nucleic acids ranging from 15-50 bases in length that fold into defined secondary and tertiary structures, such as stemloops or G-quartets.
- Aptamers can bind small molecules, such as ATP and theophiline, as well as large molecules, such as reverse transcriptase and thrombin.
- Aptamers can bind very tightly with Kd’s from the target molecule of less than 10-12 M. It is preferred that the aptamers bind the target molecule with a Kd less thanl0-6, 10-8, 10-10, or 10-12.
- Aptamers can bind the target molecule with a very high degree of specificity.
- aptamers have been isolated that have greater than a 10,000 fold difference in binding affinities between the target molecule and another molecule that differ at only a single position on the molecule. It is preferred that the aptamer have a Kd with the target molecule at least 10, 100, 1000, 10,000, or 100,000 fold lower than the Kd with a background binding molecule. It is preferred when doing the comparison for a molecule such as a polypeptide, that the background molecule be a different polypeptide.
- the functional nucleic acids can be ribozymes.
- Ribozymes are nucleic acid molecules that are capable of catalyzing a chemical reaction, either intramolecularly or intermolecularly. It is preferred that the ribozymes catalyze intermolecular reactions.
- ribozymes that are not found in natural systems, but which have been engineered to catalyze specific reactions de novo.
- ribozymes cleave RNA or DNA substrates, and more preferably cleave RNA substrates. Ribozymes typically cleave nucleic acid substrates through recognition and binding of the target substrate with subsequent cleavage. This recognition is often based mostly on canonical or non-canonical base pair interactions. This property makes ribozymes particularly good candidates for target specific cleavage of nucleic acids because recognition of the target substrate is based on the target substrates sequence.
- the functional nucleic acids are gene editing compositions.
- Gene editing compositions can include nucleic acids that encode an element or elements that induce a single or a double strand break in the target cell’s genome, and optionally a polynucleotide.
- the compositions can be used, for example, to reduce or otherwise modify expression of POLQ. i. Strand Break Inducing Elements CRISPR/Cas
- the element that induces a single or a double strand break in the target cell’s genome is a CRISPR/Cas system.
- CRISPR Clustered Regularly Interspaced Short Palindromic Repeats
- the prokaryotic CRISPR/Cas system has been adapted for use as gene editing (silencing, enhancing or changing specific genes) for use in eukaryotes (see, for example, Cong, Science, 15:339(6121):819-823 (2013) and Jinek, et al., Science, 337(6096):816-21 (2012)).
- the organism's genome can be cut and modified at any desired location.
- Methods of preparing compositions for use in genome editing using the CRISPR/Cas systems are described in detail in WO 2013/176772 and WO 2014/018423, which are specifically incorporated by reference herein in their entireties.
- Double strand breaks can be repaired by the cell in one of two ways: non-homologous end joining, and homology- directed repair (HDR) (discussed further below).
- non-homologous end joining NHEJ
- the double-strand breaks are repaired by direct ligation of the break ends to one another. As such, no new nucleic acid material is inserted into the site, although some nucleic acid material may be lost, resulting in a deletion.
- homology-directed repair a donor polynucleotide with homology to the cleaved target DNA sequence is used as a template for repair of the cleaved target DNA sequence, resulting in the transfer of genetic information from a donor polynucleotide to the target DNA.
- the genome editing composition includes a donor polynucleotide.
- the modifications of the target DNA due to NHEJ and/or homology-directed repair can be used to induce gene correction, gene replacement, gene tagging, transgene insertion, nucleotide deletion, gene disruption, gene mutation, etc.
- cleavage of DNA by the genome editing composition can be used to delete nucleic acid material from a target DNA sequence by cleaving the target DNA sequence and allowing the cell to repair the sequence in the absence of an exogenously provided donor polynucleotide.
- CRISPR system refers collectively to transcripts and other elements involved in the expression of or directing the activity of CRISPR-associated (“Cas”) genes, including sequences encoding a Cas gene, a tracr (trans-activating CRISPR) sequence (e.g., tracrRNA or an active partial tracrRNA), a tracr-mate sequence (encompassing a “direct repeat” and a tracrRNA-processed partial direct repeat in the context of an endogenous CRISPR system), a guide sequence (also referred to as a “spacer” in the context of an endogenous CRISPR system), or other sequences and transcripts from a CRISPR locus.
- tracr trans-activating CRISPR
- tracrRNA or an active partial tracrRNA e.g., tracrRNA or an active partial tracrRNA
- a tracr-mate sequence encompassing a “direct repeat” and a tracrRNA-processed partial direct repeat in the context of an endogen
- One or more tracr mate sequences operably linked to a guide sequence can also be referred to as pre-crRNA (pre-CRISPR RNA) before processing or crRNA after processing by a nuclease.
- pre-crRNA pre-CRISPR RNA
- a tracrRNA and crRNA are linked and form a chimeric crRNA- tracrRNA hybrid where a mature crRNA is fused to a partial tracrRNA via a synthetic stem loop to mimic the natural crRNA:tracrRNA duplex as described in Cong, Science, 15:339(6121): 819— 823 (2013) and Jinek, et al., Science, 337(6096):816-21 (2012)).
- a single fused crRNA- tracrRNA construct can also be referred to as a guide RNA or gRNA (or single-guide RNA (sgRNA)).
- the crRNA portion can be identified as the ‘target sequence’ and the tracrRNA is often referred to as the ‘scaffold’.
- one or more elements of a CRISPR system is derived from a type I, type II, or type III CRISPR system. In some embodiments, one or more elements of a CRISPR system is derived from a particular organism including an endogenous CRISPR system, such as Streptococcus pyogenes.
- a CRISPR system is characterized by elements that promote the formation of a CRISPR complex at the site of a target sequence (also referred to as a protospacer in the context of an endogenous CRISPR system).
- target sequence refers to a sequence to which a guide sequence is designed to have complementarity, where hybridization between a target sequence and a guide sequence promotes the formation of a CRISPR complex.
- a target sequence can be any polynucleotide, such as DNA or RNA polynucleotides.
- a target sequence is located in the nucleus or cytoplasm of a cell.
- each protospacer is associated with a protospacer adjacent motif (PAM) whose recognition is specific to individual CRISPR systems.
- PAM protospacer adjacent motif
- the PAM is the nucleotide sequence NGG.
- the PAM is the nucleotide sequence is NNAGAAW.
- the tracrRNA duplex directs Cas to the DNA target consisting of the protospacer and the requisite PAM via heteroduplex formation between the spacer region of the crRNA and the protospacer DNA.
- a CRISPR complex including a guide sequence hybridized to a target sequence and complexed with one or more Cas proteins
- formation of a CRISPR complex results in cleavage of one or both strands in or near (e.g., within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50, or more base pairs from) the target sequence.
- All or a portion of the tracr sequence may also form part of a CRISPR complex, such as by hybridization to all or a portion of a tracr mate sequence that is operably linked to the guide sequence.
- one or more vectors driving expression of one or more elements of a CRISPR system are introduced into a target cell such that expression of the elements of the CRISPR system direct formation of a CRISPR complex at one or more target sites.
- a Cas enzyme, a guide sequence linked to a tracr-mate sequence, and a tracr sequence could each be operably linked to separate regulatory elements on separate vectors.
- two or more of the elements expressed from the same or different regulatory elements may be combined in a single vector, with one or more additional vectors providing any components of the CRISPR system not included in the first vector.
- CRISPR system elements that are combined in a single vector may be arranged in any suitable orientation, such as one element located 5' with respect to (“upstream” of) or 3' with respect to (“downstream” of) a second element.
- the coding sequence of one element can be located on the same or opposite strand of the coding sequence of a second element, and oriented in the same or opposite direction.
- a single promoter drives expression of a transcript encoding a CRISPR enzyme and one or more of the guide sequence, tracr mate sequence (optionally operably linked to the guide sequence), and a tracr sequence embedded within one or more intron sequences (e.g., each in a different intron, two or more in at least one intron, or all in a single intron).
- the CRISPR enzyme, guide sequence, tracr mate sequence, and tracr sequence are operably linked to and expressed from the same promoter.
- a vector includes one or more insertion sites, such as a restriction endonuclease recognition sequence (also referred to as a “cloning site”).
- one or more insertion sites e.g., about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more insertion sites are located upstream and/or downstream of one or more sequence elements of one or more vectors.
- a vector includes an insertion site upstream of a tracr mate sequence, and optionally downstream of a regulatory element operably linked to the tracr mate sequence, such that following insertion of a guide sequence into the insertion site and upon expression the guide sequence directs sequence-specific binding of a CRISPR complex to a target sequence in a eukaryotic cell.
- a vector includes two or more insertion sites, each insertion site being located between two tracr mate sequences so as to allow insertion of a guide sequence at each site.
- the two or more guide sequences can include two or more copies of a single guide sequence, two or more different guide sequences, or combinations of these.
- a single expression construct may be used to target CRISPR activity to multiple different, corresponding target sequences within a cell.
- a single vector can include about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20 guide sequences. In some embodiments, about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, such guide-sequence-containing vectors may be provided, and optionally delivered to a cell.
- a vector includes a regulatory element operably linked to an enzyme-coding sequence encoding a CRISPR enzyme, such as a Cas protein.
- Cas proteins include Casl, CaslB, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9 (also known as Csnl and Csxl2), CaslO, Csyl, Csy2, Csy3, Csel, Cse2, Cscl, Csc2, Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmrl, Cmr3, Cmr4, Cmr5, Cmr6, Csbl, Csb2, Csb3, Csxl7, Csxl4, CsxlO, Csxl6, CsaX, Csx3, Csxl, Csxl5, Csfl, Csf2, Csf3, Csf4, homo
- the unmodified CRISPR enzyme has DNA cleavage activity, such as Cas9.
- the CRISPR enzyme directs cleavage of one or both strands at the location of a target sequence, such as within the target sequence and/or within the complement of the target sequence. In some embodiments, the CRISPR enzyme directs cleavage of one or both strands within about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, 200, 500, or more base pairs from the first or last nucleotide of a target sequence.
- a vector encodes a CRISPR enzyme that is mutated with respect to a corresponding wild-type enzyme such that the mutated CRISPR enzyme lacks the ability to cleave one or both strands of a target polynucleotide containing a target sequence.
- an aspartate-to-alanine substitution (D10A) in the RuvC I catalytic domain of Cas9 from S. pyogenes converts Cas9 from a nuclease that cleaves both strands to a nickase (cleaves a single strand).
- D10A aspartate-to-alanine substitution
- Other examples of mutations that render Cas9 a nickase include, without limitation, H840A, N854A, and N863A.
- two or more catalytic domains of Cas9 can be mutated to produce a mutated Cas9 substantially lacking all DNA cleavage activity.
- a D10A mutation is combined with one or more of H840A, N854A, or N863A mutations to produce a Cas9 enzyme substantially lacking all DNA cleavage activity.
- a CRISPR enzyme is considered to substantially lack all DNA cleavage activity when the DNA cleavage activity of the mutated enzyme is less than about 25%, 10%, 5%>, 1%>, 0.1 %>, 0.01%, or lower with respect to its non-mutated form.
- an enzyme coding sequence encoding a CRISPR enzyme is codon optimized for expression in particular cells, such as eukaryotic cells.
- the eukaryotic cells can be those of or derived from a particular organism, such as a mammal, including but not limited to human, mouse, rat, rabbit, dog, or non-human primate.
- codon optimization refers to a process of modifying a nucleic acid sequence for enhanced expression in the host cells of interest by replacing at least one codon (e.g., about or more than about 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more codons) of the native sequence with codons that are more frequently or most frequently used in the genes of that host cell while maintaining the native amino acid sequence.
- Codon bias differences in codon usage between organisms
- mRNA messenger RNA
- tRNA transfer RNA
- the predominance of selected tRNAs in a cell is generally a reflection of the codons used most frequently in peptide synthesis. Accordingly, genes can be tailored for optimal gene expression in a given organism based on codon optimization. Codon usage tables are readily available, for example, at the “Codon Usage Database”, and these tables can be adapted in a number of ways. See Nakamura, Y., et al., Nucl. Acids Res., 28:292 (2000). Computer algorithms for codon optimizing a particular sequence for expression in a particular host cell, for example Gene Forge (Aptagen; Jacobus, PA), are also available.
- one or more codons in a sequence encoding a CRISPR enzyme correspond to the most frequently used codon for a particular amino acid.
- a vector encodes a CRISPR enzyme including one or more nuclear localization sequences (NLSs).
- NLSs nuclear localization sequences
- each may be selected independently of the others, such that a single NLS may be present in more than one copy and/or in combination with one or more other NLSs present in one or more copies.
- an NLS is considered near the N- or C-terminus when the nearest amino acid of the NLS is within about 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50, or more amino acids along the polypeptide chain from the N-or C-terminus.
- the one or more NLSs are of sufficient strength to drive accumulation of the CRISPR enzyme in a detectable amount in the nucleus of a eukaryotic cell.
- strength of nuclear localization activity may derive from the number of NLSs in the CRISPR enzyme, the particular NLS(s) used, or a combination of these factors.
- Detection of accumulation in the nucleus may be performed by any suitable technique.
- a detectable marker may be fused to the CRISPR enzyme, such that location within a cell may be visualized, such as in combination with a means for detecting the location of the nucleus (e.g., a stain specific for the nucleus such as DAPI).
- Cell nuclei may also be isolated from cells, the contents of which may then be analyzed by any suitable process for detecting protein, such as immunohistochemistry, Western blot, or enzyme activity assay. Accumulation in the nucleus may also be determined indirectly, such as by an assay for the effect of CRISPR complex formation (e.g., assay for DNA cleavage or mutation at the target sequence, or assay for altered gene expression activity affected by CRISPR complex formation and/or CRISPR enzyme activity), as compared to a control no exposed to the CRISPR enzyme or complex, or exposed to a CRISPR enzyme lacking the one or more NLSs.
- an assay for the effect of CRISPR complex formation e.g., assay for DNA cleavage or mutation at the target sequence, or assay for altered gene expression activity affected by CRISPR complex formation and/or CRISPR enzyme activity
- one or more of the elements of CRISPR system are under the control of an inducible promoter, which can include inducible Cas, such as Cas9.
- CRISPR system utilized in the methods disclosed herein can be encoded within a vector system which can include one or more vectors which can include a first regulatory element operably linked to a CRISPR/Cas system chimeric RNA (chiRNA) polynucleotide sequence, wherein the polynucleotide sequence includes (a) a guide sequence capable of hybridizing to a target sequence in a eukaryotic cell, (b) a tracr mate sequence, and (c) a tracr sequence; and a second regulatory element operably linked to an enzyme-coding sequence encoding a CRISPR enzyme which can optionally include at least one or more nuclear localization sequences.
- chiRNA chimeric RNA
- Elements (a), (b) and (c) can arranged in a 5' to 3 orientation, wherein components I and II are located on the same or different vectors of the system, wherein when transcribed, the tracr mate sequence hybridizes to the tracr sequence and the guide sequence directs sequence-specific binding of a CRISPR complex to the target sequence, and wherein the CRISPR complex can include the CRISPR enzyme complexed with (1) the guide sequence that is hybridized to the target sequence, and (2) the tracr mate sequence that is hybridized to the tracr sequence, wherein the enzyme coding sequence encoding the CRISPR enzyme further encodes a heterologous functional domain.
- one or more of the vectors encodes also encodes a suitable Cas enzyme, for example, Cas9.
- the different genetic elements can be under the control of the same or different promoters.
- RNA expression plasmid contains the target sequence (about 20 nucleotides), a form of the tracrRNA sequence (the scaffold) as well as a suitable promoter and necessary elements for proper processing in eukaryotic cells.
- Such vectors are commercially available (see, for example, Addgene).
- the element that induces a single or a double strand break in the target cell’s genome is a nucleic acid construct or constructs encoding a zinc finger nucleases (ZFNs).
- ZFNs are typically fusion proteins that include a DNA-binding domain derived from a zinc-finger protein linked to a cleavage domain.
- the most common cleavage domain is the Type IIS enzyme Fokl. Fokl catalyzes double-stranded cleavage of DNA, at 9 nucleotides from its recognition site on one strand and 13 nucleotides from its recognition site on the other. See, for example, U.S. Pat. Nos.
- the DNA-binding domain which can, in principle, be designed to target any genomic location of interest, can be a tandem array of Cys2His2 zinc fingers, each of which generally recognizes three to four nucleotides in the target DNA sequence.
- the Cys2His2 domain has a general structure: Phe (sometimes Tyr)-Cys-(2 to 4 amino acids)-Cys-(3 amino acids)- Phe(sometimes Tyr)-(5 amino acids)-Leu-(2 amino acids)-His-(3 amino acids)-His.
- Rational design includes, for example, using databases including triplet (or quadruplet) nucleotide sequences and individual zinc finger amino acid sequences, in which each triplet or quadruplet nucleotide sequence is associated with one or more amino acid sequences of zinc fingers which bind the particular triplet or quadruplet sequence. See, for example, U.S. Pat. Nos. 6, 140,081; 6,453,242; 6,534,261; 6,610,512; 6,746,838; 6,866,997; 7,067,617; U.S. Published Application Nos.
- the element that induces a single or a double strand break in the target cell’s genome is a nucleic acid construct or constructs encoding a transcription activatorlike effector nuclease (TALEN).
- TALENs have an overall architecture similar to that of ZFNs, with the main difference that the DNA-binding domain comes from TAL effector proteins, transcription factors from plant pathogenic bacteria.
- the DNA-binding domain of a TALEN is a tandem array of amino acid repeats, each about 34 residues long. The repeats are very similar to each other; typically they differ principally at two positions (amino acids 12 and 13, called the repeat variable diresidue, or RVD).
- Each RVD specifies preferential binding to one of the four possible nucleotides, meaning that each TALEN repeat binds to a single base pair, though the NN RVD is known to bind adenines in addition to guanine.
- TAL effector DNA binding is mechanistically less well understood than that of zinc-finger proteins, but their seemingly simpler code could prove very beneficial for engineered-nuclease design.
- TALENs also cleave as dimers, have relatively long target sequences (the shortest reported so far binds 13 nucleotides per monomer) and appear to have less stringent requirements than ZFNs for the length of the spacer between binding sites.
- Monomeric and dimeric TALENs can include more than 10, more than 14, more than 20, or more than 24 repeats.
- Replication protein A is a heterotrimeric, single-stranded DNA-binding protein.
- RPA is conserved in all eukaryotes and is essential for DNA replication, DNA repair, and recombination.
- RPA also plays a role in coordinating DNA metabolism and the cellular response to DNA damage.
- the three cDNAs encoding the subunits of human replication protein A 70, 32, and 14 kDa have been expressed individually and in combination in Escherichia coli (Herikson, et al., J Biol Chem . 269(15): 11121-32), the methods of which are incorporated herein by reference.
- RPA has high affinity for ssDNA. It has three subunits RPA1, RPA2, and RPA3 that can form a heterotrimer. Data in the present application shows that introducing one single subunit of RPA protein into cells subjected to CRISPER/CAS gene editing, can reduce large deletion frequency dramatically (Figure 3C, 3D).
- the disclosed methods include expressing one or more subunits of RPA, i.e., RPA1, RPA2, and/or RPA3 in a cell.
- RPA1, RPA2, and/or RPA3 can be from source such as mammalian, for example, human, and the nucleic acid source can be selected to correspond with the organism whose genes are being edited.
- RPA1 Protein - 616 aa (Uniprot Accession ID: P27694) (SEQ ID NO:1) is shown below.
- RPA2 Protein - 270 aa (Uniprot Accession ID: P15927) (SEQ ID NO:2) MWNSGFESYGSSSYGGAGGYTQSPGGFGSPAPSQAEKKSRARAQHIVPCTISQLLSATLVDEVFRIGNVEIS QVTIVGIIRHAEKAPTNIVYKIDDMTAAPMDVRQWVDTDDTSSENTVVPPETYVKVAGHLRSFQNKKSLV AFKIMPLEDMNEFTTHILEVINAHMVLSKANSQPSAGRAPISNPGMSEAGNFGGNSFMPANGLTVAQNQVL NLIKACPRPEGLNFQDLKNQLKHMSVSSIKQAVDFLSNEGHIYSTVDDDHFKSTDAE;
- RPA3 Protein - 121 aa (Uniprot Accession ID: P35244) ((SEQ ID NO:3) MVDMMDLPRSRINAGMLAQFIDKPVCFVGRLEKIHPTGKMFILSDGEGKNGTIELMEPLDEEISGIVEVVGR VTAKATILCTSYVQFKEDSHPFDLGLYNEAVKIIHDFPQFYPLGIVQHD.
- compositions include nucleic acids encoding RPA1, RPA2, and/or RPA3, preferably, in a vector for delivery and expression in cells, for example, mammalian cells.
- Plasmids containing genes encoding human RPA 1, 2 and 3 are commercially available, for example, Addgene, Cat # 46948.
- the nucleic acid molecule is a messenger RNA (mRNA).
- mRNA messenger RNA
- the term "messenger RNA” (mRNA) refers to any polynucleotide which encodes a polypeptide of interest and which is capable of being translated to produce the encoded polypeptide of interest in vitro, in vivo, in situ or ex vivo.
- Nucleic acids in vectors can be operably linked to one or more expression control sequences.
- the control sequence can be incorporated into a genetic construct so that expression control sequences effectively control expression of a coding sequence of interest.
- expression control sequences include promoters, enhancers, and transcription terminating regions.
- a promoter is an expression control sequence composed of a region of a DNA molecule, typically within 100 nucleotides upstream of the point at which transcription starts (generally near the initiation site for RNA polymerase II). To bring a coding sequence under the control of a promoter, it is necessary to position the translation initiation site of the translational reading frame of the polypeptide between one and about fifty nucleotides downstream of the promoter.
- Enhancers provide expression specificity in terms of time, location, and level. Unlike promoters, enhancers can function when located at various distances from the transcription site. An enhancer also can be located downstream from the transcription initiation site.
- a coding sequence is “operably linked” and “under the control” of expression control sequences in a cell when RNA polymerase is able to transcribe the coding sequence into mRNA, which then can be translated into the protein encoded by the coding sequence.
- Suitable expression vectors include, without limitation, plasmids and viral vectors derived from, for example, bacteriophage, baculoviruses, tobacco mosaic virus, herpes viruses, cytomegalo virus, retroviruses, vaccinia viruses, adenoviruses, and adeno- associated viruses.
- Numerous vectors and expression systems are commercially available from such corporations as Novagen (Madison, WI), Clontech (Palo Alto, CA), Stratagene (La Jolla, CA), and Invitrogen Life Technologies (Carlsbad, CA). Recent transfection studies have investigated minicircle DNA (mcDNA), nucleic acids that are derived from pDNA by recombination that removes bacterial sequences.
- the vectors including the nucleic acid of interest can be administered to subjects in need thereof resulting in transfection or transformation of the cells in the subject which in turn express the protein/peptide encoded by the nucleic acid.
- the disclosed methods employ fluorophore modified nucleic acids used in gene-editing include for example, fluorophore labelled dODN and/or sgRNA.
- a polynucleotide including a donor sequence to be inserted is also provided to the cell.
- a “donor sequence” or “donor polynucleotide” or “donor oligonucleotide” it is meant a nucleic acid sequence to be inserted at the cleavage site (referred to collectively herein as “donor oligonucleotide “dON”).
- the donor polynucleotide typically contains sufficient homology to a genomic sequence at the cleavage site, e.g., 70%, 80%, 85%, 90%, 95%, or 100% homology with the nucleotide sequences flanking the cleavage site, e.g., within about 50 bases or less of the cleavage site, e.g., within about 30 bases, within about 15 bases, within about 10 bases, within about 5 bases, or immediately flanking the cleavage site, to support homology- directed repair between it and the genomic sequence to which it bears homology.
- the donor sequence is typically not identical to the genomic sequence that it replaces.
- the donor sequence may contain at least one or more single base changes, insertions, deletions, inversions or rearrangements with respect to the genomic sequence, so long as sufficient homology is present to support homology-directed repair.
- the donor sequence includes a non-homologous sequence flanked by two regions of homology, such that homology- directed repair between the target DNA region and the two flanking sequences results in insertion of the non-homologous sequence at the target region.
- Donor sequences can also include a vector backbone containing sequences that are not homologous to the DNA region of interest and that are not intended for insertion into the DNA region of interest.
- the homologous region(s) of a donor sequence will have at least 50% sequence identity to a genomic sequence with which recombination is desired. In certain embodiments, 60%, 70%, 80%, 90%, 95%, 98%, 99%, or 99.9% sequence identity is present. Any value between 1% and 100% sequence identity can be present, depending upon the length of the donor polynucleotide.
- the donor sequence can include certain sequence differences as compared to the genomic sequence, e.g., restriction sites, nucleotide polymorphisms, selectable markers (e.g., drug resistance genes, fluorescent proteins, enzymes etc.), etc., which can be used to assess for successful insertion of the donor sequence at the cleavage site or in some cases may be used for other purposes (e.g., to signify expression at the targeted genomic locus).
- selectable markers e.g., drug resistance genes, fluorescent proteins, enzymes etc.
- sequence differences may include flanking recombination sequences such as FLPs, loxP sequences, or the like, that can be activated at a later time for removal of the marker sequence.
- the donor sequence can be a single-stranded DNA, single-stranded RNA, doublestranded DNA, or double-stranded RNA. It can be introduced into a cell in linear or circular form. If introduced in linear form, the ends of the donor sequence can be protected (e.g., from exonucleolytic degradation) by methods known to those of skill in the art. For example, one or more dideoxynucleotide residues are added to the 3' terminus of a linear molecule and/or self- complementary oligonucleotides are ligated to one or both ends. See, for example, Chang et al. Proc. Natl. Acad. Sci. USA 84:4959-4963 (1987); Nehls et al.
- Additional methods for protecting exogenous polynucleotides from degradation include, but are not limited to, addition of terminal amino group(s) and the use of modified internucleotide linkages such as, for example, phosphorothioates, phosphor amidates, and O-methyl ribose or deoxyribose residues.
- additional lengths of sequence can be included outside of the regions of homology that can be degraded without impacting recombination.
- a donor sequence can be introduced into a cell as part of a vector molecule having additional sequences such as, for example, replication origins, promoters and genes encoding antibiotic resistance.
- the genome editing composition includes a modified donor oligonucleotides used in CIRSPR/Cas- mediated HDR
- the ssODN or sgRNA is labelled with a fluorophore at its 5’ and/3’ end by a covalent bond between the ssODN/sgRNA and the fluorophore.
- fluorophore modified nucleic acid is a 5’ fluorophore-modified ssODN.
- fluorophore modified nucleic acid is a 5’ and 3’ fluorophore-modified ssODN.
- fluorophore modified nucleic acid is a 5’ fluorophore-modified sgRNA.
- fluorophore modified nucleic acid is a 5’ and 3’ fluorophore-modified sgRNA.
- the nucleic acids are labelled with the fluorophore via covalent binding with/without a linker separating the nucleic acid and fluorophore.
- the ssODN or sgRNA is covalently linked to the fluorophore, preferably at its 5 ’end.
- Exemplary fluorophore molecules include but are not limited to cyanine (Cy), 1, Cy2, Cy5 Cy5.5, Cy3, Cy3.5, Cy7 , Cyl.5, etc. (reviewed in Yuan, et al. Chem. Soc. Rev., 2025, 54, 341-366).
- HDR homology directed repair
- One embodiment to reduce the large deletion includes inhibiting PolQ activity, the main player of the MMEJ repair pathway, preferably, via treating cells with a PolQ inhibitor such Novobiocin (NVB) for example about 24 hours before and after electroporation.
- a PolQ inhibitor such Novobiocin (NVB) for example about 24 hours before and after electroporation.
- Another embodiment includes delivering recombinant replication protein A (RPA) proteins to cells subjected to CRISPER/CAA gene editing to avoid annealing of single- stranded DNA resected after CRISPR-induced DSBs.
- RPA replication protein A
- the RPA proteins can prevent the donor ssDNA from degradation in the cells and can activate the HDR repair pathway, and the inhibition of MMEJ may switch the DNA repair pathway from MMEJ to HDR.
- the disclosed strategy can reduce CRISPR-induced on-target large deletions 50% and can increase HDR efficiency two-fold.
- the methods are also based in some forms on the discovery that modifying nucleic acids involved in CRISPER/CAS gene editing can dramatically improve CRISPER/CAS gene editing efficiency.
- Double strand breaks can be repaired by the cell in one of two ways: non-homologous end joining, and homology- directed repair (HDR).
- HDR homology- directed repair
- a donor polynucleotide with homology to the cleaved target DNA sequence is used as a template for repair of the cleaved target DNA sequence, resulting in the transfer of genetic information from a donor polynucleotide to the target DNA.
- new nucleic acid material can be inserted/copied into the site.
- the modifications of the target DNA due to HDR repair can be used to induce gene correction, gene replacement, gene tagging, transgene insertion, nucleotide deletion, gene disruption, gene mutation, etc.
- HDR genome editing composition include a donor polynucleotide sequence that includes at least a segment with homology to the target DNA sequence
- the methods can be used to add, i.e., insert or replace, nucleic acid material to a target DNA sequence (e.g., to “knock in” a nucleic acid that encodes for a protein, an siRNA, an miRNA, etc.), to add a tag (e.g., 6xHis, a fluorescent protein (e.g., a green fluorescent protein; a yellow fluorescent protein, etc.), hemagglutinin (HA), FLAG, etc.), to add a regulatory sequence to a gene (e.g., promoter, polyadenylation signal, internal ribosome entry sequence (IRES), 2A peptide, start codon, stop codon, splice signal, localization signal, etc.), to modify a nucleic acid sequence (e.g., introduce a mutation), and the like.
- the compositions can be used to
- compositions and methods can be used to improve the HDR of the point mutation of the HBB gene in sickle cell disease and reduce the risk of CRISPR-induced large deletions in genome edited hematopoietic stem cells.
- This strategy is easy to handle by pretreating the cells with NVB molecule and delivering RPA protein together with Cas9/sgRNA and donor ssDNA.
- Hl hESC line was purchased from WiCell Institute. Hl-iCas9 ESC line is a gift from Danwei Huangfu’s lab. The wild-type iPSC line was reprogrammed and well characterized in previous studies [44, 55, 56]. The study was approved by the KAUST Institutional Biosafety and Bioethics Committee (IBEC). All hPSCs were cultured in Essential 8 medium (ThermoFisher, Cat# Al 517001) in rhLaminin-521 (ThermoFisher, Cat# A29249) coated wells with medium change daily.
- Essential 8 medium ThermoFisher, Cat# Al 517001
- rhLaminin-521 ThermoFisher, Cat# A29249
- peripheral blood mononuclear cells were isolated from the whole blood of a healthy donor via a standard Ficoll-Paque based protocol and further cultured in StemSpanTM-ACF Erythroid Expansion medium (STEMCELL Technology, Cat# 09860) for 13 days with medium change every 3 days to expand the erythroid progenitors.
- the erythroid progenitors were analyzed by FACS before CRIPSR-Cas9 editing.
- Oligonucleotides containing gRNA sequence were Annealed and later cloned into a lentiGuide-puro plasmid (Addgene Cat # 52963) followed by the published protocol [57].
- the full-length RPA including RPA1, RPA2, and RPA3 open reading frames (ORF) were cloned from cDNA of Hl ESCs and GFP ORF was cloned from plnducer21 (Addgene, Cat # 46948). Subsequently, the ORFs of RPA and GFP were inserted into plnducer21 using the Gateway cloning method. The sequences were confirmed by Sanger sequencing.
- the gRNA lentiGuide- puro, newly constructed vectors, and pEGIP*35 were packaged into lentivirus individually. Briefly, the plasmid was premixed with packaging vectors, then transfected into HEK293T using lipofectamine 3000. The lentivirus was harvested two times after 48 hours and 72 hours. The lentivirus was concentrated by PEG-it Virus Precipitation Solution (System Biosciences) and stored in a -80° C freezer. siRNA transfection
- the protocol of esiRNA transfection was adapted to the instruction of lipofectamine RNAiMAX reagent (ThermoFisher, Cat# 13778150). Hl-iCas9 cells were harvested after 1 hour of 10 pM Y-27632 (Abeam, Cat# abl20129) treatment.
- the esiRNA/RNAiMAX solution was prepared for 3 wells per siRNA of 12-well format plate as the recipe: Mix 1 was prepared by adding 13.5 pl RNAiMAX reagent into 225 pl opti-MEM and vertexing for a few seconds.
- Mix 2 was prepared by adding 90 pmol esiRNA into 225 pl opti-MEM + 90 pmol and pipetting a few times. The esiRNA/RNAiMAX solution was done by adding Mix2 into Mixl and incubating for
- RNA was extracted using an RNeasy Mini kit (Qiagen, Cat #74106) and reversed transcribed to cDNA using iScript Reverse Transcription Supermix (BioRad, Cat# 1708840).
- the qPCR was performed on a CFX384 real-time PCR detection system (BioRad) using SsoAdvanced Universal SYBR Green Supermix (BioRad, Cat# 725270).
- the qPCR primers were shown in Table 1.
- the genomic DNA was extracted after 3 days post-electroporation using a DNeasy Blood
- the ddPCR was performed on a Bio-Rad QX200 system using ddPCR Supermix for Probes (No dUTP) (BioRad, Cat #1863024) following the manufacturer’s protocols.
- the 20x assay mix was comprised of 18 pM each primer and 5 pM each probe.
- One reaction contains 5 ng genomic DNA, lx assay mix, and lx ddPCR Supermix. The probes and oligos were shown in Table 1.
- the gRNA lentivirus infected Hl-iCas9 cells were treated with 2 pg/ml doxycycline for 2 days to induce Cas9 expression for gene editing. After the doxycycline treatment for 10 days, the cells were harvested and washed twice with PBS buffer containing 3% BSA and filtered through a 70 pm strainer. For each sample, 100, 000 cells were stained with 2 pl FLAER Alexa488 (Cederlane, Cat# NC9870611) in 100 pl PBS buffer containing 3% BSA for 15 min at room temperature. The stained cells were washed once with PBS buffer containing 3% BSA and load .
- CD9 gene edited samples cells were harvested and washed once in FACS buffer with 2% FBS. Subsequently, 10,000 cells were stained in 100 pl FACS buffer with 2% FBS and 1 pl PE anti-CD9 (BioLegend, Cat# 312106) for 30 min at 4°C, followed by two washes with FACS buffer containing 2%.
- FACS buffer containing 2% For GFPmut correction samples, the cells were harvested after 3 days post-electroporation and passed through a 70 pm strainer. The cells were resuspended in 200 pl FACS buffer containing 1 pg/ml DAPI and loaded onto a FACS Aria II cytometer for analysis.
- the cell cycle synchronization protocol was adapted from a previous publication [36].
- PIGA intr5_l gRNA positive Hl-iCas9 ESCs were seeded at a density of 2 x 10 5 cells per well in a 12-well plate.
- a 16-hour treatment with 100 ng/ml nocodazole (Abeam, Cat# abl20630) was administered.
- the cells were washed twice with prewarmed lx PBS and then cultured in fresh E8 medium for 4 hours and 12 hours to release the G1 and S phases, respectively.
- the synchronized cells were treated with 2 pg/ml doxycycline to induce Cas9 expression and genome editing, followed by 10 days’ culture for LD analysis.
- Cell cycle analysis was performed using a standard protocol. Initially, cell pellets were fixed by adding cold 70% ethanol dropwise while vortexing and then incubated overnight at - 20°C. Subsequently, the cells were washed twice and resuspended in FACS buffer containing 200 pg/ml RNase. After a 20-minute incubation at room temperature, the cells were washed once and resuspended in FACS buffer containing 1 pg/ml Propidium Iodide (PI). Following a 10-minute incubation at room temperature, the samples were ready for FACS analysis.
- PI Propidium Iodide
- WT and Hifi-Cas9 were purchased from IDT.
- the gRNAs used in this study were designed using Benchling (https://www.benchling.com/crispr) and their sequences were shown in Table 1.
- the gRNAs were obtained either through in vitro transcribed by MEGAshortscriptTM T7 Transcription kit (ThermoFisher, Cat# AM1354) or ordered through IDT as Alt-R crRNAs or sgRNAs.
- 50 pmol of Alt-R gRNA and 50 pmol of Cas9 were mixed and incubated at room temperature for 10 mins to form ribonucleoprotein (RNP).
- RNP ribonucleoprotein
- Buffer R (from the Neon system kit) was added into RNP to make 10 pl final volume. 200,000 single cells were electroporated using the Neon system (ThermoFisher) with the setting of 1600 V, 10 ms width and 3 pulses. For the HDR study, 30 pmol ssODN was mixed with 50 pmol RNP before the electroporation. The cells were seeded in one well of a 24-well plate immediately after electroporation.
- the genomic DNA of edited cells was extracted using a Blood & Tissue Kit (Qiagen, Cat# 69506).
- the UMI labeling was performed following the published protocol [7]. Briefly, the target locus was labeled by one-cycle PCR using a UMI primer (Table 1) in a 25 pl reaction including 50 ng genomic DNA, 1 pM UMI primer (containing a universal forward primer sequence, 10 nts UMI barcode, and a target locus forward primer sequence, see it in Supplementary Table 1), 12.5 pl 2X Platinum SuperFi PCR Master Mix (ThermoFisher, Cat# 12358010), following the program: initial denaturation at 98 °C for 70s, gradient annealing from 70 °C to 65 °C with 1 °C/5 s ramp rate, extension at 72 °C for 7 min, and hold at 4 °C.
- the UMI labeled DNA was purified by 0.8x AMPure XP beads, then mixed with a universal forward primer, a target locus reverse primer (Supplementary Table 1), and PrimeSTAR GXL DNA polymerase (Takara, Cat# R050A), and amplified following the program: initial denaturation at 95 °C for 2 min, 98 °C for 10 s, 68 °C for 7 min for 30 cycles, 68 °C for 5min, and hold at 4 °C.
- the amplicons were purified with AMPure XP beads and used for PacBio or Nanopore library preparation.
- the library preparation was done using the ligation sequencing kit (Oxford Nanopore Technologies, Cat# SQK-LSK109) following its standard protocol.
- the Nanopore sequencing was performed on an Oxford Nanopore Mini ON sequencer using R9.4.1 flow cells. The reads were base called using Guppy basecaller (v5.0.7).
- Library preparations of PacBio sequencing were performed with the Sequel Sequencing Kit 3.0 and loaded on the PacBio Sequel instrument with SMRT Cell IM v3 LR Tray.
- PacBio official tool termed ccs (v3.4.1) was used to generate HiFi Reads. All procedures were performed according to the manufacturer’s protocols.
- VAULT Data analysis was performed using VAULT as described previously [7]. In brief, the UMI primer sequence, fastq file and reference amplicon sequence were provided to the algorithm. VAULT will extract mappable reads followed by extraction of UMI sequences from reads. Reads will then be grouped based on their UMI sequences and used for parallel analysis of SNVs and SVs. The “vault summarize’’ command was used to generate the analysis summary.
- CRISPR-Cas9 induced LDs in human pluripotent stem cells contain microhomology at breakpoint junction
- CRISPR-Cas9 can efficiently cut the target DNA to promote gene knockout through the formation of small indels or precise installation of DNA sequence changes through homology directed repair. However, it also causes unintended LDs and structural variations (SVs) up to megabase or even chromosome scale loss 6-8,25,26 '
- SVs structural variations
- CRISPR-Cas9 induced LD remains unclear. Sequencing data of 329 CRISPR-Cas9 edited alleles from two published studies were collated 5 18 . An unusually high frequency of microhomologies (MHs) at LD breakpoint junctions was identified. For example, MHs > 2 bp were present in more than 70% of the LD alleles (Fig. 1A).
- hPSC Human pluripotent stem cell lines (20 clones) edited by CRISPR-Cas9 in the SH2B3 and H1.3 genes in-house were also examined and it was discovered that five of them harbored LD alleles, in which four contained MHs ( Figure IB, Fig. 5A).
- the PigA and CD9 intronic region, respectively, in Hl hESCs were edited using Cas9/gRNA ribonucleoprotein (RNP) complex (Fig. 1C and Fig. 5B).
- RNP Cas9/gRNA ribonucleoprotein
- the distance between the intronic gRNA and the nearest exons is more than 200 bp. Therefore the edited ells that lose cell surface expression of CD9 or PIGA, as monitored by fluorescence- activated cell sorting (FACS), are considered to contain LDs that extend at least from the CRISPR-Cas9 cleavage site to the nearest exon.
- FACS fluorescence- activated cell sorting
- MMEJ repair pathway plays important role in meditating CRISPR- induced LDs.
- RPA and POLQ regulate LD formation, but not PARP1 and LIG3
- the LD frequency was first investigated using sgRNAs targeting different intronic regions of the X-linked PIGA gene, an established model 3,6,14 for the study of CRISPR-Cas9 editing outcomes (sgRNA positions are shown in Fig. 5E and Fig. 7A.).
- Thirteen intronic sgRNAs targeting PIGA and seven intronic sgRNA targeting CD9 were individually expressed in Hl-iCas9 ESCs using a constitutive lenti viral vector. Upon doxycycline induction, these sgRNAs guided Cas9 generated DSBs located 126-489 bp from the nearest exon (Fig. 5E and 7 A).).
- LDs were specifically caused by Cas9-induced DSBs. LD frequency could not be predicted based solely on the orientation of the sgRNA (targeting the + or - strand) or the distance between the sgRNA and the nearest exon Fig. 5E and 5F) suggesting a dependency on the sequence context.
- CRISPR/Cas9 mediated genome editing has been associated with potential induction of various severe chromosome structural abnormalities, such as chromosome loss [26, 28], truncation [29- 31], and translocation [32-34].
- chromosome loss [26, 28]
- truncation [29- 31]
- translocation [32-34]
- LD frequency could be controlled by modulating the activity of the MMEJ pathway.
- Four key players of the MMEJ pathway were knocked down: PARP1, LIG3, RPA (including RPA1, RPA2, and RPA3), and POLQ in Hl-iCas9 cells expressing the PIGA intr5_l sgRNA (Figs. 2A- 2C and 6A-6C)) and Cas9 expression was induced 24 hours later.
- LD frequency was monitored by FACS analysis of FLAER staining as described in the preceding paragraph (Figs. 2B, 2C, and 6A).
- IDMseq was performed 7 of the PIGA locus. Briefly, individual genomic DNA flanking the Cas9 cut site was labelled with unique molecular identifiers (UMI) and amplified for long-read PacBio sequencing (Fig. 2E). In the subsequent sequencing data analyses, deletions > 30 bp were referred to as LDs. IDMseq showed that the vast majority of SVs detected in Cas9- edited cells were LDs.
- the baseline LD frequency of the control siRNA per IDMseq was higher than that estimated by FACS, which is expected, because FLAER neg% underestimates LD frequency as discussed above, and because LDs of 30-278 bp in size (i.e., noncoding deletions) are only detectable by IDMseq (Fig. 5B and 5C).
- the LD length spectrum exhibits striking similarities across all groups (Fig. 7K), implying that the LD size remains unaffected by the MMEJ deficiency. Consistent with the FACS analysis the IDMseq results showed that knocking down POLQ decreased LD frequency and RPA knockdown increased LD frequency (Fig. 2F).
- LDs induced by CRIPSR-Cas9 can be controlled by modulating POLQ and RPA
- POLQ is an error-prone polymerase and often upregulated in numerous cancers 23-27 .
- the antibiotic novobiocin (NVB) has recently been identified as a specific inhibitor of POLQ.
- NVB inhibits the ATPase activity of POLQ through direct binding to the ATPase domain and thus phenocopies POLQ depletion and impairs MMEJ DNA repair in human cells 39 .
- NVB was used to test if targeting a specific MMEJ-related activity of POLQ could recapitulate the reduction of LD frequency by knocking down POLQ level globally.
- NVB was introduced to the cells during induction of Cas9 expression by doxycycline in the PIGA intr5_l sgRNA-positive Hl-iCas9 ESCs.
- NVB decreased LD frequency up to 50% in a dose-dependent manner (Fig. 3A).
- NVB showed no discernible effect on the pluripotency of treated hESCs (Fig. 61).
- High concentrations of NVB (50 pM) showed signs of cytotoxicity, while lower concentrations were well tolerated by the cells (Fig. 2J).
- ART558 treatment significantly decreased the frequency of LDs by up to 61.78% in a dosedependent manner (Fig. 3A and Fig. 7C).
- RPA1, RPA2, RPA3 three RPA subunits (RPA1, RPA2, RPA3) and GFP (as a control) were individually cloned into the inducible lentiviral expression vector, plnducer21, that expresses GFP constitutively (Fig. 3B).
- Successfully transduced cells were sorted based on GFP positivity and transgene expression was induced by doxycycline.
- the expression level of the transgenic RPA proteins increased from 6-fold to 26-fold after doxycycline induction without affecting the expression of other RPA subunits (Fig. 3B).
- Such levels of overexpression of all three RPA proteins resulted in significant reductions in ED frequency as detected by FACS (Fig. 3C, Fig. 7C).
- hPSC cell line containing a mutant GFP transgene that can be rescued to express wild-type GFP through HDR mediated by CRISPR-Cas9 was established (Fig. 4A).
- Fig. 4A We treated the cells with 25 pM NVB for 24 hours before and after electroporation of the Cas9/sgRNA RNP and an ssODN donor and observed a significant increase n HDR efficiency compared to the control (Fig. 4B, Fig. 8A-8C).
- LIG3 is a predominant ligase of the MMEJ pathway that seals the nicks in DNA
- its function could be replaced by other ligases, such as LIG1.
- Knocking down or inhibiting POLQ caused a significant reduction of LDs, which suggested limited functional redundancy between POLQ and other DNA polymerases and reaffirmed the central role of MMEJ in Cas9- induced LD.
- RPA is involved in DNA replication and repair. We discovered that RPA deficiency led more frequent LDs induced by CRISPR- Cas9, potentially because RPA prevents the annealing of resected ssDNA at MHs.
- HDR homology-directed repair
- oligos All oligos, WT-Cas9 (Alt-RTM S.p. Cas9 Nuclease V3) and GFP-Cas9 (Alt-RTM S.p. Cas9-GFP V3) were purchased from Integrated DNA Technologies (IDT). Details about oligos including ssODNs (with or without modification), gRNAs, primers and probes used in this study are presented in Supplementary Table 2.
- Table 2 Oligonucleotides pInducer20-Cas9 (iCas9) plasmid was reconstructed by inserting Cas9 sequence into a p!nducer20 vector (Addgene, Cat#44012) using a Gateway cloning method. Lenti-NHEl-TRE-EFla-rTta (control) and Lenti-NHEl-TRE-EFla-Hl.O (OE H1.0) were reconstructed using an In-Fusion cloning kit (Takara, Cat#638948). Information regarding antibodies is provided in Table 3
- GFP-mutant pEGIP*35 plasmid, Addgene, cat#26776
- iPSC and SC9N (mutant HBB) -iPSC lines were generated and well characterized in previous studies (7- 3).
- GLPIRmutant iPSC clone H4 was kindly provided by Professor Antonio Adamo. All iPSCs were cultured in Essential 8 medium (ThermoFisher, cat#A1517001) in rhLaminin-521 (ThermoFisher, cat#A29249) coated wells with daily medium change.
- Naive iPSCs were cultured in PXGL medium, consisting of N2B27 basal medium supplemented with 1 pM PD0325901 (PD) (STEMCELL Technologies, cat#A10256), 2 pM XAV-939 (VWR, cat# ALEXBML-WN100- 0005), 2 pM Go6983 (Sellck, cat#S2911), and 10 ng/mL LIF (Cell Signaling Technology, cat#62226S).
- the N2B27 medium was a 1:1 mixture of DMEM/F12 (ThermoFisher, cat#l 1330032) and Neurobasal (ThermoFisher, cat#21103049), lx N2 supplement (Gibco, cat#17502-048), lx B27 supplement without Vitamin A (Life Technologies, cat#12587010), 2 mM GlutaMAX (ThermoFisher, cat#5050-061), 10 mM HEPES (ThermoFisher, cat#15630080), 0.055 m 2-mercaptoethanol (ThermoFisher, cat#21-985-023), MEM NEAA (ThermoFisher, cat#l 1140050), and 1% penicillin-streptomycin (ThermoFisher, cat#15140-122).
- GFP-mutant naive iPSC was epigenetically reset using the previously established protocol (5).
- Primed state iPSC was transitioned into irradiated mouse embryonic fibroblast (iMEF) and treated with 1 pM PD, 10 ng/mL LIF, and 1 mM valproic acid sodium salt (Merck, cat#P4543) for 3 days. Subsequently, cells were transferred to a PXGL medium. By day 10, naive dome-shaped colonies emerged and were purified through FACS sorting for SUSD2 (Biolegend, cat#327406) or through several passages.
- the pInducer20-Cas9, Lenti-NHEl-TRE-EFla-rTta (control) and Lenti-NHEl-TRE- EFla-Hl.O (OE H1.0) constructs were packaged into lentivirus individually. Briefly, the plasmid was co-transfected with psPAX2 and pMD2.G following TransIT-293 Transfection Reagent (Minis Bio, cat#MIR2704) standard protocol. The lentivirus was harvested after 48 hr and concentrated by PEG-it Virus Precipitation Solution (System Biosciences, cat#LV810A-l).
- GFP-mutant iPSCs were transduced with iCas9 lentivirus and 1 pg/ml polybrene (Merck, cat#TR-1003-G) for 24 h, then 1-8 pg/ml G418 (Invitrogen, cat#108321-42-2) was added to the refreshed medium. After a 14-day culture with G418, it expanded as a GFPmutant-iCas9 line. Similarly, GFP-mutant OE Hl.O/control line was established with the same protocol using 2-10 pg/ml blasticidin for positive selection (ThermoFisher, cat#Al 113903). CRISPR-Cas9 Genome Editing with Different Cell Lines
- RNA ribonucleoprotein
- RNP ribonucleoprotein
- a total volume of 10 pl including 30 pmol ssODN, 50 pmol RNP and buffer R was prepared for electroporation.
- a total of 200,000 single cells were electroporated using the Neon Transfection System with the setting of 1600 V, 10 ms width and 3 pulses. The cells were seeded in one well of a 24-well plate immediately after electroporation, and their dynamic changes were monitored using the Incucyte S3 system.
- hHOs Human heart organoids
- GFP-mutant-iCas9 iPSCs were suspended in Essential 8 medium supplemented with 10 pM Rho kinase (ROCK) inhibitor Y-27632 (Abeam, cat#abl20129) and seeded at 10,000 cells/well in round bottom ultra-low 96-well plates (CELLSTAR, cat#650970) on day-2 at a volume of 100 pl per well. The plate was then centrifuged at 100 g for 3 min and placed in an incubator at 37 °C, 5% CO2.
- ROCK Rho kinase
- B27-I containing Wnt-C59 (Abeam, cat#abl42216) was added for a final concentration of 2 pM Wnt-C59 and the samples were incubated for 48 h.
- the medium was changed again on day 4 with fresh B27-I and day 6 with fresh RPMI1640/B-27 (ThermoFisher, cat#17504044).
- RPMI1640/B-27 ThermoFisher, cat#17504044
- GFP-mutant hHOs correction was performed following adapted lipofectamine RNAiMAX reagent (ThermoFisher, cat#13778150) standard protocol. In brief, hHOs after day 20 were selected for GFP correction. Medium from each well was replaced with fresh RPMI 1640/B-27 medium, containing 2 pg/ml doxycycline to induce Cas9 expression. After 48 h, the gRNA/ssODN/RNAiMAX solution was prepared for 5 hHOs per condition as the recipe: Mix 1 was prepared by adding 9 pl RNAiMAX reagent into 50 pl opti-MEM (Gibco, cat#10149832), mixed briefly.
- Mix 2 consisted of 150 pmol gRNA and 150 pmol ssODN in 50 pl Opti-MEM. The final solution was done by adding Mix2 into Mixl and incubating for 5 min. 5 hHOs were washed with PBS then incubated with final solution supplemented with 2 pg/ml doxycycline. After 30 min incubation, the hHOs and the final solution were separately added into one well of a round bottom ultra-low 96-well plate and cultured in 37°C, 5% CO2 incubator. The culture medium was changed every 48 h and the samples were collected after 96 h for analysis.
- Blastoids were generated according to a well-established PALLY protocol (7).
- Cells were suspended in N2B27 medium with 10 pM Y-27632 and seeded at a density of 75 cells per microwell in 400 pm AggreWell plates (STEMCELL Technologies, cat#34425).
- the medium was replaced with N2B27 supplemented with PALLY components (1 pM PD, 1 pM A83-01 (Axon, cat#909910-43-6), 5 pM lysophosphatidic acid (LPA)(Merck, cat#L7260), 10 ng/mL LIF, and 10 pM ROCK inhibitor Y-27632).
- LPA lysophosphatidic acid
- 10 pM ROCK inhibitor Y-27632 After two days, the medium was refreshed with N2B27 supplemented with 5 pM LPA and 10 pM ROCK inhibitor. Maturation of the structures was done for another two more days.
- the blastoid correction was performed using LONZA 4D-Nucleofector system (Lonza Bioscience). Briefly, 100 pmol gRNA and 100 pmol of WT-Cas9/GFP-Cas9 were mixed and incubated at room temperature for 10 min to form ribonucleoprotein (RNP). 60 pmol ssODN was mixed with 100 pmol RNP before the electroporation. Nucleofector Solution P3 (LONZA, cat#V4XP-3032) was added into RNP to make 20 pl final volume. Approximately, 400 aggregates from day 0 were mixed with prepared solution and loaded into Nucleocuvette (LONZA, cat#V4XP-3032).
- the 4D-Nucleofector X Unit was employed, and transfections were carried out using the program “hES cell, H9”, pulse code CB150. Immediately after electroporation, blastoids were transferred to AggreWell plates following the standard PALLY protocol.
- the in vitro attachment assay was performed based on a previously established protocol. On day 4, well-formed blastoids were manually selected. An 8-well p-slide chamber (ibidi, cat# 80827) was pre-coated with fibronectin (Merck, cat#F0895, diluted 1:40 in cold PBS) for 30 minutes. About 10 blastoids were introduced into each chamber containing IVC1 medium and incubated at 37 °C with 5% CO2 and 5% O2.
- the IVC1 medium consisted of Advanced DMEM/F12 (ThermoFisher, cat#12634-010), 20% heat-inactivated FBS (ThermoFisher, cat#30044333), 2 mM GlutaMAX, 0.5% penicillin-streptomycin, 1% ITS-X (ThermoFisher, cat#51500-056), and 1% sodium pyruvate (Sigma-Aldrich, cat#S8636).
- Additional components included 8 nM -estradiol (Sigma- Aldrich, cat#E8875), 200 ng/mL progesterone (Sigma- Aldrich, cat#P0130), 25 p M N-acetyl-L-cysteine (Sigma-Aldrich, cat#A7250), and ROCK inhibitor (Selleckchem, cat#S1049).
- IVC2 which mirrored the composition of IVC1, but substituted 20% FBS with 30% knockout serum replacement (ThermoFisher, cat#10828-028). Culturing continued for 48 hours, after which the media was collected for hCG testing with a commercial pregnancy kit and the structures underwent further immunofluorescence analysis.
- GFP-mutant iPSC correction samples the cells were harvested after 3 days postelectroporation using TrypLE (Life Technologies, cat#12604-021) and passed through a 70 pm strainer. The cells were resuspended in 200 pl FACS buffer (lx Ca 2+/ Mg 2+ free PBS, 2mM EDTA) containing 1 pg/ml DAPI and loaded onto a BD FACSAriaTM Fusion cytometer for analysis. GFP-mutant hHO correction samples were harvested after 3 days post-electroporation using 0.25% Trypsin-EDTA (Gibco, cat#25200056) and filtered through a 70 pm strainer.
- Cell cycle analysis was conducted using a standard protocol. Initially, cell pellets were fixed by adding cold 70% ethanol dropwise while vortexing and then incubated overnight at - 20°C. Subsequently, the cells were washed twice and resuspended in FACS buffer containing 200 pg/ml RNase A. After a 20-minute incubation at room temperature, the cells were washed once and resuspended in FACS buffer containing 1 pg/ml propidium iodide (PI) staining solution (Tonbo Biosciences, cat#13-6990-T200). Following a 10-minute incubation at room temperature, the samples were ready for FACS analysis. Data were processed using the BD FlowJo software (10.8.1), employing the Watson model for cell cycle assessment.
- PI propidium iodide
- the genomic DNA of was extracted after three days post-electroporation using a DNeasy Blood & Tissue kit (Qiagen, cat#69506), and quantified by a Qubit instrument.
- the HBB/GLPIR-corrected iPSCs were harvested after three days post-electroporation and subjected to the genomic DNA extraction using a DNeasy Blood & Tissue kit (Qiagen, Cat #69506).
- the genome DNA was quantified using a Qubit instrument.
- 10 HBB gene edited blastoids were lysed by 20 pl STE buffer (10 mM Tris-HCl (pH 8.0), 1 mM EDTA, 100 mM NaCl) following the program: 55 °C for 180 min, 95 °C for 10 min, and hold at 4 °C.
- the ddPCR was performed on a Bio-Rad QX200 system using ddPCR Supermix for Probes (No dUTP) (Bio-Rad, cat#l 863024) following the manufacturer’s protocols.
- the 20x assay mix was comprised of 18 pM each primer and 5 pM each probe.
- One reaction contains 5 ng genomic DNA or 3 pl blastoid STE lysis, lx assay mix, and lx ddPCR Supermix, and was transferred into a sample well of a DG8 Cartridge for droplet generation using a QX200 Droplet Generator follow the manufactory manual.
- the PCR was performed immediately following the droplet generation.
- HBB targets amplification followed the program: 95 °C for 10 min, then 94 °C for 30 s, 53 °C for 1 min for 40 cycles, 98 °C for 10 min, and hold at 4 °C.
- the genomic DNA of HBB -corrected iPSCs was extracted after 3 days postelectroporation using a DNeasy Blood & Tissue kit and quantified by a Qubit instrument.
- the UMI labeling was performed following the published protocol (10). Briefly, the target locus was labeled by one-cycle PCR using a UMI primer (Supplementary Table 1) in a 25 pl reaction including 50 ng genomic DNA, 1 pM UMI primer (containing a universal forward primer sequence, 10 nts UMI barcode, and a target locus forward primer sequence, see it in Supplementary Table 1), 12.5 pl 2x Platinum SuperFi PCR Master Mix (ThermoFisher, cat#12358010), following the program: initial denaturation at 98 °C for 70s, gradient annealing from 70 °C to 65 °C with 1 °C/5 s ramp rate, extension at 72 °C for 7 min, and hold at 4 °C.
- the UMI labeled DNA was purified by 0.8x AMPure XP beads(Beckman Coulter, cat#A63881), then mixed with a universal forward primer, a target locus reverse primer (Supplementary Table 1), and Phusion Hot Start II High-Fidelity PCR Master Mixes (ThermoFisher, Cat#F-566), and amplified following the program: initial denaturation at 98 °C for 30 s, then 98 °C for 10 s, 57 °C for 30 s, 72 °C for 30 s for 35 cycles, lastly 68 °C for 5min, and hold at 4 °C.
- the amplicons were purified by 0.8x AMPure XP beads and used for library preparation.
- the library preparation was done using the ligation sequencing kit (Oxford Nanopore Technologies, cat#SQK-NBD 112.24) following its standard protocol.
- the Nanopore sequencing was performed on an Oxford Nanopore Mini ON sequencer using FUO-MIN112 flow cells.
- the reads were base called using Guppy basecaller (v5.0.7).
- Data analysis was performed using VAULT as described previously (ref). In brief, the UMI primer sequence, fastq file and reference amplicon sequence were provided to the algorithm. VAULT will extract mappable reads followed by identification of UMI sequences from reads. Reads will then be grouped by their UMI sequences and used for parallel analysis of SNVs and SVs.
- HDR events were calculated as the percentage of UMI groups (representing original molecules) carrying the desired mutation.
- the ONT sequencing reads were aligned to the hg38 reference genome by minimap2 (v2.11) to check for MMEJ events.
- the MMEJ frequency was calculated as the percentage of MMEJ reads from alignment results.
- the genomic DNA of HBB/GLPIR-corrected iPSCs was extracted after 3 days postelectroporation using a DNeasy Blood & Tissue kit and quantified by a Qubit instrument.
- PCR reactions were set up in a 50 pL volume with Phusion Hot Start II High-Fidelity PCR Master Mixes, and amplified following the program: initial denaturation at 98 °C for 30 s, then 98°C for 10 s, 57 °C for 30 s, 72 °C for 30 s for 35 cycles, lastly 68 °C for 5min, and hold at 4 °C.
- Primers flanking the editing site are shown in Supplementary Table 1.
- PCR products were purified by 0.8x AMPure XP beads and 200 ng of purified DNA were mixed with 2 pL lOx NEBuffer 2 (NEB, cat#B7002S) and nuclease-free water to a 19 pL volume. Heteroduplex formation followed: 98 °C for 30 s, gradual cooling from 95 to 85 °C at a rate of -2 °C/s, gradual cooling from 85 to 25 °C at a rate of -0.1 °C/s, and hold at 4 °C.
- 1 pL of T7 Endonuclease I was added into the reaction and incubated at 37 °C for 15 min.
- hHOs Human heart organoids
- RNA-seq data were processed using the online platform A.I.R. Sequentia Biotech SL (https://transcriptomics.sequentiabiotech.com/). Genes with read counts of at least 15 in two or more samples were selected, and log2 counts per million (CPM) values were used for Principal Component Analysis (PCA). Differentially expressed (DE) genes were identified using DESeq2, with a q-value threshold of ⁇ 0.05. Gene Ontology (GO) analysis was performed using DAVID (v2023q4) (77) to provide functional annotation and enrichment insights for the DE genes. For selected DE genes, normalized FPKM values were z-score transformed and used to generate a gene expression heatmap.
- CPM log2 counts per million
- the protocol of esiRNA transfection was adapted to the instruction of lipofectamine RNAiMAX reagent.
- GFP mutant iPSCs were harvested after 1 hour of 10 pM ROCK inhibitor Y-27632 treatment.
- MISSION® siRNA Universal Negative Control #1 (Merck, cat#SIC001)
- esiRNA human H1F0 (Merck, cat#EHU135871) were used in this study.
- the esiRNA/RNAiMAX solution was prepared as the recipe: Mix 1 was prepared by adding 13.5 pl RNAiMAX reagent into 225 pl opti-MEM and vertexing for a few seconds.
- Mix 2 was prepared by adding 90 pmol esiRNA into 225 pl opti-MEM and pipetting a few times.
- the esiRNA/RNAiMAX solution was done by adding Mix2 into Mixl and incubating for 5 min, which was used for resuspending a 1.5 million cell pellet. After 30 min incubation with esiRNA/RNAiMAX solution, the cells were aliquoted equally into 3 rhLaminin-521 coated wells and cultured in 37 0 °C, 5% CO2 incubator. The cell samples were collected after 48 h for GFP correction electroporation and knockdown efficiency analysis.
- Results are reported as the mean ⁇ standard error of the mean (SEM) unless indicated otherwise. Statistical comparisons were conducted using GraphPad Prism software. A p-value less than 0.05 was considered statistically significant.
- the series of cyanine attached bases were modelled first as non-standard residues.
- the 5’Cy5-G fragment was regarded as a non-standard residue with its structure built by GaussView. Structure optimization was performed through Gaussian 16 by density functional theory (DFT) calculations at B3LYP-D3(BJ)/6-31d* level(74). The optimized structure was modelled using the generalized amber force field (GAFF)(75), assigned with restrained electrostatic potentials (RESPs) calculated using the Multiwfn software based on the geometry optimization results(76). After all the parameters (including bond, angle, dihedral angle, etc.
- the relevant files of all the non-standard residues were included in the Gromacs top library, including 5’Cy5-G, 5’Cy5-G, 5’Cy3-G, 3’Cy5-G, 3’Cy5-G, 3’Cy3-G for GFP sequence, 5’Cy5-T, 5’Cy5-T, 5’Cy3-T, 3’Cy5-A, 3’Cy5-A, 3’Cy3-A for HBB sequence.
- the unmodified or cyanine attached ssODN/DNA structures were built by Discovery Studio and modelled using pdb2gmx by Gromacs for MD simulations.
- Unmodified Dio was also simulated as a control.
- a similar molecular modeling method was employed for the HBB HDR system.
- GFP sequence (unmodified) GATGCTCCTG, 5’Cy5(3,5.5)-GATGCTCCTG and GATGCTCCTG-3’Cy5(3,5.5) were extracted with 5 bases from the head and 5 bases from the tail of GFP sequence.
- HBB sequence (unmodified) TCACTGTGGA, 5’Cy5(3,5.5)-TCACTGTGGA, TCACTGTGGA-3’Cy5(3,5.5) were extracted with 5 bases from the head and 5 bases from the tail of GFP sequence.
- TCACTGTGGA, 5’Cy5(3,5.5)-TCACTGTGGA, TCACTGTGGA-3’Cy5(3,5.5) were extracted with 5 bases from the head and 5 bases from the tail of GFP sequence.
- Production runs for each system were conducted for 100 ns (DNA system) or 50ns (ssODN system) after 1 ns for equilibration, with a time step of 2 fs. Coordinates were saved every 1 ps, yielding 100000 or 50000 frames for further analysis. The simulation length was sufficient to provide converged data of ssODN/DNA systems.
- Simulation data were analyzed with Gromacs, Gaussian, Multiwfn software and visualized using Origin and VMD packages.
- the radial distribution functions (RDF) of various cyanines around the closest paring base were calculated by Gromacs to assess quantitatively their interactions.
- the spatial distribution functions (SDF) of various anions around the cage were calculated by Gromacs and visualized using VMD packages at a certain isolevel. Free energy surfaces, hydrogen bond analysis, and the radius of gyration Rg were calculated by Gromacs and processed by Origin.
- the binding energy was calculated by Gaussian using the optimized system, the unmodified first pair bases for the unmodified DNA systems or cyanine attached base with its paring base for the cyanine modified DNA systems, respectively.
- the 5’Cy5-G:C were adopted to calculate their binding energy according to the following equation:
- E 5 ’Cy 5 -G:C is the Gibbs energy of the 5’Cy5-G:C part
- Es’Cys-G is the Gibbs energy of the 5’Cy5-G part
- EC is the Gibbs energy of the base C part.
- Fluorophore-modified ssODNs enhance CRISPR/Cas9 mediated HDR efficiency
- Fluorophores are small molecules widely used for single-stranded DNA (ssDNA) modifications in various applications, such as qPCR probes and antisense oligonucleotides (ASOs ).
- the cells were also continuously monitored using a live fluorescence imaging system after delivering 5’ Cy5 or unmodified ssODNs together with CRISPR-Cas9 ribonucleoprotein (RNP). GFP-positive cells were detected around 20 hours post-electroporation, with significantly higher levels of GFP signals observed in the 5' Cy5-ssODN group at all time points, consistent with the FACS results (,).
- the GLP1R variants are associated with type 2 diabetes and are promising targets for genome editing therapy.
- 5’Cy5-ssODN was corrected in a patient iPSC line.
- the results showed that 5’Cy5-ssODN significantly elevated the HDR efficiency compared to unmodified ssODN (Fig. 9F).
- 5’Cy5-ssODN demonstrated better cell survival and proliferation following CRISPR-Cas9 editing (Fig. 5G).
- the cell cycle phase distribution in 5' Cy5-ssODN- edited cells was similar to that of unedited cells (mock), whereas unmodified-ssODN-edited cells exhibited an abnormal increase in the G2/M phase and a decrease in the G1 phase (data not shown).
- These results suggest that cells experienced less stress when edited with 5' Cy5-ssODN compared to unmodified ssODN.
- HDR primarily occurs during the S and G2 phases of the cell cycle 23
- the improvement in HDR efficiency observed with 5’ Cy5 -ssODN is unlikely to be due to difference in the cell cycle.
- hHO human heart organoid
- the hHOs were transfected with mutant GFP-targeting sgRNA and either unmodified ssODNs or 5'Cy5-ssODNs (FIG. 10A). Consistent with the results of 2D cell culture systems, 5’Cy5-ssODNs significantly increased HDR efficiency in hHOs by more than twofold, as quantified by both FACS and immunofluorescence analyses (FIG. 10B) of GFP- and CD 106- or cardiac troponin T (cTNT)- positive cardiomyocytes.
- cTNT cardiac troponin T
- Cas9 editing was performed using structures collected on day 0, 1, 2, or 3 of the protocol, which all formed blastoids.
- the blastoids exhibited GFP and Cy5 fluorescence signals, indicating the successful delivery of Cas9 RNP and ssODNs via electroporation (Fig. 10E, left).
- the edited blastoids were morphologically indistinguishable from the unedited controls (mock). No significant difference in diameter was observed between unmodified ssODN and mock groups or between the 5’Cy5- ssODN and unmodified control groups (Fig. 10E, right).
- 5’Cy5-ssODN significantly improved the HDR efficiency of HBB correction by nearly 250%, as quantified by ddPCR. Specifically, 28.9% mutation was corrected with 5’Cy5-ssODN, compared to only 11.7% with unmodified ssODN (Fig- 10G).
- RNA sequencing we performed on GFP-mutant iPSCs 24 hours after electroporation (Fig. 11 A).
- Principal component analysis (PC A) demonstrated good reproducibility among replicates (data not shown).
- the 5 ’Cy5 -modified group was positioned closer to the mock group (non-editing) in the PC1-PC2 space compared to the unmodified group (data not shown).
- DEGs Differentially expressed genes of the 5’Cy5-ssODN condition are enriched in biological processes including DNA damage response, DNA repair, and cell proliferation (Fig. 11C), which are distinct from the unmodified condition (data not shown).
- Fig. 11D Genes related to DNA damage and repair were then analyzed (Fig. 11D).
- genes associated with DNA binding e.g., RAD21, H1.0, SMC3, RECQL4, SSRP1, and APEX1
- single-stranded DNA binding e.g., RAD52, RAD23A, RPA1, SMC2, and PCBP1
- RAD50, RAD52, FEN1, and NBN Nebrin
- NER nucleotide excision repair
- XPC nucleotide excision repair
- DDB2 translesion DNA synthesis
- PCNA nucleotide excision repair
- H1.0 a variant of the linker histone family, plays a role in higher-order chromatin compaction and is emerging as a key factor in DNA repair. It binds to the linker DNA between nucleosomes, helps to stabilize the 30-nm chromatin fiber, and has been shown to affect chromatin remodeling during DNA repair, including HDR.
- H1.0 in 5’Cy5-ssODN-mediated HDR we knocked down its expression, which led to decreased HDR efficiency in the 5’Cy5 condition, while the unmodified condition remained unaffected (Fig. 1 IF, and data not shown).
- H1.0 overexpression enhanced HDR efficiency with 5’Cy5-ssODNs (Fig. 11G, and data not shown).
- CssDNA circular single-stranded DNA
- the 90-nt 5’Cy5-ssODN90 and unmodified ssODN90 were initially modeled into nearcircular conformations, with their head and tail in close proximity to reduce computational time. If the construction is favorable, the structure maintains a “doughnut-like” shape; otherwise, an unwound conformation is expected.
- 5’Cy5-ssODN90 stabilized itself into a circular architecture, with the 5’Cy5 head inserted between two adjacent bases (G and C) at the ssODN90 tail end (TGCG fragment, data not shown).
- the unmodified ssODN90 expanded into a loose, arch-like structure (data not shown).
- the introduced cyanine group significantly affects the conformation of the ssODN, which is further supported by the distinct radius of gyration (Rg) of 5’Cy5-ssODN90 and unmodified ssODN90 (Fig. 12A).
- Rg radius of gyration
- the near-constant Rg of 5’Cy5-ssODN90 indicates that the cyanine group maintains the ssODN90 in a dynamic yet compact circular structure, whereas the increasing Rg of the unmodified ssODN90 suggests a progressively diverging and loosening conformation.
- Free energy surface (FES) analysis was first used to scan dye-DNA interactions and conformations, such as stacking motifs and unstacked structures.
- 5’Cy5-GFPD10 exhibited the lowest free energy (-8.8 kcal/mol) compared to 5’Cy5.5-GFPD10 and 5’Cy3-GFPD10, reflecting its superior stability (data not shown).
- the 5’ modified ssODNs also showed more focused conformations with lower free energy (-8.8 to -8.6 kcal/mol) than 3’ modifications (-8.2 to -7.9 kcal/mol), indicating the 5’ end as the more favorable site. Similar results were demonstrated for the HBB sequence (data not shown).
- Hydrogen bonding analysis suggested that 5’Cy5-D10 exhibited a significant increase in hydrogen bond numbers compared to unmodified DIO. The number of hydrogen bonds decreases in the following order: 5’Cy5 > 5’Cy5.5 > 5’Cy3 > 3’Cy5 > 3’Cy5.5 > 3’Cy3 > unmodified (data not shown). Similarly, hydrogen bond distribution analysis corroborates this trend (data not shown).
- RDF Radial distribution function
- the supramolecular interactions between cyanine and DNA strands were analyzed using the independent gradient model based on IGMH, which highlights inter- and intra-fragment interactions.
- the 5’ or 3’ cyanine and their adjacent G:C base pair were modeled as a Cy-G:C system based on MD-optimized structures.
- IGMH maps revealed clear interaction regions, with notable 71-71 stacking interactions between cyanine and the C base, indicated by broad green isosurfaces between the 5’ cyanine and the adjacent G:C base pair segment (data not shown).
- 5’Cy5-GFPD10 exhibited the largest interaction area, suggesting the highest stability, followed by 5’Cy5.5 and 5’Cy3.
- CRISPR-Cas9 one of the most widely studied genome editing tools, has been applied across numerous fields of research and therapy.
- the low efficiency of HDR remains a significant bottleneck for its broader success.
- This study offers a new perspective on how a class of chemical modifications to ssODNs can enhance HDR following Cas9-induced DSBs (Fig. 9A-9G).
- Fluorophore modifications particularly the 5' Cy5 modification, consistently enhanced HDR efficiency across different targets (GFP/HBB/GLP1R) and multiple experimental models, including pluripotent stem cells, human heart organoids (hHOs) (Fig. 10A and 10B), and human blastoids (Fig. 10C-10H).
- the 5'Cy5-ssODNs improved HDR efficiency without affecting CRISPR-Cas9 cleavage efficiency and DNA repair by the microhomology-mediated end joining (MMEJ) pathway. They also promoted improved cell survival (Fig. 9G) and preserved normal cell cycle progression.
- MMEJ microhomology-mediated end joining
- 5'Cy5-ssODNs enhanced HDR efficiency in complex 3D stem cell models like hHOs and blastoids.
- the 5'Cy5 modification increased the proportion of GFP-positive cardiomyocytes without affecting differentiation and key functional properties, such as rhythmic beating (Fig. 10B).
- 5'Cy5-ssODNs not only improved HDR efficiency but also cavitation formation, a critical step in blastocyst development (Fig. 10C-10F).
- the size of a human blastocyst is primarily determined by the dynamics of blastocoel formation, not only cellular growth.
- the blastocoel is the fluid-filled cavity that forms within the blastocyst, mainly depending on ion transport and aquaporin channels.
- our new in situ editing method using 5'Cy5-ssODNs did not affect the morphological or functional integrity of the delicate blastoids, demonstrating that 5'Cy5-ssODNs can be safely applied in early-stage human developmental models without causing toxicity or loss of pluripotency (Fig. 10E, 10F).
- Integrating 5'Cy5-ssODNs with advanced CRISPR-Cas9 delivery technologies such as lipid nanoparticles, nanoscale zeolitic imidazolate framework, or enveloped delivery vehicles (ED Vs) — could further enhance their clinical translatability.
- CRISPR-Cas9 delivery technologies such as lipid nanoparticles, nanoscale zeolitic imidazolate framework, or enveloped delivery vehicles (ED Vs) — could further enhance their clinical translatability.
- Cy5 is a simple chemical modification and also allows visual tracking of donor DNA delivery (Fig. 10E), without causing any significant cytotoxicity to the host (Fig. 9G). Moreover, Cy5-labeled oligonucleotides can be freely taken up by cells via endocytosis, significantly reducing the challenges of cellular delivery. 5'Cy5-ssODNs could be taken up by cells when simply added to the culture medium, and interestingly, the Cy5 signal remained stable for several days. This finding suggests that 5'Cy5 modification may facilitate the delivery of ssODNs to host cells, potentially enhancing HDR to some degree.
- Cy5-antisense oligonucleotides which specifically target RNA to knock down gene expression
- Cy5 modification can be extended to the sgRNA itself. Modifications to the sgRNA could enhance the specificity and stability of the CRISPR-Cas9 complex, reducing off-target effects while improving on-target editing.
- the synergy between modified ssODNs and sgRNAs could lead to even greater improvements in HDR efficiency, minimizing off-target effects and maximizing therapeutic outcomes.
- RNA-seq analysis revealed that 5'Cy5-ssODNs upregulated HDR- related genes, such as RAD51, RAD52, and RAD50 (Fig. 11B-11E), which may directly contribute to improved HDR efficiency. Upregulation of chromatin-binding proteins, including H1.0 (Fig. 1 IB, 1 ID and 1 IE), which are known to influence chromatin remodeling and DNA accessibility at repair sites. Knocking down H1.0 specifically reduced HDR efficiency in the 5'Cy5-ssODN condition (Fig. 11F), while its overexpression enhanced HDR (Fig. 11G).
- H1.0 is a key mediator of the 5'Cy5-ssODN-induced repair response, potentially facilitating greater accessibility of the 5'Cy5-ssODNs to DSBs and stabilizing the repair machinery.
- further studies are needed to identify the cellular sensors of 5'Cy5-ssODNs and understand how they trigger HDR responses, these findings reveal a genetic interaction between the chromatin factor Hl and modified ssODNs, offering deeper insights into the mechanisms driving enhanced HDR.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Chemical & Material Sciences (AREA)
- Wood Science & Technology (AREA)
- Organic Chemistry (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Zoology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Microbiology (AREA)
- Plant Pathology (AREA)
- Biophysics (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Crystallography & Structural Chemistry (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Compositions and methods for increasing the homology directed repair (HDR) efficiency following CRISPER/CAS mediated gene editing are disclosed, include (A) Agents that reduce PoLQ level/activity in a cell; (B) agents that increase RPA levels/activity in a cell and (C) fluorophore-modified nucleic acids used in gene editing.
Description
METHODS FOR IMPROVING PRECISE GENOME MODIFICATION AND REDUCING UNWANTED MUTATIONS BY CRISPR-CAS EDITING CROSS RFERENCE TO RELATED APPLICATIONS
This application claims the benefit of and priority to U.S. Serial No. 63/639,573, filed April 26, 2024 and U.S. Serial no. 63/723,422, filed November 21, 2024, which are incorporated by reference herein in their entirety.
FIELD OF THE INVENTION
The disclosed invention is generally in the field of gene editing and specifically in the area of CRISPER/CAS mediated gene editing.
BACKGROUND OF THE INVENTION
CRISPR-Cas9 can introduce double strand breaks (DSBs) to a specific genomic locus that shares sequence complementarity with the CRISPR guide RNA (gRNA). The DSBs can be repaired through different cellular mechanisms, including the classical non-homologous end joining (C-NHEJ, hereafter referred to as NHEJ), MMEJ (also called alternative NHEJ), homologous recombination (HR) and single-stranded annealing (SSA) pathways. NHEJ often generates small insertions and deletions (indels) [1] and is believed to be the dominant repair pathway for DSBs induced by CRISPR-Cas9 [2]. MMEJ relies on small homologies for DNA repair, while SSA requires longer ones. HR is an error-free DNA repair mechanism that requires a homologous template.
The majority of on-target indels induced by CRISPR-Cas9 are less than 20 bp as estimated in numerous studies on large-scale gRNA cutting outcomes1-3, 5. However frequent on- target large deletions (LDs) and large complex rearrangements of the genome caused by CRISPR-Cas9ere have been reported6-10. One of the reasons that LDs or complex genomic rearrangements eluded detection in earlier studies is that genome editing outcomes were analyzed with Sanger and/or short-read next-generation sequencing of short PCR amplicons (usually < 300 bp), which are unable to resolve large genomic alterations. Long-read sequencing platforms, such as PacBio and Oxford Nanopore, are much better at resolving large rearrangements, have been used for the analysis of genome editing outcomes7, 11-13.
The LD issue can have significant implications for the application of the 68 otherwise versatile genome editing tool CRISPR-Cas9. However, the underlying mechanism of LD is not fully understood and strategies to reduce LDs are urgently needed
There is a still a need for compositions and methods for reducing CRISPR-induced large deletions in cells.
SUMMARY OF THE INVENTION
Disclosed herein are compositions and methods for increasing the homology directed repair (HDR) efficiency following CRISPER/CAS mediated gene editing. The disclosed methods and compositions result in up to a two-fold increase in the HDR efficiency and up to a 50% reduction in the frequency of large deletion. The methods are based in some forms, on the discovery that MMEJ is the major repair pathway to mediate CRISPR-induced on-target large deletions.
One embodiment to reduce the large deletion includes inhibiting DNA polymerase theta (PolQ) levels or activity, the main player of the MMEJ repair pathway. In some forms, PolQ levels are inhibited by treating cells with a PolQ inhibitor such as Novobiocin (NVB) or ART558, for example about 24 hours before and after electroporation.
Another embodiment includes delivering recombinant replication protein A (RPA) proteins together with Cas9/sgRNA RNP and single-stranded oligodeoxynucleotide (ssODN) to cells subjected to CRISPER/CAS gene editing to avoid annealing of single-stranded DNA resected after CRISPR-induced DSBs. The RPA proteins can prevent the donor ssDNA from degradation in the cells and can activate the HDR repair pathway, and the inhibition of MMEJ may switch the DNA repair pathway from MMEJ to HDR.
The disclosed strategy reduces CRISPR-induced on-target large deletions by up to 50% and can increase HDR efficiency by up to two-fold.
The methods are also based in some forms on the discovery that modifying nucleic acids involve in CRISPER/CAS gene editing can dramatically improve CRISPER/CAS gene editing efficiency.
In some forms, the ssODN or sgRNA is labelled with a fluorophore at its 5’ and/3’ end.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1A Left: analysis of microhomology (MH) frequency at Cas9 induced breakpoint junctions in two published data; right: schematic of how MMEJ could lead to CRISPR-induced LDs (created with BioRender.com). NH: no homology. FIG. IB shows LD events detected in Cas9-edited human pluripotent cell lines 9SEQ ID NO:127-135). Boxes indicate MH sequences. FIG. 1C Left: schematic of the strategy to analyze CRISPR-induced LDs in the CD9 locus (created with BioRender.com); right: MH frequency in deletions > 310 bp quantified from long- read sequencing data. **** p < 0.0001, Fisher’s exact test.
FIG. 2A is a schematic of the roles of four key genes in the MMEJ pathway (created with BioRender.com). FIG. 2B is a schematic of the workflow for the knockdown experiments. FIG. 2C Top: the location of the PIGA intronic gRNA, the numbers indicate the distances between gRNA cut site and the adjacent exons; bottom left: example flow cytometry analysis of PIGA expression using the FLAER assay; bottom right: normalized mRNA level of siRNA target genes biological replicates n = 3, and LD frequency quantified by FACS, biological replicates n = 4, **** p < 0.0001. FIG. 2D Top: the location of the CD9 intronic gRNAs, the number indicates the distance between the gRNA cut site and the nearest exon; bottom left: example flow cytometry analysis of CD9 expression using the PE anti-CD9; bottom right: LD frequency quantified by FACS, biological replicates n = 3, * p < 0.05, ** p < 0.01. FIG. 2E is a schematic of IDMseq analysis of LDs. FIG. 2F. Frequency of LD (> 30 bp) quantified by IDMseq. The numerator indicates the LD event number, and the denominator indicates the total event number detected by IDMseq. **** p < 0.0001, * P < 0.05, Fisher's exact test. FIG. 2G. Frequency of LD (> 30 bp) quantified by ONT long-read sequencing. The numerator indicates the LD read number, and the denominator indicates the total read number. **** p < 0.0001, Fisher's exact test.
FIG. 3A top is a schematic of the workflow for the POLQ inhibition experiment; bottom: LD frequency quantified by FACS, biological replicates n = 3, * p < 0.05, *** p < 0.001, **** p < 0.0001. FIG. 3B, top Schematics of the inducible RPA constructs; bottom: the mRNA level of RPA genes after doxycycline treatment for two days. FIG. 3C is a schematic of workflow of RPA overexpression experiments. FIG. 3D and FIG. 3E, left: LD frequency quantified by FACS, biological replicates n = 3, * p < 0.05, ** p < 0.01, *** p < 0.001, **** p < 0.0001, ns: not significant, two-sided Student’s t-test; OE: overexpression. Right: Frequency of LD (> 30 bp) quantified by IDMseq and ONT sequencing. The numerator indicates the LD event number, and the denominator indicates the total event number detected by IDMseq. * p < 0.05, ** p < 0.01, **** p < 0.0001, Fisher's exact test.
FIG. 4A is a schematic of mutant GFP correction by CRISPR-mediated HDR (left) and strategies to improve HDR efficiency (right) (created with BioRender.com). The green color indicates the restoration of green fluorescence. FIG. 4B is a quantitative analysis showing frequency of GFP positive cells quantified by FACS, biological replicates n = 3, ** p < 0.01, **** p < 0.0001. FIG. 4C is a schematic of ddPCR probe-based assay design for detecting CRISPR-mediated precise mutation via HDR and representative 2D plots of ddPCR events from positive (an EPOR G6002 mutant cell line), mock (non-edited line) and control (edited in EPOR locus) Hl ESC samples (data were shown in FIG. 4D). Probe 2 is designed to specifically
recognize the installed point mutation but not wildtype sequence. Probe 1 is designed to recognize both mutant and wild-type sequences. FIG. 4D is a schematic of CRISPR-mediated HDR via editing WAS and EPOR genes (left) (created with BioRender.com), and the HDR frequency analyzed by ddPCR (right), biological replicates n = 4, * p < 0.01, ** p < 0.01, *** p < 0.001. The treatments followed the strategy illustrated in a using 25 pM NVB or 2.5 pmol RPA, respectively.
FIG. 5 A is an agarose gel electrophoresis of long-range PCR products of SH2B3 and Hl.3 edited cell line; KO#1 (under SH2B3), KO#5, 7, 8 and 11 (under Hl.3) indicate LD clone. FIG. 5B is a schematic of the strategy to analyze CRISPR-induced LDs in PIGA locus. FIG. 5C. Representative Integrative Genomics Viewer (IGV) tracks and coverage of long-read sequencing data on the PIGA locus from PIGA FLAER positive and negative sorted populations. The dashed arrow indicates the position of the PIGA FLAER positive LD proximal end on the PIGA locus. The scissor indicates the CRISPR/Cas9 cutting site. FIG. 5D MH frequency in deletions > 30 bp of PIGA intrl_l sgRNA quantified from long-read sequencing data. **** p < 0.0001, Fisher’s exact test. FIG. 5E, Top: the location of PIGA gRNAs in the PIGA genomic locus; bottom left: example flow cytometry analysis of PIGA expression using the FLAER assay; bottom right: LD frequency of PIGA gRNAs screened, biological replicates n = 3. FIG. 5F. Plot of the LD frequency and the distance between the sgRNA and its nearest exon. FIG. 5G. Top: schematic of the target gene loci on X chromosome; bottom left: normalized copy number of target genes, biological replicates n = 3; bottom right: copy number measurements performed by ddPCR for WAS gene, this experiment was performed more than three biological replicates, ns: not significant.
FIG. 6A. Flow cytometry analysis of PIGA expression, the number in the gate indicates the percentage of PIGA FLAER negative population. FIG. 6B. Representative Western blotting analysis for RPA1 and POLQ expression. The grey value was quantified using ImageJ and normalized by the control siRNA (siCtrl) treated sample. FIG. 6C. Left: relative mRNA level of siRNA target genes, LIG3 and PARP1; right LD frequency quantified by FACS, biological replicates n = 3. **** p < 0.0001, ns: not significant. FIG. 6D. Schematic of the strategy and workflow for cell cycle synchronization by nocodazole and LD analysis by FACS (created with BioRender.com). FIG. 6E. Cell cycle analysis following nocodazole treatment strategy by FACS, the percentage of cell phase was calculated by the Watson pragmatic fitting algorithm, biological replicates n = 3. FIG. 6F. LD frequency of different doxycycline exposure time in Hl-iCas9 PIGA intr5_l ESCs, quantified by FACS, biological replicates n = 6, ns: not significant. FIG. 6G. LD frequency of cell cycle synchronized cells quantified by FACS,
biological replicates n = 3 (each with three technical replicates). FIG. 6H. Cell cycle analysis of RPA and POLQ knockdown cells by FACS, the percentage of cell phase was calculated by the Watson pragmatic fitting algorithm, biological replicates n = 2 (each with two technical replicates). FIG. 61. FACS analysis of pluripotency markers for NVB treated Hl hESCs. FIG. J. Live cell count of Hl hESCs at 24 hr and 48 hr after NVB treatment, biological replicates n = 3, ns: not significant. FIG. 6K. LD size distribution analysis for RPA and POLQ knockdown samples. FIG. 6L. MH frequency in LDs of RPA and POLQ knockdown samples. The numerator indicates the MH > 2 bp event number, and the denominator indicates the LD event number detected by IDMseq. ns: not significant. FIG. 6M. LD size distribution analysis for NVB treated and RPA overexpression samples. FIG. 6N. MH frequency in LDs of NVB treated and RPA overexpression samples. The numerator indicates the MH > 2 bp event number, and the denominator indicates the LD event number detected by IDMseq. ** p < 0.01, * p < 0.05, Fisher’ s exact test. FIG. 60. CRISPR-Cas9 editing efficiency using an exonic PIGA gRNA, ex2_l sgRNA, quantified by FACS, biological replicates n = 3, ns: not significant. FIG. 7A. Top: the location of CD9 gRNAs ; bottom: LD frequency of CD9 gRNAs screened, biological replicates n = 3. FIG. 7B-7C. LD frequency quantified by FACS, biological replicates n = 3, * p < 0.05, ** p < 0.01. OE: overexpression. FIG. 7D. Top: the location of LAMP2 intronic gRNAs, the numbers indicate the distances between sgRNA cutting sites and the nearest exons; bottom: example flow cytometry analysis of LAMP2 expression. FIG. 7E-7G show LD frequency quantified by FACS, biological replicates n = 3, * p < 0.05, ** p < 0.01, *** p < 0.001, **** p < 0.0001. OE: overexpression. FIG. 7H. Top: the location of the WAS intronic gRNA; bottom: frequency of LD (> 30 bp) quantified by ONT long-read sequencing. The numerator indicates the LD event number, and the denominator indicates the total event number detected by nanopore reads. **** p < 0.0001, Fisher’s exact test. FIG. 71. Top: the location of the HBB intronic gRNA; bottom: frequency of LD (> 30 bp) quantified by ONT long-read sequencing. The numerator indicates the LD event number, and the denominator indicates the total event number detected by nanopore reads. **** p < 0.0001, Fisher’s exact test.
FIG. 8A. Representative Coomassie-stained SDS-PAGE images of the RPA protein complex including the RPA1, RPA2 and RPA3 subunits, kDa: kilodalton. FIG. 8B. Flow cytometry analysis of GFP expression. FIG. 8C. GFP positive population quantified by FACS, biological replicates n = 3; * p < 0.05, ** p < 0.01, *** p < 0.001, **** p < 0.0001, ns: not significant. FIG. 8D. Delivery efficiency of Cy3-ssODN mixed with recombinant RPA. FIG. 8E. Flow cytometry analysis of human primary peripheral blood erythroid progenitor cell surface markers.
FIG. 9A-Schematic of mutant GFP correction via CRISPR-Cas9 and ssODN mediated HDR FIG. 9B. Frequency of GFP positive cells quantified by FACS. Data are presented as the mean ± standard error of the mean (SEM), biological replicates n = 3. A two-tailed t-test was used, * P <0.05, ** P <0.01, *** <0.001, **** P <0.0001, ns: not significant. FIG. 9C. Lei: representative images of corrected GFP mutant cells at Oh and 72 h post-electroporation. Right: frequency of GFP positive cells quantified by FACS. Data are presented as the mean ± SEM, biological replicates n = 3. A two-tailed t-test was used, ** P <0.01, **** P <0.0001. FIG. 9D. Lei: schematic of mutant HBB correction (NM_000518.5:c.20A>T) from “GTG” to “GAG” via HDR. Right: frequency of HBB correction quantified by ddPCR. The numerator indicates the HBB mutant corrected event number, and the denominator indicates the total event number detected by ddPCR. *P < 0.05, ****P < 0.0001, ns: not significant, two-sided Fisher’s exact test. FIG. 9E. Lei: schematic of mutant HBB correction (NM_000518.5:c.20A>T) from “GTG” to “GAG” via HDR. Right: frequency of HBB correction quantified by IDMseq. HBB corrected molecule number and total UMI-labeled number were subjected to two-sided Fisher’s exact test, biological replicates n = 2, ****P < 0.0001. FIG. 9F. Lei: schematic of mutant GLP1R correction (NM_002062.3:c.402+3delG, c.396A>G) via HDR. Right: frequency of GLP1R correction quantified by ddPCR. The numerator indicates the GLP1R mutant corrected event number, and the denominator indicates the total event number detected by ddPCR. ****P < 0.0001, two-sided Fisher’s exact test. FIG. 9G. Lei: live cell number collected at 72 h postelectroporation. Data are presented as the mean ± SEM, biological replicates n = 4. A two-tailed t-test was used, ***P < 0.001. Right: representative images of corrected GFP mutant cells at 0 h and 72 h post-electroporation.
FIG. 10A. Schematic of in situ mutant GFP correction in iCas9 iPSC-derived human heart organoid (hHO). FIG. 10B. Lei: representative maximum projection of immunofluorescence staining of GFP correction, magenta: cardiac- specific troponin T (cTNT), gray: DAPI, green: corrected GFP, scale bar = 100 pm; Right: frequency of GFP-positive cells quantified using Fiji software. Data were presented as the mean ± SEM, biological replicates n (unmodified ssODN) = 3, n (5’ Cy5-ssODN) = 4. A two-tailed t-test was used, ** P < 0.01. FIG. 10C. Schematic of mutant HBB correction in SC9N iPSC-derived human blastoids. FIG. 10D. Representative maximum projection of immunofluorescence staining of blastoid, magenta: primitive endoderm (PrE)-GATA4, yellow: naive epiblast (EPI)-SOX2, cyanine: trophectoderm (TE)-GATA3, scale bar = 20 pm. FIG. 10E. Lei: the violin plot of the formed blastoid diameter. Data were presented as the mean ± SEM. A two-tailed t-test was used, ns: not significant; Right: frequency of blastoids with cavitated structures. Data were presented as the mean ± SEM of four
independent experiments. A two-tailed t-test was used, ** P < 0.01. FIG. 10F. Top: schematic of in vitro human blastoid attachment assay. Middle Lei: representative phase-contrast images of the attached blastoids after correction using different ssODNs. Scale bar = 50 pm. Right: detection of stimulated CGP of edited blastoids using commercial pregnancy test kit. Bottom: representative maximum projection of immunofluorescence images for the attached blastoids, magenta: GATA4, yellow: Oct4, cyanine: GATA3. Scale bar, 100 pm. FIG. 10G. Lei: schematic of mutant HBB correction (NM_000518.5:c.20A>T) from “CTG” to “CAG” in human blastoids. Right: frequency of HBB correction quantified by ddPCR. The numerator indicates the HBB mutant corrected event number, and the denominator indicates the total event number detected by ddPCR. ****p < 0.0001, two-sided Fisher’s exact test.. FIG. 10H. shows the area of GFP-positive cells quantified by wide-field images. Data were presented as the mean ± SEM, biological replicates n = 15. A two-tailed t-test was used, **** p < 0.000 L.
FIG. 11 A. Schematic of GFP correction followed by bulk RNA-seq and western blot workflow.. FIG. 11B. Volcano plot comparing gene expression fold changes between 5’-Cy5- ssODN (lei) and unmodified ssODN conditions. FIG. 11C. Gene Ontology (GO) enrichment analysis of differentially upregulated genes in the 5’-Cy5-ssODN condition compared to the unmodified ssODN condition. FIG. 11D. Heatmap of the expression levels of genes involved in DNA repair and chromatin remodeling pathways. Gene expression is represented by a gradient color scale, with red indicating higher expression and blue representing lower expression. FIG. HE. Representative Western blot analysis of H1.0, RAD51, and RAD52 expression across three conditions. H3 was used as a loading control. Relative expression values were normalized to the mock condition. FIG. 11F. Frequency of GFP-positive cells quantified by FACS. Data are presented as mean ± SEM, biological replicates n = 18. A two-tailed t-test was used; ****P < 0.0001, ns: not significant. FIG. 11G. Frequency of GFP-positive cells quantified by FACS. Data are presented as mean ± SEM, biological replicates n = 6. A two-tailed t-test was used; *P < 0.05, ****P < 0.0001.
FIG. 12A. The radius of gyration Rg as a function of simulation time for 5’Cy5- ssODN90 and unmodified SSODN90. FIG. 12B. Hydrogen bond number analysis as a function of simulation time for 5’Cy5-ssODN9o and unmodified SSODN90. FIG. 12C. Calculated RDF plots of various cyanines around the C fragment in their corresponding GFPD10 systems. FIG. 12D. Calculated binding energy of unmodified and various cyanine modified Cy-G:C systems in GFP sequence. 5Cy5, 5Cy5.5, and 5Cy3 fragments are the last three rightmost bars, respectively, with 5Cy5 at the rightmost and binding energy of unmodified and various cyanine modified Cy- G:C systems in GFP sequence. 5Cy5, 5Cy5.5 in the middle.
DETAILED DESCRIPTION OF THE INVENTION
Although the genome editing efficiency of CRISPR/Cas9 is relatively high, which mainly refers to gene knocking out (KO), the HDR efficiency is still considered low. The high frequency of unintended on-target large deletions or other complex genomic rearrangements induced by CRISPR/Cas9 is a potential risk for both scientific research and clinic application. Genetic diseases, particularly point-mutation diseases can be treated by CRISPR/Cas9 to knock in the correct sequences to fix the mutations or deletions via HDR repair pathway. However, the relatively low HDR efficiency and high frequency of on-target large deletion of CRISPR/Cas9 constrain its applications.
The LD (large deletions) issue significantly impacts the application of gene editing tool - CRISPR-Cas9. The non-limiting examples demonstrate that LDs induced by CRISPR-Cas9 could be mainly mediated by MMEJ pathway. The Examples describe the roles of some key genes of MMEJ in LD induced by CRISPR-Cas9, replication protein A (RPA), and DNA polymerase theta play crucial roles in the MMEJ pathway..
Inhibition of POLQ or overexpression RPA significantly reduced LD frequency. By contrast, there were no changes found in LD frequency when knocking down PARP1 or LIG3. Using an individual molecular sequencing method, IDMseq, the results were confirmed. The discoveries highlighted the role of MMEJ in CRISPR-Cas9 mediated gene editing and provided the targets and strategies for safe editing.
Data in the present application also demonstrates that modifying the 5’ and/3’ end of nucleic acids involved in CRISPER/CAS gene editing can dramatically improve CRISPER/CAS gene editing efficiency.
The disclosed method and compositions may be understood more readily by reference to the following detailed description of particular embodiments and the Example included therein and to the Figures and their previous and following description.
It is to be understood that the disclosed method and compositions are not limited to specific synthetic methods, specific analytical techniques, or to particular reagents unless otherwise specified, and, as such, may vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting.
DEFINITIONS
As used herein, the term “gene” refers to a nucleic acid (e.g., DNA or RNA) sequence that including coding sequences necessary for the production of a polypeptide, RNA (e.g., including,
but not limited to, mRNA, tRNA and rRNA) or precursor. The polypeptide, RNA, or precursor can be encoded by a full-length coding sequence or by any portion thereof. The term also encompasses the coding region of a structural gene and the sequences located adjacent to the coding region on both the 5' and 3' ends for a distance of about 1 kb on either end such that the gene corresponds to the length of the full-length mRNA. The term “gene” encompasses both cDNA and genomic forms of a gene, which may be made of DNA, or RNA. A genomic form or clone of a gene may contain the coding region interrupted with non-coding sequences termed “introns” or “intervening regions” or “intervening sequences.” Introns are segments of a gene that are transcribed into nuclear RNA (hnRNA); introns may contain regulatory elements such as enhancers. Introns are removed or "spliced out" from the nuclear or primary transcript; introns therefore are absent in the messenger RNA (mRNA) transcript. The mRNA functions during translation.
As used herein, the term “inhibit” or other forms of the word such as “inhibiting” or “inhibition” means to hinder or restrain a particular characteristic, for example, to reduce, decrease or prevent, either partially or entirely. It is understood that this is typically in relation to some standard or expected value, in other words it is relative, but that it is not always necessary for the standard or relative value to be referred to. For example, “inhibits POLQ” means hindering or restraining the activity of the protein relative to a standard or a control. “Inhibits POLQ” can also mean to hinder or restrain the synthesis or expression of the protein, or mRNA encoding the protein, relative to a standard or control.
As used herein, “mammal” includes both humans and non-humans and include but is not limited to humans, non-human primates, canines, felines, murines, bovines, equines, and porcines.
As used herein, a “vector” is a replicon, such as a plasmid, phage, or cosmid, into which another DNA segment may be inserted so as to bring about the replication of the inserted segment. The vectors described herein can be expression vectors.
As used herein, an “expression vector” is a vector that includes one or more expression control sequences.
As used herein, an “expression control sequence” is a DNA sequence that controls and regulates the transcription and/or translation of another DNA sequence.
As used herein, the term “treating” includes alleviating the symptoms associated with a specific disorder or condition and/or preventing or eliminating the symptoms.
“Operably linked” refers to a juxtaposition wherein the components are configured so as to perform their usual function. For example, control sequences or promoters operably linked to
a coding sequence are capable of effecting the expression of the coding sequence, and an organelle localization sequence operably linked to protein will direct the linked protein to be localized at the specific organelle.
As used herein, “transformed” and “transfected” encompass the introduction of a nucleic acid (e.g. a vector) into a cell by a number of techniques known in the art.
"Effective amount" and “therapeutically effective amount,” used interchangeably, as applied to the nanoparticles, therapeutic agents, and pharmaceutical compositions described herein, mean the quantity necessary to render the desired therapeutic result. For example, an effective amount is a level effective to treat, cure, or alleviate the symptoms of a disease for which the composition and/or therapeutic agent, or pharmaceutical composition, is/are being administered.
Throughout the description and claims of this specification, the word “comprise” and variations of the word, such as “comprising” and “comprises,” means “including but not limited to,” and is not intended to exclude, for example, other additives, components, integers or steps.
“Optional” or “optionally” means that the subsequently described event, circumstance, or material may or may not occur or be present, and that the description includes instances where the event, circumstance, or material occurs or is present and instances where it does not occur or is not present.
Ranges may be expressed herein as from "about" one particular value, and/or to "about" another particular value. When such a range is expressed, also specifically contemplated and considered disclosed is the range from the one particular value and/or to the other particular value unless the context specifically indicates otherwise. Similarly, when values are expressed as approximations, by use of the antecedent “about,” it will be understood that the particular value forms another, specifically contemplated embodiment that should be considered disclosed unless the context specifically indicates otherwise. It will be further understood that the endpoints of each of the ranges are significant both in relation to the other endpoint, and independently of the other endpoint unless the context specifically indicates otherwise. It should be understood that all of the individual values and sub-ranges of values contained within an explicitly disclosed range are also specifically contemplated and should be considered disclosed unless the context specifically indicates otherwise. Finally, it should be understood that all ranges refer both to the recited range as a range and as a collection of individual numbers from and including the first endpoint to and including the second endpoint. In the latter case, it should be understood that any of the individual numbers can be selected as one form of the quantity, value, or feature to which the range refers. In this way, a range describes a set of numbers or values from and including the
first endpoint to and including the second endpoint from which a single member of the set (i.e. a single number) can be selected as the quantity, value, or feature to which the range refers. The foregoing applies regardless of whether in particular cases some or all of these embodiments are explicitly disclosed.
Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of skill in the art to which the disclosed method and compositions belong. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present method and compositions, the particularly useful methods, devices, and materials are as described. Nothing herein is to be construed as an admission that the present invention is not entitled to antedate such disclosure by virtue of prior invention. No admission is made that any reference constitutes prior art. The discussion of references states what their authors assert, and applicants reserve the right to challenge the accuracy and pertinency of the cited documents. It will be clearly understood that, although a number of publications are referred to herein, such reference does not constitute an admission that any of these documents forms part of the common general knowledge in the art. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the method and compositions described herein.
I. COMPOSITIONS
The compositions in some forms include (A) Agents for reducing POLQ level/activity in a cell; (B) agents for increasing RPA levels/activity in a cell and (C) fluorophore-modified nucleic acids used in gene editing.
Agents for reducing POLQ level/activity include (i) that inhibit expression or activity in a cell, including, functional nucleic acids such siRNA, shRNA, and small molecules such as Novobiocin.
Agents for increasing RPA in cells, for example, by increasing the transcription or translation of the RPA, include nucleic acids encoding RPA.
Fluorophore-modified nucleic acids used in gene editing include for example, fluorophore labelled ssODN and/or sgRNA.
A. Molecules for Inhibiting POLQ
Data in the present application demonstrates that the HDR efficiency can be increased by two-fold and hence and reduce the frequency of large deletion by 50%, by inhibiting PolQ activity, the main player of the MMEJ repair pathway. Therefore, compositions for reducing
PolQ activity such small molecule inhibitors, siRNA and ShRNA are useful in the disclosed methods.
1. Small molecules
In some forms, the agent used to reduce PolQ activity is a small molecule. A preferred small molecule is Novobiocin. However, other PolQ inhibitors can be used, including, but not limited to ART4215, (a potent and selective inhibitor of deoxyribonucleic acid (DNA) polymerase (pol) theta) (https://clinicaltrials.gov/ct2/show/NCT04991480) and ART558 (Zatreanu, D., Robinson,
Alkhatib, 0. et al. Poll) inhibitors elicit BRCA-gene synthetic lethality and target PARP inhibitor resistance. Nat Commun 12, 3636 (2021). https://doi.org/10.1038/s41467-021-23463-8).
2. Functional Nucleic acid inhibitors
The inhibitor can be a functional nucleic acid. Protein expression and/or activity of a desired protein can be inhibited using a functional nucleic acid (herein, inhibiting NA), or vector encoding the same, which reduces expression of the desired protein i.e., POLQ.
Functional nucleic acids (FNAs) refer to those nucleic acids whose functions are beyond the conventional genetic roles of nucleic acids. As discussed in more detail below, functional nucleic acid molecules can be divided into the following non-limiting categories: antisense molecules, siRNA, miRNA, aptamers, ribozymes, triplex forming molecules, RNAi, external guide sequences, and other gene editing compositions. The functional nucleic acid molecules can act as effectors, inhibitors, modulators, and stimulators of a specific activity possessed by a target molecule, or the functional nucleic acid molecules can possess a de novo activity independent of any other molecules.
Functional nucleic acid molecules can interact with any macromolecule, such as DNA, RNA, polypeptides, or carbohydrate chains. Thus, functional nucleic acids can interact with the mRNA or the genomic DNA of a target polypeptide or they can interact with the polypeptide itself. Often functional nucleic acids are designed to interact with other nucleic acids based on sequence homology between the target molecule and the functional nucleic acid molecule. In other situations, the specific recognition between the functional nucleic acid molecule and the target molecule is not based on sequence homology between the functional nucleic acid molecule and the target molecule, but rather is based on the formation of tertiary structure that allows specific recognition to take place.
Therefore the compositions can include one or more functional nucleic acids designed to reduce expression of the POLQ gene, or a gene product thereof. For example, the functional nucleic acid or polypeptide can be designed to target and reduce or inhibit expression or
translation of POLQ; or to reduce or inhibit expression, reduce activity, or increase degradation of POLQ protein. In some embodiments, the composition includes a vector suitable for in vivo expression of the functional nucleic acid.
In some embodiments, a functional nucleic acid or polypeptide is designed to target a segment of the nucleic acid sequence encoding POLQ, or the complement thereof, or a genomic sequence corresponding therewith, or variants thereof having a nucleic acid sequence at least 65%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical to a sequence encoding POLQ.
In some embodiments, a functional nucleic acid or polypeptide is designed to target a segment of a the nucleic acid encoding the amino acid sequence of POLQ,, or the complement thereof, or variants thereof having a nucleic acid sequence 65%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical to a nucleic acid encoding the amino acid sequence of POLQ,.
In some embodiments, the function nucleic acid hybridizes to the nucleic acid encoding POLQ,, or a complement thereof, for example, under stringent conditions. In some embodiments, the functional nucleic acid hybridizes to a nucleic acid sequence that encodes the amino acid sequence of POLQ, or a complement thereof, for example, under stringent conditions. i. RNA Interference
In some embodiments, PolQ RNA expression is reduced through RNA interference (RNAi). This silencing was originally observed with the addition of double stranded RNA (dsRNA) (Fire, et al. (1998) Nature, 391:806-11; Napoli, et al. (1990) Plant Cell 2:279-89; Hannon, (2002) Nature, 418:244-51). Once dsRNA enters a cell, it is cleaved by an RNase III - like enzyme, Dicer, into double stranded small interfering RNAs (siRNA) 21-23 nucleotides in length that contains 2 nucleotide overhangs on the 3’ ends (Elbashir, et al. (2001) Genes Dev., 15:188-200; Bernstein, et al. (2001) Nature, 409:363-6; Hammond, et al. (2000) Nature, 404:293-6). In an ATP dependent step, the siRNAs become integrated into a multi-subunit protein complex, commonly known as the RNAi induced silencing complex (RISC), which guides the siRNAs to the target RNA sequence (Nykanen, et al. (2001) Cell, 107:309-21). At some point, the siRNA duplex unwinds, and it appears that the antisense strand remains bound to RISC and directs degradation of the complementary mRNA sequence by a combination of endo
and exonucleases (Martinez, et al. (2002) Cell, 110:563-74). However, the effect of RNAi or siRNA or their use is not limited to any type of mechanism.
Short Interfering RNA (siRNA) is a double-stranded RNA that can induce sequencespecific post-transcriptional gene silencing, thereby decreasing or even inhibiting gene expression. In one example, a siRNA triggers the specific degradation of homologous RNA molecules, such as mRNAs, within the region of sequence identity between both the siRNA and the target RNA. For example, WO 02/44321 discloses siRNAs capable of sequence-specific degradation of target mRNAs when base-paired with 3’ overhanging ends, herein incorporated by reference for the method of making these siRNAs.
Sequence specific gene silencing can be achieved in mammalian cells using synthetic, short double-stranded RNAs that mimic the siRNAs produced by the enzyme dicer (Elbashir, et al. (2001) Nature, 411:494 498) (Ui-Tei, et al. (2000) FEBS Lett 479:79-82). SiRNA can be chemically or in vztro-synthesized or can be the result of short double-stranded hairpin-like RNAs (shRNAs) that are processed into siRNAs inside the cell. Synthetic siRNAs are generally designed using algorithms and a conventional DNA/RNA synthesizer. Suppliers include Ambion (Austin, Texas), ChemGenes (Ashland, Massachusetts), Dharmacon (Lafayette, Colorado), Glen Research (Sterling, Virginia), MWB Biotech (Esbersberg, Germany), Proligo (Boulder, Colorado), and Qiagen (Vento, The Netherlands). SiRNA can also be synthesized in vitro using kits such as Ambion’ s SILENCER® siRNA Construction Kit.
The production of siRNA from a vector is more commonly done through the transcription of a short hairpin RNAse (shRNAs). Kits for the production of vectors comprising shRNA are available, such as, for example, Imgenex’s GENESUPPRESSOR™ Construction Kits and Invitrogen’s BLOCK- IT™ inducible RNAi plasmid and lentivirus vectors. ii. Antisense
PolQ expression can be reduced using can be antisense molecules. Antisense molecules are designed to interact with a target nucleic acid molecule through either canonical or non- canonical base pairing. The interaction of the antisense molecule and the target molecule is designed to promote the destruction of the target molecule through, for example, RNAse H mediated RNA-DNA hybrid degradation. Alternatively, the antisense molecule is designed to interrupt a processing function that normally would take place on the target molecule, such as transcription or replication. Antisense molecules can be designed based on the sequence of the target molecule. There are numerous methods for optimization of antisense efficiency by finding the most accessible regions of the target molecule. Exemplary methods include in vitro selection experiments and DNA modification studies using DMS and DEPC. It is preferred that antisense
molecules bind the target molecule with a dissociation constant (Kd) less than or equal to 10-6, 10-8, 10-10, or 10-12.
An “antisense” nucleic acid sequence (antisense oligonucleotide) can include a nucleotide sequence that is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the TRF2. Antisense nucleic acid sequences and delivery methods are well known in the art (Goodchild, Curr. Opin. Mol. Ther., 6(2): 120-128 (2004); Clawson, et al., Gene Ther., 11(17): 1331-1341 (2004). The antisense nucleic acid can be complementary to an entire coding strand of a target sequence, or to only a portion thereof. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.
An antisense nucleic acid can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).
Other examples of useful antisense oligonucleotides (AONs/ASOs) include an alpha- anomeric nucleic acid. An alpha-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual beta-units, the strands run parallel to each other (Gaultier et al., Nucleic Acids. Res. 15:6625-6641 (1987)). The antisense nucleic acid molecule can also comprise a 2” -o- methylribonucleotide (Inoue et al. Nucleic Acids Res. 15:6131-6148 (1987)) or a chimeric RNA-DNA analogue (Inoue et al. FEBS Lett., 215:327-330 (1987)). iii. Triplex forming molecules
PolQ expression can be reduced using triplex forming molecules. Triplex forming functional nucleic acid molecules are molecules that can interact with either double-stranded or single-stranded nucleic acid. When triplex molecules interact with a target region, a structure called a triplex is formed in which there are three strands of DNA forming a complex dependent on both Watson-Crick and Hoogsteen base-pairing. Triplex molecules are preferred because
they can bind target regions with high affinity and specificity. It is preferred that the triplex forming molecules bind the target molecule with a Kd less than 10-6, 10-8, 10-10, or 10-12. iv. External guide sequences
PolQ RNA expression can be reduced using external guide sequences. External guide sequences (EGSs) are molecules that bind a target nucleic acid molecule forming a complex, which is recognized by Rnase P, which then cleaves the target molecule. EGSs can be designed to specifically target a RNA molecule of choice. RNAse P aids in processing transfer RNA (tRNA) within a cell. Bacterial RNAse P can be recruited to cleave virtually any RNA sequence by using an EGS that causes the target RNA:EGS complex to mimic the natural tRNA substrate. Similarly, eukaryotic EGS/RNAse P-directed cleavage of RNA can be utilized to cleave desired targets within eukaryotic cells. Representative examples of how to make and use EGS molecules to facilitate cleavage of a variety of different target molecules are known in the art.
Methods for delivering nucleic acid payloads are known in the art (reviewed in Paunovska, et al. Nat. Rev. Gen, 23:265-280 (2022). v. Aptamers
The functional nucleic acids can be aptamers. Aptamers are molecules that interact with a target molecule, preferably in a specific way. Typically aptamers are small nucleic acids ranging from 15-50 bases in length that fold into defined secondary and tertiary structures, such as stemloops or G-quartets. Aptamers can bind small molecules, such as ATP and theophiline, as well as large molecules, such as reverse transcriptase and thrombin. Aptamers can bind very tightly with Kd’s from the target molecule of less than 10-12 M. It is preferred that the aptamers bind the target molecule with a Kd less thanl0-6, 10-8, 10-10, or 10-12. Aptamers can bind the target molecule with a very high degree of specificity. For example, aptamers have been isolated that have greater than a 10,000 fold difference in binding affinities between the target molecule and another molecule that differ at only a single position on the molecule. It is preferred that the aptamer have a Kd with the target molecule at least 10, 100, 1000, 10,000, or 100,000 fold lower than the Kd with a background binding molecule. It is preferred when doing the comparison for a molecule such as a polypeptide, that the background molecule be a different polypeptide. vi. Ribozymes
The functional nucleic acids can be ribozymes. Ribozymes are nucleic acid molecules that are capable of catalyzing a chemical reaction, either intramolecularly or intermolecularly. It is preferred that the ribozymes catalyze intermolecular reactions. There are a number of different types of ribozymes that catalyze nuclease or nucleic acid polymerase type reactions which are based on ribozymes found in natural systems, such as hammerhead ribozymes. There are also a
number of ribozymes that are not found in natural systems, but which have been engineered to catalyze specific reactions de novo. Preferred ribozymes cleave RNA or DNA substrates, and more preferably cleave RNA substrates. Ribozymes typically cleave nucleic acid substrates through recognition and binding of the target substrate with subsequent cleavage. This recognition is often based mostly on canonical or non-canonical base pair interactions. This property makes ribozymes particularly good candidates for target specific cleavage of nucleic acids because recognition of the target substrate is based on the target substrates sequence.
3. Other Gene Editing Compositions
In some embodiments the functional nucleic acids are gene editing compositions. Gene editing compositions can include nucleic acids that encode an element or elements that induce a single or a double strand break in the target cell’s genome, and optionally a polynucleotide. The compositions can be used, for example, to reduce or otherwise modify expression of POLQ. i. Strand Break Inducing Elements CRISPR/Cas
In some embodiments, the element that induces a single or a double strand break in the target cell’s genome is a CRISPR/Cas system. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is an acronym for DNA loci that contain multiple, short, direct repetitions of base sequences. The prokaryotic CRISPR/Cas system has been adapted for use as gene editing (silencing, enhancing or changing specific genes) for use in eukaryotes (see, for example, Cong, Science, 15:339(6121):819-823 (2013) and Jinek, et al., Science, 337(6096):816-21 (2012)). By transfecting a cell with the required elements including a Cas gene and specifically designed CRISPRs, the organism's genome can be cut and modified at any desired location. Methods of preparing compositions for use in genome editing using the CRISPR/Cas systems are described in detail in WO 2013/176772 and WO 2014/018423, which are specifically incorporated by reference herein in their entireties.
Double strand breaks can be repaired by the cell in one of two ways: non-homologous end joining, and homology- directed repair (HDR) (discussed further below). In non- homologous end joining (NHEJ), the double-strand breaks are repaired by direct ligation of the break ends to one another. As such, no new nucleic acid material is inserted into the site, although some nucleic acid material may be lost, resulting in a deletion. In homology-directed repair, a donor polynucleotide with homology to the cleaved target DNA sequence is used as a template for repair of the cleaved target DNA sequence, resulting in the transfer of genetic information from a donor polynucleotide to the target DNA. As such, new nucleic acid material can be inserted/copied into the site.
Therefore, in some embodiments, the genome editing composition includes a donor polynucleotide. The modifications of the target DNA due to NHEJ and/or homology-directed repair can be used to induce gene correction, gene replacement, gene tagging, transgene insertion, nucleotide deletion, gene disruption, gene mutation, etc.
Accordingly, cleavage of DNA by the genome editing composition can be used to delete nucleic acid material from a target DNA sequence by cleaving the target DNA sequence and allowing the cell to repair the sequence in the absence of an exogenously provided donor polynucleotide. Thus, the subject methods can be used to knock out a gene (resulting in complete lack of transcription or altered transcription) or to knock in genetic material into a locus of choice in the target DNA.In general, “CRISPR system” refers collectively to transcripts and other elements involved in the expression of or directing the activity of CRISPR-associated (“Cas”) genes, including sequences encoding a Cas gene, a tracr (trans-activating CRISPR) sequence (e.g., tracrRNA or an active partial tracrRNA), a tracr-mate sequence (encompassing a “direct repeat” and a tracrRNA-processed partial direct repeat in the context of an endogenous CRISPR system), a guide sequence (also referred to as a “spacer” in the context of an endogenous CRISPR system), or other sequences and transcripts from a CRISPR locus. One or more tracr mate sequences operably linked to a guide sequence (e.g., direct repeat-spacer-direct repeat) can also be referred to as pre-crRNA (pre-CRISPR RNA) before processing or crRNA after processing by a nuclease.
In some embodiments, a tracrRNA and crRNA are linked and form a chimeric crRNA- tracrRNA hybrid where a mature crRNA is fused to a partial tracrRNA via a synthetic stem loop to mimic the natural crRNA:tracrRNA duplex as described in Cong, Science, 15:339(6121): 819— 823 (2013) and Jinek, et al., Science, 337(6096):816-21 (2012)). A single fused crRNA- tracrRNA construct can also be referred to as a guide RNA or gRNA (or single-guide RNA (sgRNA)). Within an sgRNA, the crRNA portion can be identified as the ‘target sequence’ and the tracrRNA is often referred to as the ‘scaffold’.
In some embodiments, one or more elements of a CRISPR system is derived from a type I, type II, or type III CRISPR system. In some embodiments, one or more elements of a CRISPR system is derived from a particular organism including an endogenous CRISPR system, such as Streptococcus pyogenes.
In general, a CRISPR system is characterized by elements that promote the formation of a CRISPR complex at the site of a target sequence (also referred to as a protospacer in the context of an endogenous CRISPR system). In the context of formation of a CRISPR complex, “target sequence” refers to a sequence to which a guide sequence is designed to have
complementarity, where hybridization between a target sequence and a guide sequence promotes the formation of a CRISPR complex. A target sequence can be any polynucleotide, such as DNA or RNA polynucleotides. In some embodiments, a target sequence is located in the nucleus or cytoplasm of a cell.
In the target nucleic acid, each protospacer is associated with a protospacer adjacent motif (PAM) whose recognition is specific to individual CRISPR systems. In the Streptococcus pyogenes CRISPR/Cas system, the PAM is the nucleotide sequence NGG. In the Streptococcus thermophiles CRISPR/Cas system, the PAM is the nucleotide sequence is NNAGAAW. The tracrRNA duplex directs Cas to the DNA target consisting of the protospacer and the requisite PAM via heteroduplex formation between the spacer region of the crRNA and the protospacer DNA.
Typically, in the context of an endogenous CRISPR system, formation of a CRISPR complex (including a guide sequence hybridized to a target sequence and complexed with one or more Cas proteins) results in cleavage of one or both strands in or near (e.g., within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50, or more base pairs from) the target sequence. All or a portion of the tracr sequence may also form part of a CRISPR complex, such as by hybridization to all or a portion of a tracr mate sequence that is operably linked to the guide sequence.
There are many resources available for helping practitioners determine suitable target sites once a desired DNA target sequence is identified. For example, numerous public resources, including a bioinformatically generated list of about 190,000 potential sgRNAs, targeting more than 40% of human exons, are available to aid practitioners in selecting target sites and designing the associate sgRNA to affect a nick or double strand break at the site. See also, crispr.u-psud.fr/, a tool designed to help scientists find CRISPR targeting sites in a wide range of species and generate the appropriate crRNA sequences.
In some embodiments, one or more vectors driving expression of one or more elements of a CRISPR system are introduced into a target cell such that expression of the elements of the CRISPR system direct formation of a CRISPR complex at one or more target sites.
For example, a Cas enzyme, a guide sequence linked to a tracr-mate sequence, and a tracr sequence could each be operably linked to separate regulatory elements on separate vectors. Alternatively, two or more of the elements expressed from the same or different regulatory elements may be combined in a single vector, with one or more additional vectors providing any components of the CRISPR system not included in the first vector. CRISPR system elements that are combined in a single vector may be arranged in any suitable orientation, such as one element located 5' with respect to (“upstream” of) or 3' with respect to (“downstream” of) a
second element. The coding sequence of one element can be located on the same or opposite strand of the coding sequence of a second element, and oriented in the same or opposite direction. In some embodiments, a single promoter drives expression of a transcript encoding a CRISPR enzyme and one or more of the guide sequence, tracr mate sequence (optionally operably linked to the guide sequence), and a tracr sequence embedded within one or more intron sequences (e.g., each in a different intron, two or more in at least one intron, or all in a single intron). In some embodiments, the CRISPR enzyme, guide sequence, tracr mate sequence, and tracr sequence are operably linked to and expressed from the same promoter.
In some embodiments, a vector includes one or more insertion sites, such as a restriction endonuclease recognition sequence (also referred to as a “cloning site”). In some embodiments, one or more insertion sites (e.g., about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more insertion sites) are located upstream and/or downstream of one or more sequence elements of one or more vectors. In some embodiments, a vector includes an insertion site upstream of a tracr mate sequence, and optionally downstream of a regulatory element operably linked to the tracr mate sequence, such that following insertion of a guide sequence into the insertion site and upon expression the guide sequence directs sequence-specific binding of a CRISPR complex to a target sequence in a eukaryotic cell. In some embodiments, a vector includes two or more insertion sites, each insertion site being located between two tracr mate sequences so as to allow insertion of a guide sequence at each site. In such an arrangement, the two or more guide sequences can include two or more copies of a single guide sequence, two or more different guide sequences, or combinations of these. When multiple different guide sequences are used, a single expression construct may be used to target CRISPR activity to multiple different, corresponding target sequences within a cell. For example, a single vector can include about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20 guide sequences. In some embodiments, about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, such guide-sequence-containing vectors may be provided, and optionally delivered to a cell.
In some embodiments, a vector includes a regulatory element operably linked to an enzyme-coding sequence encoding a CRISPR enzyme, such as a Cas protein. Non-limiting examples of Cas proteins include Casl, CaslB, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9 (also known as Csnl and Csxl2), CaslO, Csyl, Csy2, Csy3, Csel, Cse2, Cscl, Csc2, Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmrl, Cmr3, Cmr4, Cmr5, Cmr6, Csbl, Csb2, Csb3, Csxl7, Csxl4, CsxlO, Csxl6, CsaX, Csx3, Csxl, Csxl5, Csfl, Csf2, Csf3, Csf4, homologues thereof, or modified versions thereof. In some embodiments, the unmodified CRISPR enzyme has DNA cleavage activity, such as Cas9. In some embodiments, the CRISPR
enzyme directs cleavage of one or both strands at the location of a target sequence, such as within the target sequence and/or within the complement of the target sequence. In some embodiments, the CRISPR enzyme directs cleavage of one or both strands within about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, 200, 500, or more base pairs from the first or last nucleotide of a target sequence.
In some embodiments, a vector encodes a CRISPR enzyme that is mutated with respect to a corresponding wild-type enzyme such that the mutated CRISPR enzyme lacks the ability to cleave one or both strands of a target polynucleotide containing a target sequence. For example, an aspartate-to-alanine substitution (D10A) in the RuvC I catalytic domain of Cas9 from S. pyogenes converts Cas9 from a nuclease that cleaves both strands to a nickase (cleaves a single strand). Other examples of mutations that render Cas9 a nickase include, without limitation, H840A, N854A, and N863A. As a further example, two or more catalytic domains of Cas9 (RuvC I, RuvC II, and RuvC III) can be mutated to produce a mutated Cas9 substantially lacking all DNA cleavage activity. In some embodiments, a D10A mutation is combined with one or more of H840A, N854A, or N863A mutations to produce a Cas9 enzyme substantially lacking all DNA cleavage activity. In some embodiments, a CRISPR enzyme is considered to substantially lack all DNA cleavage activity when the DNA cleavage activity of the mutated enzyme is less than about 25%, 10%, 5%>, 1%>, 0.1 %>, 0.01%, or lower with respect to its non-mutated form.
In some embodiments, an enzyme coding sequence encoding a CRISPR enzyme is codon optimized for expression in particular cells, such as eukaryotic cells. The eukaryotic cells can be those of or derived from a particular organism, such as a mammal, including but not limited to human, mouse, rat, rabbit, dog, or non-human primate. In general, codon optimization refers to a process of modifying a nucleic acid sequence for enhanced expression in the host cells of interest by replacing at least one codon (e.g., about or more than about 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more codons) of the native sequence with codons that are more frequently or most frequently used in the genes of that host cell while maintaining the native amino acid sequence. Various species exhibit particular bias for certain codons of a particular amino acid. Codon bias (differences in codon usage between organisms) often correlates with the efficiency of translation of messenger RNA (mRNA), which is in turn believed to be dependent on, among other things, the properties of the codons being translated and the availability of particular transfer RNA (tRNA) molecules.
The predominance of selected tRNAs in a cell is generally a reflection of the codons used most frequently in peptide synthesis. Accordingly, genes can be tailored for optimal gene
expression in a given organism based on codon optimization. Codon usage tables are readily available, for example, at the “Codon Usage Database”, and these tables can be adapted in a number of ways. See Nakamura, Y., et al., Nucl. Acids Res., 28:292 (2000). Computer algorithms for codon optimizing a particular sequence for expression in a particular host cell, for example Gene Forge (Aptagen; Jacobus, PA), are also available. In some embodiments, one or more codons (e.g., 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more, or all codons) in a sequence encoding a CRISPR enzyme correspond to the most frequently used codon for a particular amino acid.
In some embodiments, a vector encodes a CRISPR enzyme including one or more nuclear localization sequences (NLSs). When more than one NLS is present, each may be selected independently of the others, such that a single NLS may be present in more than one copy and/or in combination with one or more other NLSs present in one or more copies. In some embodiments, an NLS is considered near the N- or C-terminus when the nearest amino acid of the NLS is within about 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50, or more amino acids along the polypeptide chain from the N-or C-terminus.
In general, the one or more NLSs are of sufficient strength to drive accumulation of the CRISPR enzyme in a detectable amount in the nucleus of a eukaryotic cell. In general, strength of nuclear localization activity may derive from the number of NLSs in the CRISPR enzyme, the particular NLS(s) used, or a combination of these factors. Detection of accumulation in the nucleus may be performed by any suitable technique. For example, a detectable marker may be fused to the CRISPR enzyme, such that location within a cell may be visualized, such as in combination with a means for detecting the location of the nucleus (e.g., a stain specific for the nucleus such as DAPI). Cell nuclei may also be isolated from cells, the contents of which may then be analyzed by any suitable process for detecting protein, such as immunohistochemistry, Western blot, or enzyme activity assay. Accumulation in the nucleus may also be determined indirectly, such as by an assay for the effect of CRISPR complex formation (e.g., assay for DNA cleavage or mutation at the target sequence, or assay for altered gene expression activity affected by CRISPR complex formation and/or CRISPR enzyme activity), as compared to a control no exposed to the CRISPR enzyme or complex, or exposed to a CRISPR enzyme lacking the one or more NLSs.
In some embodiments, one or more of the elements of CRISPR system are under the control of an inducible promoter, which can include inducible Cas, such as Cas9.
Cong, Science, 15:339(6121):819— 823 (2013) reported heterologous expression of Cas9, tracrRNA, pre-crRNA (or Cas9 and sgRNA) can achieve targeted cleavage of mammalian chromosomes. Therefore, CRISPR system utilized in the methods disclosed herein can be
encoded within a vector system which can include one or more vectors which can include a first regulatory element operably linked to a CRISPR/Cas system chimeric RNA (chiRNA) polynucleotide sequence, wherein the polynucleotide sequence includes (a) a guide sequence capable of hybridizing to a target sequence in a eukaryotic cell, (b) a tracr mate sequence, and (c) a tracr sequence; and a second regulatory element operably linked to an enzyme-coding sequence encoding a CRISPR enzyme which can optionally include at least one or more nuclear localization sequences. Elements (a), (b) and (c) can arranged in a 5' to 3 orientation, wherein components I and II are located on the same or different vectors of the system, wherein when transcribed, the tracr mate sequence hybridizes to the tracr sequence and the guide sequence directs sequence-specific binding of a CRISPR complex to the target sequence, and wherein the CRISPR complex can include the CRISPR enzyme complexed with (1) the guide sequence that is hybridized to the target sequence, and (2) the tracr mate sequence that is hybridized to the tracr sequence, wherein the enzyme coding sequence encoding the CRISPR enzyme further encodes a heterologous functional domain. In some embodiment, one or more of the vectors encodes also encodes a suitable Cas enzyme, for example, Cas9. The different genetic elements can be under the control of the same or different promoters.
While the specifics can be varied in different engineered CRISPR systems, the overall methodology is similar. A practitioner interested in using CRISPR technology to target a DNA sequence (such as Brd9, Ankibl, Cacngl, and Gtl3 (Cfap20)) can insert a short DNA fragment containing the target sequence into a guide RNA expression plasmid. The sgRNA expression plasmid contains the target sequence (about 20 nucleotides), a form of the tracrRNA sequence (the scaffold) as well as a suitable promoter and necessary elements for proper processing in eukaryotic cells. Such vectors are commercially available (see, for example, Addgene). Many of the systems rely on custom, complementary oligos that are annealed to form a double stranded DNA and then cloned into the sgRNA expression plasmid. Co-expression of the sgRNA and the appropriate Cas enzyme from the same or separate plasmids in transfected cells results in a single or double strand break (depending of the activity of the Cas enzyme) at the desired target site. ii. Zinc Finger Nucleases
In some embodiments, the element that induces a single or a double strand break in the target cell’s genome is a nucleic acid construct or constructs encoding a zinc finger nucleases (ZFNs). ZFNs are typically fusion proteins that include a DNA-binding domain derived from a zinc-finger protein linked to a cleavage domain.
The most common cleavage domain is the Type IIS enzyme Fokl. Fokl catalyzes double-stranded cleavage of DNA, at 9 nucleotides from its recognition site on one strand and 13 nucleotides from its recognition site on the other. See, for example, U.S. Pat. Nos. 5,356,802; 5,436, 150 and 5,487,994; as well as Li et al. Proc., Natl. Acad. Sci. USA 89 (1992):4275-4279; Li et al. Proc. Natl. Acad. Sci. USA, 90:2764-2768 (1993); Kim et al. Proc. Natl. Acad. Sci. USA. 91:883-887 (1994a); Kim et al. J. Biol. Chem. 269:31 ,978-31,982 (1994b). One or more of these enzymes (or enzymatically functional fragments thereof) can be used as a source of cleavage domains.
The DNA-binding domain, which can, in principle, be designed to target any genomic location of interest, can be a tandem array of Cys2His2 zinc fingers, each of which generally recognizes three to four nucleotides in the target DNA sequence. The Cys2His2 domain has a general structure: Phe (sometimes Tyr)-Cys-(2 to 4 amino acids)-Cys-(3 amino acids)- Phe(sometimes Tyr)-(5 amino acids)-Leu-(2 amino acids)-His-(3 amino acids)-His. By linking together multiple fingers (the number varies: three to six fingers have been used per monomer in published studies), ZFN pairs can be designed to bind to genomic sequences 18-36 nucleotides long.
Engineering methods include, but are not limited to, rational design and various types of empirical selection methods. Rational design includes, for example, using databases including triplet (or quadruplet) nucleotide sequences and individual zinc finger amino acid sequences, in which each triplet or quadruplet nucleotide sequence is associated with one or more amino acid sequences of zinc fingers which bind the particular triplet or quadruplet sequence. See, for example, U.S. Pat. Nos. 6, 140,081; 6,453,242; 6,534,261; 6,610,512; 6,746,838; 6,866,997; 7,067,617; U.S. Published Application Nos. 2002/0165356; 2004/0197892; 2007/0154989; 2007/0213269; and International Patent Application Publication Nos. WO 98/53059 and WO 2003/016496. iii. Transcription Activator-Like Effector Nucleases
In some embodiments, the element that induces a single or a double strand break in the target cell’s genome is a nucleic acid construct or constructs encoding a transcription activatorlike effector nuclease (TALEN). TALENs have an overall architecture similar to that of ZFNs, with the main difference that the DNA-binding domain comes from TAL effector proteins, transcription factors from plant pathogenic bacteria. The DNA-binding domain of a TALEN is a tandem array of amino acid repeats, each about 34 residues long. The repeats are very similar to each other; typically they differ principally at two positions (amino acids 12 and 13, called the repeat variable diresidue, or RVD). Each RVD specifies preferential binding to one of the four
possible nucleotides, meaning that each TALEN repeat binds to a single base pair, though the NN RVD is known to bind adenines in addition to guanine. TAL effector DNA binding is mechanistically less well understood than that of zinc-finger proteins, but their seemingly simpler code could prove very beneficial for engineered-nuclease design. TALENs also cleave as dimers, have relatively long target sequences (the shortest reported so far binds 13 nucleotides per monomer) and appear to have less stringent requirements than ZFNs for the length of the spacer between binding sites. Monomeric and dimeric TALENs can include more than 10, more than 14, more than 20, or more than 24 repeats.
Methods of engineering TAL to bind to specific nucleic acids are described in Cermak, et al, Nucl. Acids Res. 1-11 (2011). US Published Application No. 2011/0145940, which discloses TAL effectors and methods of using them to modify DNA. Miller et al. Nature Biotechnol 29: 143 (2011) reported making TALENs for site-specific nuclease architecture by linking TAL truncation variants to the catalytic domain of Fokl nuclease. The resulting TALENs were shown to induce gene modification in immortalized human cells. General design principles for TALE binding domains can be found in, for example, WO 2011/072246.
B. RPA Activators
Replication protein A (RPA) is a heterotrimeric, single-stranded DNA-binding protein. RPA is conserved in all eukaryotes and is essential for DNA replication, DNA repair, and recombination. RPA also plays a role in coordinating DNA metabolism and the cellular response to DNA damage. The three cDNAs encoding the subunits of human replication protein A (70, 32, and 14 kDa) have been expressed individually and in combination in Escherichia coli (Herikson, et al., J Biol Chem . 269(15): 11121-32), the methods of which are incorporated herein by reference.
RPA has high affinity for ssDNA. It has three subunits RPA1, RPA2, and RPA3 that can form a heterotrimer. Data in the present application shows that introducing one single subunit of RPA protein into cells subjected to CRISPER/CAS gene editing, can reduce large deletion frequency dramatically (Figure 3C, 3D).
The disclosed methods include expressing one or more subunits of RPA, i.e., RPA1, RPA2, and/or RPA3 in a cell. The RPA1, RPA2, and/or RPA3 can be from source such as mammalian, for example, human, and the nucleic acid source can be selected to correspond with the organism whose genes are being edited.
Sequences encoding RPA 1 , 2 and 3 from different organisms are know in the art. Human RPA1 Protein - 616 aa (Uniprot Accession ID: P27694) (SEQ ID NO:1) is shown below.
MVGQLSEGAIAAIMQKGDTNIKPILQVINIRPITTGNSPPRYRLLMSDGLNTLSSFMLATQLNPLVEEEQLSS NCVCQIHRFIVNTLKDGRRVVILMELEVLKSAEAVGVKIGNPVPYNEGLGQPQVAPPAPAASPAASSRPQP QNGSSGMGSTVSKAYGASKTFGKAAGPSLSHTSGGTQSKVVPIASLTPYQSKWTICARVTNKSQIRTWSNS RGEGKLFSLELVDESGEIRATAFNEQVDKFFPLIEVNKVYYFSKGTLKIANKQFTAVKNDYEMTFNNETSV MPCEDDHHLPTVQFDFTGIDDLENKSKDSLVDIIGICKSYEDATKITVRSNNREVAKRNIYLMDTSGKVVTA TLWGEDADKFDGSRQPVLAIKGARVSDFGGRSLSVLSSSTIIANPDIPEAYKLRGWFDAEGQALDGVSISDL KSGGVGGSNTNWKTLYEVKSENLGQGDKPDYFSSVATVVYLRKENCMYQACPTQDCNKKVIDQQNGLY RCEKCDTEFPNFKYRMILSVNIADFQENQWVTCFQESAEAILGQNAAYLGELKDKNEQAFEEVFQNANFRS FIFRVRVKVETYNDESRIKATVMDVKPVDYREYGRRLVMSIRRSALM;
RPA2 Protein - 270 aa (Uniprot Accession ID: P15927) (SEQ ID NO:2) MWNSGFESYGSSSYGGAGGYTQSPGGFGSPAPSQAEKKSRARAQHIVPCTISQLLSATLVDEVFRIGNVEIS QVTIVGIIRHAEKAPTNIVYKIDDMTAAPMDVRQWVDTDDTSSENTVVPPETYVKVAGHLRSFQNKKSLV AFKIMPLEDMNEFTTHILEVINAHMVLSKANSQPSAGRAPISNPGMSEAGNFGGNSFMPANGLTVAQNQVL NLIKACPRPEGLNFQDLKNQLKHMSVSSIKQAVDFLSNEGHIYSTVDDDHFKSTDAE;
RPA3 Protein - 121 aa (Uniprot Accession ID: P35244) ((SEQ ID NO:3) MVDMMDLPRSRINAGMLAQFIDKPVCFVGRLEKIHPTGKMFILSDGEGKNGTIELMEPLDEEISGIVEVVGR VTAKATILCTSYVQFKEDSHPFDLGLYNEAVKIIHDFPQFYPLGIVQHD.
Thus, in some forms, useful compositions include nucleic acids encoding RPA1, RPA2, and/or RPA3, preferably, in a vector for delivery and expression in cells, for example, mammalian cells. Plasmids containing genes encoding human RPA 1, 2 and 3 are commercially available, for example, Addgene, Cat # 46948.
In preferred embodiments, the nucleic acid molecule is a messenger RNA (mRNA). As used herein, the term "messenger RNA" (mRNA) refers to any polynucleotide which encodes a polypeptide of interest and which is capable of being translated to produce the encoded polypeptide of interest in vitro, in vivo, in situ or ex vivo.
Nucleic acids in vectors can be operably linked to one or more expression control sequences. For example, the control sequence can be incorporated into a genetic construct so that expression control sequences effectively control expression of a coding sequence of interest. Examples of expression control sequences include promoters, enhancers, and transcription terminating regions. A promoter is an expression control sequence composed of a region of a DNA molecule, typically within 100 nucleotides upstream of the point at which transcription starts (generally near the initiation site for RNA polymerase II). To bring a coding sequence under the control of a promoter, it is necessary to position the translation initiation site of the translational reading frame of the polypeptide between one and about fifty nucleotides downstream of the promoter. Hamann, et al., J. Biol. Eng., 13:7 (2019) demonstrated that gene expression in hBMSCs driven by cytomegalovirus (CMV) promoter, resulted in 10-fold higher
transgene expression than transfection with plasmids containing elongation factor 1 a (EFla) or rous sarcoma virus (RSV) promoters.
Enhancers provide expression specificity in terms of time, location, and level. Unlike promoters, enhancers can function when located at various distances from the transcription site. An enhancer also can be located downstream from the transcription initiation site. A coding sequence is “operably linked” and “under the control” of expression control sequences in a cell when RNA polymerase is able to transcribe the coding sequence into mRNA, which then can be translated into the protein encoded by the coding sequence.
Suitable expression vectors include, without limitation, plasmids and viral vectors derived from, for example, bacteriophage, baculoviruses, tobacco mosaic virus, herpes viruses, cytomegalo virus, retroviruses, vaccinia viruses, adenoviruses, and adeno- associated viruses. Numerous vectors and expression systems are commercially available from such corporations as Novagen (Madison, WI), Clontech (Palo Alto, CA), Stratagene (La Jolla, CA), and Invitrogen Life Technologies (Carlsbad, CA). Recent transfection studies have investigated minicircle DNA (mcDNA), nucleic acids that are derived from pDNA by recombination that removes bacterial sequences.
The vectors including the nucleic acid of interest can be administered to subjects in need thereof resulting in transfection or transformation of the cells in the subject which in turn express the protein/peptide encoded by the nucleic acid.
C. Fluorophore-modified nucleic acids used in gene editing
The disclosed methods employ fluorophore modified nucleic acids used in gene-editing include for example, fluorophore labelled dODN and/or sgRNA.
In applications in which it is desirable to insert a polynucleotide sequence into a target DNA sequence, a polynucleotide including a donor sequence to be inserted is also provided to the cell. By a “donor sequence” or “donor polynucleotide” or “donor oligonucleotide” it is meant a nucleic acid sequence to be inserted at the cleavage site (referred to collectively herein as “donor oligonucleotide “dON”). The donor polynucleotide typically contains sufficient homology to a genomic sequence at the cleavage site, e.g., 70%, 80%, 85%, 90%, 95%, or 100% homology with the nucleotide sequences flanking the cleavage site, e.g., within about 50 bases or less of the cleavage site, e.g., within about 30 bases, within about 15 bases, within about 10 bases, within about 5 bases, or immediately flanking the cleavage site, to support homology- directed repair between it and the genomic sequence to which it bears homology. The donor sequence is typically not identical to the genomic sequence that it replaces. Rather, the donor sequence may contain at least one or more single base changes, insertions, deletions, inversions
or rearrangements with respect to the genomic sequence, so long as sufficient homology is present to support homology-directed repair. In some embodiments, the donor sequence includes a non-homologous sequence flanked by two regions of homology, such that homology- directed repair between the target DNA region and the two flanking sequences results in insertion of the non-homologous sequence at the target region.
Donor sequences can also include a vector backbone containing sequences that are not homologous to the DNA region of interest and that are not intended for insertion into the DNA region of interest. Generally, the homologous region(s) of a donor sequence will have at least 50% sequence identity to a genomic sequence with which recombination is desired. In certain embodiments, 60%, 70%, 80%, 90%, 95%, 98%, 99%, or 99.9% sequence identity is present. Any value between 1% and 100% sequence identity can be present, depending upon the length of the donor polynucleotide.
The donor sequence can include certain sequence differences as compared to the genomic sequence, e.g., restriction sites, nucleotide polymorphisms, selectable markers (e.g., drug resistance genes, fluorescent proteins, enzymes etc.), etc., which can be used to assess for successful insertion of the donor sequence at the cleavage site or in some cases may be used for other purposes (e.g., to signify expression at the targeted genomic locus). In some cases, if located in a coding region, such nucleotide sequence differences will not change the amino acid sequence, or will make silent amino acid changes (i.e., changes which do not affect the structure or function of the protein). Alternatively, these sequences differences may include flanking recombination sequences such as FLPs, loxP sequences, or the like, that can be activated at a later time for removal of the marker sequence.
The donor sequence can be a single-stranded DNA, single-stranded RNA, doublestranded DNA, or double-stranded RNA. It can be introduced into a cell in linear or circular form. If introduced in linear form, the ends of the donor sequence can be protected (e.g., from exonucleolytic degradation) by methods known to those of skill in the art. For example, one or more dideoxynucleotide residues are added to the 3' terminus of a linear molecule and/or self- complementary oligonucleotides are ligated to one or both ends. See, for example, Chang et al. Proc. Natl. Acad. Sci. USA 84:4959-4963 (1987); Nehls et al. Science 272:886-889 (1996). Additional methods for protecting exogenous polynucleotides from degradation include, but are not limited to, addition of terminal amino group(s) and the use of modified internucleotide linkages such as, for example, phosphorothioates, phosphor amidates, and O-methyl ribose or deoxyribose residues.
As an alternative to protecting the termini of a linear donor sequence, additional lengths of sequence can be included outside of the regions of homology that can be degraded without impacting recombination. A donor sequence can be introduced into a cell as part of a vector molecule having additional sequences such as, for example, replication origins, promoters and genes encoding antibiotic resistance.
Therefore, in some embodiments, the genome editing composition includes a modified donor oligonucleotides used in CIRSPR/Cas- mediated HDR
In some forms, the ssODN or sgRNA is labelled with a fluorophore at its 5’ and/3’ end by a covalent bond between the ssODN/sgRNA and the fluorophore.
In some forms, fluorophore modified nucleic acid is a 5’ fluorophore-modified ssODN.
In some forms, fluorophore modified nucleic acid is a 5’ and 3’ fluorophore-modified ssODN.
In some forms, fluorophore modified nucleic acid is a 5’ fluorophore-modified sgRNA.
In some forms, fluorophore modified nucleic acid is a 5’ and 3’ fluorophore-modified sgRNA.
The nucleic acids are labelled with the fluorophore via covalent binding with/without a linker separating the nucleic acid and fluorophore.
In preferred forms, the ssODN or sgRNA is covalently linked to the fluorophore, preferably at its 5 ’end.
Exemplary fluorophore molecules include but are not limited to cyanine (Cy), 1, Cy2, Cy5 Cy5.5, Cy3, Cy3.5, Cy7 , Cyl.5, etc. (reviewed in Yuan, et al. Chem. Soc. Rev., 2025, 54, 341-366).
II. METHODS
Method for increasing the homology directed repair (HDR) efficiency following CRISPER/CAS mediated gene editing, by two-fold and reduce the frequency of large deletion by 50% (Figure 1). The methods are based in some forms on the discovery that MMEJ is the major repair pathway to mediate CRISPR-induced on-target large deletions.
A. PolQ inhibition
One embodiment to reduce the large deletion includes inhibiting PolQ activity, the main player of the MMEJ repair pathway, preferably, via treating cells with a PolQ inhibitor such Novobiocin (NVB) for example about 24 hours before and after electroporation.
B. RPA Activation
Another embodiment includes delivering recombinant replication protein A (RPA) proteins to cells subjected to CRISPER/CAA gene editing to avoid annealing of single- stranded DNA resected after CRISPR-induced DSBs. Meanwhile, the RPA proteins can prevent the donor ssDNA from degradation in the cells and can activate the HDR repair pathway, and the inhibition of MMEJ may switch the DNA repair pathway from MMEJ to HDR. As a consequence, the disclosed strategy can reduce CRISPR-induced on-target large deletions 50% and can increase HDR efficiency two-fold.
C. Modification of HDR nucleic acids
The methods are also based in some forms on the discovery that modifying nucleic acids involved in CRISPER/CAS gene editing can dramatically improve CRISPER/CAS gene editing efficiency.
Double strand breaks can be repaired by the cell in one of two ways: non-homologous end joining, and homology- directed repair (HDR). In HDR, a donor polynucleotide with homology to the cleaved target DNA sequence is used as a template for repair of the cleaved target DNA sequence, resulting in the transfer of genetic information from a donor polynucleotide to the target DNA. As such, new nucleic acid material can be inserted/copied into the site.
The modifications of the target DNA due to HDR repair can be used to induce gene correction, gene replacement, gene tagging, transgene insertion, nucleotide deletion, gene disruption, gene mutation, etc.
HDR genome editing composition include a donor polynucleotide sequence that includes at least a segment with homology to the target DNA sequence, the methods can be used to add, i.e., insert or replace, nucleic acid material to a target DNA sequence (e.g., to “knock in” a nucleic acid that encodes for a protein, an siRNA, an miRNA, etc.), to add a tag (e.g., 6xHis, a fluorescent protein (e.g., a green fluorescent protein; a yellow fluorescent protein, etc.), hemagglutinin (HA), FLAG, etc.), to add a regulatory sequence to a gene (e.g., promoter, polyadenylation signal, internal ribosome entry sequence (IRES), 2A peptide, start codon, stop codon, splice signal, localization signal, etc.), to modify a nucleic acid sequence (e.g., introduce a mutation), and the like. As such, the compositions can be used to modify DNA in a sitespecific, i.e., “targeted”, way, for example gene knock-out, gene knock-in, gene editing, gene tagging, etc. as used in, for example, gene therapy.
This disclosed improve HDR efficiency and can be widely used for any genome editing purposes, such as establishment of gene modified cell lines, genome editing
therapy, etc. For example, the compositions and methods can be used to improve the HDR of the point mutation of the HBB gene in sickle cell disease and reduce the risk of CRISPR-induced large deletions in genome edited hematopoietic stem cells. This strategy is easy to handle by pretreating the cells with NVB molecule and delivering RPA protein together with Cas9/sgRNA and donor ssDNA.
The disclosed methods can be further understood by way of the following non-limiting examples, which are a disclosure of one preferred embodiment.
Examples
A. Modulation of the microhomology-mediated end joining pathway suppresses large deletions and enhances homology-directed repair following CRISPR-Cas9-induced DNA breaks
Materials and Methods
Cell culture
The Hl hESC line was purchased from WiCell Institute. Hl-iCas9 ESC line is a gift from Danwei Huangfu’s lab. The wild-type iPSC line was reprogrammed and well characterized in previous studies [44, 55, 56]. The study was approved by the KAUST Institutional Biosafety and Bioethics Committee (IBEC). All hPSCs were cultured in Essential 8 medium (ThermoFisher, Cat# Al 517001) in rhLaminin-521 (ThermoFisher, Cat# A29249) coated wells with medium change daily. The peripheral blood mononuclear cells were isolated from the whole blood of a healthy donor via a standard Ficoll-Paque based protocol and further cultured in StemSpanTM-ACF Erythroid Expansion medium (STEMCELL Technology, Cat# 09860) for 13 days with medium change every 3 days to expand the erythroid progenitors. The erythroid progenitors were analyzed by FACS before CRIPSR-Cas9 editing.
Plasmids and lentiviral packaging
Oligonucleotides containing gRNA sequence were Annealed and later cloned into a lentiGuide-puro plasmid (Addgene Cat # 52963) followed by the published protocol [57]. The full-length RPA including RPA1, RPA2, and RPA3 open reading frames (ORF) were cloned from cDNA of Hl ESCs and GFP ORF was cloned from plnducer21 (Addgene, Cat # 46948). Subsequently, the ORFs of RPA and GFP were inserted into plnducer21 using the Gateway cloning method. The sequences were confirmed by Sanger sequencing. The gRNA lentiGuide- puro, newly constructed vectors, and pEGIP*35 (Addgene, Cat #26776) were packaged into lentivirus individually. Briefly, the plasmid was premixed with packaging vectors, then transfected into HEK293T using lipofectamine 3000. The lentivirus was harvested two times
after 48 hours and 72 hours. The lentivirus was concentrated by PEG-it Virus Precipitation Solution (System Biosciences) and stored in a -80° C freezer. siRNA transfection
The protocol of esiRNA transfection was adapted to the instruction of lipofectamine RNAiMAX reagent (ThermoFisher, Cat# 13778150). Hl-iCas9 cells were harvested after 1 hour of 10 pM Y-27632 (Abeam, Cat# abl20129) treatment. The esiRNA/RNAiMAX solution was prepared for 3 wells per siRNA of 12-well format plate as the recipe: Mix 1 was prepared by adding 13.5 pl RNAiMAX reagent into 225 pl opti-MEM and vertexing for a few seconds. Mix 2 was prepared by adding 90 pmol esiRNA into 225 pl opti-MEM + 90 pmol and pipetting a few times. The esiRNA/RNAiMAX solution was done by adding Mix2 into Mixl and incubating for
5 min, which was used for resuspending a 1.5 million cell pellet. After 30 min incubation with esiRNA/RNAiMAX solution, the cells were aliquoted equally into 3 rhLaminin-521 coated wells and cultured in 37°C, 5% CO2 incubator. The cell samples were collected after 24 for knockdown efficiency analysis.
Quantification PCR (qPCR)
The RNA was extracted using an RNeasy Mini kit (Qiagen, Cat #74106) and reversed transcribed to cDNA using iScript Reverse Transcription Supermix (BioRad, Cat# 1708840). The qPCR was performed on a CFX384 real-time PCR detection system (BioRad) using SsoAdvanced Universal SYBR Green Supermix (BioRad, Cat# 725270). The qPCR primers were shown in Table 1.
Droplet Digital PCR (ddPCR)
The genomic DNA was extracted after 3 days post-electroporation using a DNeasy Blood
6 Tissue kit (Qiagen, Cat #69506), and quantified by a Qubit instrument. The ddPCR was performed on a Bio-Rad QX200 system using ddPCR Supermix for Probes (No dUTP) (BioRad, Cat #1863024) following the manufacturer’s protocols. The 20x assay mix was comprised of 18 pM each primer and 5 pM each probe. One reaction contains 5 ng genomic DNA, lx assay mix, and lx ddPCR Supermix. The probes and oligos were shown in Table 1.
Table 1. Oligonucleotides
Flow Cytometry
For PIGA gene edited samples, the gRNA lentivirus infected Hl-iCas9 cells were treated with 2 pg/ml doxycycline for 2 days to induce Cas9 expression for gene editing. After the doxycycline treatment for 10 days, the cells were harvested and washed twice with PBS buffer containing 3% BSA and filtered through a 70 pm strainer. For each sample, 100, 000 cells were stained with 2 pl FLAER Alexa488 (Cederlane, Cat# NC9870611) in 100 pl PBS buffer containing 3% BSA for 15 min at room temperature. The stained cells were washed once with PBS buffer containing 3% BSA and load . For CD9 gene edited samples, cells were harvested and washed once in FACS buffer with 2% FBS. Subsequently, 10,000 cells were stained in 100 pl FACS buffer with 2% FBS and 1 pl PE anti-CD9 (BioLegend, Cat# 312106) for 30 min at 4°C, followed by two washes with FACS buffer containing 2%. For GFPmut correction samples, the cells were harvested after 3 days post-electroporation and passed through a 70 pm strainer. The cells were resuspended in 200 pl FACS buffer containing 1 pg/ml DAPI and loaded onto a FACS Aria II cytometer for analysis. For LAMP2 gene edited samples, a manufacturer's protocol of BD Cytofix/Cytoperm™ Fixation/Permeabilization Kit (BD, Cat# 554714) was followed. Briefly, the cells were fixed in Fixation/Permeabilization solution at 4°C for 20 min and washed twice in BD Perm/Wash™ Buffer, followed by staining using 50 pl BD Perm/Wash™ Buffer containing 2 pl FITC anti-LAMP2 (eBioscience, Cat# 11-1078-42) for 30 min at 4°C. After two washes, the cells were resuspended in FACS buffer and loaded onto a BD FACSymphony™ A3 Cell Analyzer.
Cell cycle synchronization and analysis
The cell cycle synchronization protocol was adapted from a previous publication [36]. In brief, PIGA intr5_l gRNA positive Hl-iCas9 ESCs were seeded at a density of 2 x 105 cells per
well in a 12-well plate. To synchronize the cells at the G2/M phase, a 16-hour treatment with 100 ng/ml nocodazole (Abeam, Cat# abl20630) was administered. Subsequently, the cells were washed twice with prewarmed lx PBS and then cultured in fresh E8 medium for 4 hours and 12 hours to release the G1 and S phases, respectively. Alternatively, the synchronized cells were treated with 2 pg/ml doxycycline to induce Cas9 expression and genome editing, followed by 10 days’ culture for LD analysis.
Cell cycle analysis was performed using a standard protocol. Initially, cell pellets were fixed by adding cold 70% ethanol dropwise while vortexing and then incubated overnight at - 20°C. Subsequently, the cells were washed twice and resuspended in FACS buffer containing 200 pg/ml RNase. After a 20-minute incubation at room temperature, the cells were washed once and resuspended in FACS buffer containing 1 pg/ml Propidium Iodide (PI). Following a 10-minute incubation at room temperature, the samples were ready for FACS analysis.
CRISPR-Cas9 genome editing
WT and Hifi-Cas9 were purchased from IDT. The gRNAs used in this study were designed using Benchling (https://www.benchling.com/crispr) and their sequences were shown in Table 1. The gRNAs were obtained either through in vitro transcribed by MEGAshortscript™ T7 Transcription kit (ThermoFisher, Cat# AM1354) or ordered through IDT as Alt-R crRNAs or sgRNAs. For each electroporation, 50 pmol of Alt-R gRNA and 50 pmol of Cas9 were mixed and incubated at room temperature for 10 mins to form ribonucleoprotein (RNP). Buffer R (from the Neon system kit) was added into RNP to make 10 pl final volume. 200,000 single cells were electroporated using the Neon system (ThermoFisher) with the setting of 1600 V, 10 ms width and 3 pulses. For the HDR study, 30 pmol ssODN was mixed with 50 pmol RNP before the electroporation. The cells were seeded in one well of a 24-well plate immediately after electroporation.
PacBio and Nanopore sequencing
The genomic DNA of edited cells was extracted using a Blood & Tissue Kit (Qiagen, Cat# 69506). The UMI labeling was performed following the published protocol [7]. Briefly, the target locus was labeled by one-cycle PCR using a UMI primer (Table 1) in a 25 pl reaction including 50 ng genomic DNA, 1 pM UMI primer (containing a universal forward primer sequence, 10 nts UMI barcode, and a target locus forward primer sequence, see it in Supplementary Table 1), 12.5 pl 2X Platinum SuperFi PCR Master Mix (ThermoFisher, Cat# 12358010), following the program: initial denaturation at 98 °C for 70s, gradient annealing from 70 °C to 65 °C with 1 °C/5 s ramp rate, extension at 72 °C for 7 min, and hold at 4 °C. The UMI labeled DNA was purified by 0.8x AMPure XP beads, then mixed with a universal forward
primer, a target locus reverse primer (Supplementary Table 1), and PrimeSTAR GXL DNA polymerase (Takara, Cat# R050A), and amplified following the program: initial denaturation at 95 °C for 2 min, 98 °C for 10 s, 68 °C for 7 min for 30 cycles, 68 °C for 5min, and hold at 4 °C. The amplicons were purified with AMPure XP beads and used for PacBio or Nanopore library preparation.
For Nanopore sequencing, the library preparation was done using the ligation sequencing kit (Oxford Nanopore Technologies, Cat# SQK-LSK109) following its standard protocol. The Nanopore sequencing was performed on an Oxford Nanopore Mini ON sequencer using R9.4.1 flow cells. The reads were base called using Guppy basecaller (v5.0.7). Library preparations of PacBio sequencing were performed with the Sequel Sequencing Kit 3.0 and loaded on the PacBio Sequel instrument with SMRT Cell IM v3 LR Tray. PacBio official tool termed ccs (v3.4.1) was used to generate HiFi Reads. All procedures were performed according to the manufacturer’s protocols.
Data analysis was performed using VAULT as described previously [7]. In brief, the UMI primer sequence, fastq file and reference amplicon sequence were provided to the algorithm. VAULT will extract mappable reads followed by extraction of UMI sequences from reads. Reads will then be grouped based on their UMI sequences and used for parallel analysis of SNVs and SVs. The “vault summarize’’ command was used to generate the analysis summary.
Recombinant human RPA protein
Human RPA was expressed and purified as described previously [58, 59]. Briefly, the cloned plasmid was transformed into BL21 (DE3) E. coli. The cells were grown in 2YT media at 37 °C to an ODeoo of 0.7 and protein expression was induced with 0.5 mM IPTG and further incubated for 4 - 6 hr at 37 °C. The cells were collected by centrifugation and lysed by lysozyme and sonication. The supernatant was loaded onto HisTrap HP 5 ml column (Cytiva) followed by the HiTrap Blue affinity column (Cytiva). RPA fractions containing all subunits were concentrated and loaded onto HiLoad 16/600 Superdex 200 pg column (Cytiva). RPA protein fractions were flash-frozen and stored at -80 °C.
Statistical Analysis
The data in the figures are shown as the mean ± SD unless indicated otherwise. Comparisons were performed with two-sided Student’ s t-test unless indicated otherwise.
Results
Most CRISPR-Cas9 induced LDs in human pluripotent stem cells contain microhomology at breakpoint junction
CRISPR-Cas9 can efficiently cut the target DNA to promote gene knockout through the formation of small indels or precise installation of DNA sequence changes through homology directed repair. However, it also causes unintended LDs and structural variations (SVs) up to megabase or even chromosome scale loss6-8,25,26' The underlying mechanism of CRISPR-Cas9 induced LD remains unclear. Sequencing data of 329 CRISPR-Cas9 edited alleles from two published studies were collated 5 18. An unusually high frequency of microhomologies (MHs) at LD breakpoint junctions was identified. For example, MHs > 2 bp were present in more than 70% of the LD alleles (Fig. 1A). Human pluripotent stem cell (hPSC) lines (20 clones) edited by CRISPR-Cas9 in the SH2B3 and H1.3 genes in-house were also examined and it was discovered that five of them harbored LD alleles, in which four contained MHs (Figure IB, Fig. 5A).
To provide quantitative evidence for prevalent MHs in Cas9-edited hPSCs, the PigA and CD9 intronic region, respectively, in Hl hESCs were edited using Cas9/gRNA ribonucleoprotein (RNP) complex (Fig. 1C and Fig. 5B). At both loci, the distance between the intronic gRNA and the nearest exons is more than 200 bp. Therefore the edited ells that lose cell surface expression of CD9 or PIGA, as monitored by fluorescence- activated cell sorting (FACS), are considered to contain LDs that extend at least from the CRISPR-Cas9 cleavage site to the nearest exon. PacBio circular consensus sequencing of a 7-kb region flanking the intronic gRNA target amplified from the CD9-negative or both PIGA- positive and PIGA-negative cells and respective unsorted cells, was performed (Fig. 1C, Fig. 5B). The sequencing data showed that most PIGA- negative sorted cells contain LDs in which the nearby exon was deleted either entirely or partially, while the PIGA- positive sorted cells often contain small indels and occasionally LDs that the nearby exon was not disrupted (Fig. 5C), which indicates that the FACS-based quantification can be used for LD studies. The examination of reads with deletions > 30-bp at the Cas9 cut site revealed a strong enrichment of MHs (> 2 bp) in the breakpoint junctions in the negative populations (78.61% and 97.91% in the CD9 and PIGA loci, respectively), which was lessened in the unsorted populations (Fig. 1C, Fig. 5D).
Since MMEJ mediated DNA repair results in MHs at the breakpoints (Fig. 1 A), the high occurrence of MHs suggest MMEJ repair pathway plays important role in meditating CRISPR- induced LDs.
RPA and POLQ regulate LD formation, but not PARP1 and LIG3
To better understand the role of the MMEJ pathway in the formation of LDs, the function of four genes (PARP1, RPA, POLQ, LIG3) were modulated in hPSCs undergoing CRISPR-Cas9 editing (Figs. 2A and 2B). To achieve consistent and uniform induction of the CRISPR-Cas9 editing, an Hl ESC line with a doxycycline-inducible Cas9 expression system knocked-in to the AAVS1 safe harbor locus (Hl-iCas9) was used27. The LD frequency was first investigated using sgRNAs targeting different intronic regions of the X-linked PIGA gene, an established model 3,6,14 for the study of CRISPR-Cas9 editing outcomes (sgRNA positions are shown in Fig. 5E and Fig. 7A.). Thirteen intronic sgRNAs targeting PIGA and seven intronic sgRNA targeting CD9 were individually expressed in Hl-iCas9 ESCs using a constitutive lenti viral vector. Upon doxycycline induction, these sgRNAs guided Cas9 generated DSBs located 126-489 bp from the nearest exon (Fig. 5E and 7 A).). Subsequent DNA repair could lead to small indels that did not reach coding sequences and preserved PIGA expression, or, to LDs that extended into nearby exons and disrupted PIGA expression, which resulted in cells stained positively and negatively with FLAER reagent, respectively (Fig. 5C). Control sgRNAs targeting exons led to a nearly complete PIGA loss (100% PIGA knockout) based on FACS quantification of FLAER staining, suggesting that the disclosed system achieved a saturating level of editing (Fig. 6D). The intronic sgRNA data showed that the frequency of PIGA deficient cells (FLAERneg) ranged from 0.23% to 9.05% (average 3.52+0.83%) in Cas9-edited cells (Fig. 5E).
The intronic sgRNA data showed that the frequency of PIGA- deficient cells (FLAERneg) ranged from 0.23% to 9.05% (average 3.52 ± 0.83%) in Cas9-edited cells (Fig. 5E). In the case of the autosomal CD9 gene, intronic sgRNAs led to lower frequencies of CD9-negative cells, ranging from 0.19% to 4.4% (average 1.25% ± 1.44%) (Fig. 7A). Note that the negatively stained population is likely a conservative estimate of LD, because LDs extending to the opposite side of the nearest exon may not result in loss of expression (i.e., FLAERneg) (Fig. 5C) and in-frame LDs may lead to hypomorphic levels (i.e., cells with intermediate FLAER staining in Fig. 5E). In cells without Cas9 expression (no dox), background LD events were almost undetectable (Fig. 2C, 2D, and Fig. 6A). This observation demonstrates that the LDs were specifically caused by Cas9-induced DSBs. LD frequency could not be predicted based solely on the orientation of the sgRNA (targeting the + or - strand) or the distance between the sgRNA and the nearest exon Fig. 5E and 5F) suggesting a dependency on the sequence context. CRISPR/Cas9 mediated genome editing has been associated with potential induction of various severe chromosome structural abnormalities, such as chromosome loss [26, 28], truncation [29- 31], and translocation [32-34]. To rule out the effect of such events on the LD%, we edited the
Hl-iCas9 ESCs employing intr5_l sgRNA, the highest LD% intronic gRNA based on our test (Fig. 5E), and quantified the X chromosome copy number using a well-established qPCR-based assay [35] by targeting multiple gene loci (VCX, PNPLA4, TSPAN7, USP9X, USP27X and HRRT1) flanking PIGA gene in both non-edited cells and PIGA FLAER- negative sorted cells. The data did not show significant chromosome loss at these loci in PIGA FLAER negative sorted cells (Fig. 5G). To corroborate this result, a ddPCR assay for the VIA .S' gene (located on the same X chromosome p-arm as the PIGA gene), was also conducted, which is considered a more sensitive quantification method for detecting chromosome copy number variation. Consistently, the ddPCR result did not show any significant difference in X chromosome copy number. Thus, an sensitive setup to evaluate the effects of modulating the MMEJ pathway on the occurrence of LDs was established in the rest of the study.
Considering the prevalence of MMEJ MHs represented in LDs, it was hypothesized that LD frequency could be controlled by modulating the activity of the MMEJ pathway. Four key players of the MMEJ pathway were knocked down: PARP1, LIG3, RPA (including RPA1, RPA2, and RPA3), and POLQ in Hl-iCas9 cells expressing the PIGA intr5_l sgRNA (Figs. 2A- 2C and 6A-6C)) and Cas9 expression was induced 24 hours later. LD frequency was monitored by FACS analysis of FLAER staining as described in the preceding paragraph (Figs. 2B, 2C, and 6A). The results showed that knocking down POLQ caused a 40% reduction in LD frequency, while knocking down RPAs lead to a 40% increase in LD frequency (Figs. 2C). Interestingly, knocking down PARP1 or LIG3 did not affect LD frequency (Fig. 6C).
To investigate whether the LD frequency is influenced by the cell state, studies examined the impact of the cell cycle on LD prevalence. Human PSCs are notoriously difficult to arrest in the G1 phase. Among all the cell cycle synchronization drugs and protocols tested, only nocodazole at 100 g/ml could synchronize hPSCs reliably without toxicity (Fig. 6D and 6E). The Hl-iCas9/PIGA intr5_l sgRNA system used herein is highly sensitive to doxycycline exposure, leading to the maximum LD frequency within a 12-hour oxycycline treatment (Fig. 6F). A slight increase in LD frequency when cells were arrested at the G1 phase, while no significant differences were found across all ell cycle phases was observed (Fig. 6G). The effects of the knockdown f MMEJ protein on the cell cycle was investigated. The cell cycle profile of the knockdown samples did not differ significantly from that of the control (Fig. 6H). Therefore, the findings suggest that the LD frequency is mostly contributed by DNA repair pathway inhibition.
To better quantify the Cas9 editing outcomes in MMEJ-knockdown Hl-iCas9 cells at base resolution, IDMseq was performed 7 of the PIGA locus. Briefly, individual genomic DNA flanking the Cas9 cut site was labelled with unique molecular identifiers (UMI) and amplified
for long-read PacBio sequencing (Fig. 2E). In the subsequent sequencing data analyses, deletions > 30 bp were referred to as LDs. IDMseq showed that the vast majority of SVs detected in Cas9- edited cells were LDs. The baseline LD frequency of the control siRNA per IDMseq was higher than that estimated by FACS, which is expected, because FLAERneg% underestimates LD frequency as discussed above, and because LDs of 30-278 bp in size (i.e., noncoding deletions) are only detectable by IDMseq (Fig. 5B and 5C). The LD length spectrum exhibits striking similarities across all groups (Fig. 7K), implying that the LD size remains unaffected by the MMEJ deficiency. Consistent with the FACS analysis the IDMseq results showed that knocking down POLQ decreased LD frequency and RPA knockdown increased LD frequency (Fig. 2F). Consistent observations were made at the CD9 locus sing Oxford Nanopore Technologies (ONT) long-read sequencing to quantify LD frequency (Fig. 2G). To gain insights into the impact of POLQ and RPA knockdown on DSB repair, we conducted MH analysis for LD events of PIGA. Our results did not yield statistically significant differences when comparing the MMEJ knockdown groups with the control group, possibly due to the constraints imposed by the sequencing depth of long-read data. However, a discernible trend merged in the data, indicating an increase in MH>2bp in RPA- knockdown cells and a corresponding decrease in POLQ- knockdown cells (Fig. 6L)., This trend aligns with the observed patterns in LD data (Fig. 2F).
LDs induced by CRIPSR-Cas9 can be controlled by modulating POLQ and RPA
POLQ is an error-prone polymerase and often upregulated in numerous cancers23-27. The antibiotic novobiocin (NVB) has recently been identified as a specific inhibitor of POLQ. NVB inhibits the ATPase activity of POLQ through direct binding to the ATPase domain and thus phenocopies POLQ depletion and impairs MMEJ DNA repair in human cells39. NVB was used to test if targeting a specific MMEJ-related activity of POLQ could recapitulate the reduction of LD frequency by knocking down POLQ level globally. NVB was introduced to the cells during induction of Cas9 expression by doxycycline in the PIGA intr5_l sgRNA-positive Hl-iCas9 ESCs. NVB decreased LD frequency up to 50% in a dose-dependent manner (Fig. 3A). NVB showed no discernible effect on the pluripotency of treated hESCs (Fig. 61). High concentrations of NVB (50 pM) showed signs of cytotoxicity, while lower concentrations were well tolerated by the cells (Fig. 2J). Studies also performed the LD analysis on both the PIGA and CD9 loci using a potent and selective inhibitor of the polymerase function of POLQ-ART558. Similarly, ART558 treatment significantly decreased the frequency of LDs by up to 61.78% in a dosedependent manner (Fig. 3A and Fig. 7C). These results show that transient inhibition of POLQ activity is sufficient to reduce the formation of LDs following repair of DSB induced by Cas9.
The RPA proteins prevent ssDNA annealing thus blocking MMEJ repair20,43, that the data herein showed knocking down RPA increases LD. It was hypothesized that increasing RPA availability could divert DNA repair away from the MMEJ pathway during Cas9 editing and lead to a reduction of LDs. Therefore, three RPA subunits (RPA1, RPA2, RPA3) and GFP (as a control) were individually cloned into the inducible lentiviral expression vector, plnducer21, that expresses GFP constitutively (Fig. 3B). Successfully transduced cells were sorted based on GFP positivity and transgene expression was induced by doxycycline. The expression level of the transgenic RPA proteins increased from 6-fold to 26-fold after doxycycline induction without affecting the expression of other RPA subunits (Fig. 3B). Such levels of overexpression of all three RPA proteins resulted in significant reductions in ED frequency as detected by FACS (Fig. 3C, Fig. 7C).
To gain a sequence-level understanding of the effect of POEQ inhibition and RPA overexpression on DNA repair outcome of CRISPR-Cas9 editing, IDMseq7 of the PIGA locus and ONT long-read sequencing of the CD9 locus as in the knockdown experiments was performed, as in the knockdown experiments. Similar to the knockdown results, no substantial disparity in the LD size spectrum was discernible across the samples (Fig. 6M). . Both POLQ inhibition and RPA overexpression demonstrated the ability to decrease CRISPR-Cas9 induced LDs (Fig. 3D and 3E), consistent with the FACS analysis results.. The MH>2bp frequency exhibited a decrease when comparing the NVB treatment and RPA overexpression groups to the wild-type group (Fig. 6N). .
To investigate the generality of the effects of POLQ and RPA on LDs, we conducted additional experiments on the X-linked gene LAMP2 in an induced pluripotent stem cell (iPSC) line [44] , and quantified the LD using flow cytometry. Knocking down RPA subunits significantly increased LD frequency, while inhibiting POLQ with NVB and ART558 or overexpressing RPA subunits significantly reduced (Fig. 7D-G). Additionally, we performed bulk ONT long-read sequencing to quantify LD in two disease-associated CRISPR- edited genes (WAS and HBB) in the same cellular models used for PIGA and CD9 editing. Consistent with other gene loci, the LD frequency significantly increased in the RPA-knockdown group, and dramatically decreased in POLQ deficiency or RPA overexpression groups (Fig. 7H, 71).
To examine whether modulating POLQ activity or RPA overexpression affects the desirable small indel formation, the editing efficiency of an sgRNA targeting PIGA exon 2 was analyzed by FACS (Fig. 60). The data showed that treatment with NVB or overexpression of RPA1 and RPA3 did not change the frequency of PIGA knockout cells (the majority of which contain small indels). These results showed that LD can be controlled by inhibiting POLQ
activity and overexpressing RPA without changing the overall editing efficiency. Thus, smallmolecule inhibition of POLQ and RPA overexpression offer convenient and safe ways to reduce unwanted LDs following Cas9 editing without compromising editing efficiency. .
Modulation of POLQ and RPA can enhance HDR efficiency
To test the hypothesis that the inhibition of MMEJ pathway could potentially increase HDR efficiency, an hPSC cell line containing a mutant GFP transgene that can be rescued to express wild-type GFP through HDR mediated by CRISPR-Cas9 was established (Fig. 4A). We treated the cells with 25 pM NVB for 24 hours before and after electroporation of the Cas9/sgRNA RNP and an ssODN donor and observed a significant increase n HDR efficiency compared to the control (Fig. 4B, Fig. 8A-8C). We also investigated the effect of recombinant RPA on HDR efficiency and found that low doses (less than 10 pmol) of RPA, premixed with the Cas9/sgRNA RNP and ssODN donor before electroporation, improved HDR efficiency (Fig. 4b; Fig. 8A-8C). However, higher doses of RPA diminished HDR efficiency, while not being bound by theory, possibly due to dose-dependent interference with ssODN delivery into cells. This was demonstrated by FACS analysis of Cy3-labelled ssODN after co-electroporation with Cas9/sgRNA RNP and varying doses of RPA (Fig. 8D). To validate this strategy in clinically relevant genes and/or cell types, we installed, via Cas9- mediated HDR, an EPOR gene mutation (G6002A) that can cause benign human erythrocytosis [45] in both human ESCs and primary peripheral blood erythroid progenitors; and an activating WAS mutation (T882C) that is associated with X-linked neutropenia [46]. The findings demonstrated a consistent enhancement of HDR efficiency in the editing of both genes in different cell types when treated with NVB or using recombinant RPA, compared to the control group (Fig. 4C and 4D; Fig. 8E). Hence, these data demonstrate that modulating POLQ activity and RPA level can increase HDR efficiency for precise genome editing.
Discussion
The data shows that CRISPR-Cas9 continuously recuts the target after error-free DNA repair (which regenerates the target) could increase the chance for LD-prone repair through the MMEJ pathway. Although how asymmetrical release of the 3’ end of non-target DNA strand after Cas9 cleavage and long-term residence of Cas9 on the broken ends of DNA [48] affect DNA repair pathway choice is unclear, that PARP1 knockdown does not affect LD frequency suggests CRISPR-Cas9 induced DSBs initiate MMEJ repair pathway via a PARP1 -independent manner (Fig. 6C). Although LIG3 is a predominant ligase of the MMEJ pathway that seals the nicks in DNA, its function could be replaced by other ligases, , such as LIG1.
Knocking down or inhibiting POLQ caused a significant reduction of LDs, which suggested limited functional redundancy between POLQ and other DNA polymerases and reaffirmed the central role of MMEJ in Cas9- induced LD. RPA is involved in DNA replication and repair. We discovered that RPA deficiency led more frequent LDs induced by CRISPR- Cas9, potentially because RPA prevents the annealing of resected ssDNA at MHs. This study discovered two key players (POLQ and RPA) of MMEJ regulate CRISPR-Cas9 induced LD formation and provided a mechanistic understanding of Cas9-induced LD. The studies then demonstrated that small-molecule inhibition of POLQ or supplying recombinant RPA together with Cas9/sgRNA RNP and ssODN can significantly increase HDR in hPSCs. Thus, smallmolecule inhibition of POLQ and/or delivery of recombinant RPA offers a simple, convenient, and potentially safe way to reduce the risk of the unwanted LDs and improve HDR efficiency.
B. Modified ssODNs Improve CRISPR-mediated HDR Efficiency
The rapid advancements of CRISPR genome editing technologies have revolutionized biomedicine; however, precise control over editing outcomes remains challenging. Achieving precise genome editing via homology-directed repair (HDR) is essential in many aspects, especially for clinical applications, yet current limitations (such as , low HDR efficiency, high costs, and safety concerns) restrict broader implementation. The studies herein demonstrate a significant enhancement in HDR efficiency by employing fluorophore-modified single-stranded oligodeoxynucleotides (ssODNs) with CRISPR-Cas9. Of the modifications tested, the 5' cyanine 5 (5’ Cy5) modification achieved the highest HDR efficiency with minimal cytotoxicity across multiple gene loci in human stem cell lines.
Materials and Methods
All oligos, WT-Cas9 (Alt-R™ S.p. Cas9 Nuclease V3) and GFP-Cas9 (Alt-R™ S.p. Cas9-GFP V3) were purchased from Integrated DNA Technologies (IDT). Details about oligos including ssODNs (with or without modification), gRNAs, primers and probes used in this study are presented in Supplementary Table 2.
Table 2 - Oligonucleotides
pInducer20-Cas9 (iCas9) plasmid was reconstructed by inserting Cas9 sequence into a p!nducer20 vector (Addgene, Cat#44012) using a Gateway cloning method. Lenti-NHEl-TRE-EFla-rTta (control) and Lenti-NHEl-TRE-EFla-Hl.O (OE H1.0) were reconstructed using an In-Fusion cloning kit (Takara, Cat#638948). Information regarding antibodies is provided in Table 3
Table 3
Cell Lines
The study was reviewed and approved by the KAUST Institutional Biosafety and Bioethics Committee. The GFP-mutant (pEGIP*35 plasmid, Addgene, cat#26776) iPSC and SC9N (mutant HBB) -iPSC lines were generated and well characterized in previous studies (7- 3). GLPIRmutant iPSC clone H4 was kindly provided by Professor Antonio Adamo. All iPSCs were cultured in Essential 8 medium (ThermoFisher, cat#A1517001) in rhLaminin-521 (ThermoFisher, cat#A29249) coated wells with daily medium change. Human naive pluripotent stem cells- SC-9N iPSCs- were established in a previous study (4). Naive iPSCs were cultured in PXGL medium, consisting of N2B27 basal medium supplemented with 1 pM PD0325901 (PD) (STEMCELL Technologies, cat#A10256), 2 pM XAV-939 (VWR, cat# ALEXBML-WN100- 0005), 2 pM Go6983 (Sellck, cat#S2911), and 10 ng/mL LIF (Cell Signaling Technology, cat#62226S). The N2B27 medium was a 1:1 mixture of DMEM/F12 (ThermoFisher, cat#l 1330032) and Neurobasal (ThermoFisher, cat#21103049), lx N2 supplement (Gibco, cat#17502-048), lx B27 supplement without Vitamin A (Life Technologies, cat#12587010), 2 mM GlutaMAX (ThermoFisher, cat#5050-061), 10 mM HEPES (ThermoFisher, cat#15630080), 0.055 m 2-mercaptoethanol (ThermoFisher, cat#21-985-023), MEM NEAA (ThermoFisher, cat#l 1140050), and 1% penicillin-streptomycin (ThermoFisher, cat#15140-122).
GFP-mutant naive iPSC was epigenetically reset using the previously established protocol (5). Primed state iPSC was transitioned into irradiated mouse embryonic fibroblast (iMEF) and treated with 1 pM PD, 10 ng/mL LIF, and 1 mM valproic acid sodium salt (Merck, cat#P4543) for 3 days. Subsequently, cells were transferred to a PXGL medium. By day 10, naive dome-shaped colonies emerged and were purified through FACS sorting for SUSD2 (Biolegend, cat#327406) or through several passages.
The pInducer20-Cas9, Lenti-NHEl-TRE-EFla-rTta (control) and Lenti-NHEl-TRE- EFla-Hl.O (OE H1.0) constructs were packaged into lentivirus individually. Briefly, the plasmid was co-transfected with psPAX2 and pMD2.G following TransIT-293 Transfection Reagent (Minis Bio, cat#MIR2704) standard protocol. The lentivirus was harvested after 48 hr and concentrated by PEG-it Virus Precipitation Solution (System Biosciences, cat#LV810A-l). GFP-mutant iPSCs were transduced with iCas9 lentivirus and 1 pg/ml polybrene (Merck, cat#TR-1003-G) for 24 h, then 1-8 pg/ml G418 (Invitrogen, cat#108321-42-2) was added to the refreshed medium. After a 14-day culture with G418, it expanded as a GFPmutant-iCas9 line. Similarly, GFP-mutant OE Hl.O/control line was established with the same protocol using 2-10 pg/ml blasticidin for positive selection (ThermoFisher, cat#Al 113903).
CRISPR-Cas9 Genome Editing with Different Cell Lines
For genome editing, cells were electroporated using the Neon Transfection System (ThermoFisher), 50 pmol of gRNA and 50 pmol of WT-Cas9 were mixed and incubated at room temperature for 10 min to form ribonucleoprotein (RNP). A total volume of 10 pl including 30 pmol ssODN, 50 pmol RNP and buffer R (from the Neon Transfection System kit, cat# MPK1096) was prepared for electroporation. A total of 200,000 single cells were electroporated using the Neon Transfection System with the setting of 1600 V, 10 ms width and 3 pulses. The cells were seeded in one well of a 24-well plate immediately after electroporation, and their dynamic changes were monitored using the Incucyte S3 system.
CRISPR-Cas9 Genome Editing with Human Heart Organoid (hHO)
Human heart organoids (hHOs) were generated according to a well-established protocol (6). GFP-mutant-iCas9 iPSCs were suspended in Essential 8 medium supplemented with 10 pM Rho kinase (ROCK) inhibitor Y-27632 (Abeam, cat#abl20129) and seeded at 10,000 cells/well in round bottom ultra-low 96-well plates (CELLSTAR, cat#650970) on day-2 at a volume of 100 pl per well. The plate was then centrifuged at 100 g for 3 min and placed in an incubator at 37 °C, 5% CO2. After 24 h (day-1), 50 pl of medium was carefully removed from each well, and 200 pl of fresh Essential 8 medium was added for a final volume of 250 pl/well. On day 0, 166 pl (—2/3 of total well volume) of medium was removed from each well and 166 pl of RPMI 1640 (ThermoFisher, cat#21875034)/B-27, minus insulin (Life Technologies, cat#A1895601)(B27-I here after) containing CHIR99021 (Selleck, cat#SML1046) was added at a final concentration of 4 pM/well along with 1.25 ng/ml BMP4 (Abeam, cat#ab51998) and 1 ng/ml Activin A (Abeam, cat#abl51687) for 24 h. On day 1, 166 pl of media was removed and replaced with fresh B27-I. On day 2, B27-I containing Wnt-C59 (Abeam, cat#abl42216) was added for a final concentration of 2 pM Wnt-C59 and the samples were incubated for 48 h. The medium was changed again on day 4 with fresh B27-I and day 6 with fresh RPMI1640/B-27 (ThermoFisher, cat#17504044). On day 7, a second 2 pM CHIR99021 exposure was conducted for 1 h in RPMI1640/B-27. Subsequently, the culture medium was changed every 48 h.
GFP-mutant hHOs correction was performed following adapted lipofectamine RNAiMAX reagent (ThermoFisher, cat#13778150) standard protocol. In brief, hHOs after day 20 were selected for GFP correction. Medium from each well was replaced with fresh RPMI 1640/B-27 medium, containing 2 pg/ml doxycycline to induce Cas9 expression. After 48 h, the gRNA/ssODN/RNAiMAX solution was prepared for 5 hHOs per condition as the recipe: Mix 1 was prepared by adding 9 pl RNAiMAX reagent into 50 pl opti-MEM (Gibco, cat#10149832), mixed briefly. Mix 2 consisted of 150 pmol gRNA and 150 pmol ssODN in 50 pl Opti-MEM.
The final solution was done by adding Mix2 into Mixl and incubating for 5 min. 5 hHOs were washed with PBS then incubated with final solution supplemented with 2 pg/ml doxycycline. After 30 min incubation, the hHOs and the final solution were separately added into one well of a round bottom ultra-low 96-well plate and cultured in 37°C, 5% CO2 incubator. The culture medium was changed every 48 h and the samples were collected after 96 h for analysis.
CRISPR-Cas9 Genome Editing with Human Blastoid
Blastoids were generated according to a well-established PALLY protocol (7). Cells were suspended in N2B27 medium with 10 pM Y-27632 and seeded at a density of 75 cells per microwell in 400 pm AggreWell plates (STEMCELL Technologies, cat#34425). After compaction on day 0, the medium was replaced with N2B27 supplemented with PALLY components (1 pM PD, 1 pM A83-01 (Axon, cat#909910-43-6), 5 pM lysophosphatidic acid (LPA)(Merck, cat#L7260), 10 ng/mL LIF, and 10 pM ROCK inhibitor Y-27632). After two days, the medium was refreshed with N2B27 supplemented with 5 pM LPA and 10 pM ROCK inhibitor. Maturation of the structures was done for another two more days.
The blastoid correction was performed using LONZA 4D-Nucleofector system (Lonza Bioscience). Briefly, 100 pmol gRNA and 100 pmol of WT-Cas9/GFP-Cas9 were mixed and incubated at room temperature for 10 min to form ribonucleoprotein (RNP). 60 pmol ssODN was mixed with 100 pmol RNP before the electroporation. Nucleofector Solution P3 (LONZA, cat#V4XP-3032) was added into RNP to make 20 pl final volume. Approximately, 400 aggregates from day 0 were mixed with prepared solution and loaded into Nucleocuvette (LONZA, cat#V4XP-3032). The 4D-Nucleofector X Unit was employed, and transfections were carried out using the program “hES cell, H9”, pulse code CB150. Immediately after electroporation, blastoids were transferred to AggreWell plates following the standard PALLY protocol.
In Vitro Blastoid Attachment Assay
The in vitro attachment assay was performed based on a previously established protocol. On day 4, well-formed blastoids were manually selected. An 8-well p-slide chamber (ibidi, cat# 80827) was pre-coated with fibronectin (Merck, cat#F0895, diluted 1:40 in cold PBS) for 30 minutes. About 10 blastoids were introduced into each chamber containing IVC1 medium and incubated at 37 °C with 5% CO2 and 5% O2. The IVC1 medium consisted of Advanced DMEM/F12 (ThermoFisher, cat#12634-010), 20% heat-inactivated FBS (ThermoFisher, cat#30044333), 2 mM GlutaMAX, 0.5% penicillin-streptomycin, 1% ITS-X (ThermoFisher, cat#51500-056), and 1% sodium pyruvate (Sigma-Aldrich, cat#S8636). Additional components included 8 nM -estradiol (Sigma- Aldrich, cat#E8875), 200 ng/mL progesterone (Sigma-
Aldrich, cat#P0130), 25 p M N-acetyl-L-cysteine (Sigma-Aldrich, cat#A7250), and ROCK inhibitor (Selleckchem, cat#S1049). After 48 hours, the media was replaced with IVC2, which mirrored the composition of IVC1, but substituted 20% FBS with 30% knockout serum replacement (ThermoFisher, cat#10828-028). Culturing continued for 48 hours, after which the media was collected for hCG testing with a commercial pregnancy kit and the structures underwent further immunofluorescence analysis.
Flow Cytometry
For GFP-mutant iPSC correction samples, the cells were harvested after 3 days postelectroporation using TrypLE (Life Technologies, cat#12604-021) and passed through a 70 pm strainer. The cells were resuspended in 200 pl FACS buffer (lx Ca2+/Mg2+ free PBS, 2mM EDTA) containing 1 pg/ml DAPI and loaded onto a BD FACSAria™ Fusion cytometer for analysis. GFP-mutant hHO correction samples were harvested after 3 days post-electroporation using 0.25% Trypsin-EDTA (Gibco, cat#25200056) and filtered through a 70 pm strainer. For each sample, 100, 000 cells were stained with 2 pl APC-CD106 (BioLegend, cat#305810) in 100 pl FACS buffer containing 3% FBS for 15 min at room temperature. The stained cells were washed once with FACS buffer containing 3% FBS and resuspended in 200 pl FACS buffer containing 1 pg/ml DAPI for FACS analysis. For naive GFPmutant iPSC sorting, cells were harvested using TryplE containing 10% DNase I (VWR, cat#APA3778.0500) and passed through a 40 pm strainer, then stained with SUSD2 (Biolegend, cat#327406).
Cell cycle analysis was conducted using a standard protocol. Initially, cell pellets were fixed by adding cold 70% ethanol dropwise while vortexing and then incubated overnight at - 20°C. Subsequently, the cells were washed twice and resuspended in FACS buffer containing 200 pg/ml RNase A. After a 20-minute incubation at room temperature, the cells were washed once and resuspended in FACS buffer containing 1 pg/ml propidium iodide (PI) staining solution (Tonbo Biosciences, cat#13-6990-T200). Following a 10-minute incubation at room temperature, the samples were ready for FACS analysis. Data were processed using the BD FlowJo software (10.8.1), employing the Watson model for cell cycle assessment.
Droplet Digital PCR (ddPCR)
The genomic DNA of was extracted after three days post-electroporation using a DNeasy Blood & Tissue kit (Qiagen, cat#69506), and quantified by a Qubit instrument. The HBB/GLPIR-corrected iPSCs were harvested after three days post-electroporation and subjected to the genomic DNA extraction using a DNeasy Blood & Tissue kit (Qiagen, Cat #69506). The genome DNA was quantified using a Qubit instrument. 10 HBB gene edited blastoids were lysed by 20 pl STE buffer (10 mM Tris-HCl (pH 8.0), 1 mM EDTA, 100 mM NaCl) following the
program: 55 °C for 180 min, 95 °C for 10 min, and hold at 4 °C. The ddPCR was performed on a Bio-Rad QX200 system using ddPCR Supermix for Probes (No dUTP) (Bio-Rad, cat#l 863024) following the manufacturer’s protocols. The 20x assay mix was comprised of 18 pM each primer and 5 pM each probe. One reaction contains 5 ng genomic DNA or 3 pl blastoid STE lysis, lx assay mix, and lx ddPCR Supermix, and was transferred into a sample well of a DG8 Cartridge for droplet generation using a QX200 Droplet Generator follow the manufactory manual. The PCR was performed immediately following the droplet generation. For HBB targets amplification followed the program: 95 °C for 10 min, then 94 °C for 30 s, 53 °C for 1 min for 40 cycles, 98 °C for 10 min, and hold at 4 °C. For GLP1R targets amplification followed the program: 95 °C for 10 min, then 94 °C for 30 s, 63 °C for 1 min for 40 cycles, 98 °C for 10 min, and hold at 4 °C. After the PCR was done, the samples were read using a QX200 Droplet Reader and analyzed by QuantaSoft software. All the probes and oligos were shown in Supplementary Table 2.
IDMseq Nanopore sequencing
The genomic DNA of HBB -corrected iPSCs was extracted after 3 days postelectroporation using a DNeasy Blood & Tissue kit and quantified by a Qubit instrument. The UMI labeling was performed following the published protocol (10). Briefly, the target locus was labeled by one-cycle PCR using a UMI primer (Supplementary Table 1) in a 25 pl reaction including 50 ng genomic DNA, 1 pM UMI primer (containing a universal forward primer sequence, 10 nts UMI barcode, and a target locus forward primer sequence, see it in Supplementary Table 1), 12.5 pl 2x Platinum SuperFi PCR Master Mix (ThermoFisher, cat#12358010), following the program: initial denaturation at 98 °C for 70s, gradient annealing from 70 °C to 65 °C with 1 °C/5 s ramp rate, extension at 72 °C for 7 min, and hold at 4 °C. The UMI labeled DNA was purified by 0.8x AMPure XP beads(Beckman Coulter, cat#A63881), then mixed with a universal forward primer, a target locus reverse primer (Supplementary Table 1), and Phusion Hot Start II High-Fidelity PCR Master Mixes (ThermoFisher, Cat#F-566), and amplified following the program: initial denaturation at 98 °C for 30 s, then 98 °C for 10 s, 57 °C for 30 s, 72 °C for 30 s for 35 cycles, lastly 68 °C for 5min, and hold at 4 °C. The amplicons were purified by 0.8x AMPure XP beads and used for library preparation. The library preparation was done using the ligation sequencing kit (Oxford Nanopore Technologies, cat#SQK-NBD 112.24) following its standard protocol. The Nanopore sequencing was performed on an Oxford Nanopore Mini ON sequencer using FUO-MIN112 flow cells. The reads were base called using Guppy basecaller (v5.0.7).
Data analysis was performed using VAULT as described previously (ref). In brief, the UMI primer sequence, fastq file and reference amplicon sequence were provided to the algorithm. VAULT will extract mappable reads followed by identification of UMI sequences from reads. Reads will then be grouped by their UMI sequences and used for parallel analysis of SNVs and SVs. HDR events were calculated as the percentage of UMI groups (representing original molecules) carrying the desired mutation. The ONT sequencing reads were aligned to the hg38 reference genome by minimap2 (v2.11) to check for MMEJ events. The MMEJ frequency was calculated as the percentage of MMEJ reads from alignment results.
T7 endonuclease assay
The genomic DNA of HBB/GLPIR-corrected iPSCs was extracted after 3 days postelectroporation using a DNeasy Blood & Tissue kit and quantified by a Qubit instrument. PCR reactions were set up in a 50 pL volume with Phusion Hot Start II High-Fidelity PCR Master Mixes, and amplified following the program: initial denaturation at 98 °C for 30 s, then 98°C for 10 s, 57 °C for 30 s, 72 °C for 30 s for 35 cycles, lastly 68 °C for 5min, and hold at 4 °C. Primers flanking the editing site are shown in Supplementary Table 1. The PCR products were purified by 0.8x AMPure XP beads and 200 ng of purified DNA were mixed with 2 pL lOx NEBuffer 2 (NEB, cat#B7002S) and nuclease-free water to a 19 pL volume. Heteroduplex formation followed: 98 °C for 30 s, gradual cooling from 95 to 85 °C at a rate of -2 °C/s, gradual cooling from 85 to 25 °C at a rate of -0.1 °C/s, and hold at 4 °C. For the digestion reaction, 1 pL of T7 Endonuclease I (NEB, cat#M0302S) was added into the reaction and incubated at 37 °C for 15 min.
Immunofluorescence Analysis
Human heart organoids (hHOs) were transferred to 1.5 ml centrifuge tubes and fixed in 4% paraformaldehyde solution. The organoids were washed with PBS, permeabilized, and blocked using PBS containing 10% donkey normal serum (VWR, cat#S2170-100), 0.5% Triton X-100 (Alfa Aesar, cat#A16046.AE), and 0.5% bovine serum albumin (BSA) (Sigma-Aldrich, cat#A8577) at 4 °C overnight. Following washing with PBS, the hHOs were incubated with primary antibodies in an antibody solution containing 1% donkey normal serum, 0.5% Triton X- 100, and 0.5% BSA at 4 °C for 24 hours. After incubation, the samples were washed with antibody solution and incubated with secondary antibodies in the same antibody solution at 4 °C for 24 hours. Subsequently, the hHOs were washed with PBS and maintained in PBS with DAPI. Images were acquired using Leica THUNDER Imaging Systems and reconstructed using the standard thunder algorithm.
Blastoids were transferred to 1.5 ml centrifuge tubes and post-implantation samples were grown on 8-well chambers (ibidi, cat#80826). After washing with PBS, the samples were fixed with 4% paraformaldehyde for 15 minutes, followed by additional washing with PBS. The samples were permeabilized and blocked using antibody solution containing 0.2% Triton-X-100 and 6% normal donkey serum for one hour. After washing with PBS, the blastoids were incubated with primary antibodies in antibody solution at 4 °C overnight. The samples were then washed again and incubated with secondary antibodies in the same antibody solution for one hour at 37°C. The samples were subsequently washed with PBS. The samples were washed with PBS and were kept in PBS and DAPI. Finally, images were acquired using Zeiss Cell Discoverer 7 (CD7) with LSM 900 confocal microscopes and processed using Fiji software (2.14.0/1.54f).
RNA-seq
RNA was isolated using the RNeasy Mini Kit (Qiagen, cat#74106). The quality of the RNA samples was evaluated using the Tapestation (Agilent), with an RNA Integrity Number (RIN) equal to 10. RNA-seq was performed using the Illumina 6000 PE150 platform to generate paired-end 150 bp reads by NovogeneAIT Genomics Singapore Pte Ltd. The libraries were sequenced, resulting in approximately 12 GB of raw data reads per library.
Bioinformatics Analysis
The raw RNA-seq data were processed using the online platform A.I.R. Sequentia Biotech SL (https://transcriptomics.sequentiabiotech.com/). Genes with read counts of at least 15 in two or more samples were selected, and log2 counts per million (CPM) values were used for Principal Component Analysis (PCA). Differentially expressed (DE) genes were identified using DESeq2, with a q-value threshold of <0.05. Gene Ontology (GO) analysis was performed using DAVID (v2023q4) (77) to provide functional annotation and enrichment insights for the DE genes. For selected DE genes, normalized FPKM values were z-score transformed and used to generate a gene expression heatmap.
Western Bloting
Western blotting was performed following a standard protocol. For HDR pathway analysis, cells were lysed 24 hours after electroporation or passage using Laemmli sample buffer (Bio-Rad, cat#1610737), supplemented with protease inhibitors (Sigma, cat#l 1836170001) and phosphatase inhibitors (Life Technologies, cat#78440). For H1.0 knockdown/overexpression validation, cells were lysed 48 hours post-esiRNA transfection or 2 pg/ml doxycycline induction. Samples were separated on 4-12% Bis-Tris Plus Gels (ThermoFisher, cat#NW04125BOX) and transferred to 0.2 pm PVDF membranes (ThermoFisher, cat#88520). Membranes were blocked with 5% BSA in 0.2% TBS-Tween 20 for 2 hours at room temperature, followed by overnight incubation at 4°C
with primary antibodies. After washing, the membranes were incubated with secondary antibodies for 1 hour at room temperature. The Chemiluminescent HRP substrate (ThermoFisher, cat#34076) was used for signal detection. Antibody details and concentrations are provided in Supplementary Table 2. The normalized expression values were calculated using Fiji (2.14.0/1.54f). siRNA Transfection
The protocol of esiRNA transfection was adapted to the instruction of lipofectamine RNAiMAX reagent. GFP mutant iPSCs were harvested after 1 hour of 10 pM ROCK inhibitor Y-27632 treatment. MISSION® siRNA Universal Negative Control #1 (Merck, cat#SIC001) and esiRNA human H1F0 (Merck, cat#EHU135871) were used in this study. The esiRNA/RNAiMAX solution was prepared as the recipe: Mix 1 was prepared by adding 13.5 pl RNAiMAX reagent into 225 pl opti-MEM and vertexing for a few seconds. Mix 2 was prepared by adding 90 pmol esiRNA into 225 pl opti-MEM and pipetting a few times. The esiRNA/RNAiMAX solution was done by adding Mix2 into Mixl and incubating for 5 min, which was used for resuspending a 1.5 million cell pellet. After 30 min incubation with esiRNA/RNAiMAX solution, the cells were aliquoted equally into 3 rhLaminin-521 coated wells and cultured in 37 0 °C, 5% CO2 incubator. The cell samples were collected after 48 h for GFP correction electroporation and knockdown efficiency analysis.
Statistical Analysis
Results are reported as the mean ± standard error of the mean (SEM) unless indicated otherwise. Statistical comparisons were conducted using GraphPad Prism software. A p-value less than 0.05 was considered statistically significant.
DFT calculations and MD Simulations
In order to model the ssODN sequence using pdb2gmx by Gromacs for simulations(72, 13), the series of cyanine attached bases were modelled first as non-standard residues. For example, the 5’Cy5-G fragment was regarded as a non-standard residue with its structure built by GaussView. Structure optimization was performed through Gaussian 16 by density functional theory (DFT) calculations at B3LYP-D3(BJ)/6-31d* level(74). The optimized structure was modelled using the generalized amber force field (GAFF)(75), assigned with restrained electrostatic potentials (RESPs) calculated using the Multiwfn software based on the geometry optimization results(76). After all the parameters (including bond, angle, dihedral angle, etc. ) were settled, the relevant files of all the non-standard residues were included in the Gromacs top library, including 5’Cy5-G, 5’Cy5-G, 5’Cy3-G, 3’Cy5-G, 3’Cy5-G, 3’Cy3-G for GFP sequence, 5’Cy5-T, 5’Cy5-T, 5’Cy3-T, 3’Cy5-A, 3’Cy5-A, 3’Cy3-A for HBB sequence.
The unmodified or cyanine attached ssODN/DNA structures were built by Discovery Studio and modelled using pdb2gmx by Gromacs for MD simulations. For the ssODN, unmodified 90 base GFP sequence and 5’Cy5-GFP-ssODN9o were fully modelled for simulations and for comparison. A wave circular conformation for initial state was adopted with its tail and head parts stay close. A water box with 61937 water molecules for 5’Cy5-ssODN9o (61862 water molecules for unmodified were ssODNgo) was built. For the Dio DNA strands, those modified at the 5’ or 3’ end with covalently linked Cy3, Cy5, and Cy5.5 dyes were designated as 573’Cy3-GFPDio, 573’Cy5-GFPDio, and 573’Cy5.5-GFPDio, respectively. Unmodified Dio was also simulated as a control. A similar molecular modeling method was employed for the HBB HDR system. For the GFP sequence, (unmodified) GATGCTCCTG, 5’Cy5(3,5.5)-GATGCTCCTG and GATGCTCCTG-3’Cy5(3,5.5) were extracted with 5 bases from the head and 5 bases from the tail of GFP sequence. For the HBB sequence, (unmodified) TCACTGTGGA, 5’Cy5(3,5.5)-TCACTGTGGA, TCACTGTGGA-3’Cy5(3,5.5) were extracted with 5 bases from the head and 5 bases from the tail of GFP sequence. After reaching equilibrium and a 100 ns production run, the optimized structures and supramolecular interactions of different DNA systems were subsequently analyzed and compared in detail.
All MD simulations were performed by the AMBER 10 program package with the parmbscO modification28 of the parm99 force field29 for DNA. Water molecules were modelled using the explicit TIP3P water model and Na+ and Cl" ions were adopted to neutralize the system electrostatically with a low ionic strength of ~0.1 M roughly. Simulations were performed at 25 °Cand 1 bar with periodic boundary conditions. The particle mesh Ewald (PME) summation was used to calculate the long-range electrostatic interaction, (77) with a cutoff of 1.0 nm for the separation of the direct and reciprocal space summation. The temperature and pressure were controlled using modified Berendsen thermostat (7S) and Parrinello-Rahman barostat (79), respectively. Production runs for each system were conducted for 100 ns (DNA system) or 50ns (ssODN system) after 1 ns for equilibration, with a time step of 2 fs. Coordinates were saved every 1 ps, yielding 100000 or 50000 frames for further analysis. The simulation length was sufficient to provide converged data of ssODN/DNA systems.
Simulation data were analyzed with Gromacs, Gaussian, Multiwfn software and visualized using Origin and VMD packages. The radial distribution functions (RDF) of various cyanines around the closest paring base were calculated by Gromacs to assess quantitatively their interactions. The spatial distribution functions (SDF) of various anions around the cage were calculated by Gromacs and visualized using VMD packages at a certain isolevel. Free
energy surfaces, hydrogen bond analysis, and the radius of gyration Rg were calculated by Gromacs and processed by Origin.
Independent gradient model based on Hirshfeld partition of molecular density (IGMH) analysis were used to visualize the supramolecular interactions between the various cyanines around the closest paring base(20). Based on the geometry optimization results of cyanine attached base with its paring base by Gaussian, the Multiwfn software was used for the calculation and VMD packages were applied for visualization.
For direct comparison, the binding energy was calculated by Gaussian using the optimized system, the unmodified first pair bases for the unmodified DNA systems or cyanine attached base with its paring base for the cyanine modified DNA systems, respectively. Specifically, taking 5’Cy5-DNA10 system as an example, the 5’Cy5-G:C were adopted to calculate their binding energy according to the following equation:
AE = Es’Cy5-G:C - (Es’Cy^G + EC) where E5’Cy5-G:C is the Gibbs energy of the 5’Cy5-G:C part, Es’Cys-G is the Gibbs energy of the 5’Cy5-G part, and EC is the Gibbs energy of the base C part.
Results
Fluorophore-modified ssODNs enhance CRISPR/Cas9 mediated HDR efficiency Fluorophores are small molecules widely used for single-stranded DNA (ssDNA) modifications in various applications, such as qPCR probes and antisense oligonucleotides (ASOs ).
To investigate this, we performed genome editing to correct a mutant GFP transgene in an induced pluripotent stem cell (iPSC) line using ssODNs with single-end or dual-end fluorophore modifications (Fig. 9A, and commercially available modifications commonly used to enhance HDR — 5' or 3' phosphorylation and dual-end Alt-R™ (IDT) — as controls. The mutant GFP correction rate was quantified using Fluorescence-Activated Cell Sorting (FACS) and used as a measure of HDR efficiency. Cyanine 3 (Cy3) modification at either 5' or 3' ends improved HDR efficiency, with 5' modification being more effective (Fig. 9B). 5' Cy3 outperformed either 5' or 3' phosphorylation and was comparable to the Alt-R™ modification (Fig. 9B). Another structurally distinct fluorophore, ATTO532 (Fig. Sib), also showed improvement in HDR efficiency but was significantly less effective than cyanines and, therefore, was not pursued further. Unlike the dual-end Alt-R™ modification, dual-end Cy3 modification performed worse than single-end Cy3 modifications. Notably, 5' Cy5 and 5' Cy5.5 were much more effective than 5' Cy3 or Alt-R™, with 5' Cy5 performing better than 5' Cy5.5 (Fig. 1c). To assess the potential role of fluorescence energy release in HDR improvement, the 5' Cy5.5-
ssODN with a 3' quencher, which yielded an HDR efficiency comparable to the 5' Cy5.5 modification.
The cells were also continuously monitored using a live fluorescence imaging system after delivering 5’ Cy5 or unmodified ssODNs together with CRISPR-Cas9 ribonucleoprotein (RNP). GFP-positive cells were detected around 20 hours post-electroporation, with significantly higher levels of GFP signals observed in the 5' Cy5-ssODN group at all time points, consistent with the FACS results (,).
Further validation of the impact of fluorophore modifications on HDR by performing gene correction of the sickle cell disease (SCD) mutation in the HBB gene NM_000518.5:c.20A>T) in a previously characterized SCD patient iPSC line 18 (Fig.. The rate of gene correction was quantified using ddPCR as described previously. All single-end fluorophore modifications significantly enhanced HBB correction, with 5’ Cy5-ssODNs exhibiting the highest efficiency and 5' Cy3-ssODNs outperforming dual-end Cy3 modifications (Fig. 9D). Similarly, adding a 3' quencher to 5’Cy5.5-ssODN resulted in a similarly improved HDR efficiency compared to 5' Cy5.5-modified ssODN (Fig.Slh). Additionally, we used individual-molecule sequencing (IDMseq) 19, a sensitive long-read sequencing method for the quantitative analysis of diverse types of variants 5, 19, 20, to quantify the frequency of mutant HBB correction events at the single-molecule level. The data showed that 5' Cy5-ssODNs achieved approximately a two- fold increase in correction efficiency compared to unmodified ssODNs (Fig. 9E). Since the 5' Cy5 modification consistently demonstrated superior enhancement of HDR, it was used for subsequent studies.
The GLP1R variants are associated with type 2 diabetes and are promising targets for genome editing therapy. To test the effect of 5’Cy5-ssODN on HDR in another locus, we performed gene correction of a GLP1R mutation (NM_002062.3:c.402 + 3delG, c.396A>G) was corrected in a patient iPSC line. The results showed that 5’Cy5-ssODN significantly elevated the HDR efficiency compared to unmodified ssODN (Fig. 9F).
Importantly, 5’Cy5-ssODN demonstrated better cell survival and proliferation following CRISPR-Cas9 editing (Fig. 5G). Moreover, the cell cycle phase distribution in 5' Cy5-ssODN- edited cells was similar to that of unedited cells (mock), whereas unmodified-ssODN-edited cells exhibited an abnormal increase in the G2/M phase and a decrease in the G1 phase (data not shown). These results suggest that cells experienced less stress when edited with 5' Cy5-ssODN compared to unmodified ssODN. Given that HDR primarily occurs during the S and G2 phases of the cell cycle 23, the improvement in HDR efficiency observed with 5’ Cy5 -ssODN is unlikely to be due to difference in the cell cycle. Moreover, the introduction of 5’ Cy5-ssODN
did not affect the CRISPR-Cas9 cutting efficiency (data not shown). Thus, 5’ cyanine modified ssODNs, particularly those modified with Cy5, exhibit lower cytotoxicity and higher HDR efficiency during Cas9 genome editing.
5’Cy5-ssODNs improve CRISPR-Cas9 precise genome editing in human cardiac organoid
Next studies investigated whether 5'Cy5-ssODNs could improve in situ precisive genome editing in a complex model, such as the human heart organoid (hHO), which serves as a sophisticated in vitro system closely mimicking the complexity of the human heart. To test this hypothesis, hHOs were first generated using a three-step Wnt signaling modulation strategy. To achieve consistent CRISPR-Cas9 editing, we utilized a GFP-mutant iPSC line with a doxycycline-inducible Cas9 expression system (GFPmutant-iCas9), was used.
Following the Cas9 expression induced by doxycycline, the hHOs were transfected with mutant GFP-targeting sgRNA and either unmodified ssODNs or 5'Cy5-ssODNs (FIG. 10A). Consistent with the results of 2D cell culture systems, 5’Cy5-ssODNs significantly increased HDR efficiency in hHOs by more than twofold, as quantified by both FACS and immunofluorescence analyses (FIG. 10B) of GFP- and CD 106- or cardiac troponin T (cTNT)- positive cardiomyocytes. Although the baseline HDR rate in the 3D hHO model is lower than 2D cultures due to technical limitations in transfection nevertheless, GFP-corrected hHOs transfected with either unmodified ssODN or 5'Cy5-ssODN displayed normal rhythmic beating behavior. These results demonstrate that 5'Cy5 modification enhances HDR efficiency without disrupting function in a complex 3D organoid system.
5’Cy5-ssODNs enable whole mount in situ genome editing of human blastoids
Human blastoids, generated from naive pluripotent stem cells, can recapitulate the key stages of blastocyst development. Genome editing in blastoids could serve as a model for genome editing therapy and mechanistic study of human early development. However, genome editing technology during early embryo development is limited to microinjection at the zygote stage, which is tedious and incompatible with the aggregation-based blastoid models. Therefore, there is an unmet need for methods for in situ genome editing of human embryo models such as blastoids.
Given the enhanced GFP correction observed in hHOs, subsequent studies were conducted to determine if 5’Cy5-ssODNs could enable precise genome editing such as correction of human disease-relevant mutations in human blastoids. To test this, CRISPR-Cas9 RNP and ssODNs was directly electroporated into aggregates of chemically reset naive SCD patient-derived iPSCs following an efficient PALLY blastoid protocol, and harvested day 4
blastoids for gene correction validation and implantation test (FIG. IOC). To monitor Cas9 RNP delivery, a GFP tagged Cas9 (GFP-Cas9) was employed for this experiment. The human SCD blastoids showed normal cavitation and lineage composition (Fig. 10D, 10E). Cas9 editing was performed using structures collected on day 0, 1, 2, or 3 of the protocol, which all formed blastoids. The blastoids exhibited GFP and Cy5 fluorescence signals, indicating the successful delivery of Cas9 RNP and ssODNs via electroporation (Fig. 10E, left). The edited blastoids were morphologically indistinguishable from the unedited controls (mock). No significant difference in diameter was observed between unmodified ssODN and mock groups or between the 5’Cy5- ssODN and unmodified control groups (Fig. 10E, right). Interestingly, the 5’Cy5-ssODN edited group showed a significantly higher cavitation efficiency than the unmodified control, suggesting that 5’Cy5-ssODNs are less toxic than unmodified ssODNs. Like the mock group, most edited blastoids successfully attached to ECM-coated plates within a day and displayed outgrowths (Fig. 10F), resembling human implantation. By day 4, human chorionic gonadotropin (hCG) was detectable in the culture medium (Fig. 10F, middle), which indicated that the genome editing can be successfully applied to human blastoid models without affecting their developmental potentials. Consistent with previous results, 5’Cy5-ssODN significantly improved the HDR efficiency of HBB correction by nearly 250%, as quantified by ddPCR. Specifically, 28.9% mutation was corrected with 5’Cy5-ssODN, compared to only 11.7% with unmodified ssODN (Fig- 10G).
To facilitate the visualization of HDR events in blastoids, genome editing was performed to correct a mutant GFP transgene, primed GFP-mutant iPSC line was reverted a to the naive state (SUSD2 positive). Next, the GFP-mutant blastoids was generated and corrected GFP ORF with the same protocol using wild-type Cas9 instead of GFP-Cas9. We observed a significant increase in the GFP signal in the 5’Cy5-ssODN group (Fig. 10H). Together, these data demonstrate that the disease-relevant mutations can be corrected by CRISPR-Cas9 directly in human blastoids without affecting their developmental potential, and that 5’Cy5-ssODNs significantly improve HDR efficiency in in situ genome editing of human blastoids. 5’Cy5 modification boosts HDR activity by stimulating the expression of HDR-associated genes To gain a comprehensive view of the transcriptional response to 5’Cy5-ssODNs, RNA sequencing (RNA-seq) we performed on GFP-mutant iPSCs 24 hours after electroporation (Fig. 11 A). Principal component analysis (PC A) demonstrated good reproducibility among replicates (data not shown). Interestingly, the 5 ’Cy5 -modified group was positioned closer to the mock group (non-editing) in the PC1-PC2 space compared to the unmodified group (data not shown). Differentially expressed genes (DEGs) of the 5’Cy5-ssODN condition are enriched in biological
processes including DNA damage response, DNA repair, and cell proliferation (Fig. 11C), which are distinct from the unmodified condition (data not shown).
Genes related to DNA damage and repair were then analyzed (Fig. 11D). Expectedly, genes associated with DNA binding (e.g., RAD21, H1.0, SMC3, RECQL4, SSRP1, and APEX1), particularly single-stranded DNA binding (e.g., RAD52, RAD23A, RPA1, SMC2, and PCBP1), exhibited higher expression in the 5’Cy5-ssODN samples. Several genes involved in the HDR pathway were upregulated in the 5’Cy5-ssODN condition, including RAD51, TP53BP1, RBBP8 (CtIP), RAD50, RAD52, FEN1, and NBN (Nibrin), consistent with an elevated HDR activity (Fig. 1 ID). In contrast, we observed relatively lower expression of genes associated with nucleotide excision repair (NER) and translesion DNA synthesis, such as XPC, POLH, DDB2, and PCNA in the 5’Cy5-ssODN condition, suggesting that NER and TLS were not the dominant DNA repair pathways in 5’Cy5-ssODN samples. Chromatin-binding proteins not only facilitate chromosomal stability, DNA replication, and DNA repair, but are also critical for modulating DNA repair pathway choice, particularly by enabling access to DNA damage sites during HR. Several chromatin remodeling and structural maintenance genes, including H1.0, H2AX, SMC2, SMC3, and MCM7, were also upregulated in 5’Cy5-ssODN samples. The protein expression level of the linker histone H1.0, as well as two key DNA repair pathway regulators, RAD51 and RAD52, was further evaluated via immunoblotting (Fig. HE). These findings suggest that the enhanced HDR efficiency observed in the 5'Cy5-ssODN condition is driven by a shift in gene expression from the NER and TLS pathways toward HDR, accompanied by increased expression of chromatin remodeling proteins that promote the HDR pathway.
H1.0, a variant of the linker histone family, plays a role in higher-order chromatin compaction and is emerging as a key factor in DNA repair. It binds to the linker DNA between nucleosomes, helps to stabilize the 30-nm chromatin fiber, and has been shown to affect chromatin remodeling during DNA repair, including HDR. To further understand the role of H1.0 in 5’Cy5-ssODN-mediated HDR, we knocked down its expression, which led to decreased HDR efficiency in the 5’Cy5 condition, while the unmodified condition remained unaffected (Fig. 1 IF, and data not shown). Conversely, H1.0 overexpression enhanced HDR efficiency with 5’Cy5-ssODNs (Fig. 11G, and data not shown).
Combined together, these data suggest that 5’Cy5-ssODN could trigger an elevated HDR activity following DSB by upregulating the expression of HDR-associated genes, with H1.0 as a potential novel regulator in this process.
5’Cy5 modification enhances system stability during ssODN transport and HDR processes
The significant improvement of HDR efficiency with cyanine-modified ssODNs motivated us to further investigate the chemical underpinnings of cyanine modifications. To gain molecular-level insights into the interactions and effects of covalently attached cyanine dyes on the ssODN, we employed molecular dynamics (MD) simulations using the Gromacs package, along with density-functional theory with dispersion corrections (DFT-D) using Gaussian packages. Two types of molecular models were constructed to simulate and explore two distinct processes: ssODN90 models for the ssODN transport process and dsDNA models for the HDR process. In the transport process, a stable conformation could facilitate successful targeting and improve HDR efficiency. Studies have shown that circular single-stranded DNA (CssDNA) donors provide improved efficacy over linear ssDNA donors in HDR by offering exonuclease protection due to their circular structure. Thus, further studies investigated whether 5’Cy5- ssODNs adopt a CssODN-like conformation to enhance stability during transport.
The 90-nt 5’Cy5-ssODN90 and unmodified ssODN90 were initially modeled into nearcircular conformations, with their head and tail in close proximity to reduce computational time. If the construction is favorable, the structure maintains a “doughnut-like” shape; otherwise, an unwound conformation is expected. After sufficient simulation time (50 ns), 5’Cy5-ssODN90 stabilized itself into a circular architecture, with the 5’Cy5 head inserted between two adjacent bases (G and C) at the ssODN90 tail end (TGCG fragment, data not shown). In contrast, the unmodified ssODN90 expanded into a loose, arch-like structure (data not shown). The introduced cyanine group significantly affects the conformation of the ssODN, which is further supported by the distinct radius of gyration (Rg) of 5’Cy5-ssODN90 and unmodified ssODN90 (Fig. 12A). The near-constant Rg of 5’Cy5-ssODN90 indicates that the cyanine group maintains the ssODN90 in a dynamic yet compact circular structure, whereas the increasing Rg of the unmodified ssODN90 suggests a progressively diverging and loosening conformation. An independent gradient model based on the Hirshfeld partition of molecular density (IGMH) analysis further depicts the multiple supramolecular interactions between the cyanine dye and the ssODN90 skeleton (data not shown). In particular, there are strong 71-71 stack interactions between the cyanine aromatic rings and guanine base below. Quantitative analysis of the hydrogen bond count further revealed that the circular 5’Cy5-ssODN forms a greater number of hydrogen bonds compared to the loosely structured, unmodified ssODN (Fig. 12B). These results collectively suggest that cyanine modification significantly stabilizes ssODNs during the transport process, promoting the formation of a circular-like structure that may contribute to improved HDR efficiency.
Another essential point is the HDR process, in which homologous DNA sequences form a duplex through Watson and Crick base pairing. Considering that base pairing between the ssODN and its genome target forms at both 5’ and 3’ ends, a 10-nt double-stranded DNA segment (DIO) containing 5’GATGCTCCTG3’ sequence of GFP was extracted and analyzed as a simplified molecular model for the DNA structures formed by ssODN90 and its target complementary DNA (data not shown).
Free energy surface (FES) analysis was first used to scan dye-DNA interactions and conformations, such as stacking motifs and unstacked structures. 5’Cy5-GFPD10 exhibited the lowest free energy (-8.8 kcal/mol) compared to 5’Cy5.5-GFPD10 and 5’Cy3-GFPD10, reflecting its superior stability (data not shown). The 5’ modified ssODNs also showed more focused conformations with lower free energy (-8.8 to -8.6 kcal/mol) than 3’ modifications (-8.2 to -7.9 kcal/mol), indicating the 5’ end as the more favorable site. Similar results were demonstrated for the HBB sequence (data not shown). Hydrogen bonding analysis suggested that 5’Cy5-D10 exhibited a significant increase in hydrogen bond numbers compared to unmodified DIO. The number of hydrogen bonds decreases in the following order: 5’Cy5 > 5’Cy5.5 > 5’Cy3 > 3’Cy5 > 3’Cy5.5 > 3’Cy3 > unmodified (data not shown). Similarly, hydrogen bond distribution analysis corroborates this trend (data not shown).
Radial distribution function (RDF) analysis can capture cyanine movement relative to the DNA strand. 5’Cy5-GFPD10 exhibited a sharp RDF peak at 0.5 nm, which decreased progressively from 5’Cy5 to 5’Cy5.5 and 5’Cy3 (Fig. 12C). In contrast, 3’ modified systems exhibit broader, lower RDF peaks (0.4-0.9 nm), indicating greater cyanine deviation and mismatch with the cytosine (C) segment at the end of the complementary strand. Spatial distribution functions (SDF) provide a 3D visualization of cyanine distribution around the terminal C base of the complementary strand. 5’ cyanine was more focused on the C base, facilitating strong supramolecular interactions, especially 71-71 stacking (data not shown). The SDF of 5’Cy5 spanned the entire upper region of the C base, while 3’ cyanines drifted away, resulting in more flexible structures. Similar trends were observed for the HBB sequence (data not shown). Collectively, these results highlight the critical roles of both the modification site and cyanine length. 5’ modifications — especially 5’Cy5 with optimal length and attachment site — show the strongest stabilizing effect via strengthened supramolecular interactions (data not shown). This enhanced targeting and matching of modified ssODNs could ultimately facilitate HDR.
The supramolecular interactions between cyanine and DNA strands were analyzed using the independent gradient model based on IGMH, which highlights inter- and intra-fragment
interactions. The 5’ or 3’ cyanine and their adjacent G:C base pair were modeled as a Cy-G:C system based on MD-optimized structures. IGMH maps revealed clear interaction regions, with notable 71-71 stacking interactions between cyanine and the C base, indicated by broad green isosurfaces between the 5’ cyanine and the adjacent G:C base pair segment (data not shown). Among the systems, 5’Cy5-GFPD10 exhibited the largest interaction area, suggesting the highest stability, followed by 5’Cy5.5 and 5’Cy3. In contrast, 3’-modified systems showed reduced 71-71 stacking and limited stabilization. Density functional theory (DFT) calculations further quantified these interactions. Binding energy analysis (Fig. 12D) confirmed that 5’- modifications, particularly 5’Cy5, significantly enhanced binding affinity and increased system stability compared to unmodified G:C, agreeing with the MD and HDR results. These findings underscore that the 5’ end is a more effective attachment site than the 3’ end for achieving stable conformations, a conclusion consistently observed across different sequences, including HBB (data not shown).
Based on molecular simulations and supramolecular interaction analyses, we conclude that the cyanine modifications of ssODNs not only help the ssODN adopt a supramolecular interaction-driven circular conformation during transport, but also strengthen interactions between the ssODN donor and its homologous target DNA, thereby enhancing homology search and strand annealing and stimulating the HDR process. The modification site and the length of the cyanine dye are critical factors influencing the dynamics of these processes, with 5’Cy5 delivering the optimal performance.
Discussion
CRISPR-Cas9, one of the most widely studied genome editing tools, has been applied across numerous fields of research and therapy. However, the low efficiency of HDR remains a significant bottleneck for its broader success. This study offers a new perspective on how a class of chemical modifications to ssODNs can enhance HDR following Cas9-induced DSBs (Fig. 9A-9G). Fluorophore modifications, particularly the 5' Cy5 modification, consistently enhanced HDR efficiency across different targets (GFP/HBB/GLP1R) and multiple experimental models, including pluripotent stem cells, human heart organoids (hHOs) (Fig. 10A and 10B), and human blastoids (Fig. 10C-10H). Additionally, the 5'Cy5-ssODNs improved HDR efficiency without affecting CRISPR-Cas9 cleavage efficiency and DNA repair by the microhomology-mediated end joining (MMEJ) pathway. They also promoted improved cell survival (Fig. 9G) and preserved normal cell cycle progression.
Notably, 5'Cy5-ssODNs enhanced HDR efficiency in complex 3D stem cell models like hHOs and blastoids. In hHOs, the 5'Cy5 modification increased the proportion of GFP-positive
cardiomyocytes without affecting differentiation and key functional properties, such as rhythmic beating (Fig. 10B). Similarly, in the blastoid model, 5'Cy5-ssODNs not only improved HDR efficiency but also cavitation formation, a critical step in blastocyst development (Fig. 10C-10F). The size of a human blastocyst is primarily determined by the dynamics of blastocoel formation, not only cellular growth. The blastocoel is the fluid-filled cavity that forms within the blastocyst, mainly depending on ion transport and aquaporin channels. The data demonstrated that 5'Cy5- ssODN can mediate efficient correction of the SCD mutation in blastoids derived from patient iPSCs (Fig. 10F). Importantly, our new in situ editing method using 5'Cy5-ssODNs did not affect the morphological or functional integrity of the delicate blastoids, demonstrating that 5'Cy5-ssODNs can be safely applied in early-stage human developmental models without causing toxicity or loss of pluripotency (Fig. 10E, 10F). These data underscore the potential of 5'Cy5-ssODNs to enhance precise genome editing in complex multicellular systems, suggesting broader in vivo therapeutic potential. The findings presented here have significant implications for genome editing therapy, particularly in enhancing CRISPR-Cas9-mediated precision genome editing. The improved HDR efficiency, combined with reduced cytotoxicity, positions 5'Cy5- ssODNs as a highly promising, simple, and cost-effective tool for therapeutic applications. This is especially relevant for correcting disease-causing mutations, such as HBB mutations associated with sickle cell disease and P-thalassemia. Integrating 5'Cy5-ssODNs with advanced CRISPR-Cas9 delivery technologies — such as lipid nanoparticles, nanoscale zeolitic imidazolate framework, or enveloped delivery vehicles (ED Vs) — could further enhance their clinical translatability.
Cy5 is a simple chemical modification and also allows visual tracking of donor DNA delivery (Fig. 10E), without causing any significant cytotoxicity to the host (Fig. 9G). Moreover, Cy5-labeled oligonucleotides can be freely taken up by cells via endocytosis, significantly reducing the challenges of cellular delivery. 5'Cy5-ssODNs could be taken up by cells when simply added to the culture medium, and interestingly, the Cy5 signal remained stable for several days. This finding suggests that 5'Cy5 modification may facilitate the delivery of ssODNs to host cells, potentially enhancing HDR to some degree.
With these advantages, Cy5-antisense oligonucleotides (which specifically target RNA to knock down gene expression) are emerging in drug development, encouraging the application of 5'Cy5-ssODNs for CRISPR-based genome editing therapy in clinical settings. Cy5 modification can be extended to the sgRNA itself. Modifications to the sgRNA could enhance the specificity and stability of the CRISPR-Cas9 complex, reducing off-target effects while improving on-target editing. The synergy between modified ssODNs and sgRNAs could lead to even greater
improvements in HDR efficiency, minimizing off-target effects and maximizing therapeutic outcomes.
The underlying mechanism of the enhanced HDR efficiency observed with 5'Cy5- ssODNs was investigated. RNA-seq analysis revealed that 5'Cy5-ssODNs upregulated HDR- related genes, such as RAD51, RAD52, and RAD50 (Fig. 11B-11E), which may directly contribute to improved HDR efficiency. Upregulation of chromatin-binding proteins, including H1.0 (Fig. 1 IB, 1 ID and 1 IE), which are known to influence chromatin remodeling and DNA accessibility at repair sites. Knocking down H1.0 specifically reduced HDR efficiency in the 5'Cy5-ssODN condition (Fig. 11F), while its overexpression enhanced HDR (Fig. 11G). Although the exact role of H1.0 in boosting HDR under 5'Cy5-ssODN conditions remains unclear, our data reveal that H1.0 is a key mediator of the 5'Cy5-ssODN-induced repair response, potentially facilitating greater accessibility of the 5'Cy5-ssODNs to DSBs and stabilizing the repair machinery. Although further studies are needed to identify the cellular sensors of 5'Cy5-ssODNs and understand how they trigger HDR responses, these findings reveal a genetic interaction between the chromatin factor Hl and modified ssODNs, offering deeper insights into the mechanisms driving enhanced HDR.
To further explore how cyanine modifications affect ssODNs, DFT calculations and MD simulations were performed. The data showed that 5'Cy5-ssODNs, unlike covalently closed CssDNA, stabilize into a circular-like conformation through multiple supramolecular interactions during transport. During the HDR process, the attached cyanine strengthens intermolecular interactions between the matched DNA strands, thereby improving homology search, strand annealing, and HDR. Notably, 5'Cy5 modifications of the optimal length and placement demonstrated the highest binding affinity to the complementary sequences. Interestingly, higher system stability conferred by 5'Cy5 modification correlated with increased HDR efficiency. These findings highlight the critical role of cyanine modifications in both transport and HDR processes at the molecular level, offering valuable insights and design strategies for advancing HDR efficiency in precision genome editing applications.
In summary, previously different strategy that achieves high HDR efficiency by simply introducing a Cy5 moiety at the 5' end of ssODNs is provided in this application. Significant HDR improvements was demonstrated at multiple loci using human cell lines, heart organoids, and blastoids, and we provided mechanistic insights into these enhancements. Overall, the results underscore the utility of 5'Cy5 modifications for precise genome editing, offering important implications for therapeutic genome correction.
References
1. Chakrabarti et al. Mol Cell 2019, 73(4):699-713 e696.
2. Guo et al. Genome Biol 2018, 19(1): 170.
3. Koike-Yusa et al. Nat Biotechnol 2014, 32(3):267-273.
4. van Overbeek M et al. Mol Cell 2016, 63(4):633-646.
5. Tan et al. Genesis 2015, 53(2):225-236.
6. Kosicki et al. Nature biotechnology 2018, 36(8):765-771.
7. Bi et al. Genome biology 2020, 21 (1):213.
8. Adikusuma et al. Nature 2018, 560(7717):E8-E9.
9. Zuccaro et al. Cell 2020, 183(6): 1650- 1664 el615.
10. Alanis-Lobato et al. Proceedings of the National Academy of Sciences of the United States of America 2021, 118(22).
11. Wen et al. Genome Biol 2021, 22(1):236.
12. Hoijer et al. Nat Commun 2022, 13(1):627.
13. Song et al. Mol Ther Nucleic Acids 2020, 21:523-526.
14. Kosicki et al. Nat Commun 2022, 13(1):3422.
15. Owens , et al., Nucleic Acids Res 2019, 47(14):7402-7417.
16. Wang et al. Nucleic Acids Res 2006, 34(21):6170-6182.
17. Luedeman et al. Nat Commun 2022, 13(1):4547.
18. Mimitou et al. Trends Biochem Sci 2009, 34(5):264-272.
19. San Filippo et al. Annu Rev Biochem 2008, 77:229-257.
20. McVey et al. Nat Struct Mol Biol 2014, 21(4):348-349.
21. Kent et al. Nat Struct Mol Biol 2015, 22(3):230-237.
22. Wang et al. Cell Biosci 2017, 7:6.
23. Sfeir et al. Trends Biochem Sci 2015, 40(11):701-714.
24. Audebert et al. J Biol Chem 2004, 279(53):55117-55126.
25. Boutin et al. Nat Commun 2021, 12(1):4922.
26. Papathanasiou et al. Nat Commun 2021, 12(1):5855.
27. Shi et al. Cell Stem Cell 2017, 20(5):675-688 e676.
28. Cullot et al. N ature communications 2019, 10(l):1136.
29. Rayner et al. CRISPR J 2019, 2(6):406-416.
31. Przewrocka et al. Ann Oncol 2020, 31(9):1270-1273.
32. Brunet et al. Adv Exp Med Biol 2018, 1044:15-25.
33. Hunt JMT et al. Hum Genet 2023, 142(6):705-720.
34. Beying et al. Nat Plants 2020, 6(6):638-645.
35. Quilter et al. Hum Reprod 2010, 25(8):2139-2150.
36. Yiangou et al. Stem Cell Reports 2019, 12(1): 165-179.
37. Kawamura et al. Int J Cancer 2004, 109( 1):9- 16.
38. Schrempf et al. Trends Cancer 2021, 7(2):98-l 11.
39. Zhou J, Gelot C, Pantelidou C, Li A, Yucel H, Davis RE, Farkkila A, Kochupurakkal B, Syed A, Shapiro GI et al: A first-in-class Polymerase Theta Inhibitor selectively targets Homologous-Recombination-Deficient Tumors. Nat Cancer 2021, 2(6):598-610.
40. Lemee et al. Proc Natl Acad Sci U S A 2010, 107(30): 13390-13395.
41. Allera-Moreau et al. Oncogenesis 2012, l:e30.
42. Zatreanu et al. Nature communications 2021, 12(1):3636.
43. Deng et al. Nat Struct Mol Biol 2014, 21(4):405-412.
44. Kreitzer et al. Am J Stem Cells 2013, 2(2) : 119- 131.
45. de la Chapelle A, et al., Proc Natl Acad Sci U S A 1993, 90(10):4495-4499.
46. Beel et al. Brit J Haematol 2009, 144(1): 120- 126.
47. et al. Proc Natl Acad Sci U S A 2013, 110(19):7720-7725.
48. Richardson et al. Nat Biotechnol 2016, 34(3):339-344.
49. Yin et al. Nat Commun 2022, 13(1): 1204.
50. Schimmel et al. Cell reports 2023, 42(2):112019.
51. Wimberger et al. Nature communications 2023, 14(1):4761.
52. Shy et al. Nat Biotechnol 2022.
53. Mateos-Gomez et al. Nat Struct Mol Biol 2017, 24(12): 1116-1123.
54. Mara et al. New Phytol 2019, 222(3):1380-1391.
55. Sheridan et al. Nature biotechnology 2024, 42(l):3-4.
56. Li et al. Cell Res 2011, 21(12):1740-1744.
57. Suzuki et al. Cell stem cell 2014, 15(1):31-36.
58. Sanjana et al. Nat Methods 2014, l l(8):783-784.
59. Lancey, et al., Nat Commun 2020, 11(1): 1109.
60. Henricksen, et al., J Biol Chem 1994, 269(15): 11121-11132.
It is understood that the disclosed method and compositions are not limited to the particular methodology, protocols, and reagents described as these may vary. It is also to be understood that the terminology used herein is for the purpose of describing particular
embodiments only, and is not intended to limit the scope of the present invention which will be limited only by the appended claims.
Claims
1. A method for increasing the homology directed repair (HDR) efficiency following CRISPER/CAS mediated gene editing in a cell comprising contacting the cell with: (A) one or more POLQ inhibitory agents in an effective to reducing POLQ level/activity in a cell; (B) one or more PRA activators in an effective amount to increase RPA levels/activity in the cell; and/or (C) one or more fluorophore-modified nucleic acids used in gene editing.
2. the method of claim 1, comprising contacting the cell with one or more POLQ inhibitory agents in an effective to reducing POLQ level/activity in a cell.
3. The method of claim 1, comprising contacting the cell with one or more one or more PRA activators in an effective amount to increase RPA levels/activity in the cell.
4. The method of claim 1, comprising contacting the cell with one or more fluorophore- modified nucleic acids used in gene editing.
5. The method of claim 1 or 2 wherein the PolQ inhibitory agent is Novobiocin (NVB) or ART558.
6. The method of any one of claims 1, 2 or 4, wherein the PolQ inhibitory agent is Novobiocin.
7. The method of claim 1 or 2, wherein the PolQ inhibitory agent is a functional nucleic acid inhibitor specific for PolQ.
8. The method of claim 7, wherein a functional nucleic acid inhibitor specific for PolQ is selected from the group consisting of siRNA, shRNA, and an antisense oligonucleotiude.
9. The method of any one of claims 1, 2, or 5-8, comprising contacting the cell with the PolQ inhibitor about 24 hours before and after electroporation.
10. The method of claim 1 or 3, wherein the one or more PRA activator is a nucleic acid nucleic acid encoding RPA1, RPA2, and/or RPA3.
11. The method of claim 1 or 4, wherein one or more fluorophore-modified nucleic acids used in gene editing is a donor oligonucleotide, or a gRNA.
12. The method of any one of claims any one of claims 1, 4 or 11, wherein one or more fluorophore-modified nucleic acids used in gene editing is a single-stranded oligodeoxynucleotide (ssODN) or sgRNA
13. The method of any one of claims 1, 4, or 11-12, wherein one or more fluorophore- modified nucleic acids used in gene editing is ssODN.
14. The method of any one of claims 1, 4, or 11-12, wherein one or more fluorophore- modified nucleic acids used in gene editing is sgRNA.
15. The method of any one of claims , 4, or 11-13, wherein the ssODN is modified at its 5’ end via a covalent bond, with a fluorophore
16. The method of any one of claims 1, 4, or 11-12 or 14, wherein sgRNA is modified at its 5’ end by covalent bonding with a fluorophore.
17. The method of any one of claims 1, 4 and 12-16, wherein the fluorophore is a cyanine dye.
18. The method of claim 17, wherein the cyanine dye is selected from the group consisting of cyanine (Cy) 2 (Cy2), Cy5, Cy5.5, Cy3, Cy3.5, and cy7.
19. The method of claim 18, wherein the fluorophore is Cy5.
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202463639573P | 2024-04-26 | 2024-04-26 | |
| US63/639,573 | 2024-04-26 | ||
| US202463723422P | 2024-11-21 | 2024-11-21 | |
| US63/723,422 | 2024-11-21 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025224715A1 true WO2025224715A1 (en) | 2025-10-30 |
Family
ID=95745214
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/IB2025/054391 Pending WO2025224715A1 (en) | 2024-04-26 | 2025-04-28 | Methods for improving precise genome modification and reducing unwanted mutations by crispr-cas editing |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025224715A1 (en) |
Citations (18)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US150A (en) | 1837-03-25 | Island | ||
| US5436A (en) | 1848-02-08 | Air-heating furnace | ||
| US5356802A (en) | 1992-04-03 | 1994-10-18 | The Johns Hopkins University | Functional domains in flavobacterium okeanokoites (FokI) restriction endonuclease |
| US5487994A (en) | 1992-04-03 | 1996-01-30 | The Johns Hopkins University | Insertion and deletion mutants of FokI restriction endonuclease |
| WO1998053059A1 (en) | 1997-05-23 | 1998-11-26 | Medical Research Council | Nucleic acid binding proteins |
| US6140081A (en) | 1998-10-16 | 2000-10-31 | The Scripps Research Institute | Zinc finger binding domains for GNN |
| WO2002044321A2 (en) | 2000-12-01 | 2002-06-06 | MAX-PLANCK-Gesellschaft zur Förderung der Wissenschaften e.V. | Rna interference mediating small rna molecules |
| US6453242B1 (en) | 1999-01-12 | 2002-09-17 | Sangamo Biosciences, Inc. | Selection of sites for targeting by zinc finger proteins and methods of designing zinc finger proteins to bind to preselected sites |
| US20020165356A1 (en) | 2001-02-21 | 2002-11-07 | The Scripps Research Institute | Zinc finger binding domains for nucleotide sequence ANN |
| WO2003016496A2 (en) | 2001-08-20 | 2003-02-27 | The Scripps Research Institute | Zinc finger binding domains for cnn |
| US6534261B1 (en) | 1999-01-12 | 2003-03-18 | Sangamo Biosciences, Inc. | Regulation of endogenous gene expression in cells using zinc finger proteins |
| US6746838B1 (en) | 1997-05-23 | 2004-06-08 | Gendaq Limited | Nucleic acid binding proteins |
| US20040197892A1 (en) | 2001-04-04 | 2004-10-07 | Michael Moore | Composition binding polypeptides |
| US20070154989A1 (en) | 2006-01-03 | 2007-07-05 | The Scripps Research Institute | Zinc finger domains specifically binding agc |
| US20070213269A1 (en) | 2005-11-28 | 2007-09-13 | The Scripps Research Institute | Zinc finger binding domains for tnn |
| WO2011072246A2 (en) | 2009-12-10 | 2011-06-16 | Regents Of The University Of Minnesota | Tal effector-mediated dna modification |
| WO2013176772A1 (en) | 2012-05-25 | 2013-11-28 | The Regents Of The University Of California | Methods and compositions for rna-directed target dna modification and for rna-directed modulation of transcription |
| WO2014018423A2 (en) | 2012-07-25 | 2014-01-30 | The Broad Institute, Inc. | Inducible dna binding proteins and genome perturbation tools and applications thereof |
-
2025
- 2025-04-28 WO PCT/IB2025/054391 patent/WO2025224715A1/en active Pending
Patent Citations (22)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US150A (en) | 1837-03-25 | Island | ||
| US5436A (en) | 1848-02-08 | Air-heating furnace | ||
| US5356802A (en) | 1992-04-03 | 1994-10-18 | The Johns Hopkins University | Functional domains in flavobacterium okeanokoites (FokI) restriction endonuclease |
| US5487994A (en) | 1992-04-03 | 1996-01-30 | The Johns Hopkins University | Insertion and deletion mutants of FokI restriction endonuclease |
| WO1998053059A1 (en) | 1997-05-23 | 1998-11-26 | Medical Research Council | Nucleic acid binding proteins |
| US6866997B1 (en) | 1997-05-23 | 2005-03-15 | Gendaq Limited | Nucleic acid binding proteins |
| US6746838B1 (en) | 1997-05-23 | 2004-06-08 | Gendaq Limited | Nucleic acid binding proteins |
| US6140081A (en) | 1998-10-16 | 2000-10-31 | The Scripps Research Institute | Zinc finger binding domains for GNN |
| US6610512B1 (en) | 1998-10-16 | 2003-08-26 | The Scripps Research Institute | Zinc finger binding domains for GNN |
| US6534261B1 (en) | 1999-01-12 | 2003-03-18 | Sangamo Biosciences, Inc. | Regulation of endogenous gene expression in cells using zinc finger proteins |
| US6453242B1 (en) | 1999-01-12 | 2002-09-17 | Sangamo Biosciences, Inc. | Selection of sites for targeting by zinc finger proteins and methods of designing zinc finger proteins to bind to preselected sites |
| WO2002044321A2 (en) | 2000-12-01 | 2002-06-06 | MAX-PLANCK-Gesellschaft zur Förderung der Wissenschaften e.V. | Rna interference mediating small rna molecules |
| US20020165356A1 (en) | 2001-02-21 | 2002-11-07 | The Scripps Research Institute | Zinc finger binding domains for nucleotide sequence ANN |
| US7067617B2 (en) | 2001-02-21 | 2006-06-27 | The Scripps Research Institute | Zinc finger binding domains for nucleotide sequence ANN |
| US20040197892A1 (en) | 2001-04-04 | 2004-10-07 | Michael Moore | Composition binding polypeptides |
| WO2003016496A2 (en) | 2001-08-20 | 2003-02-27 | The Scripps Research Institute | Zinc finger binding domains for cnn |
| US20070213269A1 (en) | 2005-11-28 | 2007-09-13 | The Scripps Research Institute | Zinc finger binding domains for tnn |
| US20070154989A1 (en) | 2006-01-03 | 2007-07-05 | The Scripps Research Institute | Zinc finger domains specifically binding agc |
| WO2011072246A2 (en) | 2009-12-10 | 2011-06-16 | Regents Of The University Of Minnesota | Tal effector-mediated dna modification |
| US20110145940A1 (en) | 2009-12-10 | 2011-06-16 | Voytas Daniel F | Tal effector-mediated dna modification |
| WO2013176772A1 (en) | 2012-05-25 | 2013-11-28 | The Regents Of The University Of California | Methods and compositions for rna-directed target dna modification and for rna-directed modulation of transcription |
| WO2014018423A2 (en) | 2012-07-25 | 2014-01-30 | The Broad Institute, Inc. | Inducible dna binding proteins and genome perturbation tools and applications thereof |
Non-Patent Citations (87)
| Title |
|---|
| ADIKUSUMA ET AL., NATURE, vol. 560, no. 7717, 2018, pages E8 - E9 |
| ALANIS-LOBATO ET AL., PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, vol. 118, no. 22, 2021 |
| ALLERA-MOREAU ET AL., ONCOGENESIS, vol. 1, 2012, pages e30 |
| AUDEBERT ET AL., J BIOL CHEM, vol. 279, no. 53, 2004, pages 55117 - 55126 |
| BEEL ET AL., BRIT J HAEMATOL, vol. 144, no. 1, 2009, pages 120 - 126 |
| BERNSTEIN ET AL., NATURE, vol. 411, 2001, pages 494 - 498 |
| BEYING ET AL., NAT PLANTS, vol. 6, no. 6, 2020, pages 638 - 645 |
| BI ET AL., GENOME BIOLOGY, vol. 21, no. 1, 2020, pages 213 |
| BRUNET ET AL., ADV EXP MED BIOL, vol. 1044, 2018, pages 15 - 25 |
| CERMAK ET AL., NUCL. ACIDS RES., 2011, pages 1 - 11 |
| CHAKRABARTI ET AL., MOL CELL, vol. 73, no. 4, 2019, pages 699 - 713 |
| CHANG ET AL., PROC. NATL. ACAD. SCI. USA, vol. 84, 1987, pages 4959 - 4963 |
| CLAWSON ET AL., GENE THER., vol. 11, no. 17, 2004, pages 1331 - 1341 |
| CONG, SCIENCE, vol. 15, no. 6121, 2013, pages 819 - 823 |
| CULLOT ET AL., NATURE COMMUNICATIONS, vol. 10, no. 1, 2019, pages 1136 |
| DE LA CHAPELLE A ET AL., PROC NATL ACAD SCI U S A, vol. 90, no. 10, 1993, pages 4495 - 4499 |
| ELBASHIR ET AL., GENES DEV., vol. 15, 2001, pages 188 - 200 |
| FIRE ET AL., NATURE, vol. 391, 1998, pages 806 - 11 |
| GAULTIER ET AL., NUCLEIC ACIDS. RES., vol. 15, 1987, pages 6625 - 6641 |
| GOODCHILD, CURR. OPIN. MOL. THER., vol. 6, no. 2, 2004, pages 120 - 128 |
| GUO ET AL., GENOME BIOL, vol. 19, no. 1, 2018, pages 170 |
| HAMANN ET AL., J. BIOL. ENG., vol. 13, 2019, pages 7 |
| HAMMOND ET AL., NATURE, vol. 404, 2000, pages 293 - 6 |
| HANNON, NATURE, vol. 418, 2002, pages 244 - 51 |
| HENRICKSEN ET AL., J BIOL CHEM, vol. 269, no. 15, 1994, pages 11121 - 11132 |
| HUNT JMT ET AL., HUM GENET, vol. 142, no. 6, 2023, pages 705 - 720 |
| INOUE ET AL., FEBS LETT., vol. 215, 1987, pages 327 - 330 |
| INOUE ET AL., NUCLEIC ACIDS RES., vol. 15, 1987, pages 6131 - 6148 |
| JINEK ET AL., SCIENCE, vol. 337, no. 6096, 2012, pages 816 - 21 |
| KAWAMURA ET AL., INT J CANCER, vol. 109, no. 1, 2004, pages 9 - 16 |
| KENT ET AL., NAT STRUCT MOL BIOL, vol. 22, no. 3, 2015, pages 230 - 237 |
| KIM ET AL., J. BIOL. CHEM., vol. 269, no. 31, 1994, pages 978 - 31,982 |
| KIM ET AL., PROC. NATL. ACAD. SCI. USA., vol. 91, 1994, pages 883 - 887 |
| KOIKE-YUSA ET AL., NAT BIOTECHNOL, vol. 32, no. 3, 2014, pages 267 - 273 |
| KOSICKI ET AL., NATURE BIOTECHNOLOGY, vol. 36, no. 8, 2018, pages 765 - 771 |
| KREITZER ET AL., AM J STEM CELLS, vol. 2, no. 2, 2013, pages 119 - 131 |
| LANCEY ET AL., NAT COMMUN, vol. 11, no. 1, 2020, pages 1109 |
| LEMEE ET AL., PROC NATL ACAD SCI U S A, vol. 107, no. 30, 2010, pages 13390 - 13395 |
| LI ET AL., CELL RES, vol. 21, no. 12, 2011, pages 1740 - 1744 |
| LI ET AL., PROC. NATL. ACAD. SCI. USA, vol. 90, 1993, pages 2764 - 2768 |
| LI ET AL., PROC., NATL. ACAD. SCI. USA, vol. 89, 1992, pages 4275 - 4279 |
| LUEDEMAN ET AL., NAT COMMUN, vol. 13, no. 1, 2022, pages 1204 |
| MARA ET AL., NEW PHYTOL, vol. 222, no. 3, 2019, pages 1380 - 1391 |
| MARA KOSTLEND ET AL: "POLQ plays a key role in the repair of CRISPR/Cas9-induced double-stranded breaks in the moss Physcomitrella patens", NEW PHYTOLOGIST, vol. 222, no. 3, 1 May 2019 (2019-05-01), GB, pages 1380 - 1391, XP093298585, ISSN: 0028-646X, DOI: 10.1111/nph.15680 * |
| MARTINEZ ET AL., CELL, vol. 110, 2002, pages 563 - 74 |
| MATEOS-GOMEZ ET AL., NAT STRUCT MOL BIOL, vol. 24, no. 12, 2017, pages 1116 - 1123 |
| MCVEY ET AL., NAT STRUCT MOL BIOL, vol. 21, no. 4, 2014, pages 405 - 412 |
| MILLER ET AL., NATURE BIOTECHNOL, vol. 29, 2011, pages 143 |
| MIMITOU ET AL., TRENDS BIOCHEM SCI, vol. 34, no. 5, 2009, pages 264 - 272 |
| NAPOLI ET AL., PLANT CELL, vol. 2, 1990, pages 279 - 89 |
| NEHLS ET AL., SCIENCE, vol. 272, 1996, pages 886 - 889 |
| NYKANEN ET AL., CELL, vol. 107, 2001, pages 309 - 21 |
| OWENS ET AL., NUCLEIC ACIDS RES, vol. 47, no. 14, 2019, pages 7402 - 7417 |
| PAPATHANASIOU ET AL., NAT COMMUN, vol. 12, no. 1, 2021, pages 5855 |
| PAUNOVSKA ET AL., NAT. REV. GEN, vol. 23, 2022, pages 265 - 280 |
| PISMATARO MARIA CHIARA ET AL: "Small Molecules Targeting DNA Polymerase Theta (POL[theta]) as Promising Synthetic Lethal Agents for Precision Cancer Therapy", JOURNAL OF MEDICINAL CHEMISTRY, vol. 66, no. 10, 1 May 2023 (2023-05-01), US, pages 6498 - 6522, XP093298375, ISSN: 0022-2623, DOI: 10.1021/acs.jmedchem.2c02101 * |
| PROC NATL ACAD SCI U S A, vol. 110, no. 19, 2013, pages 7720 - 7725 |
| PRZEWROCKA ET AL., ANN ONCOL, vol. 31, no. 9, 2020, pages 1270 - 1273 |
| QUILTER ET AL., HUM REPROD, vol. 25, no. 8, 2010, pages 2139 - 2150 |
| RAYNER ET AL., CRISPR J, vol. 2, no. 6, 2019, pages 406 - 416 |
| RICHARDSON ET AL., NAT BIOTECHNOL, vol. 34, no. 3, 2016, pages 339 - 344 |
| SAN FILIPPO ET AL., ANNU REV BIOCHEM, vol. 77, 2008, pages 229 - 257 |
| SANJANA ET AL., NAT METHODS, vol. 11, no. 8, 2014, pages 783 - 784 |
| SCHIMMEL ET AL., CELL REPORTS, vol. 42, no. 2, 2023, pages 112019 |
| SCHIMMEL JOOST ET AL: "Modulating mutational outcomes and improving precise gene editing at CRISPR-Cas9-induced breaks by chemical inhibition of end-joining pathways", CELL REPORTS, vol. 42, no. 2, 1 February 2023 (2023-02-01), US, pages 112019, XP093298462, ISSN: 2211-1247, DOI: 10.1016/j.celrep.2023.112019 * |
| SCHREMPF ET AL., TRENDS CANCER, vol. 7, no. 2, 2021, pages 98 - 111 |
| SFEIR ET AL., TRENDS BIOCHEM SCI, vol. 40, no. 11, 2015, pages 701 - 714 |
| SHERIDAN ET AL., NATURE BIOTECHNOLOGY, vol. 42, no. 1, 2024, pages 3 - 4 |
| SHI ET AL., CELL STEM CELL, vol. 20, no. 5, 2017, pages 675 - 688 |
| SHY ET AL., NAT BIOTECHNOL, 2022 |
| SONG ET AL., MOL THER NUCLEIC ACIDS, vol. 21, 2020, pages 523 - 526 |
| SUZUKI ET AL., CELL STEM CELL, vol. 15, no. 1, 2014, pages 31 - 36 |
| TAN ET AL., GENESIS, vol. 53, no. 2, 2015, pages 225 - 236 |
| UI-TEI ET AL., FEBS LETT, vol. 479, 2000, pages 79 - 82 |
| VAN OVERBEEK M ET AL., MOL CELL, vol. 63, no. 4, 2016, pages 633 - 646 |
| WANG ET AL., CELL BIOSCI, vol. 7, 2017, pages 6 |
| WANG ET AL., NUCLEIC ACIDS RES, vol. 34, no. 21, 2006, pages 6170 - 6182 |
| WEN ET AL., GENOME BIOL, vol. 22, no. 1, 2021, pages 236 |
| WIMBERGER ET AL., NATURE COMMUNICATIONS, vol. 14, no. 1, 2023, pages 4761 |
| WIMBERGER SANDRA ET AL: "Simultaneous inhibition of DNA-PK and Pol[Theta] improves integration efficiency and precision of genome editing", NATURE COMMUNICATIONS, vol. 14, no. 1, 1 August 2023 (2023-08-01), UK, pages 4761 - 18, XP093298384, ISSN: 2041-1723, DOI: 10.1038/s41467-023-40344-4 * |
| YIANGOU ET AL., STEM CELL REPORTS, vol. 12, no. 1, 2019, pages 165 - 179 |
| YUAN BAOLEI ET AL: "Modulation of the microhomology-mediated end joining pathway suppresses large deletions and enhances homology-directed repair following CRISPR-Cas9-induced DNA breaks", BMC BIOLOGY, vol. 22, no. 1, 29 April 2024 (2024-04-29), GB, pages 101 - 15, XP093298347, ISSN: 1741-7007, Retrieved from the Internet <URL:https://pmc.ncbi.nlm.nih.gov/articles/PMC11059712/pdf/12915_2024_Article_1896.pdf> DOI: 10.1186/s12915-024-01896-z * |
| YUAN ET AL., CHEM. SOC. REV., vol. 54, 2025, pages 341 - 366 |
| ZATREANU ET AL., NATURE COMMUNICATIONS, vol. 12, no. 1, 2021, pages 3636 |
| ZATREANY, D.ROBINSON, H.M.R.ALKHATIB, O. ET AL.: "PolO inhibitors elicit BRCA-gene synthetic lethality and target PARP inhibitor resistance", N(IF COMMUN, vol. 12, 2021, pages 3636, Retrieved from the Internet <URL:https://doi.org/10.1038/s41467-021-23463-8> |
| ZHOU JGELOT CPANTELIDOU CLI AYUCEL HDAVIS REFARKKILA AKOCHUPURAKKAL BSYED ASHAPIRO GI ET AL.: "A first-in-class Polymerase Theta Inhibitor selectively targets Homologous-Recombination-Deficient Tumors", NAT CANCER, vol. 2, no. 6, 2021, pages 598 - 610, XP093087790, DOI: 10.1038/s43018-021-00203-x |
| ZUCCARO ET AL., CELL, vol. 183, no. 6, 2020, pages 1650 - 1664 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Giuliano et al. | Generating single cell–derived knockout clones in mammalian cells with CRISPR/Cas9 | |
| Gainetdinov et al. | Relaxed targeting rules help PIWI proteins silence transposons | |
| EP3617311B1 (en) | Method for inducing exon skipping by genome editing | |
| US10760081B2 (en) | Compositions and methods for enhancing CRISPR activity by POLQ inhibition | |
| Ranganathan et al. | Expansion of the CRISPR–Cas9 genome targeting space through the use of H1 promoter-expressed guide RNAs | |
| Meister | Argonaute proteins: functional insights and emerging roles | |
| Atashpaz et al. | ATR expands embryonic stem cell fate potential in response to replication stress | |
| CA3064601A1 (en) | Crispr/cas-adenine deaminase based compositions, systems, and methods for targeted nucleic acid editing | |
| CN109415729B (en) | Gene editing reagents with reduced toxicity | |
| US20220229044A1 (en) | In situ cell screening methods and systems | |
| CN110300803B (en) | Methods for improving efficiency of Homology Directed Repair (HDR) in cellular genomes | |
| CN113348245A (en) | Novel CRISPR enzymes and systems | |
| JP2020534826A (en) | Modification of specificity of non-coding RNA molecules for silencing gene expression in eukaryotic cells | |
| US20220348910A1 (en) | Methods and compositions for multiplex gene editing | |
| WO2019010384A1 (en) | Methods for designing guide sequences for guided nucleases | |
| WO2019094984A1 (en) | Methods for determining spatial and temporal gene expression dynamics during adult neurogenesis in single cells | |
| WO2016182893A1 (en) | Functional genomics using crispr-cas systems for saturating mutagenesis of non-coding elements, compositions, methods, libraries and applications thereof | |
| CA3190991A1 (en) | Systems, methods, and compositions for rna-guided rna-targeting crispr effectors | |
| WO2019089803A1 (en) | Methods and compositions for studying cell evolution | |
| JP7210028B2 (en) | Gene mutation introduction method | |
| IL286357B1 (en) | A CRISPR/CAS screening platform to identify genetic modifiers of tau seeding or aggregation | |
| Montavon et al. | Characterization of DCL 4 missense alleles provides insights into its ability to process distinct classes of ds RNA substrates | |
| Muller et al. | An Efficient method for electroporation of small interfering RNAs into ENCODE project tier 1 GM12878 and K562 cell lines | |
| WO2025224715A1 (en) | Methods for improving precise genome modification and reducing unwanted mutations by crispr-cas editing | |
| Yiu | Investigating the role of non-coding RNAs in doxorubicin-induced cardiotoxicity |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 25725903 Country of ref document: EP Kind code of ref document: A1 |