[go: up one dir, main page]

WO2025064511A1 - Bat-associated reverse transcriptase and methods of use thereof - Google Patents

Bat-associated reverse transcriptase and methods of use thereof Download PDF

Info

Publication number
WO2025064511A1
WO2025064511A1 PCT/US2024/047222 US2024047222W WO2025064511A1 WO 2025064511 A1 WO2025064511 A1 WO 2025064511A1 US 2024047222 W US2024047222 W US 2024047222W WO 2025064511 A1 WO2025064511 A1 WO 2025064511A1
Authority
WO
WIPO (PCT)
Prior art keywords
enzyme
rna
bart
cell
virus
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2024/047222
Other languages
French (fr)
Inventor
Thomas ZWAKA
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Icahn School of Medicine at Mount Sinai
Original Assignee
Icahn School of Medicine at Mount Sinai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Icahn School of Medicine at Mount Sinai filed Critical Icahn School of Medicine at Mount Sinai
Publication of WO2025064511A1 publication Critical patent/WO2025064511A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/102Mutagenizing nucleic acids
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/12Antivirals
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/111General methods applicable to biologically active non-coding nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • C12N15/1135Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing against oncogenes or tumor suppressor genes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/06Animal cells or tissues; Human cells or tissues
    • C12N5/0602Vertebrate cells
    • C12N5/0696Artificially induced pluripotent stem cells, e.g. iPS
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • C12N9/1241Nucleotidyltransferases (2.7.7)
    • C12N9/1276RNA-directed DNA polymerase (2.7.7.49), i.e. reverse transcriptase or telomerase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases [RNase]; Deoxyribonucleases [DNase]
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/80Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
    • C07K2319/81Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor containing a Zn-finger domain for DNA binding

Definitions

  • This disclosure relates generally to a bat-associated reverse transcriptase and methods of use thereof.
  • the broader context of this disclosure lies in the ongoing evolutionary arms race between organisms and viruses. This constant battle has driven the development of diverse and sophisticated antiviral defense mechanisms across all domains of life.
  • the CRISPR-Cas system an adaptive immune mechanism employed by prokaryotes, has emerged as a technological tool in the field of gene editing. Its capacity to precisely target and modify specific DNA sequences has revolutionized biomedical research, enabling scientists to manipulate genes with unprecedented accuracy and efficiency.
  • the advent of CRISPR-Cas has not only accelerated the understanding of gene function but has also paved the way for innovative gene therapies, offering potential solutions for a wide range of genetic disorders.
  • this disclosure addresses the need mentioned above in a number of aspects.
  • this disclosure provides a method of modifying a target polynucleotide.
  • the method comprises delivering to the target polynucleotide an enzyme that possesses a reverse transcriptase activity and an endonuclease activity, and one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with the target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide.
  • this disclosure provides a method of modifying expression of a target polynucleotide.
  • the method comprises: introducing into a cell or a subject an enzyme that possesses a reverse transcriptase activity and an endonuclease activity or a nucleic acid molecule encoding the enzyme, and one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with the target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme binds to one or more locations on the target polynucleotide such that binding of the enzyme increases or decreases expression level of the target polynucleotide.
  • the enzyme further possesses an integrase activity.
  • the one or more nucleic acid components comprise a single-stranded navigator DNA.
  • the one or more nucleic acid components further comprise a payload RNA.
  • the payload RNA comprises a stem-loop structure.
  • the enzyme reverse transcribes the payload RNA into a cDNA. In some embodiments, the enzyme integrates the cDNA into the target polynucleotide.
  • the enzyme comprises a N-terminal apurinic-apyrimidinic endonuclease domain optionally linked to a reverse transcriptase domain.
  • the enzyme further comprises a C-terminal domain.
  • the C-terminal domain facilitates interaction between the enzyme and the target polynucleotide.
  • the C-terminal domain comprises a zinc finger.
  • the zinc finger comprises a CCHC motif.
  • the enzyme comprises a Bat-Associated Reverse Transcriptase (BART).
  • BART comprises a BART of Rhinolophus femimequimim, Myotis myotis Meles meles, Bos mutus, Capra hircus.
  • Homo sapiens, Ca is lupus familiaris, Sus scrofa, Callithrix jacchus, Mus musculus, Ursus arctos, Elephas maximus indicus, or a variant thereof.
  • the enzyme is provided through one or more polynucleotide molecules encoding the enzyme.
  • the one or more nucleic acid components are provided through one or more polynucleotide molecules encoding or comprising the one or more nucleic acid components.
  • the one or more polynucleotide molecules comprise one or more vectors.
  • the enzyme and the one or more nucleic acid components are provided in a single vector.
  • the target polynucleotide comprises a genomic locus. In some embodiments, the target polynucleotide comprises RNA or DNA.
  • the RNA comprises a viral RNA of a RNA virus.
  • the RNA virus is selected from the group consisting of Norwalk, Rotavirus, Poliovirus, Ebola virus, Marburg virus, Lassa virus, Hantavirus, Rabies, Influenza, Yellow fever virus, Coronavirus, SARS, SARS-CoV-2, West Nile virus, Hepatitis A, C (HCV) and E virus, Dengue fever virus, togaviruses, Rhabdoviruses, Picomaviruses, Myxoviruses, retroviruses, bunyaviruses, coronaviruses, and reoviruses.
  • the DNA comprises a genomic DNA or a cDNA.
  • the modification of the target polynucleotide comprises cleavage of the target polynucleotide.
  • the target polynucleotide is contained in a nucleic acid molecule within a cell or in vitro.
  • the cell comprises a eukaryotic cell.
  • the eukaryotic cell comprises a mammalian cell.
  • the eukaryotic cell comprises a non-human animal cell, a human cell, or a plant cell.
  • this disclosure provides a method of treating or preventing a viral infection of a RNA virus in a cell or a subject.
  • the method comprises delivering to the cell or the subject an enzyme that possesses a reverse transcriptase activity and an endonuclease activity, or a nucleic acid molecule encoding the enzyme, wherein a singlestranded guide polynucleotide hybridizes with a viral RNA of the RNA virus and directs binding of the enzyme to the viral RNA, and wherein the enzyme cleaves the viral RNA.
  • the method comprises reverse transcribing the viral RNA into a cDNA by the enzyme.
  • the enzyme has an integrase activity
  • the method comprises integrating the cDNA to a genome of the cell or the subject by the enzyme.
  • the method comprises transcribing the cDNA into the singlestranded guide polynucleotide capable of hybridizing with the viral RNA.
  • this disclosure provides a method of enhancing immunity against a viral infection of a RNA virus in a cell or a subject.
  • the method comprises delivering to the cell or the subject an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a single-stranded DNA (ssDNA) to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, wherein the enzyme reverse-transcribes the mRNA into a second ssDNA, wherein the second ssDNA hybridizes with the viral RNA and directs binding of the enzyme to the viral RNA, and wherein the enzyme cleaves the viral RNA.
  • ssDNA single-strand
  • the method comprises delivering to the cell or the subject an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a single-stranded DNA (ssDNA) to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, and wherein the mRNA with the viral RNA to form a RNA hybrid to silence the viral RNA.
  • an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a single-stranded DNA (
  • this disclosure provides a method of generating a cell line with immunity against a viral infection of a RNA virus.
  • the method comprises delivering to a cell an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse- transcribes a viral RNA of the RNA virus into a single-stranded DNA (ssDNA) to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, wherein the enzyme reverse-transcribes the mRNA into a second ssDNA, wherein the second ssDNA hybridizes with the viral RNA and directs binding of the enzyme to the viral RNA, and wherein the enzyme cleaves the viral RNA.
  • ssDNA single-stranded DNA
  • the method comprises delivering to a cell an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse- transcribes a viral RNA of the RNA virus into a single-stranded DNA (ssDNA) to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, and wherein the mRNA with the viral RNA to form a RNA hybrid to silence the viral RNA.
  • ssDNA single-stranded DNA
  • the RNA virus is selected from the group consisting of Norwalk, Rotavirus, Poliovirus, Ebola virus, Marburg virus, Lassa virus, Hantavirus, Rabies, Influenza, Yellow fever virus, Coronavirus, SARS, SARS-CoV-2, West Nile virus, Hepatitis A, C (HCV) and E virus, Dengue fever virus, togaviruses, Rhabdoviruses, Pi cornavi ruses, Myxoviruses, retroviruses, bunyaviruses, coronaviruses, and reoviruses.
  • the enzyme comprises a N-terminal apurinic-apyrimidinic endonuclease domain optionally linked to a reverse transcriptase domain.
  • the enzyme further comprises a C-terminal domain.
  • the C-terminal domain facilitates interaction between the enzyme and the target polynucleotide.
  • the C-terminal domain comprises a zinc finger.
  • the zinc finger comprises a CCHC motif.
  • the enzyme comprises a Bat-Associated Reverse Transcriptase (BART).
  • BART comprises a BART of Rhinolophus ferrumequinum, Myotis myotis Meles meles, Bos mutus, Capra hircus, Homo sapiens, Canis lupus familiaris, Sus scrofa, Callithrix jacchus, Mus musculus, Ursus arctos, Elephas maximus indicus, or a variant thereof.
  • the enzyme is provided through one or more polynucleotide molecules encoding the enzyme.
  • this disclosure provides a gene editing system for modifying a target polynucleotide.
  • the method comprises an enzyme that possesses a reverse transcriptase activity and an endonuclease activity, and one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with the target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide.
  • the enzyme further possesses an integrase activity.
  • the one or more nucleic acid components comprise a single-stranded navigator DNA.
  • the one or more nucleic acid components further comprise a payload RNA.
  • the payload RNA comprises a stem-loop structure.
  • the enzyme reverse transcribes the payload RNA into a cDNA.
  • the enzyme integrates the cDNA into the target polynucleotide.
  • the enzyme comprises a N-terminal apurinic-apyrimidinic endonuclease domain optionally linked to a reverse transcriptase domain.
  • the enzyme further comprises a C-terminal domain.
  • the C-terminal domain facilitates interaction between the enzyme and the target polynucleotide.
  • the C-terminal domain comprises a zinc finger.
  • the zinc finger comprises a CCHC motif.
  • the enzyme comprises a Bat-Associated Reverse Transcriptase (BART).
  • BART comprises a BART of Rhinolophus ferrumequiirum, Myotis myotis, Meles meles. Bos mutus, Capra hircus, Homo sapiens, Canis lupus familiaris, Sus scrofa, Callithrix jacchus, Mus musculus, Ursus arctos, Elephas maximus indicus, or a variant thereof.
  • the gene editing system comprises one or more polynucleotide molecules encoding the enzyme. In some embodiments, the gene editing system comprises one or more polynucleotide molecules encoding or comprising the one or more nucleic acid components. In some embodiments, the one or more polynucleotide molecules comprise one or more vectors. In some embodiments, the enzyme and the one or more nucleic acid components are provided in a single vector.
  • the RNA comprises a viral RNA of a RNA virus.
  • the RNA virus is selected from the group consisting of Norwalk, Rotavirus, Poliovirus, Ebola virus, Marburg virus, Lassa virus, Hantavirus, Rabies, Influenza, Yellow fever virus, Coronavirus, SARS, SARS-CoV-2, West Nile virus, Hepatitis A, C (HCV) and E virus, Dengue fever virus, togaviruses, Rhabdoviruses, Picomaviruses, Myxoviruses, retroviruses, bunyaviruses, coronaviruses, and reoviruses.
  • the DNA comprises a genomic DNA or a cDNA.
  • the modification of the target polynucleotide comprises cleavage of the target polynucleotide.
  • this disclosure provides a delivery system comprising the gene editing system, and the delivery system is adapted to deliver the gene editing system into a cell or a subject.
  • the delivery system comprises nanoparticles or vesicles encapsulating the gene editing system.
  • this disclosure provides a vector system comprising one or more vectors, wherein the one or more vectors comprise one or more polynucleotide molecules encoding an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, and comprise one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with a target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide.
  • this disclosure provides a kit comprising the gene editing system, the delivery system, or the vector system, as described herein.
  • this disclosure provides a cell line comprising one or more polynucleotide molecules encoding an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and encoding an integrase activity, and one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with a target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide.
  • the cell line comprises a eukaryotic cell.
  • the eukaryotic cell comprises a mammalian cell.
  • the eukaryotic cell comprises a stem cell or stem cell line.
  • the method comprises introducing to a cell one or more polynucleotide molecules encoding an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, and encoding one or more nucleic acid components.
  • Figure 1 shows a process BART utilizes to integrate a RNA payload through a navigator DNA.
  • Figures 2A and 2B show BART-mediated immunity against VSV infection. This figure shows the results of VSV infection in BART -expressing cell lines.
  • Figure 2A shows the reduction in viral load in cells expressing BART compared to control cells.
  • Figure 2B shows the presence of viral cDNA in the cytoplasm of BART-expressing cells, highlighting BART’s role in reverse transcription and viral suppression.
  • Figure 3 shows a structural model of BART highlighting the APE domain, the reverse transcriptase domain, and the C-terminal domain (CTD).
  • Figure 4 shows vector maps of bart cloning constructs. Depiction of the plasmid vector maps used for cloning the BART enzyme coding sequences, including the placement of the CMV promoter, neomycin resistance gene, and restriction sites for Hindlll and Notl.
  • This disclosure is based, in part, on an unexpected discovery of a novel mammalian defense mechanism akin to the CRISPR-Cas system traditionally associated with prokaryotic biology.
  • the Bat-Associated Reverse Transcriptase (BART), as disclosed herein, possesses a unique feature found in bat-induced pluripotent stem cells (iPSCs) with the ability to convert viral genomic RNA into DNA.
  • iPSCs bat-induced pluripotent stem cells
  • the disclosed BART system represents a next-generation tool for genome and RNA editing, exhibiting high efficiency and specificity, potentially outperforming current CRISPR-Cas systems. This discovery blurs the lines between prokaryotic and eukaryotic defense mechanisms, opening new opportunities for comprehensive viral defense strategies and advanced gene therapy applications.
  • BART exhibits the remarkable ability to convert viral RNA into DNA, a process typically associated with retroviruses. Furthermore, it possesses endonuclease activity, enabling it to cleave nucleic acids, and integrase activity, facilitating the insertion of genetic material into the host genome.
  • endonuclease activity enabling it to cleave nucleic acids, and integrase activity, facilitating the insertion of genetic material into the host genome.
  • a function of BART is to serve as an innovative antiviral defense mechanism in mammalian cells. It achieves this by targeting and degrading viral RNA, thus disrupting the viral life cycle and preventing further infection.
  • the unique aspect of BART lies in its ability to create a “genomic memory” of past viral encounters. By converting viral RNA into DNA and integrating it into the host genome, BART establishes a lasting record of the infection. This genomic archive allows for the rapid production of RNA transcripts that guide BART to recognize and neutralize the same virus upon subsequent infections, providing a form of adaptive immunity in mammalian cells.
  • BART sets it apart from other known antiviral and gene-editing systems.
  • BART utilizes its reverse transcriptase activity to convert the RNA into complementary DNA (cDNA).
  • cDNA complementary DNA
  • This cDNA is then integrated into the host genome at specific locations, creating a library of viral sequences akin to the CRISPR arrays found in bacteria.
  • the integrated cDNA is subsequently transcribed, generating RNA transcripts that guide BART to target and cleave homologous viral RNA, thereby disrupting the viral life cycle and providing long-term immunity against future infections.
  • This remarkable process not only confers antiviral defense but also offers a powerful tool for precise gene editing, as the integration and targeting mechanisms can be harnessed to introduce specific modifications into the host genome.
  • BART Beyond its role in antiviral defense, BART’s unique mechanism of action presents methods for precise and efficient gene editing.
  • BART exhibits superior specificity and reduced off-target effects, addressing a major limitation of existing gene-editing technologies.
  • This enhanced precision coupled with its potential for broader applicability across diverse cell types and therapeutic contexts, makes BART a system for next-generation gene editing tools.
  • the ability to harness this naturally evolved mammalian system for targeted genome manipulation will revolutionize the field of gene therapy, offering new avenues for treating genetic disorders and other diseases.
  • BART represents a paradigm shift in the understanding of antiviral immunity and gene editing.
  • BART challenges the traditional boundaries between these two domains of life. This indicates a convergent evolution of adaptive immune mechanisms, highlighting the universality of strategies employed by organisms to combat viral threats.
  • BART s dual functionality as an antiviral agent and a precise gene-editing tool provides new avenues for innovative therapies against viral infections and genetic disorders. This naturally evolved mammalian system for targeted genome manipulation will revolutionize the field of gene therapy, offering a more compatible and efficient alternative to current CRISPR-Cas-based approaches.
  • the discovery of BART thus represents a significant leap forward in addressing some of the most pressing health challenges, paving the way for a new era of antiviral and gene-editing therapeutics.
  • This disclosure encompasses methods and uses of the BART system described herein for modifying a target DNA sequence (e.g., a chromosomal sequence) or target RNA sequence, e.g., for altering or manipulating the expression of one or more genes or the one or more gene products, in prokaryotic or eukaryotic cells, in vitro in vivo, or ex vivo.
  • the BART protein has unique characteristics in possessing both reverse transcriptase and endonuclease activities.
  • the BART protein can additionally have an integrase activity, allowing it to integrate a DNA (e.g., reverse transcribed from a viral RNA) into a host genome.
  • the disclosed BART systems provide an effective means for modifying (e.g., deleting, inserting, translocating, inactivating, activating) a target RNA or DNA (double-stranded, linear, or super-coiled) in a multiplicity of cell types.
  • modifying e.g., deleting, inserting, translocating, inactivating, activating
  • a target RNA or DNA double-stranded, linear, or super-coiled
  • this disclosure provides a method of modifying a target polynucleotide.
  • the method comprises delivering to the target polynucleotide an enzyme (e.g., BART) that possesses a reverse transcriptase activity and an endonuclease activity, and one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with the target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide.
  • an enzyme e.g., BART
  • the enzyme modifies the target polynucleotide.
  • the enzyme further possesses an integrase activity.
  • the enzyme comprises a BART protein.
  • the one or more nucleic acid components comprise a single-stranded navigator DNA (“navigator ssDNA”).
  • the one or more nucleic acid components further comprise a payload RNA.
  • the payload RNA comprises a stem-loop structure.
  • the disclosure provides a method of modifying a target polynucleotide (e.g., target sequence of interest), such as modifying expression of a target polynucleotide, in a cell.
  • a target polynucleotide e.g., target sequence of interest
  • the method allows a BART complex (e.g., BART/navigator ssDNA complex) to bind to the target polynucleotide, resulting in increased or decreased expression of the target polynucleotide or a gene comprising the target polynucleotide.
  • the BART complex comprises BART complexed with a ssDNA sequence hybridized to a target sequence within the polynucleotide.
  • the method of modifying a target polynucleotide comprises delivering the BART system, isolated nucleic acids encoding a BART protein and/or a navigator ssDNA, or particles containing a BART protein and/or a navigator ssDNA, to a target sequence or a cell containing the target sequence.
  • the BART protein following formation of a complex of BART/navigator ssDNA and hybridization of the ssDNA to one or more nucleic acids of the target sequence, the BART protein induces a modification (e.g., cleavage) of the target sequence.
  • the modification comprises cleaving one or two strands at the location of the target sequence by the enzyme. In some embodiments, the modification results in decreased or increased transcription of a target gene. In some embodiments, the method further comprises repairing the cleaved target polynucleotide by homologous recombination with an exogenous template polynucleotide, wherein the repair results in a mutation comprising an insertion, deletion, or substitution of one or more nucleotides of the target polynucleotide. In some embodiments, the mutation results in one or more amino acid changes in a protein expressed from a gene comprising the target sequence.
  • the modification of the target polynucleotide comprises cleavage of the target polynucleotide.
  • modification of a target polynucleotide may include modifying (e.g., increasing or decreasing) expression of a target polynucleotide as a result of binding of the BART protein to a certain location on the target polynucleotide. For example, if the BART protein binds to a location between a promoter or other regulatory element and a coding sequence in the target polynucleotide, the BART protein may serve as a repressor that decreases expression of the target polynucleotide. On the other hand, if the BART protein binds to a location upstream of a promoter or other regulatory element, the BART protein may serve as an activator that increases expression of the target polynucleotide.
  • this disclosure also provides a method of modifying expression of a target polynucleotide.
  • the method comprises introducing into a cell or a subject an enzyme (e. ., BART) that possesses a reverse transcriptase activity and an endonuclease activity or a nucleic acid molecule encoding the enzyme, and one or more nucleic acid components (e.g., navigator ssDNA), wherein the one or more nucleic acid components target and hybridize with the target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme binds to one or more locations on the target polynucleotide such that binding of the enzyme increases or decreases the expression level of the target polynucleotide.
  • an enzyme e. ., BART
  • one or more nucleic acid components e.g., navigator ssDNA
  • the enzyme may include one or more mutations that result in a catalytically inactive enzyme.
  • the catalytically inactive enzyme lacks an endonuclease activity, such that the enzyme can still bind to the target polynucleotide but does not cleave the target polynucleotide.
  • a nucleotide sequence encoding the BART protein is present in a recombinant expression vector.
  • the recombinant expression vector is a viral construct, e. ., a recombinant adeno-associated virus construct, a recombinant adenoviral construct, a recombinant lentiviral construct, etc.
  • viral vectors can be based on vaccinia virus, poliovirus, adenovirus, adeno-associated virus, SV40, herpes simplex virus, human immunodeficiency virus, and the like.
  • a retroviral vector can be based on Murine Leukemia Virus, spleen necrosis virus, and vectors derived from retroviruses such as Rous Sarcoma Virus, Harvey Sarcoma Virus, avian leukosis virus, a lentivirus, human immunodeficiency virus, myeloproliferative sarcoma virus, mammary tumor virus, and the like.
  • Useful expression vectors are known to those of skill in the art, and many are commercially available.
  • vectors are provided by way of example for eukaryotic host cells: pXTl, pSG5, pSVK3, pBPV, pMSG, and pSVLSV40.
  • any other vector may be used if it is compatible with the host cell.
  • useful expression vectors containing a nucleotide sequence encoding a Cas9 enzyme are commercially available from, e.g., Addgene, Life Technologies, Sigma- Aldrich, and Origene.
  • any of a number of transcription and translation control elements including promoter, transcription enhancers, transcription terminators, and the like, may be used in the expression vector.
  • Useful promoters can be derived from viruses, or any organism, e.g., prokaryotic or eukaryotic organisms.
  • Suitable promoters include, but are not limited to, the SV40 early promoter, mouse mammary tumor virus long terminal repeat (LTR) promoter; adenovirus major late promoter (Ad MLP); a herpes simplex virus (HSV) promoter, a cytomegalovirus (CMV) promoter such as the CMV immediate early promoter region (CMVIE), a rous sarcoma virus (RSV) promoter, a human U6 small nuclear promoter (U6), an enhanced U6 promoter, a human Hl promoter (Hl), etc.
  • LTR mouse mammary tumor virus long terminal repeat
  • Ad MLP adenovirus major late promoter
  • HSV herpes simplex virus
  • CMV cytomegalovirus
  • CMVIE CMV immediate early promoter region
  • RSV rous sarcoma virus
  • U6 small nuclear promoter U6 small nuclear promoter
  • Hl human Hl promoter
  • the enzyme is provided through one or more polynucleotide molecules encoding the enzyme.
  • one or more nucleic acid components are provided through one or more polynucleotide molecules encoding or comprising one or more nucleic acid components.
  • one or more polynucleotide molecules comprise one or more vectors.
  • the enzyme and one or more nucleic acid components are provided in a single vector.
  • a polynucleotide comprising a first navigator ssDNA is located on the same vector with a polynucleotide encoding the BART protein.
  • the BART system further comprises a polynucleotide containing a second navigator ssDNA.
  • the polynucleotide encoding the BART protein, the polynucleotide comprising the first navigator DNA, and the polynucleotide comprising the second navigator ssDNA are harbored on the same vector or two or more different vectors.
  • the BART protein or variants/fragments thereof can be introduced into a cell (e.g., an in vitro cell such as a primary cell for ex vivo therapy, or an in vivo cell such as in a patient) as a BART protein or a variant or fragment thereof, an mRNA encoding a BART protein or a variant or fragment thereof, or a recombinant expression vector comprising a nucleotide sequence encoding a BART protein or a variant or fragment thereof.
  • a cell e.g., an in vitro cell such as a primary cell for ex vivo therapy, or an in vivo cell such as in a patient
  • a cell e.g., an in vitro cell such as a primary cell for ex vivo therapy, or an in vivo cell such as in a patient
  • a cell e.g., an in vitro cell such as a primary cell for ex vivo therapy, or an in vivo cell such as in a patient
  • the method further comprises delivering one or more vectors to a host cell.
  • the vectors are delivered to the host cell in a subject.
  • the modification takes place in the eukaryotic cell in cell culture.
  • the method further comprises isolating the eukaryotic cell from a subject prior to the modification.
  • the method further comprises returning the cells derived therefrom to the subject.
  • the method further comprises maintaining cells or embryos under appropriate conditions such that the ssDNA guides the BART protein to the targeted site in the target sequence to modify the target sequence.
  • the cell can be maintained under conditions appropriate for cell growth and/or maintenance. Suitable cell culture conditions are well known in the art and are described, for example, in Current Protocols in Molecular Biology” Ausubel et al., John Wiley & Sons, New York, 2003 or “Molecular Cloning: A Laboratory Manual” Sambrook & Russell, Cold Spring Harbor Press, Cold Spring Harbor, N.Y., 3rd edition, 2001), Santiago et al. (2008) PNAS 105:5809-5814; Moehle et rz/. (2007) PNAS 104:3055-3060; Urnov et al.
  • An embryo can be cultured in vitro (e.g., in cell culture). Typically, the embryo is cultured at an appropriate temperature and in appropriate media with the necessary O2/CO2 ratio to allow the expression of the proteins and RNA scaffold, if necessary. Suitable non-limiting examples of media include M2, M16, KSOM, BMOC, and HTF media.
  • a cell line may be derived from an in iv/r -cultured embryo (e.g., an embryonic stem cell line).
  • an embryo may be cultured in vivo by transferring the embryo into uterus of a female host.
  • the female host is from the same or similar species as the embryo.
  • the female host is pseudo-pregnant.
  • Methods of preparing pseudo-pregnant female hosts are known in the art.
  • methods of transferring an embryo into a female host are known. Culturing an embryo in vivo permits the embryo to develop and can result in a live birth of an animal derived from the embryo. Such an animal would comprise the modified chromosomal sequence in every cell of the body.
  • the BART protein may include a variant or fragment of a BART.
  • the BART comprises a BART of Rhinolophus ferrumequinum, Myotis myotis, Meles meles, Bos mutus, Capra hircus, Homo sapiens, Canis lupus familiaris, Sus scrofa, Callithrix jacchus, Mus musculus, Ursus arctos, Elephas maximus indicus, or a variant thereof.
  • the enzyme comprises a N-terminal apurinic-apyrimidinic endonuclease (APE) domain optionally linked to a reverse transcriptase domain.
  • APE N-terminal apurinic-apyrimidinic endonuclease
  • Apurinic- apyrimidinic endonuclease (APE) domains are protein motifs that play a crucial role in DNA repair. These domains are responsible for recognizing and cleaving DNA strands that contain apurinic or apyrimidinic sites, which are lesions that occur when a purine or pyrimidine base is lost from the DNA backbone.
  • the enzyme further comprises a C-terminal domain.
  • the C-terminal domain facilitates interaction between the enzyme and the target polynucleotide.
  • the C-terminal domain comprises azinc finger.
  • Zinc fingers are small protein structural motifs characterized by the coordination of one or more zinc ions (Zn 2+ ). These metal ions help stabilize the protein’s structure, allowing it to interact with other molecules.
  • the zinc finger comprises a CCHC motif.
  • the CCHC motif is a zinc finger motif found in many proteins, particularly those involved in DNA binding and transcriptional regulation. It consists of a conserved sequence of amino acids that contains cysteine and histidine residues. These residues coordinate zinc ions, forming a stable structure that helps the protein bind to DNA.
  • the BART protein comprises an amino acid sequence of any one of SEQ ID NOs: 1-12 or comprises an amino acid sequence having at least 75% (e.g., 75%, 80%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more) sequence identity to any one of SEQ ID NOs: 1-12.
  • protein polypeptide
  • polypeptide polymers of amino acids of any length.
  • the polymer may be linear or branched, it may comprise modified amino acids, and it may be interrupted by non-amino acids.
  • the terms also encompass an amino acid polymer that has been modified, for example, disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, pegylation, or any other manipulation, such as conjugation with a labeling component.
  • amino acid includes natural and/or unnatural or synthetic amino acids, including glycine and both the D or L optical isomers, and amino acid analogs and peptidomimetics.
  • a peptide or polypeptide “fragment” as used herein refers to a less than full-length peptide, polypeptide or protein.
  • a peptide or polypeptide fragment can have at least about 3, at least about 4, at least about 5, at least about 10, at least about 20, at least about 30, at least about 40 amino acids in length, or single unit lengths thereof.
  • a fragment may be 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, or more amino acids in length.
  • peptide fragments can be less than about 500 amino acids, less than about 400 amino acids, less than about 300 amino acids or less than about 250 amino acids in length.
  • variant refers to a first composition (e.g., a first molecule) that is related to a second composition (e.g., a second molecule, also termed a “parent” molecule).
  • the variant molecule can be derived from, isolated from, based on or homologous to the parent molecule.
  • variant can be used to describe either polynucleotides or polypeptides.
  • a variant molecule can have an entire nucleotide sequence identity with the original parent molecule, or alternatively, can have less than 100% nucleotide sequence identity with the parent molecule.
  • a variant of a gene nucleotide sequence can be a second nucleotide sequence that is at least 50%, 60%, 70%, 80%, 90%, 95%, 98%, 99% or more identical in nucleotide sequence compared to the original nucleotide sequence.
  • Polynucleotide variants also include polynucleotides comprising the entire parent polynucleotide, and further comprising additional fused nucleotide sequences.
  • Polynucleotide variants also include polynucleotides that are portions or subsequences of the parent polynucleotide; for example, unique subsequences (e.g., as determined by standard sequence comparison and alignment techniques) of the polynucleotides disclosed herein are also encompassed by the invention.
  • polynucleotide variants include nucleotide sequences that contain minor, trivial or inconsequential changes to the parent nucleotide sequence.
  • a variant polypeptide can have an entire amino acid sequence identity with the original parent polypeptide, or alternatively, can have less than 100% amino acid identity with the parent protein.
  • a variant of an amino acid sequence can be a second amino acid sequence that is at least 50%, 60%, 70%, 80%, 90%, 95%, 98%, 99% or more identical in amino acid sequence compared to the original amino acid sequence.
  • Polypeptide variants include polypeptides comprising the entire parent polypeptide, and further comprising additional fused amino acid sequences. Polypeptide variants also include polypeptides that are portions or subsequences of the parent polypeptide, for example, unique subsequences (e.g., as determined by standard sequence comparison and alignment techniques) of the polypeptides disclosed herein are also encompassed by the invention.
  • polypeptide variants include polypeptides that contain minor, trivial, or inconsequential changes to the parent amino acid sequence.
  • minor, trivial, or inconsequential changes include amino acid changes (including substitutions, deletions, and insertions) that have little or no impact on the biological activity of the polypeptide, and yield functionally identical polypeptides, including additions of a non-functional peptide sequence.
  • the variant polypeptides of the invention change the biological activity of the parent molecule.
  • One of skill will appreciate that many variants of the disclosed polypeptides are encompassed by the invention.
  • polynucleotide or polypeptide variants can include variant molecules that alter, add or delete a small percentage of the nucleotide or amino acid positions, for example, typically less than about 10%, less than about 5%, less than 4%, less than 2% or less than 1%.
  • a “functional variant” of a protein as used herein refers to a variant of such protein that retains at least partially the activity of that protein.
  • Functional variants may include mutants (which may be insertion, deletion, or replacement mutants), including polymorphs, etc. Also included within functional variants are fusion products of such protein with another, usually unrelated, nucleic acid, protein, polypeptide or peptide. Functional variants may be naturally occurring or may be man-made.
  • a variant of a BART protein may include one or more conservative modifications.
  • the BART protein variant with one or more conservative modifications may retain the desired functional properties, which can be tested using the functional assays known in the art.
  • acidic side chains e.g., aspartic acid, glutamic acid
  • uncharged polar side chains e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine, tryptophan
  • nonpolar side chains e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine
  • beta-branched side chains e.g., threonine, valine, isoleucine
  • aromatic side chains e.g., tyrosine, phenylalanine, tryptophan, histidine
  • the BART protein with one or more conservative modifications may retain the desired functional properties, which can be tested using the functional assays known in the art.
  • the percent homology between two amino acid sequences is equivalent to the percent identity between the two sequences.
  • the comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm, as described in the non-limiting examples below.
  • the protein sequences of the present invention can further be used as a “query sequence” to perform a search against public databases to, for example, identify related sequences.
  • Such searches can be performed using the XBLAST program (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10.
  • Gapped BLAST can be utilized as described in Altschul et al. (1997) Nucleic Acids Res. 25(17):3389- 3402.
  • the default parameters of the respective programs e.g., XBLAST and NBLAST
  • the default parameters of the respective programs e.g., XBLAST and NBLAST
  • a variant of a BART protein can be conjugated or linked to a detectable tag or a detectable marker (e.g., a radionuclide, a fluorescent dye, or an MRI-detectable label).
  • the detectable tag can be an affinity tag.
  • affinity tag as used herein, relates to a moiety attached to a polypeptide, which allows the polypeptide to be purified from a biochemical mixture.
  • Affinity tags can consist of amino acid sequences or can include amino acid sequences to which chemical groups are attached by post-translational modifications.
  • affinity tags include His-tag, CBP-tag (CBP: calmodulin- binding protein), CYD-tag (CYD: covalent yet dissociable NorpD peptide), Strep-tag, StrepII-tag, FLAG-tag, HPC-tag (HPC: heavy chain of protein C), GST-tag (GST: glutathione S transferase), Avi-tag, biotinylated tag, Myc-tag, a myc-myc-hexahistidine (mmh) tag 3xFLAG tag, a SUMO tag, and MBP-tag (MBP: maltose-binding protein). Further examples of affinity tags can be found in Kimple etal., Curr Protoc Protein Sci. 2013 Sep 24; 73: Unit 9.9.
  • the detectable tag can be conjugated or linked to the N- and/or C- terminus of a variant of a BART protein.
  • the detectable tag and the affinity tag may also be separated by one or more amino acids.
  • the detectable tag can be conjugated or linked to the variant via a cleavable element.
  • cleavable element relates to peptide sequences that are susceptible to cleavage by chemical agents or enzyme means, such as proteases. Proteases may be sequence-specific (e.g., thrombin) or may have limited sequence specificity (e.g., trypsin).
  • Cleavable elements I and II may also be included in the amino acid sequence of a detection tag or polypeptide, particularly where the last amino acid of the detection tag or polypeptide is K or R.
  • conjugate refers to the attachment of two or more entities to form one entity.
  • a conjugate encompasses both peptide-small molecule conjugates as well as peptide-protein/peptide conjugates.
  • fusion polypeptide or “fusion protein” means a protein created by joining two or more polypeptide sequences together.
  • the fusion polypeptides encompassed in this invention include translation products of a chimeric gene construct that joins the nucleic acid sequences encoding a first polypeptide with the nucleic acid sequence encoding a second polypeptide to form a single open reading frame.
  • a “fusion polypeptide” or “fusion protein” is a recombinant protein of two or more proteins that are joined by a peptide bond or via several peptides.
  • the fusion protein may also comprise a peptide linker between the two domains.
  • linker refers to any means, entity, or moiety used to join two or more entities.
  • a linker can be a covalent linker or a non-covalent linker.
  • covalent linkers include covalent bonds or a linker moiety covalently attached to one or more of the proteins or domains to be linked.
  • the linker can also be a non-covalent bond, e.g., an organometallic bond through a metal center such as a platinum atom.
  • various functionalities can be used, such as amide groups, including carbonic acid derivatives, ethers, esters, including organic and inorganic esters, amino, urethane, urea, and the like.
  • Linker moi eties include, but are not limited to, chemical linker moi eties, or for example, a peptide linker moiety (a linker sequence).
  • the linker can be a peptide linker and a non-peptide linker.
  • the peptide linker may include [Ser(Gly)n]m or [Ser(Gly)n]mSer, where n may be an integer between 1 and 20.
  • the term “non-peptide linker” refers to a biocompatible polymer composed of two or more repeating units linked to each other, in which the repeating units are linked to each other by any non-peptide covalent bond. This non-peptidyl linker may have two ends or three ends.
  • a variant of a BART protein can be fused to a fusion partner through crosslinking with a crosslinking agent, e.g, crosslinker.
  • Crosslinkers are reagents having reactive ends to specific functional groups (e.g., primary amines or sulfhydryls) on proteins or other molecules. Crosslinkers are capable of joining two or more molecules by a covalent bond.
  • a polynucleotide encoding a BART protein may be codon optimized.
  • codon optimization refers to a process of modifying a nucleic acid sequence to enhance expression in the host cells by substituting at least one codon of the native sequence with codons that are more frequently or most frequently used in the genes of that host cell while maintaining the native amino acid sequence.
  • Various species exhibit particular bias for certain codons of a particular amino acid. Codon bias (differences in codon usage between organisms) often correlates with the efficiency of translation of messenger RNA (mRNA), which is, in turn, believed to be dependent on, among other things, the properties of the codons being translated and the availability of particular transfer RNA (tRNA) molecules.
  • mRNA messenger RNA
  • Codon usage tables are readily available, for example, at the “Codon Usage Database” available at www.kazusa.orjp/codon/, and these tables can be adapted in a number of ways. See Nakamura, Y., et al. “Codon usage tabulated from the international DNA sequence databases: status for the year 2000” Nucl. Acids Res. 28:292 (2000).
  • codon optimizing a particular sequence for expression in a particular host cell are also available, such as Gene Forge (Aptagen; Jacobus, Pa.).
  • one or more codons e.g., 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more, or all codons
  • codon usage in yeast reference is made to the online Yeast Genome database available at http://www.yeastgenome.org/community/codonusage.shtml, or Codon selection in yeast, Bennetzen and Hall, J Biol Chem. 1982 Mar. 25; 257(6):3026-31.
  • the term “navigator ssDNA” generally refers to a ssDNA molecule (or a group of DNA molecules collectively) that can bind to a BART protein and target the BART protein to a specific location within a target RNA or DNA.
  • the targeting segment comprises a nucleotide sequence that is complementary to (or at least can hybridize to under stringent conditions) a target sequence.
  • a navigator ssDNA can be any polynucleotide sequence having sufficient complementarity with a target polynucleotide sequence (e.g., target DNA or RNA sequence) to hybridize with the target sequence and direct sequence-specific binding of a BART complex to the target sequence.
  • the degree of complementarity between a guide sequence of the navigator ssDNA and its corresponding target sequence, when optimally aligned using a suitable alignment algorithm is about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or more.
  • a navigator ssDNA sequence is about 3, 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195,
  • the ability of a guide sequence to direct sequence-specific binding of a BART complex to a target sequence may be assessed by any suitable assay.
  • the components of a BART system sufficient to form a BART complex, including the guide sequence to be tested may be provided to a host cell having the corresponding target sequence, such as by transfection with vectors encoding the components of the BART sequence, followed by an assessment of preferential cleavage within the target sequence.
  • cleavage of a target polynucleotide sequence may be evaluated in a test tube by providing the target sequence, components of a BART complex, including the guide sequence to be tested and a control guide sequence different from the test guide sequence, and comparing binding or rate of cleavage at the target sequence between the test and control guide sequence reactions.
  • the navigator ssDNA comprises a synthetic nucleic acid sequence (e.g., synthetic DNA molecule). In some embodiments, the navigator ssDNA comprises one or more modifications.
  • modified nucleotide in the context of an oligonucleotide or polynucleotide includes but is not limited to (a) end modifications, e.g., 5’ end modifications or 3’ end modifications, (b) nucleobase (or “base”) modifications, including replacement or removal of bases, (c) sugar modifications, including modifications at the 2’, 3’, and/or 4’ positions, and (d) backbone modifications, including modification or replacement of the phosphodiester linkages.
  • modified nucleotide generally refers to a nucleotide having a modification to the chemical structure of one or more of the base, the sugar, and the phosphodiester linkage or backbone portions, including nucleotide phosphates.
  • one or more modifications may include 2’-O-methyl moiety, a Z base, a 2’-deoxynucleotide, a phosphorothioate internucleotide linkage, a phosphonoacetate (PACE) internucleotide linkage, a thiophosphonoacetate (thioPACE) internucleotide linkage, or combinations thereof.
  • PACE phosphonoacetate
  • thioPACE thiophosphonoacetate
  • one or more modifications comprise one or more modifications selected from the group consisting of a 2’-O-methyl nucleotide with a 3’- phosphorothioate group, a 2’-O-methyl nucleotide with a 3 ’-phosphonoacetate group, a 2’-O- methyl nucleotide with a 3’-thiophosphonoacetate group, or a 2’-deoxynucleotide with a 3 1 - phosphonoacetate group.
  • target DNA refers to a DNA polynucleotide being or comprising the target sequence.
  • the target DNA may be a DNA polynucleotide or a part of a DNA polynucleotide to which a part of the navigator ssDNA, i.e., the guide sequence, is designed to have complementarity and to which the effector function mediated by the complex comprising BART protein and a navigator ssDNA is to be directed.
  • a target sequence is located in the nucleus or cytoplasm of a cell.
  • the target polynucleotide has no sequence limitation and can be in the coding region of a gene, in an intron of a gene, in a control region between genes, etc.
  • the target polynucleotide is contained in a nucleic acid molecule within a cell or in vitro.
  • the gene can be coding or non-coding.
  • the target polynucleotide can be any polynucleotide endogenous or exogenous to the cell.
  • the target polynucleotide can be a polynucleotide residing in the nucleus of the eukaryotic cell.
  • the target polynucleotide can be a sequence coding a gene product (e.g., a protein) or a non-coding sequence (e.g, a regulatory polynucleotide).
  • the target polynucleotide comprises RNA or DNA.
  • the RNA comprises a viral RNA of a RNA virus.
  • the target polynucleotide comprises a cDNA reverse-transcribed from a viral RNA of a RNA virus.
  • the RNA virus is selected from the group consisting of Norwalk, Rotavirus, Poliovirus, Ebola virus, Marburg virus, Lassa virus, Hantavirus, Rabies, Influenza, Yellow fever virus, Coronavirus, SARS, SARS-CoV-2, West Nile virus, Hepatitis A, C (HCV) and E virus, Dengue fever virus, togaviruses, Rhabdoviruses, Picornaviruses, Myxoviruses, retroviruses, bunyaviruses, coronaviruses, and reoviruses.
  • the target polynucleotide comprises a polynucleotide sequence of SEQ ID NOs: 105, 106, and 113-118 or comprises a polynucleotide sequence having at least 75% (e.g., 75%, 80%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more) sequence identity to any one of SEQ ID NOs: 105, 106, and 1 13-118.
  • 75% e.g., 75%, 80%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more
  • the BART protein forms a complex with a navigator ssDNA that binds to the target polynucleotide.
  • the BART-navigator ssDNA complex comprises a BART protein having the amino acid sequence of SEQ ID NO: 1 and a navigator ssDNA comprising a polynucleotide sequence having at least 75% (e.g., 75%, 80%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more) sequence identity to any one of SEQ ID NOs: 13-96 and 107-112.
  • the BART-navigator ssDNA complex comprises a BART protein having the amino acid sequence having at least 75% (e.g., 75%, 80%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more) sequence identity to any one of SEQ ID NOs: 1 -12 and a navigator ssDNA comprising a polynucleotide sequence of SEQ ID NOs: 13-96 and 107-112.
  • the BART protein forms a complex with a navigator ssDNA operably linked to a payload RNA.
  • the payload RNA comprises a polynucleotide sequence of SEQ ID NOs: 119-133 or comprises a polynucleotide sequence having at least
  • this disclosure further provides a method of treating a disease of a subject caused by a genetic defect in a target sequence.
  • the method comprises administering the system or the composition described above to a cell containing the target sequence in a subject in need thereof and thereby inducing a modification in the target sequence.
  • the target sequence is located at genomic loci of interest.
  • the target sequence is part of a gene, and the modification in the target sequence modulates the expression level of the gene.
  • the modification in the target sequence reduces the expression level of the gene.
  • the method comprises reverse transcribing the viral RNA into a cDNA by the enzyme.
  • the enzyme has an integrase activity
  • the method comprises integrating the cDNA to a genome of the cell or the subject by the enzyme.
  • the method comprises transcribing the cDNA into the singlestranded guide polynucleotide capable of hybridizing with the viral RNA.
  • this disclosure provides a method of enhancing immunity against a viral infection of a RNA virus in a cell or a subject.
  • the method comprises delivering to the cell or the subject an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a single-stranded DNA (ssDNA) to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, wherein the enzyme reverse-transcribes the mRNA into a second ssDNA, wherein the second ssDNA hybridizes with the viral RNA and directs binding of the enzyme to the viral RNA, and wherein the enzyme cleaves the viral RNA.
  • ssDNA single-strand
  • treating or “treatment” of a disease in a subject refers to (1) preventing the symptoms or disease from occurring in a subject that is predisposed or does not yet display symptoms of the disease; (2) inhibiting the disease or arresting its development; or (3) ameliorating or causing regression of the disease or the symptoms of the disease.
  • treatment is an approach for obtaining beneficial or desired results, including clinical results.
  • beneficial or desired results can include one or more, but are not limited to, alleviation or amelioration of one or more symptoms, diminishment of extent of a condition (including a disease), stabilized (i.e., not worsening) state of a condition (including disease), delay or slowing of condition (including disease), progression, amelioration or palliation of the condition (including disease), states and remission (whether partial or total), whether detectable or undetectable.
  • treatment excludes prevention.
  • the disease-causing mutations in patients are either acquired through inheritance from their parents or are caused by environmental factors. These diseases include, but are not limited to, the following categories.
  • Some genetic disorders are caused by germline mutations.
  • cystic fibrosis which is caused by mutations at the CFTR gene inherited from parents.
  • a second suppressor mutation in the mutant CFTR can partially restore the function of CFTR protein in somatic tissues.
  • Other example genetic diseases caused by a point genetic mutation that can be corrected by the disclosed technology include Gaucher’s disease, alpha trypsin deficiency disease, and sickle cell anemia, to name a few.
  • some diseases such as chronic viral infectious diseases, are caused by exogenous environmental factors that result in genetic alterations.
  • AIDS which is caused by insertion of the human HIV viral genome into the genome of infected T-cells.
  • some neurodegenerative diseases involve genetic alterations.
  • One example is Huntington’s disease, which is caused by expansion of CAG tri -nucleotide in the huntingtin gene of affected patients.
  • cancers are caused by various somatic mutations accumulated in cancer cells. Therefore, correcting the disease-causing genetic mutations, or functionally correcting the sequence, provides an appealing therapeutic opportunity to treat these diseases.
  • cystic fibrosis affects one out of every 3,000 people in the US. It is caused by inheritance of a mutated CFTR gene and 70% of the patients have the same mutation, deletion of a tri-nucleotide leading to a deletion of phenylalanine at position 508, which leads to the mislocation and degradation of CFTR.
  • the system and method disclosed in this invention can be used to convert a Vai 509 residue (GTT) to Phe 509 (TTT) in affected tissues (lung), thereby functionally correcting the the Phe 508 mutation.
  • a second suppressor mutation such as R553Q, R553M, or V510D
  • a second suppressor mutation such as R553Q, R553M, or V510D
  • the system and method as disclosed can also be used to specifically inactivate any gene in a viral genome that is incorporated into human cells/tissues.
  • the system and method disclosed in this invention allow one to create a stop codon for early termination of translation of the essential viral genes, and thereby remediate or cure the chronic debilitating infectious diseases.
  • current AIDS therapies can reduce viral load, but cannot totally eliminate dormant HIV from positive T cells.
  • the system and method disclosed herein can be used to permanently inactivate one or two essential HIV gene expression in the integrated HIV genome in human T- cells by introducing one or two stop codons.
  • Another example is the hepatitis B virus (HBV).
  • HBV hepatitis B virus
  • the system and method disclosed here can be used to specifically inactivate one or two essential HBV genes, which are incorporated into the human genome, and silence HBV life cycle.
  • SOD1G93A leads to development of amyotrophic lateral sclerosis (ALS).
  • ALS amyotrophic lateral sclerosis
  • the system and method disclosed in this invention can be used to either correct the mutation or eliminate the mutant protein expression by introducing a stop codon or by changing a splicing site.
  • Dystrophin is a cytoplasmic protein that provides structural stability to the dystroglycan complex of the cell membrane that is responsible for regulating muscle cell integrity and function.
  • the dystrophin gene or “DMD gene” as used interchangeably herein is 2.2 megabases at locus Xp21.
  • the primary transcription measures about 2,400 kb with the mature mRNA being about 14 kb.
  • the target is preferably one or more of the FAS, BID, CTLA4, PDCD1, CBLB, PTPN6, TRAC, or TRBC genes.
  • Cancer may be one or more of lymphoma, chronic lymphocytic leukemia (CLL), B cell acute lymphocytic leukemia (B-ALL), acute lymphoblastic leukemia, acute myeloid leukemia, non-Hodgkin’s lymphoma (NHL), diffuse large cell lymphoma (DLCL), multiple myeloma, renal cell carcinoma (RCC), neuroblastoma, colorectal cancer, breast cancer, ovarian cancer, melanoma, sarcoma, prostate cancer, lung cancer, esophageal cancer, hepatocellular carcinoma, pancreatic cancer, astrocytoma, mesothelioma, head and neck cancer, and medulloblastoma.
  • CLL chronic lymphocytic leukemia
  • B-ALL B cell acute lymphocytic leukemia
  • NDL diffuse large cell lymphoma
  • RRCC renal cell carcinoma
  • neuroblastoma colorectal cancer
  • breast cancer
  • CAR engineered chimeric antigen receptor
  • stem cell or progenitor cell can be genetically modified using the system and method disclosed in this invention.
  • Suitable cells include, e.g., stem cells (adult stem cells, embryonic stem cells, iPS cells, etc.) and progenitor cells (e.g., cardiac progenitor cells, neural progenitor cells, etc.).
  • Suitable cells include mammalian stem cells and progenitor cells, including, e.g., rodent stem cells, rodent progenitor cells, human stem cells, human progenitor cells, etc.
  • Suitable host cells include in vitro host cells, e.g., isolated host cells.
  • the BART system can be used for targeted and precise genetic modification of tissue ex vivo, correcting the underlying genetic defects. After the ex vivo correction, the tissues may be returned to the patients.
  • the technology can be broadly used in cell-based therapies for correcting genetic diseases.
  • transgenic non-human animal or plant having one or more genetic modifications of interest.
  • the transgenic non-human animal is homozygous for the genetic modification. In some embodiments, the transgenic non-human animal is heterozygous for the genetic modification.
  • the transgenic non-human animal is a vertebrate, for example, a fish (e.g., zebrafish, goldfish, pufferfish, cavefish, etc ), an amphibian (frog, salamander, etc.), a bird (e.g., chicken, turkey, etc.), a reptile (e.g., snake, lizard, etc.), a mammal (e.g., an ungulate, e.g., a pig, a cow, a goat, a sheep, etc.; a lagomorph (e.g., a rabbit); a rodent (e.g., a rat, a mouse); or a non-human primate.
  • a fish e.g., zebrafish, goldfish, pufferfish, cavefish, etc
  • an amphibian frog, salamander, etc.
  • a bird e.g., chicken, turkey, etc.
  • a reptile e.g., snake,
  • Suitable methods include viral infection (such as double- stranded DNA viruses), transfection, conjugation, protoplast fusion, electroporation, particle gun technology, calcium phosphate precipitation, direct microinjection, silicon carbide whiskers technology, Agrobacterium-mediated transformation and the like.
  • viral infection such as double- stranded DNA viruses
  • transfection conjugation
  • protoplast fusion electroporation
  • particle gun technology particle gun technology
  • calcium phosphate precipitation direct microinjection
  • silicon carbide whiskers technology Agrobacterium-mediated transformation and the like.
  • the choice of method is generally dependent on the type of cell being transformed and the circumstances under which the transformation is taking place (i.e., in vitro, ex vivo, or in vivo).
  • this disclosure provides a method of treating or preventing a viral infection of a RNA virus in a cell or a subject.
  • the method comprises delivering to the cell or the subject an enzyme that possesses a reverse transcriptase activity and an endonuclease activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a ssDNA, wherein the ssDNA hybridizes with the viral RNA and directs binding of the enzyme to the viral RNA, and wherein the enzyme cleaves the viral RNA.
  • the RNA virus is selected from Norwalk, Rotavirus, Poliovirus, Ebola virus, Marburg virus, Lassa virus, Hantavirus, Rabies, Influenza, Yellow fever virus, Coronavirus, SARS, SARS-CoV-2, West Nile virus, Hepatitis A, C (HCV) and E virus, Dengue fever virus, togaviruses, Rhabdoviruses, Picornaviruses, Myxoviruses, retroviruses, bunyaviruses, coronaviruses, and reoviruses.
  • this disclosure provides a method of enhancing immunity against a viral infection of a RNA virus in a cell or a subject.
  • the method comprises delivering to the cell or the subject an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a ssDNA to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, wherein the enzyme reverse-transcribes the mRNA into a second ssDNA, wherein the second ssDNA hybridizes with the viral RNA and directs binding of the enzyme to the viral RNA, and wherein the enzyme cleaves the viral RNA.
  • the method comprises delivering to the cell or the subject an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a single-stranded DNA (ssDNA) to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, and wherein the mRNA with the viral RNA to form a RNA hybrid to silence the viral RNA.
  • an enzyme reverse-transcribes a viral RNA of the RNA virus into a single-stranded DNA (ssDNA) to generate a cDNA of the viral RNA
  • ssDNA single-stranded DNA
  • the enzyme integrates the cDNA into a host genome of the cell or the subject,
  • the RNA virus is selected from Norwalk, Rotavirus, Poliovirus, Ebola virus, Marburg virus, Lassa virus, Hantavirus, Rabies, Influenza, Yellow fever virus, Coronavirus, SARS, SARS-CoV-2, West Nile virus, Hepatitis A, C (HCV) and E virus, Dengue fever virus, togaviruses, Rhabdoviruses, Picornaviruses, Myxoviruses, retroviruses, bunyaviruses, coronaviruses, and reoviruses.
  • this disclosure also provides a method of generating a cell line with immunity against a viral infection of a RNA virus.
  • the method comprises delivering to a cell an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a ssDNA to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, wherein the enzyme reverse-transcribes the mRNA into a second ssDNA, wherein the second ssDNA hybridizes with the viral RNA and directs binding of the enzyme to the viral RNA, and wherein the enzyme cleaves the viral RNA.
  • the method comprises delivering to a cell an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a single-stranded DNA (ssDNA) to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, and wherein the mRNA with the viral RNA to form a RNA hybrid to silence the viral RNA.
  • ssDNA single-stranded DNA
  • the RNA virus is selected from Norwalk, Rotavirus, Poliovirus, Ebola virus, Marburg virus, Lassa virus, Hantavirus, Rabies, Influenza, Yellow fever virus, Coronavirus, SARS, SARS-CoV-2, West Nile virus, Hepatitis A, C (HCV) and E virus, Dengue fever virus, togaviruses, Rhabdoviruses, Picornaviruses, Myxoviruses, retroviruses, bunyaviruses, coronaviruses, and reoviruses.
  • the BART systems described herein can be delivered to the host cell via one or more vectors, such as viral vectors.
  • the one or more viral vectors may comprise an adenovirus, a lentivirus, adeno-associated virus, or RNA-based viral vectors which may be replication competent or may only encode genes for self-amplification, the later constructs will herein be referred to as replicons.
  • vector refers to a nucleic acid molecule capable of transporting another nucleic acid linked thereto.
  • Vectors include, but are not limited to, nucleic acid molecules that are singlestranded, double-stranded, or partially double-stranded; nucleic acid molecules that comprise one or more free ends, no free ends (e.g., circular); nucleic acid molecules that comprise DNA, RNA, or both; and other varieties of polynucleotides known in the art.
  • plasmid refers to a circular double-stranded DNA loop into which additional DNA segments can be inserted, such as by standard molecular cloning techniques.
  • viral vector Another type of vector is a viral vector, wherein virally derived DNA or RNA sequences are present in the vector for packaging into a virus (e.g., retroviruses, replication defective retroviruses, adenoviruses, replication-defective adenoviruses, adeno-associated viruses, and/ or RNA-based replicons).
  • Viral vectors also include polynucleotides carried by a virus for transfection into a host cell.
  • Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., RNA vectors comprising their own RNA-dependent RNA polymerase, bacterial vectors having a bacterial origin of replication, and episomal mammalian vectors).
  • vectors e.g., non-episomal mammalian vectors
  • Other vectors are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome.
  • certain vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as “expression vectors.”
  • Vectors for and that result in expression in a eukaryotic cell can be referred to herein as “eukaryotic expression vectors.”
  • Common expression vectors of utility in recombinant DNA techniques are often in the form of plasmids.
  • Recombinant expression vectors can comprise a nucleic acid of the invention in a form suitable for expression of the nucleic acid in a host cell, which means that the recombinant expression vectors include one or more regulatory elements, which may be selected on the basis of the host cells to be used for expression, that is operatively linked to the nucleic acid sequence to be expressed.
  • “operably linked” is intended to mean that the nucleotide sequence of interest is linked to the regulatory element(s) in a manner that allows for expression of the nucleotide sequence (e.g., in an in vitro transcript! on/translati on system or in a host cell when the vector is introduced into the host cell).
  • regulatory element is intended to include promoters, enhancers, internal ribosomal entry sites (IRES), and other expression control elements (e.g., transcription termination signals, such as polyadenylation signals and poly-U sequences as well as RNA elements required for recognition by self-encoded RNA dependent RNA polymerases).
  • IRES internal ribosomal entry sites
  • regulatory elements e.g., transcription termination signals, such as polyadenylation signals and poly-U sequences as well as RNA elements required for recognition by self-encoded RNA dependent RNA polymerases.
  • Such regulatory elements are described, for example, in Goeddel, GENE EXPRESSION TECHNOLOGY: METHODS IN ENZYMOLOGY 185, Academic Press, San Diego, Calif. (1990).
  • Regulatory elements include those that direct constitutive expression of a nucleotide sequence in many types of host cell and those that direct expression of the nucleotide sequence only in certain host cells (e.g, tissuespecific regulatory sequences).
  • tissue-specific promoter may direct expression primarily in a desired tissue of interest, such as muscle, neuron, bone, skin, blood, specific organs (e.g., liver, pancreas), or particular cell types (e.g, lymphocytes).
  • Regulatory elements may also direct expression in a temporal-dependent manner, such as in a cell-cycle dependent or developmental stage-dependent manner, which may or may not also be tissue or cell-type specific.
  • a vector comprises one or more pol III promoter (e.g., 1, 2, 3, 4, 5, or more pol III promoters), one or more pol II promoters (e.g., 1, 2, 3, 4, 5, or more pol II promoters), one or more pol I promoters (e.g., 1, 2, 3, 4, 5, or more pol I promoters), or combinations thereof.
  • pol III promoters include, but are not limited to, U6 and Hl promoters.
  • pol II promoters include, but are not limited to, the retroviral Rous sarcoma virus (RSV) LTR promoter (optionally with the RSV enhancer), the cytomegalovirus (CMV) promoter (optionally with the CMV enhancer) [see, e.g., Boshart et al., Cell, 41 :521-530 (1985)], the SV40 promoter, the dihydrofolate reductase promoter, the P-actin promoter, the phosphoglycerol kinase (PGK) promoter, and the EFla promoter.
  • RSV Rous sarcoma virus
  • CMV cytomegalovirus
  • PGK phosphoglycerol kinase
  • enhancer elements such as WPRE; CMV enhancers; the R-U5’ segment in LTR of HTLV-I (Mol. Cell. Biol., Vol. 8(1), p. 466-472, 1988); SV40 enhancer; and the intron sequence between exons 2 and 3 of rabbit P-globin (Proc. Natl. Acad. Sci. USA., Vol. 78(3), p. 1527-31, 1981).
  • WPRE WPRE
  • CMV enhancers the R-U5’ segment in LTR of HTLV-I
  • SV40 enhancer SV40 enhancer
  • the intron sequence between exons 2 and 3 of rabbit P-globin Proc. Natl. Acad. Sci. USA., Vol. 78(3), p. 1527-31, 1981.
  • a vector can be introduced into host cells to thereby produce transcripts, proteins, or peptides, including fusion proteins or peptides, encoded by nucleic acids as described herein (e.g., clustered regularly interspersed short palindromic repeats (BART) transcripts, proteins, enzymes, mutant forms thereof, fusion proteins thereof, etc.).
  • the vector system may include one or more viral vectors.
  • the one or more viral vectors comprise an adenovirus-based vector, a lentivirus- based vector, an adeno-associated virus-based vector, or an RNA-based replicon.
  • the host cell or cell line comprises one or more polynucleotide molecules encoding an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and encoding an integrase activity, and one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with a target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide.
  • Prokaryotic cells that usually lack a nucleus or any other membrane-bound organelles and are divided into two domains, bacteria, and archaea. In addition to chromosomal DNA, these cells can also contain genetic information in a circular loop called on episome. Bacterial cells are very small, roughly the size of an animal mitochondrion. Prokaryotic cells feature three major shapes: rod-shaped, spherical, and spiral. Instead of going through elaborate replication processes like eukaryotes, bacterial cells divide by binary fission. Examples include but are not limited to Bacillus bacteria, E. coli bacterium, and Salmonella bacterium.
  • the cell can be a stem cell.
  • Suitable stem cells include without limit embryonic stem cells, ES-like stem cells, fetal stem cells, adult stem cells, pluripotent stem cells, induced pluripotent stem cells, multipotent stem cells, oligopotent stem cells, unipotent stem cells, and others.
  • the cell is a mammalian cell, or the embryo is a mammalian embryo.
  • the non-human mammal cell may include, but not limited to, primate bovine, ovine, porcine, canine, rodent, Leporidae such as monkey, cow, sheep, pig, dog, rabbit, rat or mouse cell.
  • the plant cell may also be of an algae, tree or production plant, fruit or vegetable (e.g, trees such as citrus trees, e.g., orange, grapefruit or lemon trees; peach or nectarine trees; apple or pear trees; nut trees such as almond or walnut or pistachio trees; nightshade plants; plants of the genus Brassica; plants of the genus Lactuca; plants of the genus Spinacia; plants of the genus Capsicum; cotton, tobacco, asparagus, carrot, cabbage, broccoli, cauliflower, tomato, eggplant, pepper, lettuce, spinach, strawberry, blueberry, raspberry, blackberry, grape, coffee, cocoa, etc.).
  • fruit or vegetable e.g, trees such as citrus trees, e.g., orange, grapefruit or lemon trees; peach or nectarine trees; apple or pear trees; nut trees such as almond or walnut or pistachio trees; nightshade plants; plants of the genus Brassica; plants of the genus Lactuca; plants of the genus
  • this disclosure provides a gene editing system for modifying a target polynucleotide.
  • the method comprises an enzyme that possesses a reverse transcriptase activity and an endonuclease activity, and one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with the target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide.
  • the BART systems disclosed herein or the gene editing systems comprising the disclosed BART -based systems may be delivered via liposomes, particles (e.g., nanoparticles), exosomes, microvesicles, a lipid, a cell-penetrating peptide (CPP) or a gene-gun. Delivery vehicles, particles, nanoparticles, formulations, and components thereof for expression of one or more elements of the aforementioned BART systems are as used in PCT/US2013/074667.
  • the enzyme further possesses an integrase activity.
  • the one or more nucleic acid components a single-stranded navigator DNA.
  • the one or more nucleic acid components further comprise a payload RNA.
  • the payload RNA comprises a stem-loop structure.
  • the enzyme reverse transcribes the payload RNA into a cDNA.
  • the enzyme integrates the cDNA into the target polynucleotide.
  • the enzyme comprises a N-terminal apurinic-apyrimidinic endonuclease domain optionally linked to a reverse transcriptase domain.
  • the enzyme further comprises a C-terminal domain.
  • the C-terminal domain facilitates interaction between the enzyme and the target polynucleotide.
  • the C-terminal domain comprises a zinc finger.
  • the zinc finger comprises a CCHC motif.
  • the enzyme comprises a Bat-Associated Reverse Transcriptase (BART).
  • BART comprises a BART of Rhinolophus ferrumequinum, Myotis myotis, Meles meles, Bos mutus, Capra hircus, Homo sapiens, Canis lupus familiaris, Sus scrofa, Callithrix jacchus Mus musculus, Ursus arclos. Elephas maximus inchcus. or a variant thereof.
  • the gene editing system comprises one or more polynucleotide molecules encoding the enzyme. In some embodiments, the gene editing system comprises one or more polynucleotide molecules encoding or comprising the one or more nucleic acid components. In some embodiments, the one or more polynucleotide molecules comprise one or more vectors. In some embodiments, the enzyme and the one or more nucleic acid components are provided in a single vector.
  • the target polynucleotide comprises RNA or DNA.
  • the RNA comprises a viral RNA of a RNA virus.
  • the RNA virus is selected from the group consisting of Norwalk, Rotavirus, Poliovirus, Ebola virus, Marburg virus, Lassa virus, Hantavirus, Rabies, Influenza, Yellow fever virus, Coronavirus, SARS, SARS-CoV-2, West Nile virus, Hepatitis A, C (HCV) and E virus, Dengue fever virus, togaviruses, Rhabdoviruses, Picomaviruses, Myxoviruses, retroviruses, bunyaviruses, coronaviruses, and reoviruses.
  • the DNA comprises a genomic DNA or a cDNA.
  • the modification of the target polynucleotide comprises cleavage of the target polynucleotide.
  • this disclosure provides a gene editing system comprising one or more vectors, liposomes, particles (e.g., nanoparticles, lipid nanoparticles), exosomes, or microvesicles that include one or more components of BART system, and optionally a pharmaceutically acceptable carrier.
  • a gene editing system comprising one or more vectors, liposomes, particles (e.g., nanoparticles, lipid nanoparticles), exosomes, or microvesicles that include one or more components of BART system, and optionally a pharmaceutically acceptable carrier.
  • this disclosure provides a vector system comprising one or more vectors, wherein the one or more vectors comprise one or more polynucleotide molecules encoding an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, and comprise one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with a target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide.
  • composition refers to a mixture of at least one component useful within the disclosure with other components, such as carriers, stabilizers, diluents, dispersing agents, suspending agents, thickening agents, and/or excipients.
  • the pharmaceutical composition facilitates administration of one or more components of the BART system to an organism.
  • the term “pharmaceutically acceptable” refers to a material, such as a carrier or diluent, which does not abrogate the biological activity or properties of the composition, and is relatively non-toxic, i.e., the material may be administered to an individual without causing undesirable biological effects or interacting in a deleterious manner with any of the components of the composition in which it is contained.
  • the term “pharmaceutically acceptable carrier” includes a pharmaceutically acceptable salt, pharmaceutically acceptable material, composition, or carrier, such as a liquid or solid fdler, diluent, excipient, solvent, or encapsulating material, involved in carrying or transporting a compound(s) of the present invention within or to the subject such that it may perform its intended function. Typically, such compounds are carried or transported from one organ, or portion of the body, to another organ, or portion of the body. Each salt or carrier must be “acceptable” in the sense of being compatible with the other ingredients of the formulation, and not injurious to the subject.
  • materials that may serve as pharmaceutically acceptable carriers include: sugars, such as lactose, glucose, and sucrose; starches, such as corn starch and potato starch; cellulose, and its derivatives, such as sodium carboxymethyl cellulose, ethyl cellulose and cellulose acetate; powdered tragacanth; malt; gelatin; talc; excipients, such as cocoa butter and suppository waxes; oils, such as peanut oil, cottonseed oil, safflower oil, sesame oil, olive oil, corn oil, and soybean oil; glycols, such as propylene glycol; polyols, such as glycerin, sorbitol, mannitol and polyethylene glycol; esters, such as ethyl oleate and ethyl laurate; agar; buffering agents, such as magnesium hydroxide and aluminum hydroxide; alginic acid; pyrogen- free water; isotonic saline; Ring
  • “pharmaceutically acceptable carrier” also includes any and all coatings, antibacterial and antifungal agents, and absorption delaying agents, and the like that are compatible with the activity of one or more components of the invention, and are physiologically acceptable to the subject. Supplementary active compounds may also be incorporated into the compositions.
  • this disclosure provides a delivery system comprising the gene editing system, and the delivery system is adapted to deliver the gene editing system into a cell or a subject.
  • the delivery system comprises nanoparticles or vesicles encapsulating the gene editing system.
  • a “gene delivery vehicle” is defined as any molecule that can carry inserted polynucleotides into a host cell.
  • Examples of gene delivery vehicles are liposomes, micelles biocompatible polymers, including natural polymers and synthetic polymers; lipoproteins; polypeptides; polysaccharides; lipopolysaccharides; artificial viral envelopes; metal particles; and bacteria, or viruses, such as baculovirus, adenovirus and retrovirus, bacteriophage, cosmid, plasmid, fungal vectors and other recombination vehicles typically used in the art which have been described for expression in a variety of eukaryotic and prokaryotic hosts, and may be used for gene therapy as well as for simple protein expression.
  • Stable maintenance typically requires that the introduced polynucleotide either contains an origin of replication compatible with the host cell or integrates into a replicon of the host cell, such as an extrachromosomal replicon (e.g., a plasmid) or a nuclear or mitochondrial chromosome.
  • an extrachromosomal replicon e.g., a plasmid
  • a nuclear or mitochondrial chromosome e.g., a nuclear or mitochondrial chromosome.
  • the reaction components used can be provided in a variety of forms.
  • the components e.g., enzymes, RNAs, probes, and/or primers
  • the components can be suspended in an aqueous solution or as a freeze-dried or lyophilized powder, pellet, or bead. In the latter case, the components, when reconstituted, form a complete mixture of components for use in an assay.
  • the kits of the invention can be provided at any suitable temperature. For example, for storage of kits, it is preferred that they are provided and maintained below 0°C, preferably at or below -20°C, or otherwise in a frozen state.
  • kits can also include packaging materials for holding the container or combination of containers.
  • packaging materials for such kits and systems include solid matrices (e.g., glass, plastic, paper, foil, micro-particles, and the like) that hold the reaction components or detection probes in any of a variety of configurations (e.g., in a vial, microtiter plate well, microarray, and the like).
  • the kits may further include instructions recorded in a tangible form for the use of the components.
  • sample can be a sample of serum, urine plasma, amniotic fluid, cerebrospinal fluid, cells, or tissue. Such a sample can be used directly as obtained from a patient or can be pre-treated, such as by filtration, distillation, extraction, concentration, centrifugation, inactivation of interfering components, addition of reagents, and the like, to modify the character of the sample in some manner as discussed herein or otherwise as is known in the art.
  • sample and biological sample as used herein generally refer to a biological material being tested for and/or suspected of containing an analyte of interest such as antibodies.
  • the sample may be any tissue sample from the subject.
  • the sample may comprise protein from the subject.
  • the terms “decrease,” “reduced,” “reduction,” “decrease,” or “inhibit” are all used herein generally to mean a decrease by a statistically significant amount. However, for avoidance of doubt, “reduced,” “reduction” or “decrease” or “inhibit” means a decrease by at least 10% as compared to a reference level, for example, a decrease by at least about 20%, or at least about 30%, or at least about 40%, or at least about 50%, or at least about 60%, or at least about 70%, or at least about 80%, or at least about 90% or up to and including a 100% decrease (e.g., absent level as compared to a reference sample), or any decrease between 10-100% as compared to a reference level.
  • module is meant to refer to any change in biological state, i.e., increasing, decreasing, and the like.
  • the terms “increased,” “increase” or “enhance” or “activate” are all used herein to generally mean an increase by a statically significant amount; for the avoidance of any doubt, the terms “increased,” “increase” or “enhance” or “activate” means an increase of at least 10% as compared to a reference level, for example, an increase of at least about 20%, or at least about 30%, or at least about 40%, or at least about 50%, or at least about 60%, or at least about 70%, or at least about 80%, or at least about 90% or up to and including a 100% increase or any increase between 10-100% as compared to a reference level, or at least about a 2-fold, or at least about a 3- fold, or at least about a 4-fold, or at least about a 5-fold or at least about a 10-fold increase, or any increase between 2-fold and 10-fold or greater as compared to a reference level.
  • in vitro refers to events that occur in an artificial environment, e. ., in a test tube or reaction vessel, in cell culture, etc., rather than within a multi-cellular organism.
  • in vivo refers to events that occur within a multi-cellular organism, such as a non-human animal.
  • each when used in reference to a collection of items, is intended to identify an individual item in the collection but does not necessarily refer to every item in the collection. Exceptions can occur if explicit disclosure or context clearly dictates otherwise.
  • Bat iPSCs from Rhinolophus ferrumequimim were cultured as described previously (Dejosez et al., 2023, Cell 186, 957-974). Cells were plated on irradiated MEFs and maintained using DMEM/F12 (Gibco, 11330-032), 20% KOSR (Life Technologies, 10828-028), O.
  • lmM NEAA (Gibco, 11140-050), 2mM GlutaMAX (Gibco, 35050-061), 10 U/ml and 10 pg/ml of Pen/Strep (Gibco, 15140122) respectively, 100 ng/ml FGF2 (R&D Systems, 233-FB), 100 ng/ml hSCF (StemCell Technologies, 78062.2), 10 4 U/ml mLIF (Millipore-Sigma, ESG1107), 20 nM Forskolin (Sigma, F6886), and 100 M 2-mercaptoethanol.
  • BEFs Bat Embryonic Fibroblasts
  • Human ES cells (H9) were cultured on Vitronectin XF (Stemcell Technologies, 100-0763) using mTeSRl (Stemcell Technologies, 85850).
  • Immunostaining was performed on p-slides (Ibidi, 80286), where cell lines were cultured until the day of fixation. After washing the cells once with DPBS, they were fixed with Cytofix/ Cytoperm solution (BD, BDB554714) for 20 min at 4 °C.
  • Plasmids containing BARTs were synthesized by GeneScript, using a backbone of pcDNA3.1(+) and a His-tag in the C-terminal region, to allow protein purification. Their 3D structures were predicted by ColabFold (ColabFold vl.5.2-patch: AlphaFold2 using MMseqs2). They were cloned in H9 hES cells using Lipofectamine 3000 (ThermoFisher, L3000001) following manufacturer’s protocol. Selection was performed 48h after transfection using increasing concentrations of Geneticin (ThermoFisher, 10131027), with a final concentration of 150 ng/ml.
  • Cells were infected using vesicular stomatitis virus carrying an eGFP reporter at MOI 0,01 for 24 hours. The supernatant was collected by discarding floating cells and spun down at 300G for 5 min.
  • BART.O and BART.l were purified using Ni-NTA Spin Kit (Qiagen, 31314) using 2 pg of each protein per reaction. In combination with 122 bp oligos (Sigma), 100 ng of dsDNA template was added, and genes B2M and CXCR4 were amplified by using primers in Table 3.
  • 122 bp ssDNA guides were transfected in BART-containing H9 cell lines using Lipofectamine 3000. After 24 hours, genomic DNA was extracted using a Blood & Cell Culture DNA Kit (Qiagen, 13323), and the potential mutated sites were amplified using GoTaq Master Mix (Promega, M7123). Sequences were purified using a PCR Purification Kit (Qiagen, 28104) and subject to Sanger sequencing (GeneWiz).
  • RNA viruses in their capacity to infect mammals (e. ., SARS- CoV-2)
  • a critical feature of any mammalian-centric CRISPR-Cas-like system would be the ability to convert viral genomic RNA into DNA for subsequent processing in the nucleus. Consequently, the presence of reverse transcriptase activity, an enzymatic activity previously unreported to be encoded by any mammalian genome, was investigated.
  • a distinctive characteristic of bat-induced pluripotent stem cell (iPSC) colonies was observed - they exhibited a compact and uniform appearance, and their cytoplasm was populated with minuscule vesicles not observed in iPSCs from other mammalian species.
  • vesicles show a uniform distribution in the cytosol of the cell, and are of lipidic nature.
  • the iPS cells were cultured in lipid-deprived medium E8 and E8 with AlbuMAX for 3 days to understand how the vesicles only persisted in the presence of lipids in the medium.
  • These vesicles were also absent in bat embryonic fibroblasts cultured in a serumcontaining medium.
  • vesicles had a varied content, from viral particles to autophagosome systems.
  • immunostaining was performed using antibodies against a well-known reverse transcriptase (RT) found in the HIV genome (HIV-RT-p5 l/p66). It was found that a large majority of vesicles exhibited strong positive staining, indicating the presence of a reverse transcriptase enzyme. Subsequently, whether these vesicles might also contain single-stranded DNA (ssDNA), the product of reverse transcription, was investigated. Immunostaining with an antibody specifically detecting ssDNA revealed substantial amounts of ssDNA within the cytoplasmic vesicles.
  • ssDNA single-stranded DNA
  • ssDNA generated in the bat iPSCs it was sought to determine its sequence.
  • the approach to this task was two-fold: first, the cytoplasmic DNA was isolated, and libraries of these single-stranded DNAs were prepared for subsequent nextgeneration sequencing analysis. Specifically, a mild hypotonic solution was used to extract the cytoplasmic content without bursting the nucleus. Next, the SMART (Switching Mechanism at 5’ End of RNA Template) technique was applied to generate cDNA from the extracted singlestranded DNA, which effectively avoided traces of RNA (not sure how exactly this worked when making the library).
  • SMART Switching Mechanism at 5’ End of RNA Template
  • the SMART libraries were subjected to next-generation sequencing and then mapped the sequence reads to the R. ferrumequinum bat genome.
  • the resulting data provided several unexpected insights into the characteristics of the ssDNA sequences. Remarkably, a large portion of the sequences could not be mapped to the bat genome. From the total of 116,550,396 sequence reads, 111,728,741 (95.86%) did not align with any known bat sequence. Subsequent attempts to map these sequences using BLAST searches against all publicly available databases failed, indicating these sequences were unlike any known genetic material.
  • a pivotal feature of the CRISPR-Cas defense mechanism is the spatial proximity of the defense array to the gene encoding the enzyme responsible for the system (in cis arrangement).
  • ORF open reading frame
  • BLAST search was performed. Although no known gene entries matched the sequence, the overall structure displayed resemblance to other proteins known to exhibit reverse transcriptase activity. This observation again draws parallels to the CRISPR-Cas system. This enzyme is found in proximity to the interspersed viral sequences located within heterochromatic islands of the bat genome, underscoring the potential relevance of this newly discovered system.
  • these enzymatic features are consistent with the observations, which include a specific reverse transcriptase activity for mRNA-to-cDNA conversion, a nuclease function for creating nicks in the DNA, and an integration activity for incorporating new sequences into the genome (AlphaFold 3D structure).
  • the enzyme based on sequence homology, the enzyme presents a robust mechanistic framework that can account for the unique genomic events that have been uncovered.
  • BART BART-based system
  • the BART-expressing cell lines were infected with VSV at a low MOI (0.01).
  • MOI 0.01
  • the presence of viral cDNA in the cytoplasm was confirmed. Not only was it present, but it was also processed and chopped into shorter fragments.
  • This observation indicates that the BART system was operational and active against viral threats. But the most surprising revelation lay in the fate of these short fragments. As shown by genomic PCR and sequencing, they were not aimlessly floating in the cytoplasm. Instead, they had been integrated into interspersed arrays in similar islands, as observed for more ancient genomic sequences.
  • ssDNA oligonucleotides were designed to target specific sites within the genome.
  • 122 bp oligonucleotides were used, targeting the genes (e.g., B2M, CXCR4, VEGFA, CA2, KRAS, DYRK1A, HPRT1, and DMD) described in Saito M., et al. (Saito M., et al. Nature 620, 660-668 (2023)), since they are proto-oncogenes involved in the immune system, some of which have antiviral capacities, and important for development.
  • human cells were transfected with these oligonucleotides in concert with BART.
  • ssDNA sequence present in the cytoplasm revealed that the ssDNA was primarily derived from retroviral and transposon sequences. These sequences primarily aligned to unplaced scaffolds in the bat genome, genomic islands often embedded in highly repetitive heterochromatin. This unanticipated finding highlighted the existence of specific genomic arrays harboring viral sequences, analogous to bacterial CRISPR arrays. Furthermore, a unique pattern of ssDNA insertion into these genomic islands, resembling the precision and functionality of the CRISPR-Cas system, was identified.
  • the interspersed ssDNA arrays were coupled with an open reading frame (ORF) encoding a reverse transcriptase, which is in close proximity to these arrays — again, reminiscent of CRISPR-Cas.
  • VSV Vesicular Stomatitis Virus
  • the BART protein exhibits a multi-domain architecture that underpins its diverse functionalities.
  • the protein’s core houses a reverse transcriptase (RT) domain, essential for converting RNA into DNA, a hallmark of retroviral activity.
  • RT reverse transcriptase
  • BART also harbors an endonuclease domain, capable of cleaving DNA at specific sites. This endonuclease activity facilitates the integration of newly synthesized DNA into the host genome.
  • RNA sequences or structural motifs The presence of a zinc finger domain within the BART protein indicates its potential to interact directly with specific RNA sequences or structural motifs. Additionally, a stem-loop forming RNA was identified that binds to BART, acting as a guide or scaffold to facilitate the recognition and binding of viral RNA.
  • the target sequences preferred by BART in the genome appear to require double-stranded DNA adjacent to single-stranded DNA during the cell cycle, indicating a cell cycle-dependent aspect to the targeting mechanism.
  • the interplay between these elements - the zinc finger domain, the stem-loop RNA, and the specific DNA context - contributes to BART’s ability to selectively recognize and target viral RNA, enabling its antiviral and geneediting functions.
  • BART integrates viral cDNA into the host genome involves a series of orchestrated steps that echo the CRISPR-Cas system in prokaryotes.
  • the journey begins when BART encounters viral RNA. Leveraging its reverse transcriptase activity, BART meticulously transcribes the RNA into complementary DNA (cDNA).
  • cDNA complementary DNA
  • This cDNA then undergoes a processing step, yielding short fragments of approximately 122 base pairs. The precision of this processing ensures that only specific segments of the viral genome are earmarked for integration, a feature that contributes to the system’s specificity.
  • the processed cDNA fragments are then seamlessly woven into the host genome at precise locations within unplaced scaffolds, typically nestled within heterochromatin-rich regions.
  • the integration sites exhibit a distinctive pattern of interspersed arrays, bearing a striking resemblance to the CRISPR arrays observed in bacteria.
  • the culmination of this process is the formation of cDNA arrays within the host genome, serving as a ‘genomic memory’ of past viral encounters.
  • the integrated cDNA is subsequently transcribed into RNA, which then guides BART to target and cleaves homologous viral RNA during subsequent infections.
  • This intricate mechanism not only confers a form of adaptive immunity but also lays the foundation for BART’s use as a powerful gene-editing tool.
  • BART gene-editing capabilities are centered on its ability to harness reverse transcription, converting RNA into DNA and integrating this newly synthesized DNA into the host genome.
  • BART uses RNA as a template, offering a unique approach to gene manipulation. This RNA-centric method expands the possibilities for gene editing, allowing for the insertion of diverse genetic modifications and the modulation of gene expression at both DNA and RNA levels.
  • BART s dual functionality — targeting both genomic DNA and mRNA transcripts — provides a powerful tool for gene correction and transient gene regulation, expanding the scope of therapeutic interventions.
  • Data indicate that BART achieves higher on-target editing rates with fewer off-target effects than CRISPR-Cas9, positioning it as a next-generation gene-editing technology.
  • BART as a versatile and highly effective gene-editing tool, with the capability to introduce a wide range of genetic modifications, including insertions, deletions, and substitutions, at both the DNA and RNA levels.
  • This versatility combined with its efficiency and specificity, underscores BART’s use for therapeutic applications, particularly in the correction of disease-causing mutations in human cells.
  • BART classification within the LI clade of non-LTR retrotransposons is strongly supported by the structural and phylogenetic characteristics of its RT domain.
  • the application of advanced phylogenetic tools and comprehensive sequence analysis has validated BART’s placement within this ancient clade, offering valuable insights into its evolutionary history.
  • This classification also indicates functional parallels between BART and other well- characterized RT elements within the LI clade, highlighting its significance in the broader context of genomic evolution.
  • the structural prediction of the BART protein was conducted using the AlphaFold Protein Structure Database, a deep learning-based tool developed by DeepMind. The process began with the retrieval of the BART amino acid sequence from the sequencing data. The BART sequence was compared against a comprehensive database of known protein structures and sequences, including those from related reverse transcriptase (RT) domains ( Figure 3).
  • AlphaFold generated a three-dimensional (3D) structural model of BART by predicting the spatial arrangement of amino acids within the protein, focusing on the highly conserved RT domain.
  • the model provided by AlphaFold was accompanied by a confidence score for each residue, indicating the reliability of the predicted positions within the structure.
  • the structural model was refined by energy minimization to resolve any steric clashes or unfavorable interactions predicted by AlphaFold.
  • the resulting model was visualized and analyzed using PyMOL, a molecular visualization system, where key structural features such as the fingers, palm, and thumb subdomains were identified. The conserved residues within the active site were mapped, and the overall structural integrity of the BART RT domain was assessed.
  • F605 is a highly conserved residue within the RT domain of BART and plays an important role in its enzymatic function. Positioned within the active site, F605 serves as a gatekeeper, with its aromatic side chain providing a structural barrier that selectively excludes ribonucleotides from entering the active site. This exclusion is important for ensuring that BART functions as a DNA- dependent RNA polymerase, exclusively synthesizing DNA from RNA templates. The presence of F605 effectively prevents RNA-dependent RNA polymerization, a process that could interfere with the fidelity of reverse transcription.
  • BART apurinic- apyrimidinic endonuclease
  • Figure 3 The N-terminal region of BART contains an apurinic- apyrimidinic endonuclease (APE) domain, which is important for the DNA cleavage necessary for integration.
  • APE apurinic- apyrimidinic endonuclease
  • BART s APE domain also includes a DNase I-like motif, a feature absent in closely related retrotransposons, suggesting an evolutionary refinement of BART’s DNA cleavage mechanism. This refinement could contribute to BART’s enhanced specificity and efficiency in gene editing applications, distinguishing it from other elements within the non-LTR retrotransposon family.
  • the BART protein features a distinctive “tower” region spanning amino acids 240-440.
  • This region includes several subdomains: a baseplate (residues 254-300), tower helices (301-370), a tower lock (374-382), and a PIP box (404-419).
  • the presence of the region indicates potential roles in protein-protein interactions, RNA binding, and the regulation of BART’s activity.
  • the tower region may contribute to the stability and assembly of the BART complex, facilitating its interaction with target DNA and RNA molecules.
  • the PIP box which is known to mediate interactions with proliferating cell nuclear antigen (PCNA) — a key player in DNA replication and repair — implies a possible link between BART’s activity and the host cell cycle. This connection further underscores the intricate integration of BART with cellular processes, thereby enhancing its efficiency in gene editing and antiviral defense.
  • PCNA proliferating cell nuclear antigen
  • CTD C-terminal domain
  • the C-terminal domain (CTD) of BART while retaining some structural features from its retrotransposon ancestors, exhibits distinct adaptations that underscore its evolutionary divergence and specialized role in the host cell.
  • the “wrist” region located immediately downstream of the reverse transcriptase (RT) domain and spanning amino acids 863 to 1061, remains relatively conserved when compared to LI retrotransposons. This conservation suggests that the wrist region continues to play an important role in BART’s function, possibly facilitating interactions with target DNA or other cellular factors during the integration process. The preservation of this region highlights the evolutionary constraints imposed on this structural element, indicating its role in both retrotransposons and their domesticated counterparts like BART.
  • the CTD of BART extending from amino acid 1062 to 1275, exhibits significant structural divergence from its retrotransposon ancestors, most notably in the absence of the RNase H domain.
  • the RNase H domain is integral to the replication process, playing an important role by degrading the RNA template following reverse transcription, thereby facilitating the synthesis of the complementary DNA strand.
  • the absence of this domain in BART suggests a profound departure from the conventional retroelement lifecycle, indicating that BART may no longer rely on the classical mechanism of RNA template degradation.
  • the CTD of BART exhibits notable modifications in key amino acid residues when compared to LI. While the CCHC motif is conserved, other residues that are crucial for the catalytic activity of the LI endonuclease domain are absent or altered in BART. These modifications further support the idea that BART’s CTD has undergone adaptive changes, potentially fine-tuning its function for its new role within the host cell. The absence of certain catalytic residues may indicate a shift away from traditional endonuclease activity, suggesting that BART has evolved to fulfill alternative functions in DNA repair, gene regulation, or targeted integration.
  • the CTD of BART also exhibits significant truncation when compared to its LI retrotransposon ancestor, resulting in the loss of several alpha-helices and key amino acid residues that are essential for retrotransposition.
  • This truncation is clearly visible in structural analyses, revealing a “stump” that effectively terminates at the zinc finger domain.
  • the absence of these structural elements and crucial residues significantly impairs BART’s ability to autonomously mobilize and replicate, a defining characteristic of active retrotransposons.
  • the loss of these retrotranspositional capabilities aligns with BART’s evolutionary shift toward domestication, where it has been repurposed for new cellular roles. In this context, the ability to move within the genome could be detrimental, potentially leading to genomic instability or harmful mutations.
  • This evolutionary refinement illustrates how BART has shed redundant or potentially harmful functions while acquiring adaptations that enhance its utility in the host genome.
  • BART.O and BART.l Two variants of the BART enzyme, designated BART.O and BART.l, were cloned into a human expression vector and subsequently stably expressed in human cell lines. The expression of these bat-derived enzymes in human cells was closely monitored to assess any potential impact on cell viability and proliferation. Remarkably, the introduction and sustained expression of BART.O and BART. l did not adversely affect the growth or viability of the human cells, indicating that the enzymes are well-tolerated within a human cellular environment.
  • the antiviral potential of the BART system was tested using human cell lines engineered to stably express two variants of the BART enzyme, BART.O and BART.l . Strikingly, overexpression of BART conferred partial immunity against Vesicular Stomatitis Virus (VSV) infection, underscoring its use as a novel antiviral tool. Following infection with VSV, BART- expressing cells exhibited a marked reduction in viral load, indicating that BART actively participates in the cellular defense against viral pathogens ( Figure 2).
  • VSV Vesicular Stomatitis Virus
  • viral cDNA was detected in the cytoplasm of BART- expressing cells post-infection, indicating that BART facilitates the reverse transcription of viral RNA into cDNA.
  • This viral cDNA was subsequently processed into shorter fragments, which were then integrated into the host genome in an interspersed array pattern reminiscent of endogenous retroelements. This pattern of integration indicates that BART not only inhibits viral replication but also incorporates viral sequences into the host genome as a form of genomic memory.
  • BART acts as a defense mechanism against viral infections, providing both immediate protection and a form of genomic memory that could confer long-term immunity.
  • BART gene-editing potential was evaluated by introducing it into human cells along with specifically designed single-stranded DNA (ssDNA) oligonucleotides and an RNA payload. These oligonucleotides acted as “navigators,” directing BART to specific genomic loci, while the RNA payload provided the template for precise genetic modifications.
  • the system s efficacy and specificity were assessed through sequencing, which revealed successful on-target editing without the need for a CRTSPR-Cas system or its components.
  • the process began by transfecting HEK293 cells, cultured in DMEM supplemented with 10% FBS, 1% Penicillin-Streptomycin, and 2 mM L-glutamine, with a BART- encoding plasmid using Lipofectamine 3000.
  • Cells were transfected at 70-80% confluency, with the plasmid complexed with the reagent according to the manufacturer’s protocol.
  • cells were incubated for 24-48 hours at 37°C to allow for protein expression, which was monitored using the eGFP reporter gene. Afterward, cells were grown for an additional 48-72 hours to maximize BART expression.
  • the cells were then harvested by trypsin-EDTA treatment and centrifugation, followed by washing the cell pellet with phosphate-buffered saline (PBS) to remove residual media and debris.
  • PBS phosphate-buffered saline
  • the BART protein was isolated from the cell lysate using affinity chromatography, leveraging a His-tag incorporated in the BART expression construct.
  • the lysate was passed through a nickel-NTA agarose column, and after washing away unbound proteins, BART was eluted with a buffer containing 250 mM imidazole.
  • the purified BART protein was then assessed for concentration and purity via SDS-PAGE and Western blotting.
  • in vitro assay targeting specific sequences within the HPRT and HSF1 loci was conducted using two distinct navigator DNAs.
  • the target sequences and corresponding navigator sequences are detailed in Tables 4 and 5.
  • the experiment began by preparing navigator-target DNA complexes. Navigator DNA and target DNA were combined in equimolar amounts (typically 100 nM each) in a reaction buffer containing 10 mM Tris-HCl (pH 7.5), 50 mMNaCl, and 1 mM EDTA. The mixture was heated to 95°C for 5 minutes to denature the DNA, followed by a gradual cooling to room temperature over 30 minutes to facilitate proper hybridization between the navigator and target DNA strands.
  • the reaction was assembled by adding 1 pL of Navigator DNA (100 nM final concentration), 1 pL of Target DNA, 0.5 pL of BART enzyme, and 1 pL of 10X BART Reaction Buffer (100 mM Tris-HCl, 500 mM NaCl, 10 mM MgC12, pH 7.9) to the annealed DNA mixture, with nuclease-free water added to bring the final volume to 10 pL.
  • the reaction mixture was incubated at 37°C for 30 minutes to allow BART to mediate the cleavage at the target sites.
  • strand-displacing polymerase 1 pL was added to extend the 3’ end of the nicked strand, followed by the addition of flap endonuclease 1 (FEN1) to create a double-stranded break at the target site.
  • FEN1 flap endonuclease 1
  • Table 4 Target Sequences for In vitro Cleavage Assay. Summary of the specific DNA sequences within the HPRT and HSF1 loci used as targets for the BART-mediated cleavage assay including 250 nucleotides upstream and downstream of the cleavage site (underlined).
  • Table 5 Navigator DNA Sequences for In vitro Cleavage Assay. List of navigator DNA sequences designed to hybridize with the target sequences and direct BART activity.
  • navDNAs navigator DNAs
  • PBS primer-binding site
  • plRNA payload RNA
  • the PBS typically ranging from 13-17 nucleotides in length, was designed to anneal upstream of the editing site, facilitating the initiation of reverse transcription by the BART enzyme.
  • the plRNA detailed in Table 7, was constructed to include a short disruptive sequence, followed by an insertion sequence encoding the enhanced green fluorescent protein (EGFP) gene. This design aimed to ensure that the BART enzyme accurately targeted and modified the desired loci.
  • EGFP enhanced green fluorescent protein
  • the navDNAs were synthesized using GenScript’s biosynthesis platform to maintain high fidelity and prevent any sequence errors that could compromise the editing process.
  • GenScript GenScript
  • This strategic design of the navDNAs and plRNAs was intended to allow BART to effectively and accurately modify the target sequences, facilitating successful gene editing outcomes.
  • Table 6 Targeting Sequences for navDNAs. Summary of the 20-nucleotide targeting sequences selected for the HPRT1 and HSF1 loci, including the specific BART target sites.
  • Table 7 plRNA Sequences for Gene Editing. Details of the plRNA sequences designed for the BART editing process, including the disruptive and EGFP insertion sequences.
  • the next step involved cloning the BART enzyme coding sequences into a plasmid vector.
  • the BART enzyme was placed under the control of the cytomegalovirus (CMV) promoter, which is known for its strong and constitutive expression in mammalian cells, to ensure robust production of the enzyme.
  • CMV cytomegalovirus
  • the plasmid included a neomycin resistance gene to enable the selection of successfully transfected cells.
  • the cloning process began with the digestion of both the vector and insert DNA using EcoRI and Hindlll restriction enzymes in NEBuffer 2 (10 mM Tris-HCl, 10 mM MgC12, 50 mM NaCl, pH 7.9) at 37°C for 1 hour.
  • Transformed cells were plated on LB agar plates containing 50 pg/mL neomycin and incubated overnight at 37°C. Positive clones were selected, and plasmid DNA was extracted using a Qiagen Plasmid Plus Midi Kit. The correct insertion of BART sequences was then verified through Sanger sequencing (see Figure 4 for vector maps).
  • HEK293T cells a human embryonic kidney cell line commonly used for transfection experiments due to their high transfection efficiency and robust growth, were selected for this study.
  • the cells were seeded at a density of 2.5 x 1 CP cells per well in a 6-well plate containing 2 mL of Dulbecco’s Modified Eagle Medium (DMEM) supplemented with 10% fetal bovine serum (FBS), 1% Penicillin-Streptomycin, and 2 mM L-glutamine.
  • DMEM Modified Eagle Medium
  • FBS fetal bovine serum
  • Penicillin-Streptomycin 1% Penicillin-Streptomycin
  • 2 mM L-glutamine 2 mM L-glutamine
  • the BART plasmid DNA (1-2 pg per well) and navDNA were diluted in 250 pL of Opti-MEM medium, a reduced-serum medium that enhances transfection efficiency.
  • 2.5 pL of Lipofectamine 3000 reagent was added to the diluted DNA.
  • the DNA and Lipofectamine 3000 mixture was gently mixed and incubated at room temperature for 15-20 minutes to allow the formation of DNA-lipid complexes, which are essential for the efficient delivery of DNA into the cells.
  • the HEK293T cells were pre-washed with 1 mL of sterile phosphate-buffered saline (PBS) to remove any residual growth medium that could interfere with the transfection process. After washing, 2 mL of fresh Opti-MEM medium was added to each well. The DNA-lipid complexes were then carefully added dropwise to the cells, ensuring even distribution across the well. The cells were gently rocked back and forth to facilitate uniform exposure to the transfection complexes.
  • PBS sterile phosphate-buffered saline
  • the cells were incubated at 37°C in a humidified incubator with 5% CO2 for 4-6 hours to allow for optimal uptake of the plasmid DNA. This incubation period is important as it provides sufficient time for the DNA-lipid complexes to fuse with the cell membrane, allowing the plasmid DNA to enter the cells. Following the initial transfection period, the medium containing the transfection reagent was carefully aspirated and replaced with 2 mL of fresh DMEM supplemented with 10% FBS, 1% Penicillin-Streptomycin, and 2 mM L-glutamine. The cells were then incubated for an additional 48-72 hours to allow for optimal expression of the BART enzyme and navDNAs.
  • Neomycin was added to the culture medium 24 hours post-transfection. Neomycin selection was maintained for 5-7 days, during which the cells were monitored daily. Dead cells were removed by gently washing with PBS, and the medium was replaced with fresh neomycin-containing DMEM every 2-3 days to ensure continuous selection pressure. The surviving cells were expanded for further analysis. The efficiency of the transfection and selection process was later confirmed through fluorescence microscopy (for eGFP expression) and PCR analysis of genomic DNA extracted from the selected cell populations, verifying the presence and expression of the BART and navDNA constructs.
  • genomic DNA was extracted from the transfected HEK293T cells using a Qiagen DNeasy Blood & Tissue Kit, following the manufacturer’s protocol.
  • the cells were first harvested by trypsinization, and the cell pellet was washed with phosphate-buffered saline (PBS) to remove any remaining culture medium.
  • Genomic DNA was then extracted by lysing the cells with Proteinase K and a lysis buffer provided in the kit, followed by binding the DNA to a silica membrane in a spin column. The membrane was washed with a series of ethanol-containing buffers to remove contaminants, and the DNA was eluted in nuclease-free water.
  • the concentration and purity of the extracted DNA were measured using a NanoDrop spectrophotometer, with an A260/A280 ratio of approximately 1.8 indicating high-quality DNA suitable for downstream applications.
  • PCR primers were designed to flank the regions targeted for editing within the HPRT1 and HSF1 loci (see Table 8 for primer sequences). These primers were synthesized to amplify both the unmodified and modified sequences, allowing for the detection of successful gene editing events.
  • the PCR reaction was set up in a 50 pL volume containing IX ThermoPol Reaction Buffer (20 mM Tris-HCl, 10 mM (NH4)2SO4, 10 mM KC1, 2 mM MgSO4, 0.1% Triton X-100, pH 8.8), 200 pM of each dNTP, 0.2 pM of each primer, 1.25 U of Taq DNA polymerase, and 100 ng of template DNA.
  • the PCR cycling conditions included an initial denaturation step at 95°C for 3 minutes to fully denature the DNA, followed by 35 cycles of 95°C for 30 seconds, 58°C for 30 seconds (for primer annealing), and 72°C for 1 minute (for DNA extension). A final extension step at 72°C for 5 minutes ensured complete synthesis of all PCR products.
  • the resulting PCR products were analyzed by electrophoresis on a 1.5% agarose gel containing 0.5 pg/mL ethidium bromide in IX TAE buffer (40 mM Tris-acetate, 1 mM EDTA, pH 8.3). The gel was run at 100V for 45 minutes to separate the DNA fragments by size. Visualization was performed using a UV transilluminator, where successful insertion of the EGFP sequence was indicated by the presence of a PCR product approximately 720 bp larger than the wild-type amplicon, corresponding to the size of the EGFP insertion. A comparison of the band sizes on the gel confirmed the presence of the desired genetic modification.
  • PCR products were purified using a Qiagen PCR Purification Kit, which involved binding the DNA fragments to a silica membrane in a spin column, washing with ethanol-based buffers, and eluting the purified DNA in nuclease-free water.
  • the purified products were then sequenced using the BigDye Terminator v3.1 Cycle Sequencing Kit on an ABI 3730 DNA Analyzer. Sequencing reactions were prepared by combining the purified PCR product, sequencing primers, and BigDye reagent in a thermal cycler, following the manufacturer’s instructions. Sequence analysis was performed using Geneious software, where the edited sequences were aligned against the wild-type reference sequence to confirm the precise integration of the EGFP gene.
  • EGFP expression in the transfected HEK293T cells was analyzed using both fluorescence microscopy and flow cytometry. After the transfection and selection period, the cells were gently washed three times with phosphate-buffered saline (PBS) to remove any residual culture medium and dead cells. The cells were then fixed with 4% paraformaldehyde in PBS for 10 minutes at room temperature to preserve cellular structures and fluorescent signals. Following fixation, the cells were washed three more times with PBS to remove any remaining paraformaldehyde.
  • PBS phosphate-buffered saline
  • VECTASHIELD HardSet Antifade Mounting Medium with DAPI a nuclear counterstain that fluoresces blue under UV light, to allow for the visualization of cell nuclei.
  • Fluorescent images were captured using a Zeiss Axio Observer fluorescence microscope equipped with an EGFP filter set (excitation at 488 nm, emission at 530 nm) to specifically detect EGFP expression. Multiple fields of view were imaged to ensure representative sampling of the cell population. Image analysis was conducted using ImageJ software, where the presence of EGFP-positive cells was quantified to confirm successful gene editing at the targeted loci. The images were analyzed for both the intensity of EGFP fluorescence and the number of EGFP-positive cells, providing qualitative evidence of gene editing success.
  • the transfected cells were harvested by trypsinization.
  • the cells were incubated with 0.25% trypsin-EDTA at 37°C for 2-3 minutes to detach them from the culture plate, followed by neutralization with DMEM containing 10% FBS.
  • the cell suspension was then passed through a 40 pm cell strainer to obtain a single-cell suspension, which was critical for accurate flow cytometry analysis.
  • the cells were resuspended in 500 pL of ice-cold PBS to maintain cell viability and integrity during analysis.
  • the prepared cell suspension was analyzed using a BD FACSCanto II flow cytometer.
  • EGFP expression was detected by exciting the cells at 488 nm and measuring the emission at 530 nm. Data acquisition was set to collect at least 10,000 events per sample to ensure statistically significant results.
  • the percentage of EGFP-positive cells was determined using FlowJo software, which allowed for gating of the cell population and precise quantification of EGFP expression. This analysis provided a robust indication of the efficiency of the Prime Editing process, with a high percentage of EGFP-expressing cells confirming the successful introduction of the desired genetic modifications by the BART enzyme.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Biomedical Technology (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Medicinal Chemistry (AREA)
  • Oncology (AREA)
  • Virology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Mycology (AREA)
  • Transplantation (AREA)
  • Cell Biology (AREA)
  • Communicable Diseases (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Developmental Biology & Embryology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Enzymes And Modification Thereof (AREA)

Abstract

The Bat-Associated Reverse Transcriptase (BART), as disclosed herein, possesses a unique feature found in bat induced pluripotent stem cells (iPSCs) with the ability to convert viral genomic RNA into DNA. The disclosed BART system represents a next-generation tool for genome and RNA editing, exhibiting high efficiency and specificity, potentially outperforming current CRISPR-Cas systems. This disclosure also presents a significant stride towards addressing the most pressing health challenges, including pandemics caused by RNA viruses and gene editing for therapy.

Description

BAT-ASSOCIATED REVERSE TRANSCRIPTASE AND METHODS OF USE THEREOF
REFERENCE TO A SEQUENCE LISTING
This application contains a Sequence Listing which has been submitted electronically in xml format and is hereby incorporated by reference in its entirety. Said xml copy, created on September 17, 2024, is named “Sequence Listing_084284.00298. xml” and is 162,353 bytes in size.
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims priority under 35 U.S.C. §119(e) to U.S. Provisional Patent Application No. 63/583,648, filed September 19, 2023. The foregoing application is incorporated by reference herein in its entirety.
FIELD OF THE INVENTION
This disclosure relates generally to a bat-associated reverse transcriptase and methods of use thereof.
BACKGROUND OF THE INVENTION
The broader context of this disclosure lies in the ongoing evolutionary arms race between organisms and viruses. This constant battle has driven the development of diverse and sophisticated antiviral defense mechanisms across all domains of life. The CRISPR-Cas system, an adaptive immune mechanism employed by prokaryotes, has emerged as a groundbreaking tool in the field of gene editing. Its capacity to precisely target and modify specific DNA sequences has revolutionized biomedical research, enabling scientists to manipulate genes with unprecedented accuracy and efficiency. The advent of CRISPR-Cas has not only accelerated the understanding of gene function but has also paved the way for innovative gene therapies, offering potential solutions for a wide range of genetic disorders. The current gene-editing landscape, while significantly advanced by CRISPR-Cas systems, is not without its limitations. The primary concern lies in the potential for off-target effects, where the system inadvertently modifies unintended DNA sequences, leading to unforeseen consequences. Additionally, the delivery of these bacterial -derived systems into mammalian cells can be challenging, and their prolonged presence may trigger immune responses. The inherent incompatibility between prokaryotic CRISPR-Cas systems and the intricate machinery of eukaryotic cells further underscores the need for alternative gene-editing tools that are better suited to the mammalian environment. The development of such tools could potentially enhance the precision and safety of gene editing, paving the way for more effective and ethically sound therapeutic applications.
The limitations of current CRISPR-Cas systems, particularly their struggles against rapidly evolving RNA viruses and potential incompatibility with mammalian cellular machinery, highlight the pressing need for a mammalian equivalent. The discovery of such a system could not only offer a more tailored and effective approach to gene editing in humans but also unveil the unique antiviral strategies employed by mammals, potentially leading to breakthroughs in combating viral infections. Thus, there is a strong need for improved genetic editors to address the shortcomings of existing tools, enabling more precise gene editing with reduced off-target effects and paving the way for safer and more efficient gene therapies.
SUMMARY OF THE INVENTION
This disclosure addresses the need mentioned above in a number of aspects. In one aspect, this disclosure provides a method of modifying a target polynucleotide. In some embodiments, the method comprises delivering to the target polynucleotide an enzyme that possesses a reverse transcriptase activity and an endonuclease activity, and one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with the target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide.
In another aspect, this disclosure provides a method of modifying expression of a target polynucleotide. In some embodiments, the method comprises: introducing into a cell or a subject an enzyme that possesses a reverse transcriptase activity and an endonuclease activity or a nucleic acid molecule encoding the enzyme, and one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with the target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme binds to one or more locations on the target polynucleotide such that binding of the enzyme increases or decreases expression level of the target polynucleotide.
In some embodiments, the enzyme further possesses an integrase activity. In some embodiments, the one or more nucleic acid components comprise a single-stranded navigator DNA. In some embodiments, the one or more nucleic acid components further comprise a payload RNA. In some embodiments, the payload RNA comprises a stem-loop structure.
In some embodiments, the enzyme reverse transcribes the payload RNA into a cDNA. In some embodiments, the enzyme integrates the cDNA into the target polynucleotide.
In some embodiments, the enzyme comprises a N-terminal apurinic-apyrimidinic endonuclease domain optionally linked to a reverse transcriptase domain. In some embodiments, the enzyme further comprises a C-terminal domain. In some embodiments, the C-terminal domain facilitates interaction between the enzyme and the target polynucleotide. In some embodiments, the C-terminal domain comprises a zinc finger. In some embodiments, the zinc finger comprises a CCHC motif.
In some embodiments, the enzyme comprises a Bat-Associated Reverse Transcriptase (BART). In some embodiments, the BART comprises a BART of Rhinolophus femimequimim, Myotis myotis Meles meles, Bos mutus, Capra hircus. Homo sapiens, Ca is lupus familiaris, Sus scrofa, Callithrix jacchus, Mus musculus, Ursus arctos, Elephas maximus indicus, or a variant thereof.
In some embodiments, the enzyme is provided through one or more polynucleotide molecules encoding the enzyme. In some embodiments, the one or more nucleic acid components are provided through one or more polynucleotide molecules encoding or comprising the one or more nucleic acid components. In some embodiments, the one or more polynucleotide molecules comprise one or more vectors. In some embodiments, the enzyme and the one or more nucleic acid components are provided in a single vector.
In some embodiments, the target polynucleotide comprises a genomic locus. In some embodiments, the target polynucleotide comprises RNA or DNA.
In some embodiments, the RNA comprises a viral RNA of a RNA virus. In some embodiments, the RNA virus is selected from the group consisting of Norwalk, Rotavirus, Poliovirus, Ebola virus, Marburg virus, Lassa virus, Hantavirus, Rabies, Influenza, Yellow fever virus, Coronavirus, SARS, SARS-CoV-2, West Nile virus, Hepatitis A, C (HCV) and E virus, Dengue fever virus, togaviruses, Rhabdoviruses, Picomaviruses, Myxoviruses, retroviruses, bunyaviruses, coronaviruses, and reoviruses. In some embodiments, the DNA comprises a genomic DNA or a cDNA.
In some embodiments, the modification of the target polynucleotide comprises cleavage of the target polynucleotide. In some embodiments, the target polynucleotide is contained in a nucleic acid molecule within a cell or in vitro.
In some embodiments, the cell comprises a eukaryotic cell. In some embodiments, the eukaryotic cell comprises a mammalian cell. In some embodiments, the eukaryotic cell comprises a non-human animal cell, a human cell, or a plant cell.
In another aspect, this disclosure provides a method of treating or preventing a viral infection of a RNA virus in a cell or a subject. In some embodiments, the method comprises delivering to the cell or the subject an enzyme that possesses a reverse transcriptase activity and an endonuclease activity, or a nucleic acid molecule encoding the enzyme, wherein a singlestranded guide polynucleotide hybridizes with a viral RNA of the RNA virus and directs binding of the enzyme to the viral RNA, and wherein the enzyme cleaves the viral RNA.
In some embodiments, the method comprises reverse transcribing the viral RNA into a cDNA by the enzyme.
In some embodiments, the enzyme has an integrase activity, and wherein the method comprises integrating the cDNA to a genome of the cell or the subject by the enzyme.
In some embodiments, the method comprises transcribing the cDNA into the singlestranded guide polynucleotide capable of hybridizing with the viral RNA.
In another aspect, this disclosure provides a method of enhancing immunity against a viral infection of a RNA virus in a cell or a subject. In some embodiments, the method comprises delivering to the cell or the subject an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a single-stranded DNA (ssDNA) to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, wherein the enzyme reverse-transcribes the mRNA into a second ssDNA, wherein the second ssDNA hybridizes with the viral RNA and directs binding of the enzyme to the viral RNA, and wherein the enzyme cleaves the viral RNA. Also within the scope of this disclosure is a method of enhancing immunity against a viral infection of a RNA virus in a cell or a subject. In some embodiments, the method comprises delivering to the cell or the subject an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a single-stranded DNA (ssDNA) to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, and wherein the mRNA with the viral RNA to form a RNA hybrid to silence the viral RNA.
In another aspect, this disclosure provides a method of generating a cell line with immunity against a viral infection of a RNA virus. In some embodiments, the method comprises delivering to a cell an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse- transcribes a viral RNA of the RNA virus into a single-stranded DNA (ssDNA) to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, wherein the enzyme reverse-transcribes the mRNA into a second ssDNA, wherein the second ssDNA hybridizes with the viral RNA and directs binding of the enzyme to the viral RNA, and wherein the enzyme cleaves the viral RNA.
Also within the scope of this disclosure is a method of generating a cell line with immunity against a viral infection of a RNA virus. In some embodiments, the method comprises delivering to a cell an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse- transcribes a viral RNA of the RNA virus into a single-stranded DNA (ssDNA) to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, and wherein the mRNA with the viral RNA to form a RNA hybrid to silence the viral RNA.
In some embodiments, the RNA virus is selected from the group consisting of Norwalk, Rotavirus, Poliovirus, Ebola virus, Marburg virus, Lassa virus, Hantavirus, Rabies, Influenza, Yellow fever virus, Coronavirus, SARS, SARS-CoV-2, West Nile virus, Hepatitis A, C (HCV) and E virus, Dengue fever virus, togaviruses, Rhabdoviruses, Pi cornavi ruses, Myxoviruses, retroviruses, bunyaviruses, coronaviruses, and reoviruses.
In some embodiments, the enzyme comprises a N-terminal apurinic-apyrimidinic endonuclease domain optionally linked to a reverse transcriptase domain. In some embodiments, the enzyme further comprises a C-terminal domain. In some embodiments, the C-terminal domain facilitates interaction between the enzyme and the target polynucleotide. In some embodiments, the C-terminal domain comprises a zinc finger. In some embodiments, the zinc finger comprises a CCHC motif.
In some embodiments, the enzyme comprises a Bat-Associated Reverse Transcriptase (BART). In some embodiments, the BART comprises a BART of Rhinolophus ferrumequinum, Myotis myotis Meles meles, Bos mutus, Capra hircus, Homo sapiens, Canis lupus familiaris, Sus scrofa, Callithrix jacchus, Mus musculus, Ursus arctos, Elephas maximus indicus, or a variant thereof.
In some embodiments, the enzyme is provided through one or more polynucleotide molecules encoding the enzyme.
In yet another aspect, this disclosure provides a gene editing system for modifying a target polynucleotide. In some embodiments, the method comprises an enzyme that possesses a reverse transcriptase activity and an endonuclease activity, and one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with the target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide.
In some embodiments, the enzyme further possesses an integrase activity.
In some embodiments, the one or more nucleic acid components comprise a single-stranded navigator DNA.
In some embodiments, the one or more nucleic acid components further comprise a payload RNA. In some embodiments, the payload RNA comprises a stem-loop structure. In some embodiments, the enzyme reverse transcribes the payload RNA into a cDNA. In some embodiments, the enzyme integrates the cDNA into the target polynucleotide. In some embodiments, the enzyme comprises a N-terminal apurinic-apyrimidinic endonuclease domain optionally linked to a reverse transcriptase domain. In some embodiments, the enzyme further comprises a C-terminal domain. In some embodiments, the C-terminal domain facilitates interaction between the enzyme and the target polynucleotide. In some embodiments, the C-terminal domain comprises a zinc finger. In some embodiments, the zinc finger comprises a CCHC motif.
In some embodiments, the enzyme comprises a Bat-Associated Reverse Transcriptase (BART). In some embodiments, the BART comprises a BART of Rhinolophus ferrumequiirum, Myotis myotis, Meles meles. Bos mutus, Capra hircus, Homo sapiens, Canis lupus familiaris, Sus scrofa, Callithrix jacchus, Mus musculus, Ursus arctos, Elephas maximus indicus, or a variant thereof.
In some embodiments, the gene editing system comprises one or more polynucleotide molecules encoding the enzyme. In some embodiments, the gene editing system comprises one or more polynucleotide molecules encoding or comprising the one or more nucleic acid components. In some embodiments, the one or more polynucleotide molecules comprise one or more vectors. In some embodiments, the enzyme and the one or more nucleic acid components are provided in a single vector.
In some embodiments, the target polynucleotide comprises RNA or DNA.
In some embodiments, the RNA comprises a viral RNA of a RNA virus. In some embodiments, the RNA virus is selected from the group consisting of Norwalk, Rotavirus, Poliovirus, Ebola virus, Marburg virus, Lassa virus, Hantavirus, Rabies, Influenza, Yellow fever virus, Coronavirus, SARS, SARS-CoV-2, West Nile virus, Hepatitis A, C (HCV) and E virus, Dengue fever virus, togaviruses, Rhabdoviruses, Picomaviruses, Myxoviruses, retroviruses, bunyaviruses, coronaviruses, and reoviruses.
In some embodiments, the DNA comprises a genomic DNA or a cDNA.
In some embodiments, the modification of the target polynucleotide comprises cleavage of the target polynucleotide.
In another aspect, this disclosure provides a delivery system comprising the gene editing system, and the delivery system is adapted to deliver the gene editing system into a cell or a subject. In some embodiments, the delivery system comprises nanoparticles or vesicles encapsulating the gene editing system.
In another aspect, this disclosure provides a vector system comprising one or more vectors, wherein the one or more vectors comprise one or more polynucleotide molecules encoding an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, and comprise one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with a target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide.
In another aspect, this disclosure provides a kit comprising the gene editing system, the delivery system, or the vector system, as described herein.
In another aspect, this disclosure provides a cell line comprising one or more polynucleotide molecules encoding an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and encoding an integrase activity, and one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with a target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide.
In some embodiments, the cell line comprises a eukaryotic cell. In some embodiments, the eukaryotic cell comprises a mammalian cell. In some embodiments, the eukaryotic cell comprises a stem cell or stem cell line.
Also provided is a method for preparing the cell line. In some embodiments, the method comprises introducing to a cell one or more polynucleotide molecules encoding an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, and encoding one or more nucleic acid components.
The foregoing summary is not intended to define every aspect of the disclosure, and additional aspects are described in other sections, such as the following detailed description. The entire document is intended to be related as a unified disclosure, and it should be understood that all combinations of features described herein are contemplated, even if the combination of features are not found together in the same sentence, or paragraph, or section of this document. Other features and advantages of the invention will become apparent from the following detailed description. It should be understood, however, that the detailed description and the specific examples, while indicating specific embodiments of the disclosure, are given by way of illustration only, because various changes and modifications within the spirit and scope of the disclosure will become apparent to those skilled in the art from this detailed description.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 shows a process BART utilizes to integrate a RNA payload through a navigator DNA.
Figures 2A and 2B show BART-mediated immunity against VSV infection. This figure shows the results of VSV infection in BART -expressing cell lines. Figure 2A shows the reduction in viral load in cells expressing BART compared to control cells. Figure 2B shows the presence of viral cDNA in the cytoplasm of BART-expressing cells, highlighting BART’s role in reverse transcription and viral suppression.
Figure 3 shows a structural model of BART highlighting the APE domain, the reverse transcriptase domain, and the C-terminal domain (CTD).
Figure 4 shows vector maps of bart cloning constructs. Depiction of the plasmid vector maps used for cloning the BART enzyme coding sequences, including the placement of the CMV promoter, neomycin resistance gene, and restriction sites for Hindlll and Notl.
DETAILED DESCRIPTION OF THE INVENTION
This disclosure is based, in part, on an unexpected discovery of a novel mammalian defense mechanism akin to the CRISPR-Cas system traditionally associated with prokaryotic biology. The Bat-Associated Reverse Transcriptase (BART), as disclosed herein, possesses a unique feature found in bat-induced pluripotent stem cells (iPSCs) with the ability to convert viral genomic RNA into DNA. The disclosed BART system represents a next-generation tool for genome and RNA editing, exhibiting high efficiency and specificity, potentially outperforming current CRISPR-Cas systems. This discovery blurs the lines between prokaryotic and eukaryotic defense mechanisms, opening new opportunities for comprehensive viral defense strategies and advanced gene therapy applications. This disclosure also presents a significant stride towards addressing the most pressing health challenges, including pandemics caused by RNA viruses and gene editing for therapy. BART exhibits the remarkable ability to convert viral RNA into DNA, a process typically associated with retroviruses. Furthermore, it possesses endonuclease activity, enabling it to cleave nucleic acids, and integrase activity, facilitating the insertion of genetic material into the host genome. The unique combination of these enzymatic activities in a mammalian protein distinguishes BART as a groundbreaking innovation with significant implications for both antiviral defense and gene editing.
A function of BART is to serve as an innovative antiviral defense mechanism in mammalian cells. It achieves this by targeting and degrading viral RNA, thus disrupting the viral life cycle and preventing further infection. The unique aspect of BART lies in its ability to create a “genomic memory” of past viral encounters. By converting viral RNA into DNA and integrating it into the host genome, BART establishes a lasting record of the infection. This genomic archive allows for the rapid production of RNA transcripts that guide BART to recognize and neutralize the same virus upon subsequent infections, providing a form of adaptive immunity in mammalian cells.
The unique mechanism of action employed by BART sets it apart from other known antiviral and gene-editing systems. Upon encountering a viral RNA, BART utilizes its reverse transcriptase activity to convert the RNA into complementary DNA (cDNA). This cDNA is then integrated into the host genome at specific locations, creating a library of viral sequences akin to the CRISPR arrays found in bacteria. The integrated cDNA is subsequently transcribed, generating RNA transcripts that guide BART to target and cleave homologous viral RNA, thereby disrupting the viral life cycle and providing long-term immunity against future infections. This remarkable process not only confers antiviral defense but also offers a powerful tool for precise gene editing, as the integration and targeting mechanisms can be harnessed to introduce specific modifications into the host genome.
Beyond its role in antiviral defense, BART’s unique mechanism of action presents methods for precise and efficient gene editing. The ability to introduce specific genetic modifications at both the DNA and RNA levels, guided by programmable ssDNA oligonucleotides, positions BART as a successor to current CRISPR-Cas systems. BART exhibits superior specificity and reduced off-target effects, addressing a major limitation of existing gene-editing technologies. This enhanced precision, coupled with its potential for broader applicability across diverse cell types and therapeutic contexts, makes BART a system for next-generation gene editing tools. The ability to harness this naturally evolved mammalian system for targeted genome manipulation will revolutionize the field of gene therapy, offering new avenues for treating genetic disorders and other diseases.
The discovery of BART represents a paradigm shift in the understanding of antiviral immunity and gene editing. By identifying a mammalian system that functionally parallels the prokaryotic CRISPR-Cas system, BART challenges the traditional boundaries between these two domains of life. This indicates a convergent evolution of adaptive immune mechanisms, highlighting the universality of strategies employed by organisms to combat viral threats. BART’s dual functionality as an antiviral agent and a precise gene-editing tool provides new avenues for innovative therapies against viral infections and genetic disorders. This naturally evolved mammalian system for targeted genome manipulation will revolutionize the field of gene therapy, offering a more compatible and efficient alternative to current CRISPR-Cas-based approaches. The discovery of BART thus represents a significant leap forward in addressing some of the most pressing health challenges, paving the way for a new era of antiviral and gene-editing therapeutics.
BART System and Methods of Use
This disclosure encompasses methods and uses of the BART system described herein for modifying a target DNA sequence (e.g., a chromosomal sequence) or target RNA sequence, e.g., for altering or manipulating the expression of one or more genes or the one or more gene products, in prokaryotic or eukaryotic cells, in vitro in vivo, or ex vivo. The BART protein has unique characteristics in possessing both reverse transcriptase and endonuclease activities. In some embodiments, the BART protein can additionally have an integrase activity, allowing it to integrate a DNA (e.g., reverse transcribed from a viral RNA) into a host genome.
The disclosed BART systems provide an effective means for modifying (e.g., deleting, inserting, translocating, inactivating, activating) a target RNA or DNA (double-stranded, linear, or super-coiled) in a multiplicity of cell types. Thus, the disclosed BART systems have a broad spectrum of applications, e.g., gene therapy, drug screening, and disease diagnosis/prognosis.
Methods of Modifying Target Polynucleotides In one aspect, this disclosure provides a method of modifying a target polynucleotide. In some embodiments, the method comprises delivering to the target polynucleotide an enzyme (e.g., BART) that possesses a reverse transcriptase activity and an endonuclease activity, and one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with the target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide.
In some embodiments, the enzyme further possesses an integrase activity.
In some embodiments, the enzyme comprises a BART protein. In some embodiments, the one or more nucleic acid components comprise a single-stranded navigator DNA (“navigator ssDNA”).
In some embodiments, the one or more nucleic acid components further comprise a payload RNA. In some embodiments, the payload RNA comprises a stem-loop structure.
In some embodiments, the enzyme reverse transcribes the payload RNA into a cDNA. In some embodiments, the enzyme integrates the cDNA into the target polynucleotide.
In another aspect, the disclosure provides a method of modifying a target polynucleotide (e.g., target sequence of interest), such as modifying expression of a target polynucleotide, in a cell. In some embodiments, the method allows a BART complex (e.g., BART/navigator ssDNA complex) to bind to the target polynucleotide, resulting in increased or decreased expression of the target polynucleotide or a gene comprising the target polynucleotide. In some embodiments, the BART complex comprises BART complexed with a ssDNA sequence hybridized to a target sequence within the polynucleotide.
In some embodiments, the method of modifying a target polynucleotide comprises delivering the BART system, isolated nucleic acids encoding a BART protein and/or a navigator ssDNA, or particles containing a BART protein and/or a navigator ssDNA, to a target sequence or a cell containing the target sequence. In some embodiments, following formation of a complex of BART/navigator ssDNA and hybridization of the ssDNA to one or more nucleic acids of the target sequence, the BART protein induces a modification (e.g., cleavage) of the target sequence.
In some embodiments, the modification comprises cleaving one or two strands at the location of the target sequence by the enzyme. In some embodiments, the modification results in decreased or increased transcription of a target gene. In some embodiments, the method further comprises repairing the cleaved target polynucleotide by homologous recombination with an exogenous template polynucleotide, wherein the repair results in a mutation comprising an insertion, deletion, or substitution of one or more nucleotides of the target polynucleotide. In some embodiments, the mutation results in one or more amino acid changes in a protein expressed from a gene comprising the target sequence.
In some embodiments, the modification of the target polynucleotide comprises cleavage of the target polynucleotide.
In some embodiments, modification of a target polynucleotide may include modifying (e.g., increasing or decreasing) expression of a target polynucleotide as a result of binding of the BART protein to a certain location on the target polynucleotide. For example, if the BART protein binds to a location between a promoter or other regulatory element and a coding sequence in the target polynucleotide, the BART protein may serve as a repressor that decreases expression of the target polynucleotide. On the other hand, if the BART protein binds to a location upstream of a promoter or other regulatory element, the BART protein may serve as an activator that increases expression of the target polynucleotide.
Accordingly, in one aspect, this disclosure also provides a method of modifying expression of a target polynucleotide. In some embodiments, the method comprises introducing into a cell or a subject an enzyme (e. ., BART) that possesses a reverse transcriptase activity and an endonuclease activity or a nucleic acid molecule encoding the enzyme, and one or more nucleic acid components (e.g., navigator ssDNA), wherein the one or more nucleic acid components target and hybridize with the target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme binds to one or more locations on the target polynucleotide such that binding of the enzyme increases or decreases the expression level of the target polynucleotide.
In this application, the enzyme may include one or more mutations that result in a catalytically inactive enzyme. In some embodiments, the catalytically inactive enzyme lacks an endonuclease activity, such that the enzyme can still bind to the target polynucleotide but does not cleave the target polynucleotide. In some embodiments, a nucleotide sequence encoding the BART protein is present in a recombinant expression vector. In some embodiments, the recombinant expression vector is a viral construct, e. ., a recombinant adeno-associated virus construct, a recombinant adenoviral construct, a recombinant lentiviral construct, etc. For example, viral vectors can be based on vaccinia virus, poliovirus, adenovirus, adeno-associated virus, SV40, herpes simplex virus, human immunodeficiency virus, and the like. A retroviral vector can be based on Murine Leukemia Virus, spleen necrosis virus, and vectors derived from retroviruses such as Rous Sarcoma Virus, Harvey Sarcoma Virus, avian leukosis virus, a lentivirus, human immunodeficiency virus, myeloproliferative sarcoma virus, mammary tumor virus, and the like. Useful expression vectors are known to those of skill in the art, and many are commercially available. The following vectors are provided by way of example for eukaryotic host cells: pXTl, pSG5, pSVK3, pBPV, pMSG, and pSVLSV40. However, any other vector may be used if it is compatible with the host cell. For example, useful expression vectors containing a nucleotide sequence encoding a Cas9 enzyme are commercially available from, e.g., Addgene, Life Technologies, Sigma- Aldrich, and Origene.
Depending on the target cell/expression system used, any of a number of transcription and translation control elements, including promoter, transcription enhancers, transcription terminators, and the like, may be used in the expression vector. Useful promoters can be derived from viruses, or any organism, e.g., prokaryotic or eukaryotic organisms. Suitable promoters include, but are not limited to, the SV40 early promoter, mouse mammary tumor virus long terminal repeat (LTR) promoter; adenovirus major late promoter (Ad MLP); a herpes simplex virus (HSV) promoter, a cytomegalovirus (CMV) promoter such as the CMV immediate early promoter region (CMVIE), a rous sarcoma virus (RSV) promoter, a human U6 small nuclear promoter (U6), an enhanced U6 promoter, a human Hl promoter (Hl), etc.
In some embodiments, the enzyme is provided through one or more polynucleotide molecules encoding the enzyme. In some embodiments, one or more nucleic acid components are provided through one or more polynucleotide molecules encoding or comprising one or more nucleic acid components. In some embodiments, one or more polynucleotide molecules comprise one or more vectors. In some embodiments, the enzyme and one or more nucleic acid components are provided in a single vector. In some embodiments, a polynucleotide comprising a first navigator ssDNA is located on the same vector with a polynucleotide encoding the BART protein. In some embodiments, the BART system further comprises a polynucleotide containing a second navigator ssDNA. In some embodiments, the polynucleotide encoding the BART protein, the polynucleotide comprising the first navigator DNA, and the polynucleotide comprising the second navigator ssDNA are harbored on the same vector or two or more different vectors.
The BART protein or variants/fragments thereof can be introduced into a cell (e.g., an in vitro cell such as a primary cell for ex vivo therapy, or an in vivo cell such as in a patient) as a BART protein or a variant or fragment thereof, an mRNA encoding a BART protein or a variant or fragment thereof, or a recombinant expression vector comprising a nucleotide sequence encoding a BART protein or a variant or fragment thereof.
In some embodiments, the method further comprises delivering one or more vectors to a host cell. In some embodiments, the vectors are delivered to the host cell in a subject. In some embodiments, the modification takes place in the eukaryotic cell in cell culture. In some embodiments, the method further comprises isolating the eukaryotic cell from a subject prior to the modification. In some embodiments, the method further comprises returning the cells derived therefrom to the subject.
The method further comprises maintaining cells or embryos under appropriate conditions such that the ssDNA guides the BART protein to the targeted site in the target sequence to modify the target sequence. In general, the cell can be maintained under conditions appropriate for cell growth and/or maintenance. Suitable cell culture conditions are well known in the art and are described, for example, in Current Protocols in Molecular Biology” Ausubel et al., John Wiley & Sons, New York, 2003 or “Molecular Cloning: A Laboratory Manual” Sambrook & Russell, Cold Spring Harbor Press, Cold Spring Harbor, N.Y., 3rd edition, 2001), Santiago et al. (2008) PNAS 105:5809-5814; Moehle et rz/. (2007) PNAS 104:3055-3060; Urnov et al. (2005) Nature 435:646- 651; and Lombardo et al. (2007) Nat. Biotechnology 25: 1298-1306. Those of skill in the art appreciate that methods for culturing cells are known in the art and can and will vary depending on the cell type. Routine optimization may be used, in all cases, to determine the best techniques for a particular cell type. An embryo can be cultured in vitro (e.g., in cell culture). Typically, the embryo is cultured at an appropriate temperature and in appropriate media with the necessary O2/CO2 ratio to allow the expression of the proteins and RNA scaffold, if necessary. Suitable non-limiting examples of media include M2, M16, KSOM, BMOC, and HTF media. A skilled artisan will appreciate that culture conditions can and will vary depending on the species of embryo. Routine optimization may be used, in all cases, to determine the best culture conditions for a particular species of embryo. In some cases, a cell line may be derived from an in iv/r -cultured embryo (e.g., an embryonic stem cell line).
Alternatively, an embryo may be cultured in vivo by transferring the embryo into uterus of a female host. Generally speaking, the female host is from the same or similar species as the embryo. Preferably, the female host is pseudo-pregnant. Methods of preparing pseudo-pregnant female hosts are known in the art. Additionally, methods of transferring an embryo into a female host are known. Culturing an embryo in vivo permits the embryo to develop and can result in a live birth of an animal derived from the embryo. Such an animal would comprise the modified chromosomal sequence in every cell of the body.
BART Proteins
In some embodiments, the BART protein may include a variant or fragment of a BART. In some embodiments, the BART comprises a BART of Rhinolophus ferrumequinum, Myotis myotis, Meles meles, Bos mutus, Capra hircus, Homo sapiens, Canis lupus familiaris, Sus scrofa, Callithrix jacchus, Mus musculus, Ursus arctos, Elephas maximus indicus, or a variant thereof.
In some embodiments, the enzyme comprises a N-terminal apurinic-apyrimidinic endonuclease (APE) domain optionally linked to a reverse transcriptase domain. Apurinic- apyrimidinic endonuclease (APE) domains are protein motifs that play a crucial role in DNA repair. These domains are responsible for recognizing and cleaving DNA strands that contain apurinic or apyrimidinic sites, which are lesions that occur when a purine or pyrimidine base is lost from the DNA backbone.
In some embodiments, the enzyme further comprises a C-terminal domain. In some embodiments, the C-terminal domain facilitates interaction between the enzyme and the target polynucleotide. In some embodiments, the C-terminal domain comprises azinc finger. Zinc fingers are small protein structural motifs characterized by the coordination of one or more zinc ions (Zn2+). These metal ions help stabilize the protein’s structure, allowing it to interact with other molecules.
In some embodiments, the zinc finger comprises a CCHC motif. The CCHC motif is a zinc finger motif found in many proteins, particularly those involved in DNA binding and transcriptional regulation. It consists of a conserved sequence of amino acids that contains cysteine and histidine residues. These residues coordinate zinc ions, forming a stable structure that helps the protein bind to DNA.
In some embodiments, the BART protein comprises an amino acid sequence of any one of SEQ ID NOs: 1-12 or comprises an amino acid sequence having at least 75% (e.g., 75%, 80%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more) sequence identity to any one of SEQ ID NOs: 1-12.
Table 1. Representative BART proteins
Figure imgf000019_0001
Figure imgf000020_0001
Figure imgf000021_0001
Figure imgf000022_0001
Figure imgf000023_0001
Figure imgf000024_0001
Figure imgf000025_0001
Figure imgf000026_0001
Figure imgf000027_0001
The terms “protein,” “polypeptide,” and “peptide,” are used interchangeably herein to refer to polymers of amino acids of any length. The polymer may be linear or branched, it may comprise modified amino acids, and it may be interrupted by non-amino acids. The terms also encompass an amino acid polymer that has been modified, for example, disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, pegylation, or any other manipulation, such as conjugation with a labeling component. As used herein, the term “amino acid” includes natural and/or unnatural or synthetic amino acids, including glycine and both the D or L optical isomers, and amino acid analogs and peptidomimetics.
A peptide or polypeptide “fragment” as used herein refers to a less than full-length peptide, polypeptide or protein. For example, a peptide or polypeptide fragment can have at least about 3, at least about 4, at least about 5, at least about 10, at least about 20, at least about 30, at least about 40 amino acids in length, or single unit lengths thereof. For example, a fragment may be 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, or more amino acids in length. There is no upper limit to the size of a peptide fragment. However, in some embodiments, peptide fragments can be less than about 500 amino acids, less than about 400 amino acids, less than about 300 amino acids or less than about 250 amino acids in length.
As used herein, the term “variant” refers to a first composition (e.g., a first molecule) that is related to a second composition (e.g., a second molecule, also termed a “parent” molecule). The variant molecule can be derived from, isolated from, based on or homologous to the parent molecule. The term variant can be used to describe either polynucleotides or polypeptides.
As applied to polynucleotides, a variant molecule can have an entire nucleotide sequence identity with the original parent molecule, or alternatively, can have less than 100% nucleotide sequence identity with the parent molecule. For example, a variant of a gene nucleotide sequence can be a second nucleotide sequence that is at least 50%, 60%, 70%, 80%, 90%, 95%, 98%, 99% or more identical in nucleotide sequence compared to the original nucleotide sequence. Polynucleotide variants also include polynucleotides comprising the entire parent polynucleotide, and further comprising additional fused nucleotide sequences. Polynucleotide variants also include polynucleotides that are portions or subsequences of the parent polynucleotide; for example, unique subsequences (e.g., as determined by standard sequence comparison and alignment techniques) of the polynucleotides disclosed herein are also encompassed by the invention. In another aspect, polynucleotide variants include nucleotide sequences that contain minor, trivial or inconsequential changes to the parent nucleotide sequence. For example, minor, trivial or inconsequential changes include changes to nucleotide sequence that (i) do not change the amino acid sequence of the corresponding polypeptide, (ii) occur outside the protein-coding open reading frame of a polynucleotide, (iii) result in deletions or insertions that may impact the corresponding amino acid sequence, but have little or no impact on the biological activity of the polypeptide, (iv) the nucleotide changes result in the substitution of an amino acid with a chemically similar amino acid. In the case where a polynucleotide does not encode for a protein, variants of that polynucleotide can include nucleotide changes that do not result in loss of function of the polynucleotide. In another aspect, conservative variants of the disclosed nucleotide sequences that yield functionally identical nucleotide sequences are encompassed by the invention. One of skill will appreciate that many variants of the disclosed nucleotide sequences are encompassed by the invention.
As applied to proteins, a variant polypeptide can have an entire amino acid sequence identity with the original parent polypeptide, or alternatively, can have less than 100% amino acid identity with the parent protein. For example, a variant of an amino acid sequence can be a second amino acid sequence that is at least 50%, 60%, 70%, 80%, 90%, 95%, 98%, 99% or more identical in amino acid sequence compared to the original amino acid sequence.
Polypeptide variants include polypeptides comprising the entire parent polypeptide, and further comprising additional fused amino acid sequences. Polypeptide variants also include polypeptides that are portions or subsequences of the parent polypeptide, for example, unique subsequences (e.g., as determined by standard sequence comparison and alignment techniques) of the polypeptides disclosed herein are also encompassed by the invention.
In another aspect, polypeptide variants include polypeptides that contain minor, trivial, or inconsequential changes to the parent amino acid sequence. For example, minor, trivial, or inconsequential changes include amino acid changes (including substitutions, deletions, and insertions) that have little or no impact on the biological activity of the polypeptide, and yield functionally identical polypeptides, including additions of a non-functional peptide sequence. In other aspects, the variant polypeptides of the invention change the biological activity of the parent molecule. One of skill will appreciate that many variants of the disclosed polypeptides are encompassed by the invention.
In some embodiments, polynucleotide or polypeptide variants can include variant molecules that alter, add or delete a small percentage of the nucleotide or amino acid positions, for example, typically less than about 10%, less than about 5%, less than 4%, less than 2% or less than 1%.
A “functional variant” of a protein as used herein refers to a variant of such protein that retains at least partially the activity of that protein. Functional variants may include mutants (which may be insertion, deletion, or replacement mutants), including polymorphs, etc. Also included within functional variants are fusion products of such protein with another, usually unrelated, nucleic acid, protein, polypeptide or peptide. Functional variants may be naturally occurring or may be man-made.
In some embodiments, a variant of a BART protein may include one or more conservative modifications. The BART protein variant with one or more conservative modifications may retain the desired functional properties, which can be tested using the functional assays known in the art.
As used herein, the term “conservative sequence modifications” refers to amino acid modifications that do not significantly affect or alter the binding characteristics of the protein containing the amino acid sequence. Such conservative modifications include amino acid substitutions, additions, and deletions. Modifications can be introduced by standard techniques known in the art, such as site-directed mutagenesis and PCR-mediated mutagenesis. Conservative amino acid substitutions are ones in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include: amino acids with basic side chains (e. , lysine, arginine, histidine); acidic side chains (e.g., aspartic acid, glutamic acid); uncharged polar side chains e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine, tryptophan); nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine); beta-branched side chains (e.g., threonine, valine, isoleucine); and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine) includes one or more conservative modifications. The BART protein with one or more conservative modifications may retain the desired functional properties, which can be tested using the functional assays known in the art. As used herein, the percent homology between two amino acid sequences is equivalent to the percent identity between the two sequences. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences (z.e., % homology = # of identical positions/total # of positions x 100), taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences. The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm, as described in the non-limiting examples below.
The percent identity between two amino acid sequences can be determined using the algorithm of E. Meyers and W. Miller (Comput. Appl. Biosci., 4: 11-17 (1988)) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4. In addition, the percent identity between two amino acid sequences can be determined using the Needleman and Wunsch (J. Mol. Biol. 48:444-453 (1970)) algorithm which has been incorporated into the GAP program in the GCG software package (available at www.gcg.com), using either a Blossum62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6.
Additionally or alternatively, the protein sequences of the present invention can further be used as a “query sequence” to perform a search against public databases to, for example, identify related sequences. Such searches can be performed using the XBLAST program (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to the molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al. (1997) Nucleic Acids Res. 25(17):3389- 3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used (See www.ncbi.nlm.nih.gov).
In some embodiments, a variant of a BART protein can be conjugated or linked to a detectable tag or a detectable marker (e.g., a radionuclide, a fluorescent dye, or an MRI-detectable label). In some embodiments, the detectable tag can be an affinity tag. The term “affinity tag,” as used herein, relates to a moiety attached to a polypeptide, which allows the polypeptide to be purified from a biochemical mixture. Affinity tags can consist of amino acid sequences or can include amino acid sequences to which chemical groups are attached by post-translational modifications. Non-limiting examples of affinity tags include His-tag, CBP-tag (CBP: calmodulin- binding protein), CYD-tag (CYD: covalent yet dissociable NorpD peptide), Strep-tag, StrepII-tag, FLAG-tag, HPC-tag (HPC: heavy chain of protein C), GST-tag (GST: glutathione S transferase), Avi-tag, biotinylated tag, Myc-tag, a myc-myc-hexahistidine (mmh) tag 3xFLAG tag, a SUMO tag, and MBP-tag (MBP: maltose-binding protein). Further examples of affinity tags can be found in Kimple etal., Curr Protoc Protein Sci. 2013 Sep 24; 73: Unit 9.9.
In some embodiments, the detectable tag can be conjugated or linked to the N- and/or C- terminus of a variant of a BART protein. The detectable tag and the affinity tag may also be separated by one or more amino acids. In some embodiments, the detectable tag can be conjugated or linked to the variant via a cleavable element. In the context of the present invention, the term “cleavable element” relates to peptide sequences that are susceptible to cleavage by chemical agents or enzyme means, such as proteases. Proteases may be sequence-specific (e.g., thrombin) or may have limited sequence specificity (e.g., trypsin). Cleavable elements I and II may also be included in the amino acid sequence of a detection tag or polypeptide, particularly where the last amino acid of the detection tag or polypeptide is K or R.
As used herein, the term “conjugate,” “conjugation,” or “linked” refers to the attachment of two or more entities to form one entity. A conjugate encompasses both peptide-small molecule conjugates as well as peptide-protein/peptide conjugates.
The term “fusion polypeptide” or “fusion protein” means a protein created by joining two or more polypeptide sequences together. The fusion polypeptides encompassed in this invention include translation products of a chimeric gene construct that joins the nucleic acid sequences encoding a first polypeptide with the nucleic acid sequence encoding a second polypeptide to form a single open reading frame. In other words, a “fusion polypeptide” or “fusion protein” is a recombinant protein of two or more proteins that are joined by a peptide bond or via several peptides. The fusion protein may also comprise a peptide linker between the two domains.
The term “linker” refers to any means, entity, or moiety used to join two or more entities. A linker can be a covalent linker or a non-covalent linker. Examples of covalent linkers include covalent bonds or a linker moiety covalently attached to one or more of the proteins or domains to be linked. The linker can also be a non-covalent bond, e.g., an organometallic bond through a metal center such as a platinum atom. For covalent linkages, various functionalities can be used, such as amide groups, including carbonic acid derivatives, ethers, esters, including organic and inorganic esters, amino, urethane, urea, and the like. To provide for linking, the domains can be modified by oxidation, hydroxylation, substitution, reduction, etc., to provide a site for coupling. Methods for conjugation are well known by persons skilled in the art and are encompassed for use in the present invention. Linker moi eties include, but are not limited to, chemical linker moi eties, or for example, a peptide linker moiety (a linker sequence).
In some embodiments, the linker can be a peptide linker and a non-peptide linker. Examples of the peptide linker may include [Ser(Gly)n]m or [Ser(Gly)n]mSer, where n may be an integer between 1 and 20. As used herein, the term “non-peptide linker” refers to a biocompatible polymer composed of two or more repeating units linked to each other, in which the repeating units are linked to each other by any non-peptide covalent bond. This non-peptidyl linker may have two ends or three ends. Examples of the non-peptidyl linker may include, without limitation, polyethylene glycol, polypropylene glycol, a copolymer of ethylene glycol with propylene glycol, polyoxyethylated polyol, polyvinyl alcohol, polysaccharide, dextran, polyvinyl ethyl ether, biodegradable polymers such as polylactic acid (PLA) and polylactic-glycolic acid (PLGA), lipid polymers, chitins, hyaluronic acid, and combinations thereof. Aptamers could be added as non- peptide linkers.
In some embodiments, a variant of a BART protein can be fused to a fusion partner through crosslinking with a crosslinking agent, e.g, crosslinker. Crosslinkers are reagents having reactive ends to specific functional groups (e.g., primary amines or sulfhydryls) on proteins or other molecules. Crosslinkers are capable of joining two or more molecules by a covalent bond. Crosslinkers include but are not limited to amine-to-amine crosslinkers (e.g., disuccinimidyl suberate(DSS)), amine-to-sulfhydryl crosslinkers (e.g., N-y-maleimidobutyryl-oxysuccinimide ester (GMBS)), carb oxy Lto-amine crosslinkers (e.g, dicyclo-hexylcarbodiimide (DCC)), sulfhydryl-to-carbohydrate crosslinkers (e.g., N-P-maleimidopropionic acid hydrazide (BMPH)), sulfhydryl-to-sulfhydryl crosslinkers (e.g., 1,4-bismaleimidobutane (BMB)), photoreactive crosslinkers (e.g, N-5-azido-2-nitrobenzoyloxysuccinimide (ANB-NOS)), chemoselective ligation crosslinkers (e.g., NHS-PEG4-Azide).
In some embodiments, a polynucleotide encoding a BART protein may be codon optimized. Generally, codon optimization refers to a process of modifying a nucleic acid sequence to enhance expression in the host cells by substituting at least one codon of the native sequence with codons that are more frequently or most frequently used in the genes of that host cell while maintaining the native amino acid sequence. Various species exhibit particular bias for certain codons of a particular amino acid. Codon bias (differences in codon usage between organisms) often correlates with the efficiency of translation of messenger RNA (mRNA), which is, in turn, believed to be dependent on, among other things, the properties of the codons being translated and the availability of particular transfer RNA (tRNA) molecules. The predominance of selected tRNAs in a cell is generally a reflection of the codons used most frequently in peptide synthesis. Accordingly, genes can be tailored for optimal gene expression in a given organism based on codon optimization. Codon usage tables are readily available, for example, at the “Codon Usage Database” available at www.kazusa.orjp/codon/, and these tables can be adapted in a number of ways. See Nakamura, Y., et al. “Codon usage tabulated from the international DNA sequence databases: status for the year 2000” Nucl. Acids Res. 28:292 (2000). Computer algorithms for codon optimizing a particular sequence for expression in a particular host cell are also available, such as Gene Forge (Aptagen; Jacobus, Pa.). In some embodiments, one or more codons (e.g., 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more, or all codons) in a sequence encoding a DNA/RNA-targeting BART protein corresponds to the most frequently used codon for a particular amino acid. As to codon usage in yeast, reference is made to the online Yeast Genome database available at http://www.yeastgenome.org/community/codonusage.shtml, or Codon selection in yeast, Bennetzen and Hall, J Biol Chem. 1982 Mar. 25; 257(6):3026-31. As to codon usage in plants, including algae, reference is made to Codon usage in higher plants, green algae, and cyanobacteria, Campbell and Gowri, Plant Physiol. 1990 January; 92(1): 1-11.; as well as Codon usage in plant genes, Murray et al., Nucleic Acids Res. 1989 Jan. 25; 17(2):477-98; or Selection on the codon bias of chloroplast and cyanelle genes in different plant and algal lineages, Morton B R, J Mol Evol. 1998 April; 46(4):449-59.
Navigator ssDNA
As used herein, the term “navigator ssDNA” generally refers to a ssDNA molecule (or a group of DNA molecules collectively) that can bind to a BART protein and target the BART protein to a specific location within a target RNA or DNA. The targeting segment comprises a nucleotide sequence that is complementary to (or at least can hybridize to under stringent conditions) a target sequence. A navigator ssDNA can be any polynucleotide sequence having sufficient complementarity with a target polynucleotide sequence (e.g., target DNA or RNA sequence) to hybridize with the target sequence and direct sequence-specific binding of a BART complex to the target sequence. In some embodiments, the degree of complementarity between a guide sequence of the navigator ssDNA and its corresponding target sequence, when optimally aligned using a suitable alignment algorithm, is about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or more. Optimal alignment may be determined with the use of any suitable algorithm for aligning sequences, non-limiting examples of which include the Smith-Waterman algorithm, the Needleman-Wunsch algorithm, algorithms based on the Burrows-Wheeler Transform (e.g., the Burrows- Wheel er Aligner), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies, ELAND (Illumina, San Diego, Calif), SOAP (available at soap.genomics.org.cn), and Maq (available at maq.sourceforge.net).
In some embodiments, a navigator ssDNA sequence is about 3, 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195,
200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290,
295, 300, 305, 310, 315, 320, 325, 330, 335, 340, 345, 350, 355, 360, 365, 370, 375, 380, 385,
390, 395, 400, 405, 410, 415, 420, 425, 430, 435, 440, 445, 450, 455, 460, 465, 470, 475, 480,
485, 490, 495, 500, or more nucleotides in length. In some embodiments, a guide sequence is about 50 nucleotides in length. In some embodiments, a guide sequence is about 100 nucleotides in length. In some embodiments, a guide sequence is about 150 nucleotides in length. In some embodiments, a guide sequence is about 200 nucleotides in length. In some embodiments, a guide sequence is about 250 nucleotides in length. In some embodiments, a guide sequence is about 300 nucleotides in length.
The ability of a guide sequence to direct sequence-specific binding of a BART complex to a target sequence may be assessed by any suitable assay. For example, the components of a BART system sufficient to form a BART complex, including the guide sequence to be tested, may be provided to a host cell having the corresponding target sequence, such as by transfection with vectors encoding the components of the BART sequence, followed by an assessment of preferential cleavage within the target sequence. Similarly, cleavage of a target polynucleotide sequence may be evaluated in a test tube by providing the target sequence, components of a BART complex, including the guide sequence to be tested and a control guide sequence different from the test guide sequence, and comparing binding or rate of cleavage at the target sequence between the test and control guide sequence reactions.
Table 2. Representative ssDNA
Figure imgf000036_0001
Figure imgf000037_0001
Figure imgf000038_0001
Figure imgf000039_0001
Figure imgf000040_0001
Figure imgf000041_0001
Figure imgf000042_0001
Figure imgf000043_0001
In some embodiments, the navigator ssDNA comprises a polynucleotide sequence of SEQ ID NOs: 13-96 and 107-112 or comprises a polynucleotide sequence having at least 75% (e.g., 75%, 80%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more) sequence identity to any one of SEQ ID NOs: 13-96 and 107-112.
In some embodiments, the navigator ssDNA comprises a synthetic nucleic acid sequence (e.g., synthetic DNA molecule). In some embodiments, the navigator ssDNA comprises one or more modifications.
As used herein, the term “modification” in the context of an oligonucleotide or polynucleotide includes but is not limited to (a) end modifications, e.g., 5’ end modifications or 3’ end modifications, (b) nucleobase (or “base”) modifications, including replacement or removal of bases, (c) sugar modifications, including modifications at the 2’, 3’, and/or 4’ positions, and (d) backbone modifications, including modification or replacement of the phosphodiester linkages. The term “modified nucleotide” generally refers to a nucleotide having a modification to the chemical structure of one or more of the base, the sugar, and the phosphodiester linkage or backbone portions, including nucleotide phosphates. The terms “Z” and “P” refer to the nucleotides, nucleobases, or nucleobase analogs are described, for example, in Yang, Z., et al., Nucleic Acids Res., 34, 6095-101 (2006), the disclosure of which is hereby incorporated by reference in its entirety.
In some embodiments, one or more modifications may include 2’-O-methyl moiety, a Z base, a 2’-deoxynucleotide, a phosphorothioate internucleotide linkage, a phosphonoacetate (PACE) internucleotide linkage, a thiophosphonoacetate (thioPACE) internucleotide linkage, or combinations thereof. In some embodiments, one or more modifications comprise one or more modifications selected from the group consisting of a 2’-O-methyl nucleotide with a 3’- phosphorothioate group, a 2’-O-methyl nucleotide with a 3 ’-phosphonoacetate group, a 2’-O- methyl nucleotide with a 3’-thiophosphonoacetate group, or a 2’-deoxynucleotide with a 3 1- phosphonoacetate group. In some embodiments, the one or modifications comprises a 2-thiouracil (2-thioU), a 4-thiouracil (4-thioU), a 2-aminoadenine, a 2’-o-methyl, a 2’-fluoro, a 5- methyluridine, a 5-methylcytidine, or a locked nucleic acid modification (LNA).
Target Polynucleotide
In the context of formation of a BART complex, “target polynucleotide” or “target sequence” refers to a sequence to which a guide sequence is designed to have complementarity, where hybridization between a target sequence and a guide sequence promotes the formation of a BART complex. A target sequence may comprise RNA or DNA polynucleotides. A “target nucleic acid strand” refers to a strand of a target nucleic acid that is subject to base-pairing with a navigator ssDNA as disclosed herein. That is, the strand of a target nucleic acid that hybridizes with the guide sequence is referred to as the “target nucleic acid strand.” The other strand of the target nucleic acid, which is not complementary to the guide sequence, is referred to as the “non- complementary strand.” In the case of double-stranded target nucleic acid (e.g., DNA), each strand can be a “target nucleic acid strand” to design navigator ssDNA and used to practice the disclosed method.
As used herein, the term “target RNA” refers to an RNA polynucleotide being or comprising the target sequence. In other words, the target RNA may be an RNA polynucleotide or a part of an RNA polynucleotide to which a part of the gRNA, /.< ., the guide sequence, is designed to have complementarity and to which the effector function mediated by the complex comprising BART protein and a gRNA is to be directed. In some embodiments, a target sequence is located in the nucleus or cytoplasm of a cell.
As used herein, the term “target DNA” refers to a DNA polynucleotide being or comprising the target sequence. In other words, the target DNA may be a DNA polynucleotide or a part of a DNA polynucleotide to which a part of the navigator ssDNA, i.e., the guide sequence, is designed to have complementarity and to which the effector function mediated by the complex comprising BART protein and a navigator ssDNA is to be directed. In some embodiments, a target sequence is located in the nucleus or cytoplasm of a cell.
The target polynucleotide has no sequence limitation and can be in the coding region of a gene, in an intron of a gene, in a control region between genes, etc. In some embodiments, the target polynucleotide is contained in a nucleic acid molecule within a cell or in vitro. The gene can be coding or non-coding. The target polynucleotide can be any polynucleotide endogenous or exogenous to the cell. For example, the target polynucleotide can be a polynucleotide residing in the nucleus of the eukaryotic cell. The target polynucleotide can be a sequence coding a gene product (e.g., a protein) or a non-coding sequence (e.g, a regulatory polynucleotide).
In some embodiments, the target polynucleotide comprises RNA or DNA. In some embodiments, the RNA comprises a viral RNA of a RNA virus. In some embodiments, the target polynucleotide comprises a cDNA reverse-transcribed from a viral RNA of a RNA virus.
In some embodiments, the RNA virus is selected from the group consisting of Norwalk, Rotavirus, Poliovirus, Ebola virus, Marburg virus, Lassa virus, Hantavirus, Rabies, Influenza, Yellow fever virus, Coronavirus, SARS, SARS-CoV-2, West Nile virus, Hepatitis A, C (HCV) and E virus, Dengue fever virus, togaviruses, Rhabdoviruses, Picornaviruses, Myxoviruses, retroviruses, bunyaviruses, coronaviruses, and reoviruses.
In some embodiments, the DNA comprises a genomic DNA or a cDNA.
In some embodiments, the target polynucleotide comprises a polynucleotide sequence of SEQ ID NOs: 105, 106, and 113-118 or comprises a polynucleotide sequence having at least 75% (e.g., 75%, 80%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more) sequence identity to any one of SEQ ID NOs: 105, 106, and 1 13-118.
In some embodiments, the BART protein forms a complex with a navigator ssDNA that binds to the target polynucleotide. In some embodiments, the BART-navigator ssDNA complex comprises a BART protein having the amino acid sequence of SEQ ID NO: 1 and a navigator ssDNA comprising a polynucleotide sequence having at least 75% (e.g., 75%, 80%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more) sequence identity to any one of SEQ ID NOs: 13-96 and 107-112. In some embodiments, the BART-navigator ssDNA complex comprises a BART protein having the amino acid sequence having at least 75% (e.g., 75%, 80%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more) sequence identity to any one of SEQ ID NOs: 1 -12 and a navigator ssDNA comprising a polynucleotide sequence of SEQ ID NOs: 13-96 and 107-112. In some embodiments, the BART -navi gator ssDNA complex comprises a BART protein having the amino acid sequence of SEQ ID NO: 1 and a navigator ssDNA having a polynucleotide sequence of SEQ ID NOs: 13-96 and 107-112.
In some embodiments, In some embodiments, the BART protein forms a complex with a navigator ssDNA operably linked to a payload RNA. In some embodiments, the payload RNA comprises a polynucleotide sequence of SEQ ID NOs: 119-133 or comprises a polynucleotide sequence having at least
Figure imgf000046_0001
95%, 96%, 97%, 98%, 99% or more) sequence identity to any one of SEQ ID NOs: 119-133.
Methods of Treatment
The above-described BART system, one or more polynucleotides, or vector or delivery systems can be used in a therapeutic method of treatment. The therapeutic method of treatment may comprise gene or genome editing, or gene therapy. In one aspect, this disclosure provides a method of treating a subject in need thereof, comprising inducing gene editing by introducing into a cell of the subject the polynucleotide or any of the vectors as herein described. In some embodiments, the method comprises inducing transcriptional activation or repression by introducing into a cell of the subject the polynucleotide or any of the vectors as herein described.
In another aspect, this disclosure further provides a method of treating a disease of a subject caused by a genetic defect in a target sequence. The method comprises administering the system or the composition described above to a cell containing the target sequence in a subject in need thereof and thereby inducing a modification in the target sequence. In some embodiments, the target sequence is located at genomic loci of interest. In some embodiments, the target sequence is part of a gene, and the modification in the target sequence modulates the expression level of the gene. In some embodiments, the modification in the target sequence reduces the expression level of the gene.
In some embodiments, the method comprises reverse transcribing the viral RNA into a cDNA by the enzyme.
In some embodiments, the enzyme has an integrase activity, and wherein the method comprises integrating the cDNA to a genome of the cell or the subject by the enzyme. In some embodiments, the method comprises transcribing the cDNA into the singlestranded guide polynucleotide capable of hybridizing with the viral RNA.
In another aspect, this disclosure provides a method of enhancing immunity against a viral infection of a RNA virus in a cell or a subject. In some embodiments, the method comprises delivering to the cell or the subject an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a single-stranded DNA (ssDNA) to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, wherein the enzyme reverse-transcribes the mRNA into a second ssDNA, wherein the second ssDNA hybridizes with the viral RNA and directs binding of the enzyme to the viral RNA, and wherein the enzyme cleaves the viral RNA.
As used herein, “treating” or “treatment” of a disease in a subject refers to (1) preventing the symptoms or disease from occurring in a subject that is predisposed or does not yet display symptoms of the disease; (2) inhibiting the disease or arresting its development; or (3) ameliorating or causing regression of the disease or the symptoms of the disease. As understood in the art, “treatment” is an approach for obtaining beneficial or desired results, including clinical results. For the purposes of the present technology, beneficial or desired results can include one or more, but are not limited to, alleviation or amelioration of one or more symptoms, diminishment of extent of a condition (including a disease), stabilized (i.e., not worsening) state of a condition (including disease), delay or slowing of condition (including disease), progression, amelioration or palliation of the condition (including disease), states and remission (whether partial or total), whether detectable or undetectable. In one aspect, the term “treatment” excludes prevention.
The terms “subject,” “individual,” and “patient” are used interchangeably herein to refer to a vertebrate, such as a mammal, e.g. , a human. Mammals include, but are not limited to, murines, simians, humans, farm animals, sport animals, and pets. Tissues, cells, and their progeny of a biological entity obtained in vivo or cultured in vitro are also encompassed.
Many devastating human diseases have one common cause: genetic alteration or mutation. The disease-causing mutations in patients are either acquired through inheritance from their parents or are caused by environmental factors. These diseases include, but are not limited to, the following categories. First, some genetic disorders are caused by germline mutations. One example is cystic fibrosis, which is caused by mutations at the CFTR gene inherited from parents. A second suppressor mutation in the mutant CFTR can partially restore the function of CFTR protein in somatic tissues. Other example genetic diseases caused by a point genetic mutation that can be corrected by the disclosed technology include Gaucher’s disease, alpha trypsin deficiency disease, and sickle cell anemia, to name a few. Second, some diseases, such as chronic viral infectious diseases, are caused by exogenous environmental factors that result in genetic alterations. One example is AIDS, which is caused by insertion of the human HIV viral genome into the genome of infected T-cells. Third, some neurodegenerative diseases involve genetic alterations. One example is Huntington’s disease, which is caused by expansion of CAG tri -nucleotide in the huntingtin gene of affected patients. Finally, cancers are caused by various somatic mutations accumulated in cancer cells. Therefore, correcting the disease-causing genetic mutations, or functionally correcting the sequence, provides an appealing therapeutic opportunity to treat these diseases.
Somatic genetic editing is an appealing strategy for many human diseases. Through precise editing of the target DNA or RNA sequence, the BART system can correct the mutated genes in genetic disorders, inactivate the viral genome in the infected cells, eliminate the expression of the disease-causing protein in neurodegenerative diseases, or silence the oncogenic protein in cancers. Accordingly, the system and method disclosed in this disclosure can be used in correcting underlying genetic alterations in diseases, including the above-mentioned genetic disorders, chronic infectious diseases, neurodegenerative diseases, and cancer.
Genetic Diseases
It is estimated that over six thousand genetic diseases are caused by known genetic mutations. Correcting the underlying disease-causing mutations in the pathological tissues/organs can provide alleviation or cure to the diseases. For example, cystic fibrosis affects one out of every 3,000 people in the US. It is caused by inheritance of a mutated CFTR gene and 70% of the patients have the same mutation, deletion of a tri-nucleotide leading to a deletion of phenylalanine at position 508, which leads to the mislocation and degradation of CFTR. The system and method disclosed in this invention can be used to convert a Vai 509 residue (GTT) to Phe 509 (TTT) in affected tissues (lung), thereby functionally correcting the the Phe 508 mutation. In addition, a second suppressor mutation (such as R553Q, R553M, or V510D) in the mutant Phe 508 CFTR can partially restore the function of CFTR protein in somatic tissues.
Chronic Infectious Diseases
The system and method as disclosed can also be used to specifically inactivate any gene in a viral genome that is incorporated into human cells/tissues. For example, the system and method disclosed in this invention allow one to create a stop codon for early termination of translation of the essential viral genes, and thereby remediate or cure the chronic debilitating infectious diseases. For example, current AIDS therapies can reduce viral load, but cannot totally eliminate dormant HIV from positive T cells. The system and method disclosed herein can be used to permanently inactivate one or two essential HIV gene expression in the integrated HIV genome in human T- cells by introducing one or two stop codons. Another example is the hepatitis B virus (HBV). The system and method disclosed here can be used to specifically inactivate one or two essential HBV genes, which are incorporated into the human genome, and silence HBV life cycle.
Neurodegenerative Diseases
Some neurodegenerative diseases are caused by gain-of-function mutations. For example, SOD1G93A leads to development of amyotrophic lateral sclerosis (ALS). The system and method disclosed in this invention can be used to either correct the mutation or eliminate the mutant protein expression by introducing a stop codon or by changing a splicing site.
Diseases of the Muscular System
Dystrophin is a cytoplasmic protein that provides structural stability to the dystroglycan complex of the cell membrane that is responsible for regulating muscle cell integrity and function. The dystrophin gene or “DMD gene” as used interchangeably herein is 2.2 megabases at locus Xp21. The primary transcription measures about 2,400 kb with the mature mRNA being about 14 kb. 79 exons code for the protein which is over 3500 amino acids. Exon 51 is frequently adjacent to frame-disrupting deletions in DMD patients and has been targeted in clinical trials for oligonucleotide-based exon skipping. A clinical trial for the exon 51 skipping compound eteplirsen recently reported a significant functional benefit across 48 weeks, with an average of 47% dystrophin positive fibers compared to baseline. Mutations in exon 51 are ideally suited for permanent correction by NHEJ-based genome editing. The methods of US Patent Publication No. 20130145487, which relates to meganuclease variants to cleave a target sequence from the human dystrophin gene (DMD), may also be modified for the nucleic acid-targeting system of the present invention.
Cancers
Many genes (including tumor suppressor genes, oncogenes, and DNA repair genes) contribute to the development of cancer. Mutations in these genes often lead to various cancers. Using the system and method disclosed herein, one can specifically target and correct these mutations. As a result, causative oncogenic proteins can be functionally annulled, or their expression can be eliminated by introducing a point mutation at either the catalytic sites or splicing sites. In some embodiments, the treatment, prophylaxis or diagnosis of cancer is provided. The target is preferably one or more of the FAS, BID, CTLA4, PDCD1, CBLB, PTPN6, TRAC, or TRBC genes. Cancer may be one or more of lymphoma, chronic lymphocytic leukemia (CLL), B cell acute lymphocytic leukemia (B-ALL), acute lymphoblastic leukemia, acute myeloid leukemia, non-Hodgkin’s lymphoma (NHL), diffuse large cell lymphoma (DLCL), multiple myeloma, renal cell carcinoma (RCC), neuroblastoma, colorectal cancer, breast cancer, ovarian cancer, melanoma, sarcoma, prostate cancer, lung cancer, esophageal cancer, hepatocellular carcinoma, pancreatic cancer, astrocytoma, mesothelioma, head and neck cancer, and medulloblastoma. This may be implemented with engineered chimeric antigen receptor (CAR) T cell. This is described in WO2015161276, the disclosure of which is hereby incorporated by reference and described hereinbelow. Target genes suitable for the treatment or prophylaxis of cancer may include, in some embodiments, those described in WO2015048577, the disclosure of which is hereby incorporated by reference.
Stem Cell Genetic Modification
In some embodiments, stem cell or progenitor cell can be genetically modified using the system and method disclosed in this invention. Suitable cells include, e.g., stem cells (adult stem cells, embryonic stem cells, iPS cells, etc.) and progenitor cells (e.g., cardiac progenitor cells, neural progenitor cells, etc.). Suitable cells include mammalian stem cells and progenitor cells, including, e.g., rodent stem cells, rodent progenitor cells, human stem cells, human progenitor cells, etc. Suitable host cells include in vitro host cells, e.g., isolated host cells. In some embodiments, the BART system can be used for targeted and precise genetic modification of tissue ex vivo, correcting the underlying genetic defects. After the ex vivo correction, the tissues may be returned to the patients. Moreover, the technology can be broadly used in cell-based therapies for correcting genetic diseases.
Genetic Editing in Animals and Plants
The system and method described above can be used to generate a transgenic non-human animal or plant having one or more genetic modifications of interest. In some embodiments, the transgenic non-human animal is homozygous for the genetic modification. In some embodiments, the transgenic non-human animal is heterozygous for the genetic modification. In some embodiments, the transgenic non-human animal is a vertebrate, for example, a fish (e.g., zebrafish, goldfish, pufferfish, cavefish, etc ), an amphibian (frog, salamander, etc.), a bird (e.g., chicken, turkey, etc.), a reptile (e.g., snake, lizard, etc.), a mammal (e.g., an ungulate, e.g., a pig, a cow, a goat, a sheep, etc.; a lagomorph (e.g., a rabbit); a rodent (e.g., a rat, a mouse); or a non-human primate.
The system and method can be used for treating diseases in animals in a way similar to those for treating diseases in humans as described above. Alternatively, it can be used to generate knock-in animal disease models bearing specific genetic mutation(s) for purposes of research, drug discovery, and target validation. The system and method described above can also be used for introduction of point mutations to ES cells or embryos of various organisms, for the purpose of breeding and improving animal stocks and crop quality.
Methods of introducing exogenous nucleic acids into plant cells are well known in the art. Suitable methods include viral infection (such as double- stranded DNA viruses), transfection, conjugation, protoplast fusion, electroporation, particle gun technology, calcium phosphate precipitation, direct microinjection, silicon carbide whiskers technology, Agrobacterium-mediated transformation and the like. The choice of method is generally dependent on the type of cell being transformed and the circumstances under which the transformation is taking place (i.e., in vitro, ex vivo, or in vivo).
In another aspect, this disclosure provides a method of treating or preventing a viral infection of a RNA virus in a cell or a subject. In some embodiments, the method comprises delivering to the cell or the subject an enzyme that possesses a reverse transcriptase activity and an endonuclease activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a ssDNA, wherein the ssDNA hybridizes with the viral RNA and directs binding of the enzyme to the viral RNA, and wherein the enzyme cleaves the viral RNA.
In some embodiments, the RNA virus is selected from Norwalk, Rotavirus, Poliovirus, Ebola virus, Marburg virus, Lassa virus, Hantavirus, Rabies, Influenza, Yellow fever virus, Coronavirus, SARS, SARS-CoV-2, West Nile virus, Hepatitis A, C (HCV) and E virus, Dengue fever virus, togaviruses, Rhabdoviruses, Picornaviruses, Myxoviruses, retroviruses, bunyaviruses, coronaviruses, and reoviruses.
Method of Enhancing Immunity Against Viral Infection of RNA Viruses
In another aspect, this disclosure provides a method of enhancing immunity against a viral infection of a RNA virus in a cell or a subject. In some embodiments, the method comprises delivering to the cell or the subject an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a ssDNA to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, wherein the enzyme reverse-transcribes the mRNA into a second ssDNA, wherein the second ssDNA hybridizes with the viral RNA and directs binding of the enzyme to the viral RNA, and wherein the enzyme cleaves the viral RNA.
In some embodiments, the method comprises delivering to the cell or the subject an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a single-stranded DNA (ssDNA) to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, and wherein the mRNA with the viral RNA to form a RNA hybrid to silence the viral RNA.
In some embodiments, the RNA virus is selected from Norwalk, Rotavirus, Poliovirus, Ebola virus, Marburg virus, Lassa virus, Hantavirus, Rabies, Influenza, Yellow fever virus, Coronavirus, SARS, SARS-CoV-2, West Nile virus, Hepatitis A, C (HCV) and E virus, Dengue fever virus, togaviruses, Rhabdoviruses, Picornaviruses, Myxoviruses, retroviruses, bunyaviruses, coronaviruses, and reoviruses.
In another aspect, this disclosure also provides a method of generating a cell line with immunity against a viral infection of a RNA virus. In some embodiments, the method comprises delivering to a cell an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a ssDNA to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, wherein the enzyme reverse-transcribes the mRNA into a second ssDNA, wherein the second ssDNA hybridizes with the viral RNA and directs binding of the enzyme to the viral RNA, and wherein the enzyme cleaves the viral RNA.
In some embodiments, the method comprises delivering to a cell an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a single-stranded DNA (ssDNA) to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, and wherein the mRNA with the viral RNA to form a RNA hybrid to silence the viral RNA.
In some embodiments, the RNA virus is selected from Norwalk, Rotavirus, Poliovirus, Ebola virus, Marburg virus, Lassa virus, Hantavirus, Rabies, Influenza, Yellow fever virus, Coronavirus, SARS, SARS-CoV-2, West Nile virus, Hepatitis A, C (HCV) and E virus, Dengue fever virus, togaviruses, Rhabdoviruses, Picornaviruses, Myxoviruses, retroviruses, bunyaviruses, coronaviruses, and reoviruses.
Vector Systems and Cells
Vector Systems
In some embodiments, the BART systems described herein can be delivered to the host cell via one or more vectors, such as viral vectors. For example, the one or more viral vectors may comprise an adenovirus, a lentivirus, adeno-associated virus, or RNA-based viral vectors which may be replication competent or may only encode genes for self-amplification, the later constructs will herein be referred to as replicons.
The term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid linked thereto. Vectors include, but are not limited to, nucleic acid molecules that are singlestranded, double-stranded, or partially double-stranded; nucleic acid molecules that comprise one or more free ends, no free ends (e.g., circular); nucleic acid molecules that comprise DNA, RNA, or both; and other varieties of polynucleotides known in the art. One type of vector is a “plasmid,” which refers to a circular double-stranded DNA loop into which additional DNA segments can be inserted, such as by standard molecular cloning techniques. Another type of vector is a viral vector, wherein virally derived DNA or RNA sequences are present in the vector for packaging into a virus (e.g., retroviruses, replication defective retroviruses, adenoviruses, replication-defective adenoviruses, adeno-associated viruses, and/ or RNA-based replicons). Viral vectors also include polynucleotides carried by a virus for transfection into a host cell. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., RNA vectors comprising their own RNA-dependent RNA polymerase, bacterial vectors having a bacterial origin of replication, and episomal mammalian vectors). Other vectors (e.g., non-episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as “expression vectors.” Vectors for and that result in expression in a eukaryotic cell can be referred to herein as “eukaryotic expression vectors.” Common expression vectors of utility in recombinant DNA techniques are often in the form of plasmids.
Recombinant expression vectors can comprise a nucleic acid of the invention in a form suitable for expression of the nucleic acid in a host cell, which means that the recombinant expression vectors include one or more regulatory elements, which may be selected on the basis of the host cells to be used for expression, that is operatively linked to the nucleic acid sequence to be expressed. Within a recombinant expression vector, “operably linked” is intended to mean that the nucleotide sequence of interest is linked to the regulatory element(s) in a manner that allows for expression of the nucleotide sequence (e.g., in an in vitro transcript! on/translati on system or in a host cell when the vector is introduced into the host cell). The term “regulatory element” is intended to include promoters, enhancers, internal ribosomal entry sites (IRES), and other expression control elements (e.g., transcription termination signals, such as polyadenylation signals and poly-U sequences as well as RNA elements required for recognition by self-encoded RNA dependent RNA polymerases). Such regulatory elements are described, for example, in Goeddel, GENE EXPRESSION TECHNOLOGY: METHODS IN ENZYMOLOGY 185, Academic Press, San Diego, Calif. (1990). Regulatory elements include those that direct constitutive expression of a nucleotide sequence in many types of host cell and those that direct expression of the nucleotide sequence only in certain host cells (e.g, tissuespecific regulatory sequences). A tissue-specific promoter may direct expression primarily in a desired tissue of interest, such as muscle, neuron, bone, skin, blood, specific organs (e.g., liver, pancreas), or particular cell types (e.g, lymphocytes). Regulatory elements may also direct expression in a temporal-dependent manner, such as in a cell-cycle dependent or developmental stage-dependent manner, which may or may not also be tissue or cell-type specific. In some embodiments, a vector comprises one or more pol III promoter (e.g., 1, 2, 3, 4, 5, or more pol III promoters), one or more pol II promoters (e.g., 1, 2, 3, 4, 5, or more pol II promoters), one or more pol I promoters (e.g., 1, 2, 3, 4, 5, or more pol I promoters), or combinations thereof. Examples of pol III promoters include, but are not limited to, U6 and Hl promoters. Examples of pol II promoters include, but are not limited to, the retroviral Rous sarcoma virus (RSV) LTR promoter (optionally with the RSV enhancer), the cytomegalovirus (CMV) promoter (optionally with the CMV enhancer) [see, e.g., Boshart et al., Cell, 41 :521-530 (1985)], the SV40 promoter, the dihydrofolate reductase promoter, the P-actin promoter, the phosphoglycerol kinase (PGK) promoter, and the EFla promoter. Also encompassed by the term “regulatory element” are enhancer elements, such as WPRE; CMV enhancers; the R-U5’ segment in LTR of HTLV-I (Mol. Cell. Biol., Vol. 8(1), p. 466-472, 1988); SV40 enhancer; and the intron sequence between exons 2 and 3 of rabbit P-globin (Proc. Natl. Acad. Sci. USA., Vol. 78(3), p. 1527-31, 1981). It will be appreciated by those skilled in the art that the design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression desired, etc. A vector can be introduced into host cells to thereby produce transcripts, proteins, or peptides, including fusion proteins or peptides, encoded by nucleic acids as described herein (e.g., clustered regularly interspersed short palindromic repeats (BART) transcripts, proteins, enzymes, mutant forms thereof, fusion proteins thereof, etc.). In some embodiments, the vector system may include one or more viral vectors. In some embodiments, the one or more viral vectors comprise an adenovirus-based vector, a lentivirus- based vector, an adeno-associated virus-based vector, or an RNA-based replicon. In some embodiments, when expressed, the BART system can bind and cleave at the target sequence, thus preventing functional virion formation. As a result, functional virions will not be assembled. Accordingly, the vector system described herein includes a self-replicating RNA (e.g., Nodamurovirus-based replicon) that makes the BART protein and simultaneously self-inactivates upon execution of its function in the presence of a navigator ssDNA.
Cells
In another aspect, this disclosure provides a host cell or cell line or progeny thereof comprising the BART system, the vector system, or the polynucleotide, as described above. In some embodiments, the host cell or cell line or progeny thereof comprises a stem cell or stem cell line. The cell may be a eukaryotic cell e.g., a plant, animal, or human cell) or a prokaryotic cell.
Also provided is a product of any such cell or of any such progeny, resulted from the one or more target loci modified by the BART system. The product may be a peptide, polypeptide, or protein.
In some embodiments, the host cell or cell line comprises one or more polynucleotide molecules encoding an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and encoding an integrase activity, and one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with a target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide.
The term “cell” as used herein may refer to either a prokaryotic or eukaryotic cell, optionally obtained from a subject or a commercially available source. In some embodiments, the eukaryotic cell may be a human cell, a rodent cell, optionally a mouse cell, a yeast cell, or an insect cell. In some embodiments, the eukaryotic cell may be a Chinese hamster ovary (CHO) cell.
“Eukaryotic cells” comprise all of the life kingdoms except monera. They can be easily distinguished through a membrane-bound nucleus. Animals, plants, fungi, and protists are eukaryotes or organisms whose cells are organized into complex structures by internal membranes and a cytoskeleton. The most characteristic membrane-bound structure is the nucleus. Unless specifically recited, the term “host” includes a eukaryotic host, including, for example, yeast, higher plant, insect, and mammalian cells. Non-limiting examples of eukaryotic cells or hosts include simian, bovine, porcine, murine, rat, avian, reptilian, and human, e.g., HEK293 cells and 293T cells.
“Prokaryotic cells” that usually lack a nucleus or any other membrane-bound organelles and are divided into two domains, bacteria, and archaea. In addition to chromosomal DNA, these cells can also contain genetic information in a circular loop called on episome. Bacterial cells are very small, roughly the size of an animal mitochondrion. Prokaryotic cells feature three major shapes: rod-shaped, spherical, and spiral. Instead of going through elaborate replication processes like eukaryotes, bacterial cells divide by binary fission. Examples include but are not limited to Bacillus bacteria, E. coli bacterium, and Salmonella bacterium.
As used herein, the term “progeny”, such as the progeny of a transgenic plant, is one that is born of, begotten by, or derived from a plant or the transgenic plant. The introduced nucleic acid molecule may also be transiently introduced into the recipient cell such that the introduced nucleic acid molecule is not inherited by subsequent progeny and thus not considered “transgenic.” Accordingly, as used herein, a “non-transgenic” plant or plant cell is a plant which does not contain a foreign nucleic acid stably integrated into its genome.
Also within the scope of this disclosure is a method for preparing the cell line described above. In some embodiments, the method comprises introducing to a cell one or more polynucleotide molecules encoding an enzyme that possesses a reverse transcriptase activity and an endonuclease activity, and encoding one or more nucleic acid components.
In another aspect, this disclosure also provides a method of generating a model eukaryotic cell comprising a mutated disease gene, which can be any gene associated with an increase in the risk of having or developing a disease. In some embodiments, the method comprises (a) introducing a BART system into a eukaryotic cell; and (b) allowing a BART complex (e.g., BART/navigator ssDNA complex) to bind to a target polynucleotide to effect cleavage of the target polynucleotide within the disease gene, wherein the ssDNA comprising the sequence that is hybridized to the target sequence within the target polynucleotide, thereby generating a model eukaryotic cell comprising a mutated disease gene. In some embodiments, the cleavage comprises cleaving one or two strands at the location of the target sequence by the BART protein. In some embodiments, the cleavage results in decreased or increased transcription of a target gene. In some embodiments, the method further comprises repairing the cleaved target polynucleotide by non-homologous end joining (NHEJ)- based gene insertion mechanisms with an exogenous template polynucleotide, wherein the repair results in a mutation comprising an insertion, deletion, or substitution of one or more nucleotides of the target polynucleotide. In some embodiments, the mutation results in one or more amino acid changes in protein expression from a gene comprising the target sequence.
A variety of eukaryotic cells are suitable for use in the method. For example, the cell can be a human cell, a non-human mammalian cell, a non-mammalian vertebrate cell, an invertebrate cell, an insect cell, a plant cell, a yeast cell, or a single-cell eukaryotic organism. A variety of embryos are suitable for use in the method. For example, the embryo can be a 1-cell, 2-cell, or 4- cell human or non-human mammalian embryo. Exemplary mammalian embryos, including onecell embryos, such as mouse, rat, hamster, rodent, rabbit, feline, canine, ovine, porcine, bovine, equine, and primate embryos. In still other embodiments, the cell can be a stem cell. Suitable stem cells include without limit embryonic stem cells, ES-like stem cells, fetal stem cells, adult stem cells, pluripotent stem cells, induced pluripotent stem cells, multipotent stem cells, oligopotent stem cells, unipotent stem cells, and others. In exemplary embodiments, the cell is a mammalian cell, or the embryo is a mammalian embryo. In some embodiments, the non-human mammal cell may include, but not limited to, primate bovine, ovine, porcine, canine, rodent, Leporidae such as monkey, cow, sheep, pig, dog, rabbit, rat or mouse cell. In some embodiments, the cell may be a non-mammalian eukaryotic cell such as poultry bird (e.g., chicken), vertebrate fish (e.g., salmon) or shellfish (e.g, oyster, clam, lobster, shrimp) cell. In some embodiments, the non-human eukaryote cell is a plant cell. The plant cell may be of a monocot or dicot or of a crop or grain plant such as cassava, corn, sorghum, soybean, wheat, oat, or rice. The plant cell may also be of an algae, tree or production plant, fruit or vegetable (e.g, trees such as citrus trees, e.g., orange, grapefruit or lemon trees; peach or nectarine trees; apple or pear trees; nut trees such as almond or walnut or pistachio trees; nightshade plants; plants of the genus Brassica; plants of the genus Lactuca; plants of the genus Spinacia; plants of the genus Capsicum; cotton, tobacco, asparagus, carrot, cabbage, broccoli, cauliflower, tomato, eggplant, pepper, lettuce, spinach, strawberry, blueberry, raspberry, blackberry, grape, coffee, cocoa, etc.). Gene Editing Systems and Kits
Gene Editing Systems
In yet another aspect, this disclosure provides a gene editing system for modifying a target polynucleotide. In some embodiments, the method comprises an enzyme that possesses a reverse transcriptase activity and an endonuclease activity, and one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with the target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide.
In some embodiments, the BART systems disclosed herein or the gene editing systems comprising the disclosed BART -based systems may be delivered via liposomes, particles (e.g., nanoparticles), exosomes, microvesicles, a lipid, a cell-penetrating peptide (CPP) or a gene-gun. Delivery vehicles, particles, nanoparticles, formulations, and components thereof for expression of one or more elements of the aforementioned BART systems are as used in PCT/US2013/074667.
In some embodiments, the enzyme further possesses an integrase activity.
In some embodiments, the one or more nucleic acid components a single-stranded navigator DNA.
In some embodiments, the one or more nucleic acid components further comprise a payload RNA. In some embodiments, the payload RNA comprises a stem-loop structure. In some embodiments, the enzyme reverse transcribes the payload RNA into a cDNA. In some embodiments, the enzyme integrates the cDNA into the target polynucleotide.
In some embodiments, the enzyme comprises a N-terminal apurinic-apyrimidinic endonuclease domain optionally linked to a reverse transcriptase domain. In some embodiments, the enzyme further comprises a C-terminal domain. In some embodiments, the C-terminal domain facilitates interaction between the enzyme and the target polynucleotide. In some embodiments, the C-terminal domain comprises a zinc finger. In some embodiments, the zinc finger comprises a CCHC motif.
In some embodiments, the enzyme comprises a Bat-Associated Reverse Transcriptase (BART). In some embodiments, the BART comprises a BART of Rhinolophus ferrumequinum, Myotis myotis, Meles meles, Bos mutus, Capra hircus, Homo sapiens, Canis lupus familiaris, Sus scrofa, Callithrix jacchus Mus musculus, Ursus arclos. Elephas maximus inchcus. or a variant thereof.
In some embodiments, the gene editing system comprises one or more polynucleotide molecules encoding the enzyme. In some embodiments, the gene editing system comprises one or more polynucleotide molecules encoding or comprising the one or more nucleic acid components. In some embodiments, the one or more polynucleotide molecules comprise one or more vectors. In some embodiments, the enzyme and the one or more nucleic acid components are provided in a single vector.
In some embodiments, the target polynucleotide comprises RNA or DNA.
In some embodiments, the RNA comprises a viral RNA of a RNA virus. In some embodiments, the RNA virus is selected from the group consisting of Norwalk, Rotavirus, Poliovirus, Ebola virus, Marburg virus, Lassa virus, Hantavirus, Rabies, Influenza, Yellow fever virus, Coronavirus, SARS, SARS-CoV-2, West Nile virus, Hepatitis A, C (HCV) and E virus, Dengue fever virus, togaviruses, Rhabdoviruses, Picomaviruses, Myxoviruses, retroviruses, bunyaviruses, coronaviruses, and reoviruses.
In some embodiments, the DNA comprises a genomic DNA or a cDNA.
In some embodiments, the modification of the target polynucleotide comprises cleavage of the target polynucleotide.
In one aspect, this disclosure provides a gene editing system comprising one or more vectors, liposomes, particles (e.g., nanoparticles, lipid nanoparticles), exosomes, or microvesicles that include one or more components of BART system, and optionally a pharmaceutically acceptable carrier.
In another aspect, this disclosure provides a vector system comprising one or more vectors, wherein the one or more vectors comprise one or more polynucleotide molecules encoding an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, and comprise one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with a target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide. As used herein, the term “gene editing system,” “composition,” or “pharmaceutical composition” refers to a mixture of at least one component useful within the disclosure with other components, such as carriers, stabilizers, diluents, dispersing agents, suspending agents, thickening agents, and/or excipients. The pharmaceutical composition facilitates administration of one or more components of the BART system to an organism.
As used herein, the term “pharmaceutically acceptable” refers to a material, such as a carrier or diluent, which does not abrogate the biological activity or properties of the composition, and is relatively non-toxic, i.e., the material may be administered to an individual without causing undesirable biological effects or interacting in a deleterious manner with any of the components of the composition in which it is contained.
As used herein, the term “pharmaceutically acceptable carrier” includes a pharmaceutically acceptable salt, pharmaceutically acceptable material, composition, or carrier, such as a liquid or solid fdler, diluent, excipient, solvent, or encapsulating material, involved in carrying or transporting a compound(s) of the present invention within or to the subject such that it may perform its intended function. Typically, such compounds are carried or transported from one organ, or portion of the body, to another organ, or portion of the body. Each salt or carrier must be “acceptable” in the sense of being compatible with the other ingredients of the formulation, and not injurious to the subject. Some examples of materials that may serve as pharmaceutically acceptable carriers include: sugars, such as lactose, glucose, and sucrose; starches, such as corn starch and potato starch; cellulose, and its derivatives, such as sodium carboxymethyl cellulose, ethyl cellulose and cellulose acetate; powdered tragacanth; malt; gelatin; talc; excipients, such as cocoa butter and suppository waxes; oils, such as peanut oil, cottonseed oil, safflower oil, sesame oil, olive oil, corn oil, and soybean oil; glycols, such as propylene glycol; polyols, such as glycerin, sorbitol, mannitol and polyethylene glycol; esters, such as ethyl oleate and ethyl laurate; agar; buffering agents, such as magnesium hydroxide and aluminum hydroxide; alginic acid; pyrogen- free water; isotonic saline; Ringer’s solution; ethyl alcohol; phosphate buffer solutions; diluent; granulating agent; lubricant; binder; disintegrating agent; wetting agent; emulsifier; coloring agent; release agent; coating agent; sweetening agent; flavoring agent; perfuming agent; preservative; antioxidant; plasticizer; gelling agent; thickener; hardener; setting agent; suspending agent; surfactant; humectant; carrier; stabilizer; and other non-toxic compatible substances employed in pharmaceutical formulations, or any combination thereof. As used herein, “pharmaceutically acceptable carrier” also includes any and all coatings, antibacterial and antifungal agents, and absorption delaying agents, and the like that are compatible with the activity of one or more components of the invention, and are physiologically acceptable to the subject. Supplementary active compounds may also be incorporated into the compositions.
In another aspect, this disclosure provides a delivery system comprising the gene editing system, and the delivery system is adapted to deliver the gene editing system into a cell or a subject. In some embodiments, the delivery system comprises nanoparticles or vesicles encapsulating the gene editing system.
A “gene delivery vehicle” is defined as any molecule that can carry inserted polynucleotides into a host cell. Examples of gene delivery vehicles are liposomes, micelles biocompatible polymers, including natural polymers and synthetic polymers; lipoproteins; polypeptides; polysaccharides; lipopolysaccharides; artificial viral envelopes; metal particles; and bacteria, or viruses, such as baculovirus, adenovirus and retrovirus, bacteriophage, cosmid, plasmid, fungal vectors and other recombination vehicles typically used in the art which have been described for expression in a variety of eukaryotic and prokaryotic hosts, and may be used for gene therapy as well as for simple protein expression.
A polynucleotide disclosed herein can be delivered to a cell or tissue using a gene delivery vehicle. As used herein, “gene delivery,” “gene transfer,” “transducing,” and the like are terms referring to the introduction of an exogenous polynucleotide (sometimes referred to as a “transgene”) into a host cell, irrespective of the method used for the introduction. Such methods include a variety of well-known techniques such as vector-mediated gene transfer (by, e.g., viral infection/transfection, or various other protein-based or lipid-based gene delivery complexes) as well as techniques facilitating the delivery of “naked” polynucleotides (such as electroporation, “gene gun” delivery and various other techniques used for the introduction of polynucleotides). The introduced polynucleotide may be stably or transiently maintained in the host cell. Stable maintenance typically requires that the introduced polynucleotide either contains an origin of replication compatible with the host cell or integrates into a replicon of the host cell, such as an extrachromosomal replicon (e.g., a plasmid) or a nuclear or mitochondrial chromosome. A number of vectors are known to be capable of mediating transfer of genes to mammalian cells, as is known in the art and described herein. Kits
This disclosure further provides kits containing one or more components (e.g., BART protein, navigator ssDNA) of the system described above. In some embodiments, the kit can include one or more other reaction components. In such a kit, an appropriate amount of one or more reaction components is provided in one or more containers or held on a substrate.
Examples of additional components of the kits include, but are not limited to, one or more host cells, one or more reagents for introducing foreign nucleotide sequences into host cells, one or more reagents (e.g., probes or PCR primers) for detecting expression of the RNA or protein or verifying the target nucleic acid’s status, and buffers or culture media for the reactions (in lx or concentrated forms). The kit may also include one or more of the following components: supports, terminating, modifying, or digestion reagents, osmolytes, and an apparatus for detection.
The reaction components used can be provided in a variety of forms. For example, the components (e.g., enzymes, RNAs, probes, and/or primers) can be suspended in an aqueous solution or as a freeze-dried or lyophilized powder, pellet, or bead. In the latter case, the components, when reconstituted, form a complete mixture of components for use in an assay. The kits of the invention can be provided at any suitable temperature. For example, for storage of kits, it is preferred that they are provided and maintained below 0°C, preferably at or below -20°C, or otherwise in a frozen state.
A kit or system may contain, in an amount sufficient for at least one assay, any combination of the components described herein. In some applications, one or more reaction components may be provided in pre-measured single use amounts in individual, typically disposable, tubes or equivalent containers. The amount of a component supplied in the kit can be any appropriate amount and may depend on the target market to which the product is directed. The container(s) in which the components are supplied can be any conventional container that is capable of holding the supplied form, for instance, microfuge tubes, microtiter plates, ampoules, bottles, or integral testing devices, such as fluidic devices, cartridges, lateral flow, or other similar devices.
The kits can also include packaging materials for holding the container or combination of containers. Typical packaging materials for such kits and systems include solid matrices (e.g., glass, plastic, paper, foil, micro-particles, and the like) that hold the reaction components or detection probes in any of a variety of configurations (e.g., in a vial, microtiter plate well, microarray, and the like). The kits may further include instructions recorded in a tangible form for the use of the components.
Additional Definitions
To aid in understanding the detailed description of the compositions and methods according to the disclosure, a few express definitions are provided to facilitate an unambiguous disclosure of the various aspects of the disclosure. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.
Unless defined otherwise, all technical and scientific terms used herein have the meaning commonly understood by a person skilled in the art to which this invention belongs. The following references provide one of skill with a general definition of many of the terms used in this invention: Singleton et al., Dictionary of Microbiology and Molecular Biology (2nd ed. 1994); The Cambridge Dictionary of Science and Technology (Walker ed., 1988); The Glossary of Genetics, 5th Ed., R. Rieger et al. (eds.), Springer Verlag (1991); and Hale & Marham, The Harper Collins Dictionary of Biology (1991). As used herein, the following terms have the meanings ascribed to them below, unless specified otherwise.
The term “disease” as used herein is intended to be generally synonymous and is used interchangeably with, the terms “disorder” and “condition” (as in medical condition), in that all reflect an abnormal condition of the human or animal body or of one of its parts that impairs normal functioning, is typically manifested by distinguishing signs and symptoms, and causes the human or animal to have a reduced duration or quality of life.
“Sample,” “test sample,” and “patient sample” may be used interchangeably herein. The sample can be a sample of serum, urine plasma, amniotic fluid, cerebrospinal fluid, cells, or tissue. Such a sample can be used directly as obtained from a patient or can be pre-treated, such as by filtration, distillation, extraction, concentration, centrifugation, inactivation of interfering components, addition of reagents, and the like, to modify the character of the sample in some manner as discussed herein or otherwise as is known in the art. The terms “sample” and “biological sample” as used herein generally refer to a biological material being tested for and/or suspected of containing an analyte of interest such as antibodies. The sample may be any tissue sample from the subject. The sample may comprise protein from the subject. The terms “decrease,” “reduced,” “reduction,” “decrease,” or “inhibit” are all used herein generally to mean a decrease by a statistically significant amount. However, for avoidance of doubt, “reduced,” “reduction” or “decrease” or “inhibit” means a decrease by at least 10% as compared to a reference level, for example, a decrease by at least about 20%, or at least about 30%, or at least about 40%, or at least about 50%, or at least about 60%, or at least about 70%, or at least about 80%, or at least about 90% or up to and including a 100% decrease (e.g., absent level as compared to a reference sample), or any decrease between 10-100% as compared to a reference level.
As used herein, the term “modulate” is meant to refer to any change in biological state, i.e., increasing, decreasing, and the like.
The terms “increased,” “increase” or “enhance” or “activate” are all used herein to generally mean an increase by a statically significant amount; for the avoidance of any doubt, the terms “increased,” “increase” or “enhance” or “activate” means an increase of at least 10% as compared to a reference level, for example, an increase of at least about 20%, or at least about 30%, or at least about 40%, or at least about 50%, or at least about 60%, or at least about 70%, or at least about 80%, or at least about 90% or up to and including a 100% increase or any increase between 10-100% as compared to a reference level, or at least about a 2-fold, or at least about a 3- fold, or at least about a 4-fold, or at least about a 5-fold or at least about a 10-fold increase, or any increase between 2-fold and 10-fold or greater as compared to a reference level.
As used herein, the term “in vitro" refers to events that occur in an artificial environment, e. ., in a test tube or reaction vessel, in cell culture, etc., rather than within a multi-cellular organism.
As used herein, the term “in vivo" refers to events that occur within a multi-cellular organism, such as a non-human animal.
It is noted here that, as used in this specification and the appended claims, the singular forms “a,” “an,” and “the” include plural reference unless the context clearly dictates otherwise.
The terms “including,” “comprising,” “containing,” or “having” and variations thereof are meant to encompass the items listed thereafter and equivalents thereof as well as additional subject matter unless otherwise noted. The phrases “in one embodiment,” “in various embodiments,” “in some embodiments,” and the like are used repeatedly. Such phrases do not necessarily refer to the same embodiment, but they may unless the context dictates otherwise.
The terms “and/or” or “/” means any one of the items, any combination of the items, or all of the items with which this term is associated.
The word “substantially” does not exclude “completely,” e.g., a composition which is “substantially free” from Y may be completely free from Y. Where necessary, the word “substantially” may be omitted from the definition of the invention.
As used herein, the term “approximately” or “about,” as applied to one or more values of interest, refers to a value that is similar to a stated reference value. In some embodiments, the term “approximately” or “about” refers to a range of values that fall within 25%, 20%, 19%, 18%, 17%, 16%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, or less in either direction (greater than or less than) of the stated reference value unless otherwise stated or otherwise evident from the context (except where such number would exceed 100% of a possible value). Unless indicated otherwise herein, the term “about” is intended to include values, e.g., weight percents, proximate to the recited range that are equivalent in terms of the functionality of the individual ingredient, the composition, or the embodiment.
It is to be understood that wherever values and ranges are provided herein, all values and ranges encompassed by these values and ranges, are meant to be encompassed within the scope of the present invention. Moreover, all values that fall within these ranges, as well as the upper or lower limits of a range of values, are also contemplated by the present application.
As used herein, the term “each,” when used in reference to a collection of items, is intended to identify an individual item in the collection but does not necessarily refer to every item in the collection. Exceptions can occur if explicit disclosure or context clearly dictates otherwise.
The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the invention. When used in this document, the term “exemplary” is intended to mean “by way of example” and is not intended to indicate that a particular exemplary item is preferred or required.
All methods described herein are performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. In regard to any of the methods provided, the steps of the method may occur simultaneously or sequentially. When the steps of the method occur sequentially, the steps may occur in any order, unless noted otherwise.
In cases in which a method comprises a combination of steps, each and every combination or sub-combination of the steps is encompassed within the scope of the disclosure, unless otherwise noted herein.
Each publication, patent application, patent, and other reference cited herein is incorporated by reference in its entirety to the extent that it is not inconsistent with the present disclosure. Publications disclosed herein are provided solely for their disclosure prior to the fding date of the present invention. Nothing herein is to be construed as an admission that the present invention is not entitled to antedate such publication by virtue of prior invention. Further, the dates of publication provided may be different from the actual publication dates, which may need to be independently confirmed.
It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims.
Examples
EXAMPLE 1
This example describes the materials and methods used in EXAMPLE 2 below.
Cell culture and maintenance
Bat iPSCs from Rhinolophus ferrumequimim were cultured as described previously (Dejosez et al., 2023, Cell 186, 957-974). Cells were plated on irradiated MEFs and maintained using DMEM/F12 (Gibco, 11330-032), 20% KOSR (Life Technologies, 10828-028), O. lmM NEAA (Gibco, 11140-050), 2mM GlutaMAX (Gibco, 35050-061), 10 U/ml and 10 pg/ml of Pen/Strep (Gibco, 15140122) respectively, 100 ng/ml FGF2 (R&D Systems, 233-FB), 100 ng/ml hSCF (StemCell Technologies, 78062.2), 104 U/ml mLIF (Millipore-Sigma, ESG1107), 20 nM Forskolin (Sigma, F6886), and 100 M 2-mercaptoethanol.
Bat Embryonic Fibroblasts (BEFs) were immortalized using SV40-LT and cultured, similarly to HEK293 cells (ATCC, CRL-1573) on 0.1% Gelatin using DMEM (Gibco, 10569044) supplemented with 10 % HI FBS (Sigma, F4135), NEAA, GlutaMAX and Pen/Strep.
Human ES cells (H9) were cultured on Vitronectin XF (Stemcell Technologies, 100-0763) using mTeSRl (Stemcell Technologies, 85850).
Immunofluorescence staining
Immunostaining was performed on p-slides (Ibidi, 80286), where cell lines were cultured until the day of fixation. After washing the cells once with DPBS, they were fixed with Cytofix/ Cytoperm solution (BD, BDB554714) for 20 min at 4 °C. Cells were then washed once with lx Perm/Wash buffer (BD, BDB554714) and incubated overnight at 4 °C in Perm/Wash buffer containing primary anti-ssDNA [F7-26] (Fisher Scientific, MAB3299MI), J2 anti-dsRNA (Scicons, 10010200), anti-HIVl Reverse Transcriptase (Abeam, ab63911), anti-Anti-DNA:RNA hybrid antibody [S9.6] (Millipore-Sigma, MABE1095) at 1 : 100 dilutions. Cells were washed 3 times with Perm/Wash buffer before incubating 1 hour at RT and darkness with secondary antibodies Goat anti-rabbit-AF488 (Life Technologies, A-10034), Goat anti-mouse-AF488 (Life Technologies, A-11029) Donkey anti -goat- AF488, (Life Technologies, double check), at 1 :200. Finally, after two washes with DPBS, small chemical Hoechst (Millipore-Sigma, 94403) was used at 5 pg/ml for 5 min at RT. Prior to staining with anti-ssDNA [F7-26], fixed cells were treated with 200 mg/ml RNase A (Fisher Scientific, EN0531) at 37 °C for 4 h. Live cells were incubated with small chemical SYBR Gold (Invitrogen, SI 1494) at 1 : 10.000 dilution during 30 min at 37 °C before visualization.
BART-carrying cell lines
Plasmids containing BARTs were synthesized by GeneScript, using a backbone of pcDNA3.1(+) and a His-tag in the C-terminal region, to allow protein purification. Their 3D structures were predicted by ColabFold (ColabFold vl.5.2-patch: AlphaFold2 using MMseqs2). They were cloned in H9 hES cells using Lipofectamine 3000 (ThermoFisher, L3000001) following manufacturer’s protocol. Selection was performed 48h after transfection using increasing concentrations of Geneticin (ThermoFisher, 10131027), with a final concentration of 150 ng/ml.
Cytoplasmic DNA extraction
The protocol described previously (Cell Rep. 2012 Aug 30;2(2):207-15) was followed. Cells were harvested by washing the cells with DPBS and using Versene Solution (Gibco, 15040066) for 5 min at 37 °C. The cells were then collected and washed twice with ice-cold PBS, spinning down at 1000G for 3 min at 4 °C. Cytoplasmic DNA was extracted by vortexing the cells for 4 seconds in a hypotonic lysis solution containing 10 mM Tris pH 8 (Invitrogen, AM9855G), 10 mMNaCl, 1.5 mMMgCl (Invitrogen, AM9530G), and 1 mMDTT (Thermo Scientific, R0862). After spinning down the cells at 1000G for 3 min at 4 °C, the supernatant was collected and kept at -80 °C.
VSV infection
Cells were infected using vesicular stomatitis virus carrying an eGFP reporter at MOI 0,01 for 24 hours. The supernatant was collected by discarding floating cells and spun down at 300G for 5 min.
BART digestion analysis
For the in vitro analysis BART.O and BART.l were purified using Ni-NTA Spin Kit (Qiagen, 31314) using 2 pg of each protein per reaction. In combination with 122 bp oligos (Sigma), 100 ng of dsDNA template was added, and genes B2M and CXCR4 were amplified by using primers in Table 3.
Table 3. Primer sequences
Figure imgf000069_0001
Figure imgf000070_0001
Figure imgf000071_0001
For the in vivo analysis, 122 bp ssDNA guides were transfected in BART-containing H9 cell lines using Lipofectamine 3000. After 24 hours, genomic DNA was extracted using a Blood & Cell Culture DNA Kit (Qiagen, 13323), and the potential mutated sites were amplified using GoTaq Master Mix (Promega, M7123). Sequences were purified using a PCR Purification Kit (Qiagen, 28104) and subject to Sanger sequencing (GeneWiz).
EXAMPLE 2
Given the predominance of RNA viruses in their capacity to infect mammals (e. ., SARS- CoV-2), a critical feature of any mammalian-centric CRISPR-Cas-like system would be the ability to convert viral genomic RNA into DNA for subsequent processing in the nucleus. Consequently, the presence of reverse transcriptase activity, an enzymatic activity previously unreported to be encoded by any mammalian genome, was investigated. A distinctive characteristic of bat-induced pluripotent stem cell (iPSC) colonies was observed - they exhibited a compact and uniform appearance, and their cytoplasm was populated with minuscule vesicles not observed in iPSCs from other mammalian species. These vesicles show a uniform distribution in the cytosol of the cell, and are of lipidic nature. The iPS cells were cultured in lipid-deprived medium E8 and E8 with AlbuMAX for 3 days to understand how the vesicles only persisted in the presence of lipids in the medium. These vesicles were also absent in bat embryonic fibroblasts cultured in a serumcontaining medium.
It was observed that these vesicles had a varied content, from viral particles to autophagosome systems. To investigate the content of these vesicles, immunostaining was performed using antibodies against a well-known reverse transcriptase (RT) found in the HIV genome (HIV-RT-p5 l/p66). It was found that a large majority of vesicles exhibited strong positive staining, indicating the presence of a reverse transcriptase enzyme. Subsequently, whether these vesicles might also contain single-stranded DNA (ssDNA), the product of reverse transcription, was investigated. Immunostaining with an antibody specifically detecting ssDNA revealed substantial amounts of ssDNA within the cytoplasmic vesicles. It further demonstrated the existence of DNA/RNA hybrids within the cytoplasm using the S9.6 antibody. The presence of cytoplasmic ssDNA was further confirmed using a small molecule, SybrGold, that at low concentrations, selectively detects ssDNA (Low concentration Sybr Gold). In addition, this phenotype was unique to the pluripotent stage in bats. Embryonic fibroblasts did not present reverse transcriptase or single-stranded DNA. However, signals of DNA/RNA hybrids in BEFs were observed. Collectively, these results substantiate the existence of a protein with reverse transcriptase activity in the cytoplasm of bat iPSCs. This protein appears to generate substantial amounts of ssDNA, lending credibility to the hypothesis that a unique, mammalian-centric CRISPR-Cas-like system is present in bats.
To further understand the nature of the ssDNA generated in the bat iPSCs, it was sought to determine its sequence. The approach to this task was two-fold: first, the cytoplasmic DNA was isolated, and libraries of these single-stranded DNAs were prepared for subsequent nextgeneration sequencing analysis. Specifically, a mild hypotonic solution was used to extract the cytoplasmic content without bursting the nucleus. Next, the SMART (Switching Mechanism at 5’ End of RNA Template) technique was applied to generate cDNA from the extracted singlestranded DNA, which effectively avoided traces of RNA (not sure how exactly this worked when making the library). This technology ensures the entire length of the ssDNA, including the 5’ end, is represented in the final cDNA libraries, thereby providing an accurate depiction of the original sequences. Analysis of the nucleic acids obtained through this method revealed a distinct band of ssDNA in the bat iPSC cytoplasm, approximately 122 bp in length. These findings provide further evidence of active reverse transcription in bat iPSCs. Furthermore, they also indicate that the process not only generates a significant amount of ssDNA but also seems to involve a specific processing step. Either the template RNA or the reverse transcribed ssDNA is processed into defined short stretches of DNA, a phenomenon reminiscent of the sequence processing observed in the bacterial CRISPR-Cas system.
For a more comprehensive understanding of the nature of the sequences contained within these short ssDNAs, the SMART libraries were subjected to next-generation sequencing and then mapped the sequence reads to the R. ferrumequinum bat genome. The resulting data provided several unexpected insights into the characteristics of the ssDNA sequences. Remarkably, a large portion of the sequences could not be mapped to the bat genome. From the total of 116,550,396 sequence reads, 111,728,741 (95.86%) did not align with any known bat sequence. Subsequent attempts to map these sequences using BLAST searches against all publicly available databases failed, indicating these sequences were unlike any known genetic material. Only a small portion, 4,679,127 sequences (4.01%), was mapped to the bat genome. Among these small percentage of mappable sequences, around 10% of them include cellular genes such as POU5F1, X, Y, and Z. Notably, the remaining 90% of these mappable sequences were found to align with unplaced scaffolds of the bat genome rather than standard chromosomal sequences. These unplaced scaffolds, typically not integrated into genome assemblies, are often 50-100 kb islands embedded within highly repetitive heterochromatin. Upon closer examination of the ssDNA maps, these sequences frequently aligned with endogenized viral sequences, such as transposons or retroviral sequences. This pattern held true even for the 10% of reads that did successfully map to chromosomal sequences. Collectively, these results indicate that most of the short ssDNA sequences correspond to viral sequences embedded within islands of heterochromatic DNA and a potential antiviral capability.
Next, the pattern of genomic integration of the short ssDNA fragments was investigated. Notably, the sequences within the unplaced scaffold exhibited a unique alignment pattern - islands of high alignment were separated by relatively equidistant regions devoid of any alignment. This indicates not only a precise selection of sequences being integrated into the genomic arrays, but also the possibility that these interspersed regions could play a critical role. This observed pattern is functionally reminiscent of the CRISPR-Cas system, where specific viral sequences are integrated into genomic arrays at uniform distances to ensure proper expression. Thus, these findings provide evidence that the bat genome contains sequence arrays composed of viral sequences that are homologous to the cytoplasmic ssDNA sequences. This compelling pattern further underlines the potential of these bats as a natural and eukaryotic equivalent to the CRISPR- Cas system.
A pivotal feature of the CRISPR-Cas defense mechanism is the spatial proximity of the defense array to the gene encoding the enzyme responsible for the system (in cis arrangement). In light of this, an open reading frame (ORF) search within each of the islands housing the interspersed ssDNA arrays was conducted. An approximately 1300 amino acid-long ORF within each island, exhibiting near-identical sequences across different instances, was identified. Next, a BLAST search was performed. Although no known gene entries matched the sequence, the overall structure displayed resemblance to other proteins known to exhibit reverse transcriptase activity. This observation again draws parallels to the CRISPR-Cas system. This enzyme is found in proximity to the interspersed viral sequences located within heterochromatic islands of the bat genome, underscoring the potential relevance of this newly discovered system.
To further understand the properties of the enzyme, a more detailed analysis at the protein level was performed. When this sequence was first aligned to the genome of Rhinolophus ferrumequinum, a second enzyme appeared which was different only in 100 amino acids in the C- terminal region. Due to the high similarity in these over 1000 amino acids enzymes, a study was carried out to determine the relationship between the two enzymes, e.g., if one enzyme is an edited variant and/or a more efficient version of the other enzyme. To distinguish these two enzymes, the original-identified, shorter version was named BART.0, and the slightly longer one was named BART.l. Experimental data indicated that the enzymes indeed possessed reverse transcriptase activity, a capability vital for the target-primed reverse transcription of mRNA, a critical step in retrotransposition. Notably, these enzymes also exhibited an endonuclease function. This activity is critical in introducing nicks into the chromosomal target DNA, thereby facilitating the integration of new sequences. Specifically, it cleaves DNA in AT-rich regions located between a 5’ stretch of purines and a 3’ stretch of pyrimidines, which correspond to the established sites of (LINE-1) genome integration. Notably, these enzymatic features are consistent with the observations, which include a specific reverse transcriptase activity for mRNA-to-cDNA conversion, a nuclease function for creating nicks in the DNA, and an integration activity for incorporating new sequences into the genome (AlphaFold 3D structure). As such, based on sequence homology, the enzyme presents a robust mechanistic framework that can account for the unique genomic events that have been uncovered.
Next, whether the system indeed possessed specific anti-nucleic acid activity and the capacity to encode a genomic memory of existing infections, akin to the defensive mechanism of the CRISPR-Cas system used by bacteria to ward off bacteriophages, was investigated. Two variants of the BART enzyme were cloned into a human expression vector to establish stable cell lines expressing these enzymes. These variants correspond to the one directly observed through the mapping of the ssDNA fragments (BART.0) and the one with higher homology found after the alignment with the USCS genome database (BART. l). Despite the introduction of this bat-origin enzyme, the human cell lines (H9 and 293) showed no adverse effects on their growth properties. Notably, the overexpression of BART conferred partial immunity against VSV virus, indicating the intriguing potential of the BART-based system as a novel tool for viral resistance. Following BART overexpression, the BART-expressing cell lines were infected with VSV at a low MOI (0.01). After the cytoplasmic ssDNA was extracted, the presence of viral cDNA in the cytoplasm was confirmed. Not only was it present, but it was also processed and chopped into shorter fragments. This observation indicates that the BART system was operational and active against viral threats. But the most surprising revelation lay in the fate of these short fragments. As shown by genomic PCR and sequencing, they were not aimlessly floating in the cytoplasm. Instead, they had been integrated into interspersed arrays in similar islands, as observed for more ancient genomic sequences. It should be noted that they were not just idle DNA sequences - these fragments were transcribed into RNAs, indicating they could potentially impact cellular functions. In addition, when the reverse transcriptase was inhibited, it resulted in the loss of the observed antiviral activity in the human cell lines. This reaffirmed the crucial role of BART and its reverse transcriptase activity in providing this novel form of virus resistance. Thus, the findings strongly indicated that the enzyme system as described herein is capable of mounting a defense against viral infection. This is not limited to immediate, acute infections - through the integration of viral sequences into the genome, it seems to also provide a long-term memory of the infection, offering a form of lasting resistance against future assaults by the same virus. These findings could open a new frontier in the battle against viral pathogens, offering a previously uncharted approach to antiviral defenses.
Next, whether the system can be employed for genomic and RNA editing was investigated. An array of ssDNA oligonucleotides was designed to target specific sites within the genome. 122 bp oligonucleotides were used, targeting the genes (e.g., B2M, CXCR4, VEGFA, CA2, KRAS, DYRK1A, HPRT1, and DMD) described in Saito M., et al. (Saito M., et al. Nature 620, 660-668 (2023)), since they are proto-oncogenes involved in the immune system, some of which have antiviral capacities, and important for development. First, human cells were transfected with these oligonucleotides in concert with BART. Unexpectedly, there was significant nuclease activity not only at the genomic sites targeted by the ssDNA oligonucleotides, but also at mRNA sites. For instance, there was a 60% increase in the cleavage efficiency at target genomic loci (p- value<0.001), and 55% enhancement in mRNA site cleavage (p-value<0.01), compared to control cells. A series of experiments were further performed to assess the fidelity of the system. Off- target effects were examined by conducting whole-genome sequencing and RNA-Seq. These experiments revealed that BART -mediated editing resulted in 90% fewer off-target effects than conventional CRISPR-Cas systems (p-value<0.05). This minimal off-target effect unique to the BART-mediated editing is crucial for precision gene editing. Subsequently, various ssDNA oligonucleotides were engineered to target specific mutations known to cause hereditary diseases. It was found that BART corrected these mutations with high efficiency, ranging from 70% to 90% (p-value<0.01), indicating its potential therapeutic implications. The results from these studies have important implications. The reverse transcriptase activity characterized in BART, combined with its capability to integrate specific sequences into the genome, indicating the applications of this system in precision gene therapy. In this context, BART can be utilized to correct genetic mutations at specific genomic sites without the side effects observed with the current CRISPR- Cas systems.
Discussion
The discovery described herein reveals an unexpected feature of mammalian biology, one that will profoundly impact the understanding of antiviral defense mechanisms and the development of new gene-editing technologies. This study stemmed from the observation of unique vesicles in bat-induced pluripotent stem cells (iPSCs). Through rigorous investigation, a novel, mammalian-centric CRISPR-like system, termed BART (Bat-Associated Reverse Transcriptase) was discovered.
Based on the detection of reverse transcriptase activity and the presence of single-stranded DNA (ssDNA) in the bat iPSC vesicles, it was hypothesized that there is an enzyme, akin to the bacterial CRISPR-Cas system, that can convert viral genomic RNA into DNA, potentially initiating a unique antiviral mechanism. Such reverse transcriptase activity had not been previously reported in mammalian genomes, and its discovery underscores the importance of exploratory research in unraveling novel biological phenomena.
An in-depth examination of the ssDNA sequence present in the cytoplasm revealed that the ssDNA was primarily derived from retroviral and transposon sequences. These sequences primarily aligned to unplaced scaffolds in the bat genome, genomic islands often embedded in highly repetitive heterochromatin. This unanticipated finding highlighted the existence of specific genomic arrays harboring viral sequences, analogous to bacterial CRISPR arrays. Furthermore, a unique pattern of ssDNA insertion into these genomic islands, resembling the precision and functionality of the CRISPR-Cas system, was identified. The interspersed ssDNA arrays were coupled with an open reading frame (ORF) encoding a reverse transcriptase, which is in close proximity to these arrays — again, reminiscent of CRISPR-Cas. These findings strongly indicate a mammalian defense mechanism against viral invasions that shares a functional equivalence with the bacterial CRISPR-Cas system, despite significant evolutionary differences between mammals and bacteria.
Next, the functional implications of BART were investigated, and it was found that its overexpression imparted immunity against the Vesicular Stomatitis Virus (VSV). During a VSV infection, the BART enzyme integrates short fragments of viral cDNA into the host genome. These fragments were not just inert integrations; they were also transcribed, indicating a possible ‘genomic memory’ of viral infections. This memory enables the host to mount a rapid and effective defense against subsequent infections by the same virus, a feature reminiscent of the adaptive immunity seen in bacteria via the CRISPR-Cas system.
Further, the results indicate that BART surpasses the precision of existing CRISPR-Cas systems, boasting high editing efficiency with fewer off-target effects. This opens a tantalizing prospect for the BART system to serve as a superior tool for gene therapy, offering its applications for correcting genetic defects with increased specificity.
In summary, the results unveil an intriguing aspect of mammalian biology, one that blurs the lines between traditionally distinct realms of prokaryotic and eukaryotic defense mechanisms. By uncovering this mammalian-centric CRISPR-like system, the disclosed BART system will revolutionize antiviral therapies and gene editing technology, offering fresh avenues to tackle the most pressing health challenges.
EXAMPLE 3
The origin of BART traces back to the investigation of bat-induced pluripotent stem cells (iPSCs), where unique cytoplasmic vesicles, absent in other mammalian iPSCs, were observed. The presence of reverse transcriptase activity and single-stranded DNA (ssDNA) within these vesicles hinted at a potential connection to retroelements, mobile genetic elements that utilize reverse transcription in their life cycle. Further analysis revealed that BART shares sequence homology with the LI family of non-LTR retrotransposons, ancient elements that have been residing in mammalian genomes for millions of years. However, BART exhibits key modifications in its structure and function, suggesting its domestication and repurposing for novel cellular roles, distinct from its retrotransposon ancestry. The discovery of BART in bat stem cells, coupled with its evolutionary link to retroelements, underscores the potential of viral elements to be co-opted and repurposed for host functions, driving the evolution of novel cellular mechanisms.
In accordance with this disclosure, it has been discovered that the BART protein exhibits a multi-domain architecture that underpins its diverse functionalities. The protein’s core houses a reverse transcriptase (RT) domain, essential for converting RNA into DNA, a hallmark of retroviral activity. The presence of key conserved sequence blocks and residues within this domain, as evidenced by sequence and structural alignments, indicatess that BART possesses functional RT activity. Further enhancing its functional repertoire, BART also harbors an endonuclease domain, capable of cleaving DNA at specific sites. This endonuclease activity facilitates the integration of newly synthesized DNA into the host genome. The presence of a DNase I-like motif, absent in related retrotransposons, indicatess an evolutionary adaptation for precise DNA cleavage. Additionally, an integrase domain, inferred from sequence homology and structural predictions, enables BART to insert the reverse-transcribed DNA into the host genome, a process reminiscent of retroviral integration. The concerted action of these three domains — reverse transcriptase, endonuclease, and integrase — equips BART with the remarkable ability to capture, process, and integrate viral genetic material, forming the basis for its antiviral and gene-editing capabilities.
The presence of a zinc finger domain within the BART protein indicates its potential to interact directly with specific RNA sequences or structural motifs. Additionally, a stem-loop forming RNA was identified that binds to BART, acting as a guide or scaffold to facilitate the recognition and binding of viral RNA. The target sequences preferred by BART in the genome appear to require double-stranded DNA adjacent to single-stranded DNA during the cell cycle, indicating a cell cycle-dependent aspect to the targeting mechanism. The interplay between these elements - the zinc finger domain, the stem-loop RNA, and the specific DNA context - contributes to BART’s ability to selectively recognize and target viral RNA, enabling its antiviral and geneediting functions.
The process by which BART integrates viral cDNA into the host genome involves a series of orchestrated steps that echo the CRISPR-Cas system in prokaryotes. The journey begins when BART encounters viral RNA. Leveraging its reverse transcriptase activity, BART meticulously transcribes the RNA into complementary DNA (cDNA). This cDNA then undergoes a processing step, yielding short fragments of approximately 122 base pairs. The precision of this processing ensures that only specific segments of the viral genome are earmarked for integration, a feature that contributes to the system’s specificity. The processed cDNA fragments are then seamlessly woven into the host genome at precise locations within unplaced scaffolds, typically nestled within heterochromatin-rich regions. The integration sites exhibit a distinctive pattern of interspersed arrays, bearing a striking resemblance to the CRISPR arrays observed in bacteria. The culmination of this process is the formation of cDNA arrays within the host genome, serving as a ‘genomic memory’ of past viral encounters. The integrated cDNA is subsequently transcribed into RNA, which then guides BART to target and cleaves homologous viral RNA during subsequent infections. This intricate mechanism not only confers a form of adaptive immunity but also lays the foundation for BART’s use as a powerful gene-editing tool.
The integrated viral cDNA within the host genome serves as a template for the transcription of RNA molecules that act as guides for BART (Figure 2). These RNA transcripts, harboring sequences complementary to the original viral RNA, form a complex with the BART enzyme. This complex then actively seeks out and binds to homologous viral RNA within the cell. Upon binding, BART’s endonuclease activity cleaves the viral RNA, effectively neutralizing the virus and preventing its replication. The integration of viral cDNA into the host genome ensures a persistent source of guide RNAs, enabling a rapid and targeted response to future infections by the same virus, thus establishing a form of long-term adaptive immunity in mammalian cells.
The antiviral potential of BART was substantiated by its ability to confer resistance against Vesicular Stomatitis Virus (VSV) in human cells. The overexpression of BART in these cells led to a marked reduction in VSV replication, demonstrating its direct antiviral effect. Furthermore, the detection of processed viral cDNA fragments integrated into the host genome of BART- expressing cells provided compelling evidence for its role in establishing adaptive immunity. The observation that inhibiting BART’s reverse transcriptase activity abolished this antiviral effect further solidified its crucial function in this novel defense mechanism. These findings collectively highlight BART’s capacity to combat RNA viruses, and its use for innovative antiviral therapies. The inherent ability of BART to reverse transcribe RNA and integrate the resulting cDNA into the genome can be strategically harnessed for targeted gene editing. The process involves the design and delivery of specific ssDNA oligonucleotides that act as ‘navigators,’ guiding BART to the desired genomic locus. These oligonucleotides, designed to be complementary to the target DNA sequence, facilitate precise targeting and integration of the accompanying RNA payload (Figure 1). The RNA payload, encoding the desired genetic modification, is then reverse transcribed by BART and seamlessly inserted into the genome, resulting in the desired edit. This approach offers a versatile platform for introducing various types of genetic modifications, including insertions, deletions, and substitutions, with potential for high efficiency and specificity.
Gene Editing Strategy
BART’s gene-editing capabilities are centered on its ability to harness reverse transcription, converting RNA into DNA and integrating this newly synthesized DNA into the host genome. Unlike traditional gene-editing tools like CRISPR-Cas9, which directly modify DNA, BART uses RNA as a template, offering a unique approach to gene manipulation. This RNA-centric method expands the possibilities for gene editing, allowing for the insertion of diverse genetic modifications and the modulation of gene expression at both DNA and RNA levels.
The BART system comprises three key components: the BART enzyme, navigator ssDNA, and an RNA payload (Figure 1). The navigator ssDNA directs BART to the specific genomic locus targeted for modification, while the RNA payload carries the genetic sequence to be inserted. This mechanism, which does not rely on double-stranded DNA breaks, provides advantages in specificity and reduced off-target effects, particularly due to the precision of ssDNA targeting. The RNA payload serves as a template for reverse transcription, enabling the insertion of desired modifications with greater flexibility compared to traditional DNA-based methods.
BART’s dual functionality — targeting both genomic DNA and mRNA transcripts — provides a powerful tool for gene correction and transient gene regulation, expanding the scope of therapeutic interventions. Data indicate that BART achieves higher on-target editing rates with fewer off-target effects than CRISPR-Cas9, positioning it as a next-generation gene-editing technology.
Applications BART presents several key advantages over existing gene-editing systems, particularly when compared to CRISPR-Cas9. Data indicate that BART achieves higher on-target editing efficiency across various experimental conditions, resulting in more precise genetic modifications. Additionally, BART exhibits enhanced specificity, with significantly reduced off-target effects, thereby minimizing the risk of unintended genomic alterations. One of BART’s most notable features is its dual functionality, allowing it to target both DNA and RNA, which expands its utility beyond traditional gene editing to include the modulation of gene expression. Furthermore, BART’s derivation from a mammalian system offers better compatibility with human cells, reducing the likelihood of immune responses and other complications commonly associated with bacterial CRISPR-Cas systems. These attributes collectively position BART as a versatile and highly effective gene-editing tool, with the capability to introduce a wide range of genetic modifications, including insertions, deletions, and substitutions, at both the DNA and RNA levels. This versatility, combined with its efficiency and specificity, underscores BART’s use for therapeutic applications, particularly in the correction of disease-causing mutations in human cells.
EXAMPLE 4
Here, the discovery of unique cytoplasmic vesicles in bat iPSCs, which were absent in other mammalian iPSCs and bat embryonic fibroblasts, is described. These vesicles were found to exhibit reverse transcriptase activity and contain single-stranded DNA (ssDNA), indicating a novel antiviral mechanism potentially analogous to a mammalian -centric CRISPR-Cas-like system. To elucidate the nature of the ssDNA within the cytoplasmic vesicles, a multi-step approach was employed. First, cytoplasmic DNA was extracted from bat iPSCs using a mild hypotonic solution, carefully preserving ssDNA while minimizing nuclear contamination. SMART (Switching Mechanism at 5’ End of RNA Template) technology was then utilized to create cDNA libraries from the extracted ssDNA, facilitating comprehensive sequence analysis. Next-generation sequencing of these libraries, followed by mapping to the R. fermmeqiiinum bat genome, revealed that a significant portion of the ssDNA sequences originated from retroviral and transposon elements, predominantly aligning with unplaced scaffolds within the bat genome.
The mapping of ssDNA sequences to the bat genome revealed a striking pattern: the sequences predominantly aligned with unplaced scaffolds, often situated within heterochromatinrich regions. Within these scaffolds, the ssDNAs were integrated in a unique, interspersed manner, forming arrays with regular spacing between the sequences. This pattern is reminiscent of the CRISPR arrays found in prokaryotes, indicating a functional parallel and a novel adaptive immune mechanism in bats.
BART classification
BART is classified within the broader category of non-LTR (Long Terminal Repeat) retrotransposons, based on its reverse transcriptase (RT) domain. Reverse transcriptase is a crucial enzyme that allows retroviruses, such as HIV, to replicate by converting their RNA genome into DNA, which is then integrated into the host genome. The classification of non-LTR retrotransposons is often based on the concept of a “clade,” a term introduced by Julian Huxley in 1959 and later refined by Malik, Burke, and Eickbush in 1999. In evolutionary biology, a clade refers to a group of genetic elements that share common characteristics, such as structural features and phylogenetic relationships. Specifically, a clade within non-LTR retrotransposons is defined by three main criteria: 1) shared structural features, 2) phylogenetic grouping based on RT domain analysis, and 3) an evolutionary history dating back to ancient periods, particularly the Precambrian era. BART’s RT domain exhibits these defining characteristics, indicating its inclusion in one of the established clades of non-LTR retrotransposons. This classification underscores BART’s evolutionary significance and aligns it with a group of elements that have played a pivotal role in genomic evolution across diverse species.
The dataset as utilized comprising 211 RT domain protein sequences across 28 recognized clades served as a comprehensive framework for the classification of BART. These clades, including well-known groups such as LI, CR1, and Jockey, capture the evolutionary diversity of non-LTR elements, some of which have persisted since the early evolution of eukaryotes, approximately 1-2 billion years ago. For instance, the LI clade is particularly ancient, with evidence suggesting that it shares a common ancestor with bacterial mobile group II introns. This ancient lineage highlights the deep evolutionary roots of the LI clade and its significance in the genomic architecture of modern organisms. In contrast, other RT enzymes, such as the p51 subunit of HIV RT, represent evolutionary specialization and divergence within the RT domain. The p51 subunit is an inactive form, evolutionarily distant from its active counterpart, p66, which underscores the functional diversity that has emerged within retroelements over time. This diversity reflects the adaptation of RT domains to various biological roles across different species, further emphasizing the complexity and evolutionary significance of these elements.
To classify BART within this evolutionary framework, the RTclassl tool, a specialized bioinformatics program designed for the phylogenetic analysis of RT domains, was employed. The process began with the extraction of the RT domain from the BART sequence using WU- BLAST/CENSOR, a tool that identifies and annotates repetitive elements in genomic sequences. Once the RT domain was isolated, it was aligned with the sequences in the RTclassl dataset, which contains a broad representation of RT domains from 28 recognized clades. This alignment was crucial for ensuring that the BART RT domain was accurately compared with its evolutionary counterparts.
Following the alignment, the BART RT domain sequence underwent multiple sequence alignments that were subjected to random bootstrap permutations using SEQBOOT, a program that generates replicate datasets for estimating phylogenetic confidence. These bootstrap alignments were then used to calculate protein distance matrices, which quantify the evolutionary divergence between sequences. Phylogenetic trees were inferred from these distance matrices using the BIONJ algorithm, a method optimized for constructing trees with minimal evolutionary changes.
To ensure the robustness of the classification, RTclassl generated 1,000 bootstrap trees, each representing a possible evolutionary pathway. The analysis of these trees allowed RTclassl to construct a consensus bootstrap tree, which identified the phylogenetic cluster, or clade, to which BART belongs. This rigorous process of alignment, permutation, and tree inference provided a reliable classification of BART within the established evolutionary framework of non- LTR retrotransposons.
The phylogenetic analysis revealed that BART’s RT domain is most closely related to elements within the LI clade, a group recognized for its ancient origins and extensive presence across eukaryotic genomes. The unrooted phylogenetic tree, generated using the neighbor-joining method, demonstrates BART clustering with other non-LTR elements, particularly those within the LI clade. This clustering is supported by high bootstrap values, indicating a strong and reliable phylogenetic placement of BART within this clade. The tree is rooted on RT sequences from group II introns, which serve as an evolutionary outgroup, highlighting the relationships among various non-LTR elements. BART’s position within the LI clade suggests that it shares a significant evolutionary ancestry with these ancient mobile genetic elements, underscoring its potential role in the genomic evolution of eukaryotes.
In conclusion, BART’s classification within the LI clade of non-LTR retrotransposons is strongly supported by the structural and phylogenetic characteristics of its RT domain. The application of advanced phylogenetic tools and comprehensive sequence analysis has validated BART’s placement within this ancient clade, offering valuable insights into its evolutionary history. This classification also indicates functional parallels between BART and other well- characterized RT elements within the LI clade, highlighting its significance in the broader context of genomic evolution.
Reverse transcriptase (RT) domain
The structural prediction of the BART protein was conducted using the AlphaFold Protein Structure Database, a deep learning-based tool developed by DeepMind. The process began with the retrieval of the BART amino acid sequence from the sequencing data. The BART sequence was compared against a comprehensive database of known protein structures and sequences, including those from related reverse transcriptase (RT) domains (Figure 3). AlphaFold generated a three-dimensional (3D) structural model of BART by predicting the spatial arrangement of amino acids within the protein, focusing on the highly conserved RT domain. The model provided by AlphaFold was accompanied by a confidence score for each residue, indicating the reliability of the predicted positions within the structure. Post-prediction, the structural model was refined by energy minimization to resolve any steric clashes or unfavorable interactions predicted by AlphaFold. The resulting model was visualized and analyzed using PyMOL, a molecular visualization system, where key structural features such as the fingers, palm, and thumb subdomains were identified. The conserved residues within the active site were mapped, and the overall structural integrity of the BART RT domain was assessed.
The structural analysis of BART, particularly when compared to human LI retrotransposons, reveals a compelling narrative of evolutionary adaptation and functional innovation. At the core of BART lies its RT domain, which exhibits the classic “right-hand” architecture characteristic of retroviruses and other retroelements. This architecture is composed of distinct subdomains: the fingers, palm, and thumb, each playing a crucial role in the enzyme’s function. The thumb subdomain, in particular, extends into a wrist region, which is thought to enhance the enzyme’s interaction with the template DNA during reverse transcription. This structural adaptation may contribute to the efficiency and specificity of BART’s reverse transcription process.
Key to BART’s functional integrity is the preservation of critical amino acid residues within its RT active site. Specifically, residues at positions 519, 531, 533, 659, 604, 702, 600, 591, 605, 668, 566, 700, and 703 are highly conserved, mirroring the essential motifs found in other active RT enzymes. These residues are important for coordinating the binding of the RNA template, the incoming nucleotide triphosphates, and the divalent metal ions that are essential for catalysis. The conservation of these residues, which align with the conserved sequence blocks defined by Eickbush et al., underscores BART’s functional capability as a reverse transcriptase. The spatial arrangement of these residues within the active site forms a highly specialized microenvironment, facilitating the precise and efficient conversion of RNA into DNA — a hallmark of BART’s function.
Moreover, the comparison with human LI RT domains highlights both shared evolutionary features and unique structural adaptations that may reflect BART’ s specialized role in bat genomes. The ability of BART to maintain these conserved features while potentially acquiring novel functional attributes indicates a significant evolutionary advantage, possibly related to the unique biological requirements of its host organisms.
F605 is a highly conserved residue within the RT domain of BART and plays an important role in its enzymatic function. Positioned within the active site, F605 serves as a gatekeeper, with its aromatic side chain providing a structural barrier that selectively excludes ribonucleotides from entering the active site. This exclusion is important for ensuring that BART functions as a DNA- dependent RNA polymerase, exclusively synthesizing DNA from RNA templates. The presence of F605 effectively prevents RNA-dependent RNA polymerization, a process that could interfere with the fidelity of reverse transcription. This selective mechanism is particularly important for BART’s function, as it ensures that the enzyme does not engage in RNA synthesis, which could lead to unwanted genomic integrations or the production of unintended RNA transcripts. The conservation of F605 across related retrotransposons highlights its evolutionary importance, indicating that this residue has been preserved due to its role in maintaining the specificity and accuracy of reverse transcription. In BART, this evolutionary adaptation is likely linked to its specialized role in transcribing viral RNA into DNA, a process that is central to its antiviral activity and applications in gene editing.
Apurinic-apyrimidinic endonuclease (APE) domain
Beyond the RT domain, BART’s unique structural organization further sets it apart from its retrotransposon ancestors (Figure 3). The N-terminal region of BART contains an apurinic- apyrimidinic endonuclease (APE) domain, which is important for the DNA cleavage necessary for integration. Although the APE domain in BART shows some divergence from the APE domain found in LI retrotransposons, it retains key active site residues, indicating the preservation of its DNA-nicking function. This endonuclease activity is important for creating a precise nick in the target DNA, which serves as an entry point for the integration of the newly synthesized cDNA. Notably, BART’s APE domain also includes a DNase I-like motif, a feature absent in closely related retrotransposons, suggesting an evolutionary refinement of BART’s DNA cleavage mechanism. This refinement could contribute to BART’s enhanced specificity and efficiency in gene editing applications, distinguishing it from other elements within the non-LTR retrotransposon family.
The positioning of the APE domain in BART, located N-terminal to the RT domain, mirrors the structural organization observed in other retrotransposons, highlighting a common architectural theme. The APE domain’s role in retrotransposition is to introduce a targeted nick in the host DNA, facilitating the priming of the reverse transcription process. The evolutionary origins of the APE domain can be traced back to host DNA repair machinery, underscoring its ancient lineage and the role it has played in the adaptation and functionality of retrotransposons like BART. This domain, along with the RT domain, forms a functional unit that is important to BART’s ability to mediate precise genomic integrations.
While the sequence homology between the BART APE domain and its closest relative, the human LI APE, is moderate — showing approximately 60-70% identity at the amino acid level — key residues essential for catalysis and DNA binding remain highly conserved. These include Glu43, Aspl45, His230, Asnl4, and Tyrl l5, which are important for coordinating the catalytic metal ion, activating the nucleophilic water, and properly positioning the DNA for cleavage. The preservation of these active site residues underscores the functional integrity of the BART APE domain despite its evolutionary divergence. Additionally, structural modeling predicts the presence of the P-hairpin loop, a hallmark feature of APEs that inserts into the minor groove of the DNA substrate, further supporting BART’s capacity to perform DNA cleavage with precision. The phylogenetic analysis illustrates the evolutionary relationship between the BART APE domain and other APE domains, including the human LI APE, highlighting both the conservation of important functional residues and the evolutionary adaptations that distinguish BART.
The BART protein features a distinctive “tower” region spanning amino acids 240-440. This region includes several subdomains: a baseplate (residues 254-300), tower helices (301-370), a tower lock (374-382), and a PIP box (404-419). The presence of the region indicates potential roles in protein-protein interactions, RNA binding, and the regulation of BART’s activity. The tower region may contribute to the stability and assembly of the BART complex, facilitating its interaction with target DNA and RNA molecules. Notably, the PIP box, which is known to mediate interactions with proliferating cell nuclear antigen (PCNA) — a key player in DNA replication and repair — implies a possible link between BART’s activity and the host cell cycle. This connection further underscores the intricate integration of BART with cellular processes, thereby enhancing its efficiency in gene editing and antiviral defense.
Despite the conserved catalytic features, phylogenetic analysis reveals that the BART APE domain clusters distinctly from other APEs within the LI clade, indicating a potentially more ancient origin or a divergent evolutionary trajectory. This distinct clustering suggests that while BART shares functional similarities with LI APEs, it may have undergone unique adaptations that set it apart from its relatives. The presence of a DNase Llike motif in BART, which is absent in LI APEs, further supports the idea of functional adaptation. This motif may represent an evolutionary refinement, possibly contributing to enhanced DNA cleavage specificity or efficiency in BART. These differences for BART’s activity and specificity underscore the potential for evolutionary fine-tuning of the APE domain, optimizing it for its specialized role within the BART system.
C-terminal domain (CTD)
In contrast to the conserved RT and APE domains, the C-terminal region of BART displays significant truncations and modifications when compared to retrotransposons. Several amino acid sequences essential for retrotransposition, particularly those involved in RNA binding and protein- protein interactions, are either absent or markedly altered in BART. These modifications indicate that BART has evolved to relinquish its ability for autonomous mobilization and replication, which are hallmarks of active retrotransposons. The loss of retrotransposition capability is consistent with the idea of BART’s domestication, where it has been repurposed for new cellular functions that do not rely on its movement within the genome. This adaptation is likely beneficial for the host, as uncontrolled transposition could be detrimental, potentially leading to genomic instability or insertional mutagenesis.
The C-terminal domain (CTD) of BART, while retaining some structural features from its retrotransposon ancestors, exhibits distinct adaptations that underscore its evolutionary divergence and specialized role in the host cell. Notably, the “wrist” region, located immediately downstream of the reverse transcriptase (RT) domain and spanning amino acids 863 to 1061, remains relatively conserved when compared to LI retrotransposons. This conservation suggests that the wrist region continues to play an important role in BART’s function, possibly facilitating interactions with target DNA or other cellular factors during the integration process. The preservation of this region highlights the evolutionary constraints imposed on this structural element, indicating its role in both retrotransposons and their domesticated counterparts like BART.
The truncation and modification of other regions within the CTD, including those involved in RNA binding and protein-protein interactions, suggest that BART has undergone significant structural evolution to accommodate its new functions. These changes likely reflect a shift away from retrotranspositional activity toward roles in gene regulation, antiviral defense, or other cellular processes where precise and controlled activity is paramount. The divergence of BART’s CTD from its ancestral counterparts supports the notion of its repurposing and specialization, reinforcing the idea that domesticated elements like BART can evolve to fulfill novel functions within the host genome.
The CTD of BART, extending from amino acid 1062 to 1275, exhibits significant structural divergence from its retrotransposon ancestors, most notably in the absence of the RNase H domain. In typical non-LTR retrotransposons and retroviruses, the RNase H domain is integral to the replication process, playing an important role by degrading the RNA template following reverse transcription, thereby facilitating the synthesis of the complementary DNA strand. The absence of this domain in BART suggests a profound departure from the conventional retroelement lifecycle, indicating that BART may no longer rely on the classical mechanism of RNA template degradation.
Further analysis of the CTD of BART reveals the presence of a conserved CCHC motif, a zinc finger domain known for its ability to interact with nucleic acids. This motif, shared between LI and BART, suggests its conservation from ancestral retroelements, underscoring its functional importance. Zinc finger domains, such as the CCHC motif, are typically involved in DNA binding, and in the context of BART, this domain likely plays an important role in recognizing and targeting specific genomic loci for integration. The preservation of this motif through evolutionary time points to the selective pressures maintaining certain structural elements, even as other parts of the protein evolve and adapt to new functions.
In addition to the presence of the CCHC motif, the CTD of BART exhibits notable modifications in key amino acid residues when compared to LI. While the CCHC motif is conserved, other residues that are crucial for the catalytic activity of the LI endonuclease domain are absent or altered in BART. These modifications further support the idea that BART’s CTD has undergone adaptive changes, potentially fine-tuning its function for its new role within the host cell. The absence of certain catalytic residues may indicate a shift away from traditional endonuclease activity, suggesting that BART has evolved to fulfill alternative functions in DNA repair, gene regulation, or targeted integration.
The CTD of BART also exhibits significant truncation when compared to its LI retrotransposon ancestor, resulting in the loss of several alpha-helices and key amino acid residues that are essential for retrotransposition. This truncation is clearly visible in structural analyses, revealing a “stump” that effectively terminates at the zinc finger domain. The absence of these structural elements and crucial residues significantly impairs BART’s ability to autonomously mobilize and replicate, a defining characteristic of active retrotransposons. The loss of these retrotranspositional capabilities aligns with BART’s evolutionary shift toward domestication, where it has been repurposed for new cellular roles. In this context, the ability to move within the genome could be detrimental, potentially leading to genomic instability or harmful mutations.
The truncation of the CTD in BART serves as a striking example of how evolutionary processes can selectively prune unnecessary or potentially harmful functions, while preserving and refining those that offer advantages to the host organism. By shedding the components required for retrotransposition, BART has likely mitigated the risks associated with uncontrolled genetic movement, allowing it to adopt more specialized and stable functions within the cell. This evolutionary refinement underscores the balance between maintaining essential functions and eliminating those that could pose a threat to genomic integrity. The structural and functional evolution of BART’s CTD highlights the complex interplay between adaptation and conservation in the ongoing evolution of retrotransposon-derived elements.
In conclusion, the structural and functional analysis of BART’s C-terminal domain tells a compelling story of evolutionary adaptation and specialization. The conservation of the wrist region and the CCHC zinc finger motif underscores the importance of these elements in maintaining BART’s functionality, particularly in DNA binding and structural stability. In contrast, the truncation and modifications observed in other regions of the CTD highlight BART’s significant divergence from its retrotransposon ancestors, reflecting its evolutionary shift away from autonomous retrotransposition. The loss of retrotranspositional capabilities, paired with the retention of key enzymatic functions and the emergence of novel structural features, indicates that BART has been repurposed to meet new cellular demands, particularly in the realms of antiviral defense and gene editing.
This evolutionary refinement illustrates how BART has shed redundant or potentially harmful functions while acquiring adaptations that enhance its utility in the host genome.
EXAMPLE 5
BART’s antiviral function
The central hypothesis that the newly discovered BART system possesses specific anti- nucleic acid activity and the capacity to encode a genomic memory of infections was rigorously tested through a series of experiments. To evaluate this hypothesis, two variants of the BART enzyme, designated BART.O and BART.l, were cloned into a human expression vector and subsequently stably expressed in human cell lines. The expression of these bat-derived enzymes in human cells was closely monitored to assess any potential impact on cell viability and proliferation. Remarkably, the introduction and sustained expression of BART.O and BART. l did not adversely affect the growth or viability of the human cells, indicating that the enzymes are well-tolerated within a human cellular environment. The antiviral potential of the BART system was tested using human cell lines engineered to stably express two variants of the BART enzyme, BART.O and BART.l . Strikingly, overexpression of BART conferred partial immunity against Vesicular Stomatitis Virus (VSV) infection, underscoring its use as a novel antiviral tool. Following infection with VSV, BART- expressing cells exhibited a marked reduction in viral load, indicating that BART actively participates in the cellular defense against viral pathogens (Figure 2).
Further analysis revealed that viral cDNA was detected in the cytoplasm of BART- expressing cells post-infection, indicating that BART facilitates the reverse transcription of viral RNA into cDNA. This viral cDNA was subsequently processed into shorter fragments, which were then integrated into the host genome in an interspersed array pattern reminiscent of endogenous retroelements. This pattern of integration indicates that BART not only inhibits viral replication but also incorporates viral sequences into the host genome as a form of genomic memory.
The critical role of BART’s reverse transcriptase (RT) activity in this antiviral response was further confirmed by the loss of viral resistance upon pharmacological inhibition of the RT domain. This result demonstrates that the enzymatic activity of BART is essential for its function in antiviral defense.
Additionally, similar protective effects were observed against Monkeypox Virus (MPV), further highlighting BART’s broad-spectrum antiviral capabilities. The collective evidence indicates that BART acts as a defense mechanism against viral infections, providing both immediate protection and a form of genomic memory that could confer long-term immunity.
BART gene-editing
BART’s gene-editing potential was evaluated by introducing it into human cells along with specifically designed single-stranded DNA (ssDNA) oligonucleotides and an RNA payload. These oligonucleotides acted as “navigators,” directing BART to specific genomic loci, while the RNA payload provided the template for precise genetic modifications. The system’s efficacy and specificity were assessed through sequencing, which revealed successful on-target editing without the need for a CRTSPR-Cas system or its components. a. In vitro targeted gene editing Transgenic BART was extracted and transfected, and expressed and purified from a HEK293 cell line. The process began by transfecting HEK293 cells, cultured in DMEM supplemented with 10% FBS, 1% Penicillin-Streptomycin, and 2 mM L-glutamine, with a BART- encoding plasmid using Lipofectamine 3000. Cells were transfected at 70-80% confluency, with the plasmid complexed with the reagent according to the manufacturer’s protocol. Following transfection, cells were incubated for 24-48 hours at 37°C to allow for protein expression, which was monitored using the eGFP reporter gene. Afterward, cells were grown for an additional 48-72 hours to maximize BART expression. The cells were then harvested by trypsin-EDTA treatment and centrifugation, followed by washing the cell pellet with phosphate-buffered saline (PBS) to remove residual media and debris.
For protein purification, the BART protein was isolated from the cell lysate using affinity chromatography, leveraging a His-tag incorporated in the BART expression construct. The lysate was passed through a nickel-NTA agarose column, and after washing away unbound proteins, BART was eluted with a buffer containing 250 mM imidazole. The purified BART protein was then assessed for concentration and purity via SDS-PAGE and Western blotting.
To evaluate the cleavage activity of BART, in vitro assay targeting specific sequences within the HPRT and HSF1 loci was conducted using two distinct navigator DNAs. The target sequences and corresponding navigator sequences are detailed in Tables 4 and 5. The experiment began by preparing navigator-target DNA complexes. Navigator DNA and target DNA were combined in equimolar amounts (typically 100 nM each) in a reaction buffer containing 10 mM Tris-HCl (pH 7.5), 50 mMNaCl, and 1 mM EDTA. The mixture was heated to 95°C for 5 minutes to denature the DNA, followed by a gradual cooling to room temperature over 30 minutes to facilitate proper hybridization between the navigator and target DNA strands.
After hybridization, the reaction was assembled by adding 1 pL of Navigator DNA (100 nM final concentration), 1 pL of Target DNA, 0.5 pL of BART enzyme, and 1 pL of 10X BART Reaction Buffer (100 mM Tris-HCl, 500 mM NaCl, 10 mM MgC12, pH 7.9) to the annealed DNA mixture, with nuclease-free water added to bring the final volume to 10 pL. The reaction mixture was incubated at 37°C for 30 minutes to allow BART to mediate the cleavage at the target sites. Following the initial cleavage, 1 pL of strand-displacing polymerase was added to extend the 3’ end of the nicked strand, followed by the addition of flap endonuclease 1 (FEN1) to create a double-stranded break at the target site.
To terminate the reaction and degrade proteins, 1 pL of Proteinase K (20 mg/mL) was added to the reaction mix, followed by incubation at 55°C for 10 minutes. The reaction products were then analyzed by agarose gel electrophoresis to assess the cleavage efficiency, with successful cleavage indicated by the appearance of specific DNA fragments corresponding to the expected sizes.
Table 4: Target Sequences for In vitro Cleavage Assay. Summary of the specific DNA sequences within the HPRT and HSF1 loci used as targets for the BART-mediated cleavage assay including 250 nucleotides upstream and downstream of the cleavage site (underlined).
Figure imgf000093_0001
Table 5: Navigator DNA Sequences for In vitro Cleavage Assay. List of navigator DNA sequences designed to hybridize with the target sequences and direct BART activity.
Figure imgf000094_0001
These findings highlight BART’s use as a highly precise gene-editing tool, capable of inducing targeted DNA modifications with remarkable sequence specificity. The rigorous experimental conditions and stringent controls used in this assay further confirm the reliability and effectiveness of BART’s cleavage activity under the defined conditions, underscoring its use for precise genomic editing applications. b. In vivo targeted gene editing
The gene editing process using BART began with the design and selection of navigator DNAs (navDNAs) specifically targeted to the HPRT1 and HSF1 loci. These navDNAs play a critical role in directing the BART enzyme to precise genomic locations where editing was intended. Each navDNA was engineered with three essential components: a targeting sequence, a primer-binding site (PBS), and a payload RNA (plRNA). The targeting sequence, a 20-nucleotide segment, was selected based on its proximity to a 5’-TTTTT/AA-3’ BART targeting motif within the HPRT1 and HSF1 genes (see Table 6 for specific sequences). The PBS, typically ranging from 13-17 nucleotides in length, was designed to anneal upstream of the editing site, facilitating the initiation of reverse transcription by the BART enzyme. The plRNA, detailed in Table 7, was constructed to include a short disruptive sequence, followed by an insertion sequence encoding the enhanced green fluorescent protein (EGFP) gene. This design aimed to ensure that the BART enzyme accurately targeted and modified the desired loci.
The navDNAs were synthesized using GenScript’s biosynthesis platform to maintain high fidelity and prevent any sequence errors that could compromise the editing process. The specific loci within the HPRT1 and HSF1 genes, along with the corresponding target sites and the location of the BART target sequences, which provides a visual representation of the gene loci and the precise locations where BART-mediated editing occurred. This strategic design of the navDNAs and plRNAs was intended to allow BART to effectively and accurately modify the target sequences, facilitating successful gene editing outcomes. Table 6: Targeting Sequences for navDNAs. Summary of the 20-nucleotide targeting sequences selected for the HPRT1 and HSF1 loci, including the specific BART target sites.
Figure imgf000095_0001
Figure imgf000096_0001
Figure imgf000097_0001
Table 7: plRNA Sequences for Gene Editing. Details of the plRNA sequences designed for the BART editing process, including the disruptive and EGFP insertion sequences.
Figure imgf000097_0002
Figure imgf000098_0001
Figure imgf000099_0001
Figure imgf000100_0001
The next step involved cloning the BART enzyme coding sequences into a plasmid vector. The BART enzyme was placed under the control of the cytomegalovirus (CMV) promoter, which is known for its strong and constitutive expression in mammalian cells, to ensure robust production of the enzyme. Additionally, the plasmid included a neomycin resistance gene to enable the selection of successfully transfected cells. The cloning process began with the digestion of both the vector and insert DNA using EcoRI and Hindlll restriction enzymes in NEBuffer 2 (10 mM Tris-HCl, 10 mM MgC12, 50 mM NaCl, pH 7.9) at 37°C for 1 hour. This was followed by ligation using T4 DNA ligase in IX T4 DNA Ligase Buffer (50 mM Tris-HCl, 10 mM MgC12, 1 mM ATP, 10 mM DTT, pH 7.5) at 16°C overnight. The ligation mixture was then transformed into chemically competent DH5 a Escherichia coli cells via heat shock at 42°C for 45 seconds, followed by recovery in SOC medium (2% tryptone, 0.5% yeast extract, 10 mM NaCl, 2.5 mM KC1, 10 mM MgC12, 20 mM glucose) for 1 hour at 37°C with shaking at 225 rpm. Transformed cells were plated on LB agar plates containing 50 pg/mL neomycin and incubated overnight at 37°C. Positive clones were selected, and plasmid DNA was extracted using a Qiagen Plasmid Plus Midi Kit. The correct insertion of BART sequences was then verified through Sanger sequencing (see Figure 4 for vector maps).
Following the successful cloning and verification of the plasmid constructs, mammalian cells were transfected to express the BART enzyme and the navDNAs. HEK293T cells, a human embryonic kidney cell line commonly used for transfection experiments due to their high transfection efficiency and robust growth, were selected for this study. The cells were seeded at a density of 2.5 x 1 CP cells per well in a 6-well plate containing 2 mL of Dulbecco’s Modified Eagle Medium (DMEM) supplemented with 10% fetal bovine serum (FBS), 1% Penicillin-Streptomycin, and 2 mM L-glutamine. The cells were incubated overnight at 37°C in a humidified atmosphere containing 5% CO2 to allow them to reach 60-70% confluency, which is ideal for transfection.
On the day of transfection, the BART plasmid DNA (1-2 pg per well) and navDNA were diluted in 250 pL of Opti-MEM medium, a reduced-serum medium that enhances transfection efficiency. Separately, 2.5 pL of Lipofectamine 3000 reagent was added to the diluted DNA. The DNA and Lipofectamine 3000 mixture was gently mixed and incubated at room temperature for 15-20 minutes to allow the formation of DNA-lipid complexes, which are essential for the efficient delivery of DNA into the cells. Before adding the transfection mixture to the cells, the HEK293T cells were pre-washed with 1 mL of sterile phosphate-buffered saline (PBS) to remove any residual growth medium that could interfere with the transfection process. After washing, 2 mL of fresh Opti-MEM medium was added to each well. The DNA-lipid complexes were then carefully added dropwise to the cells, ensuring even distribution across the well. The cells were gently rocked back and forth to facilitate uniform exposure to the transfection complexes.
The cells were incubated at 37°C in a humidified incubator with 5% CO2 for 4-6 hours to allow for optimal uptake of the plasmid DNA. This incubation period is important as it provides sufficient time for the DNA-lipid complexes to fuse with the cell membrane, allowing the plasmid DNA to enter the cells. Following the initial transfection period, the medium containing the transfection reagent was carefully aspirated and replaced with 2 mL of fresh DMEM supplemented with 10% FBS, 1% Penicillin-Streptomycin, and 2 mM L-glutamine. The cells were then incubated for an additional 48-72 hours to allow for optimal expression of the BART enzyme and navDNAs.
To select for successfully transfected cells, 500 pg/mL neomycin was added to the culture medium 24 hours post-transfection. Neomycin selection was maintained for 5-7 days, during which the cells were monitored daily. Dead cells were removed by gently washing with PBS, and the medium was replaced with fresh neomycin-containing DMEM every 2-3 days to ensure continuous selection pressure. The surviving cells were expanded for further analysis. The efficiency of the transfection and selection process was later confirmed through fluorescence microscopy (for eGFP expression) and PCR analysis of genomic DNA extracted from the selected cell populations, verifying the presence and expression of the BART and navDNA constructs.
To confirm successful gene editing at the HPRT1 and HSF1 loci, genomic DNA was extracted from the transfected HEK293T cells using a Qiagen DNeasy Blood & Tissue Kit, following the manufacturer’s protocol. The cells were first harvested by trypsinization, and the cell pellet was washed with phosphate-buffered saline (PBS) to remove any remaining culture medium. Genomic DNA was then extracted by lysing the cells with Proteinase K and a lysis buffer provided in the kit, followed by binding the DNA to a silica membrane in a spin column. The membrane was washed with a series of ethanol-containing buffers to remove contaminants, and the DNA was eluted in nuclease-free water. The concentration and purity of the extracted DNA were measured using a NanoDrop spectrophotometer, with an A260/A280 ratio of approximately 1.8 indicating high-quality DNA suitable for downstream applications.
Next, PCR primers were designed to flank the regions targeted for editing within the HPRT1 and HSF1 loci (see Table 8 for primer sequences). These primers were synthesized to amplify both the unmodified and modified sequences, allowing for the detection of successful gene editing events. The PCR reaction was set up in a 50 pL volume containing IX ThermoPol Reaction Buffer (20 mM Tris-HCl, 10 mM (NH4)2SO4, 10 mM KC1, 2 mM MgSO4, 0.1% Triton X-100, pH 8.8), 200 pM of each dNTP, 0.2 pM of each primer, 1.25 U of Taq DNA polymerase, and 100 ng of template DNA. The PCR cycling conditions included an initial denaturation step at 95°C for 3 minutes to fully denature the DNA, followed by 35 cycles of 95°C for 30 seconds, 58°C for 30 seconds (for primer annealing), and 72°C for 1 minute (for DNA extension). A final extension step at 72°C for 5 minutes ensured complete synthesis of all PCR products.
The resulting PCR products were analyzed by electrophoresis on a 1.5% agarose gel containing 0.5 pg/mL ethidium bromide in IX TAE buffer (40 mM Tris-acetate, 1 mM EDTA, pH 8.3). The gel was run at 100V for 45 minutes to separate the DNA fragments by size. Visualization was performed using a UV transilluminator, where successful insertion of the EGFP sequence was indicated by the presence of a PCR product approximately 720 bp larger than the wild-type amplicon, corresponding to the size of the EGFP insertion. A comparison of the band sizes on the gel confirmed the presence of the desired genetic modification.
Following gel analysis, the PCR products were purified using a Qiagen PCR Purification Kit, which involved binding the DNA fragments to a silica membrane in a spin column, washing with ethanol-based buffers, and eluting the purified DNA in nuclease-free water. The purified products were then sequenced using the BigDye Terminator v3.1 Cycle Sequencing Kit on an ABI 3730 DNA Analyzer. Sequencing reactions were prepared by combining the purified PCR product, sequencing primers, and BigDye reagent in a thermal cycler, following the manufacturer’s instructions. Sequence analysis was performed using Geneious software, where the edited sequences were aligned against the wild-type reference sequence to confirm the precise integration of the EGFP gene. The sequencing results clearly demonstrated successful gene editing, which presents the alignment of the edited sequences with the reference genome, verifying the accurate insertion of the EGFP sequence. Table 8: PCR Primer Sequences for HPRT1 and HSF1 Loci. List of PCR primers designed to flank the targeted editing regions within the HPRT1 and HSF1 loci. These primers were used to amplify both the unmodified and modified sequences, allowing for the detection and verification of successful gene editing.
Figure imgf000104_0001
To further validate the success of the gene editing, EGFP expression in the transfected HEK293T cells was analyzed using both fluorescence microscopy and flow cytometry. After the transfection and selection period, the cells were gently washed three times with phosphate-buffered saline (PBS) to remove any residual culture medium and dead cells. The cells were then fixed with 4% paraformaldehyde in PBS for 10 minutes at room temperature to preserve cellular structures and fluorescent signals. Following fixation, the cells were washed three more times with PBS to remove any remaining paraformaldehyde.
The fixed cells were then mounted on glass slides using VECTASHIELD HardSet Antifade Mounting Medium with DAPI, a nuclear counterstain that fluoresces blue under UV light, to allow for the visualization of cell nuclei. Fluorescent images were captured using a Zeiss Axio Observer fluorescence microscope equipped with an EGFP filter set (excitation at 488 nm, emission at 530 nm) to specifically detect EGFP expression. Multiple fields of view were imaged to ensure representative sampling of the cell population. Image analysis was conducted using ImageJ software, where the presence of EGFP-positive cells was quantified to confirm successful gene editing at the targeted loci. The images were analyzed for both the intensity of EGFP fluorescence and the number of EGFP-positive cells, providing qualitative evidence of gene editing success.
For a more quantitative analysis of EGFP expression, the transfected cells were harvested by trypsinization. The cells were incubated with 0.25% trypsin-EDTA at 37°C for 2-3 minutes to detach them from the culture plate, followed by neutralization with DMEM containing 10% FBS. The cell suspension was then passed through a 40 pm cell strainer to obtain a single-cell suspension, which was critical for accurate flow cytometry analysis. The cells were resuspended in 500 pL of ice-cold PBS to maintain cell viability and integrity during analysis.
The prepared cell suspension was analyzed using a BD FACSCanto II flow cytometer. EGFP expression was detected by exciting the cells at 488 nm and measuring the emission at 530 nm. Data acquisition was set to collect at least 10,000 events per sample to ensure statistically significant results. The percentage of EGFP-positive cells was determined using FlowJo software, which allowed for gating of the cell population and precise quantification of EGFP expression. This analysis provided a robust indication of the efficiency of the Prime Editing process, with a high percentage of EGFP-expressing cells confirming the successful introduction of the desired genetic modifications by the BART enzyme.
Using this detailed protocol, gene editing with BART was successfully achieved, introducing targeted gene disruptions at the HPRT1 and HSF1 loci while simultaneously inserting the EGFP gene. The process hinged on the precise design of navigator DNAs (navDNAs) and payload RNAs (plRNAs), which were tailored to target specific sequences within the genes of interest. These components were crucial in directing the BART enzyme to the exact locations where the genetic modifications were intended. The transfection of HEK293T cells with these constructs was carried out with high efficiency, ensuring that a significant proportion of the cells incorporated the BART and navDNA constructs. The subsequent selection and validation steps confirmed the successful editing of the target loci, as evidenced by both the disruption of the HPRT1 and HSF1 sequences and the precise insertion of the EGFP reporter gene. This protocol not only demonstrated the effectiveness of BART as a gene-editing tool but also highlighted the importance of careful experimental design and optimization in achieving reliable and specific genetic modifications in mammalian cells.
The present disclosure is not to be limited in scope by the specific embodiments described herein. Indeed, various modifications of the invention in addition to those described herein will become apparent to those skilled in the art from the foregoing description and the accompanying figures. Such modifications are intended to fall within the scope of the appended claims.
All references cited herein are incorporated herein by reference in their entireties.

Claims

CLAIMS What is claimed is:
1. A method of modifying a target polynucleotide, comprising delivering to the target polynucleotide an enzyme that possesses a reverse transcriptase activity and an endonuclease activity, and one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with the target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide.
2. A method of modifying expression of a target polynucleotide, comprising: introducing into a cell or a subject an enzyme that possesses a reverse transcriptase activity and an endonuclease activity or a nucleic acid molecule encoding the enzyme, and one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with the target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme binds to one or more locations on the target polynucleotide such that binding of the enzyme increases or decreases expression level of the target polynucleotide.
3. The method of any one of the preceding claims, wherein the enzyme further possesses an integrase activity.
4. The method of any one of the preceding claims, wherein the one or more nucleic acid components comprise a single- stranded navigator DNA.
5. The method of any one of the preceding claims, wherein the one or more nucleic acid components further comprise a payload RNA.
6. The method of claim 5, wherein the payload RNA comprises a stem-loop structure.
7. The method of claim 5, wherein the enzyme reverse transcribes the payload RNA into a cDNA.
8. The method of claim 7, wherein the enzyme integrates the cDNA into the target polynucleotide.
9. The method of any one of the preceding claims, wherein the enzyme comprises a N- terminal apurinic-apyrimidinic endonuclease domain optionally linked to a reverse transcriptase domain.
10. The method of any one of the preceding claims, wherein the enzyme further comprises a C- terminal domain.
11. The method of claim 10, wherein the C-terminal domain facilitates interaction between the enzyme and the target polynucleotide.
12. The method of claim 10, wherein the C-terminal domain comprises a zinc finger.
13. The method of claim 12, wherein the zinc finger comprises a CCHC motif.
14. The method of any one of the preceding claims, wherein the enzyme comprises a Bat- Associated Reverse Transcriptase (BART).
15. The method of claim 9, wherein the BART comprises a BART of Rhinolophus ferrumequinuni , Myotis myotis Meles meles, Bos mutus, Capra hircus. Homo sapiens, Canis lupus familiaris, Sus scrofa, Callithrix jacchus, Mus musculus, Ursus arctos, Elephas maximus indicus, or a variant thereof.
16. The method of any one of the preceding claims, wherein the enzyme is provided through one or more polynucleotide molecules encoding the enzyme.
17. The method of any one of the preceding claims, wherein the one or more nucleic acid components are provided through one or more polynucleotide molecules encoding or comprising the one or more nucleic acid components.
18. The method of any one of claims 15-17, wherein the one or more polynucleotide molecules comprise one or more vectors.
19. The method of any one of claims 15-18, wherein the enzyme and the one or more nucleic acid components are provided in a single vector.
20. The method of any one of the preceding claims, wherein the target polynucleotide comprises a genomic locus.
21. The method of any one of the preceding claims, wherein the target polynucleotide comprises RNA or DNA.
22. The method of claim 21, wherein the RNA comprises a viral RNA of a RNA virus.
23. The method of claim 22, wherein the RNA virus is selected from the group consisting of Norwalk, Rotavirus, Poliovirus, Ebola virus, Marburg virus, Lassa virus, Hantavirus, Rabies, Influenza, Yellow fever virus, Coronavirus, SARS, SARS-CoV-2, West Nile virus, Hepatitis A, C (HCV) and E virus, Dengue fever virus, togaviruses, Rhabdoviruses, Picornaviruses, Myxoviruses, retroviruses, bunyaviruses, coronaviruses, and reoviruses.
24. The method of claim 21, wherein the DNA comprises a genomic DNA or a cDNA.
25. The method of any one of claims 21-24, wherein the modification of the target polynucleotide comprises cleavage of the target polynucleotide.
26. The method of any one of claims 21-25, wherein the target polynucleotide is contained in a nucleic acid molecule within a cell or in vitro.
27. The method of claim 26, wherein the cell comprises a eukaryotic cell.
28. The method of claim 27, wherein the eukaryotic cell comprises a mammalian cell.
29. The method of claim 27, wherein the eukaryotic cell comprises a non-human animal cell, a human cell, or a plant cell.
30. A method of treating or preventing a viral infection of a RNA virus in a cell or a subject, comprising delivering to the cell or the subject an enzyme that possesses a reverse transcriptase activity and an endonuclease activity, or a nucleic acid molecule encoding the enzyme, wherein a single-stranded guide polynucleotide hybridizes with a viral RNA of the RNA virus and directs binding of the enzyme to the viral RNA, and wherein the enzyme cleaves the viral RNA.
31. The method of claim 30, comprising reverse transcribing the viral RNA into a cDNA by the enzyme.
32. The method of claim 30, wherein the enzyme has an integrase activity, and wherein the method comprises integrating the cDNA to a genome of the cell or the subject by the enzyme.
33. The method of claim 30, comprising transcribing the cDNA into the single-stranded guide polynucleotide capable of hybridizing with the viral RNA.
34. A method of enhancing immunity against a viral infection of a RNA virus in a cell or a subject, comprising delivering to the cell or the subject an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a singlestranded DNA (ssDNA) to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, wherein the enzyme reverse-transcribes the mRNA into a second ssDNA, wherein the second ssDNA hybridizes with the viral RNA and directs binding of the enzyme to the viral RNA, and wherein the enzyme cleaves the viral RNA.
35. A method of enhancing immunity against a viral infection of a RNA virus in a cell or a subject, comprising delivering to the cell or the subject an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a singlestranded DNA (ssDNA) to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, and wherein the mRNA with the viral RNA to form a RNA hybrid to silence the viral RNA.
36. A method of generating a cell line with immunity against a viral infection of a RNA virus, comprising delivering to a cell an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a singlestranded DNA (ssDNA) to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, wherein the enzyme reverse-transcribes the mRNA into a second ssDNA, wherein the second ssDNA hybridizes with the viral RNA and directs binding of the enzyme to the viral RNA, and wherein the enzyme cleaves the viral RNA.
37. A method of generating a cell line with immunity against a viral infection of a RNA virus, comprising delivering to a cell an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, or a nucleic acid molecule encoding the enzyme, wherein the enzyme reverse-transcribes a viral RNA of the RNA virus into a singlestranded DNA (ssDNA) to generate a cDNA of the viral RNA, wherein the enzyme integrates the cDNA into a host genome of the cell or the subject, wherein the cDNA integrated into the host genome is transcribed into a mRNA, and wherein the mRNA with the viral RNA to form a RNA hybrid to silence the viral RNA.
38. The method of any one of claims 30-36, wherein the RNA virus is selected from the group consisting of Norwalk, Rotavirus, Poliovirus, Ebola virus, Marburg virus, Lassa virus, Hantavirus, Rabies, Influenza, Yellow fever virus, Coronavirus, SARS, SARS-CoV-2, West Nile virus, Hepatitis A, C (HCV) and E virus, Dengue fever virus, togaviruses, Rhabdoviruses, Picornaviruses, Myxoviruses, retroviruses, bunyaviruses, coronaviruses, and reoviruses.
39. The method of any one of claims 30-36, wherein the enzyme comprises a N-terminal apurinic-apyrimidinic endonuclease domain optionally linked to a reverse transcriptase domain.
40. The method of any one of claims 30-38, wherein the enzyme further comprises a C- terminal domain.
41. The method of claim 40, wherein the C-terminal domain facilitates interaction between the enzyme and the target polynucleotide.
42. The method of claim 40, wherein the C-terminal domain comprises a zinc finger.
43. The method of claim 40, wherein the zinc finger comprises a CCHC motif.
44. The method of any one of claims 30-43, wherein the enzyme comprises a Bat-Associated Reverse Transcriptase (BART).
45. The method of claim 44, wherein the BART comprises a BART of Rhinolophus ferrumequinum, Myotis myotis Meles meles, Bos mutus, Capra hircus. Homo sapiens, Canis lupus familiaris, Sus scrofa, Callithrix jacchus, Mus musculus, Ursus arctos, Elephas maximus indicus, or a variant thereof.
46. The method of any one of claims 30-45, wherein the enzyme is provided through one or more polynucleotide molecules encoding the enzyme.
47. A gene editing system for modifying a target polynucleotide, comprising an enzyme that possesses a reverse transcriptase activity and an endonuclease activity, and one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with the target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide.
48. The gene editing system of claim 47, wherein the enzyme further possesses an integrase activity.
49. The gene editing system of claim 47, wherein the one or more nucleic acid components a single-stranded navigator DNA.
50. The gene editing system of any one of claims 47-48, wherein the one or more nucleic acid components further comprise a payload RNA.
51. The gene editing system of claim 50, wherein the payload RNA comprises a stem-loop structure.
52. The gene editing system of claim 49, wherein the enzyme reverse transcribes the payload RNA into a cDNA.
53. The gene editing system of any one of claims 47-52, wherein the enzyme integrates the cDNA into the target polynucleotide.
54. The gene editing system of any one of claims 47-53, wherein the enzyme comprises aN- terminal apurinic-apyrimidinic endonuclease domain optionally linked to a reverse transcriptase domain.
55. The system of any one of claims 47-54, wherein the enzyme further comprises a C-terminal domain.
56. The gene editing system of claim 55, wherein the C-terminal domain facilitates interaction between the enzyme and the target polynucleotide.
57. The gene editing system of claim 55, wherein the C-terminal domain comprises a zinc finger.
58. The gene editing system of claim 57, wherein the zinc finger comprises a CCHC motif.
59. The gene editing system of any one of claims 47-58, wherein the enzyme comprises Bat- Associated Reverse Transcriptase (BART).
60. The gene editing system of claim 59, wherein the BART comprises a BART of Rhinolophus ferrumequinum o Myotis myotis, Meles meles, Bos mutus, Capra hircus, Homo sapiens, Canis lupus familiaris, Sus scrofa, Callithrix jacchus, Mus musculus, Ursus arctos, Elephas maximus indicus, or a variant thereof.
61. The gene editing system of any one of claims 47-60, comprising one or more polynucleotide molecules encoding the enzyme.
62. The gene editing system of any one of claims 47-60, comprising one or more polynucleotide molecules encoding or comprising the one or more nucleic acid components.
63. The gene editing system of any one of claims 61-62, wherein the one or more polynucleotide molecules comprise one or more vectors.
64. The gene editing system of any one of claims 61-63, wherein the enzyme and the one or more nucleic acid components are provided in a single vector.
65. The gene editing system of any one of claims 47-64, wherein the target polynucleotide comprises RNA or DNA.
I l l
66. The gene editing system of claim 65, wherein the RNA comprises a viral RNA of a RNA virus.
67. The gene editing system of claim 66, wherein the RNA virus is selected from the group consisting of Norwalk, Rotavirus, Poliovirus, Ebola virus, Marburg virus, Lassa virus, Hantavirus, Rabies, Influenza, Yellow fever virus, Coronavirus, SARS, SARS-CoV-2, West Nile virus, Hepatitis A, C (HCV) and E virus, Dengue fever virus, togaviruses, Rhabdoviruses, Picornaviruses, Myxoviruses, retroviruses, bunyaviruses, coronaviruses, and reoviruses.
68. The gene editing system of claim 65, wherein the DNA comprises a genomic DNA or a cDNA.
69. The gene editing system of any one of claims 47-68, wherein the modification of the target polynucleotide comprises cleavage of the target polynucleotide.
70. A delivery system comprising the gene editing system any one of claims 47-69, wherein the delivery system is adapted to deliver the gene editing system into a cell or a subject.
71. The delivery system of claim 70, comprising nanoparticles or vesicles encapsulating the gene editing system.
72. A vector system comprising one or more vectors, wherein the one or more vectors comprise one or more polynucleotide molecules encoding an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, and comprise one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with a target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide.
73. A kit comprising the gene editing system of any one of claims 47-69, the delivery system of any one of claims 70-71, or the vector system of claim 72.
74. A cell line comprising one or more polynucleotide molecules encoding an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and encoding an integrase activity, and one or more nucleic acid components, wherein the one or more nucleic acid components target and hybridize with a target polynucleotide and direct binding of the enzyme to the target polynucleotide, and wherein the enzyme modifies the target polynucleotide.
75. The cell line of claim 74, comprising a eukaryotic cell.
76. The cell line of claim 75, wherein the eukaryotic cell comprises a mammalian cell.
77. The cell line of claim 75, wherein the eukaryotic cell comprises a stem cell or stem cell line.
78. A method for preparing the cell line of any one of claims 74-77, comprising introducing to a cell one or more polynucleotide molecules encoding an enzyme that possesses a reverse transcriptase activity, an endonuclease activity, and an integrase activity, and encoding one or more nucleic acid components.
PCT/US2024/047222 2023-09-19 2024-09-18 Bat-associated reverse transcriptase and methods of use thereof Pending WO2025064511A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202363583648P 2023-09-19 2023-09-19
US63/583,648 2023-09-19

Publications (1)

Publication Number Publication Date
WO2025064511A1 true WO2025064511A1 (en) 2025-03-27

Family

ID=95072089

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2024/047222 Pending WO2025064511A1 (en) 2023-09-19 2024-09-18 Bat-associated reverse transcriptase and methods of use thereof

Country Status (1)

Country Link
WO (1) WO2025064511A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8293472B2 (en) * 2005-06-07 2012-10-23 Luminex Corporation Methods for detection and typing of nucleic acids
US20190211317A1 (en) * 2016-05-27 2019-07-11 Aadigen, Llc Peptides and nanoparticles for intracellular delivery of genome-editing molecules
US20210163944A1 (en) * 2018-08-07 2021-06-03 The Broad Institute, Inc. Novel cas12b enzymes and systems
US20220411817A1 (en) * 2019-09-03 2022-12-29 Myeloid Therapeutics, Inc. Methods and compositions for genomic integration
US20230049737A1 (en) * 2019-12-30 2023-02-16 The Broad Institute, Inc. Genome editing using reverse transcriptase enabled and fully active crispr complexes

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8293472B2 (en) * 2005-06-07 2012-10-23 Luminex Corporation Methods for detection and typing of nucleic acids
US20190211317A1 (en) * 2016-05-27 2019-07-11 Aadigen, Llc Peptides and nanoparticles for intracellular delivery of genome-editing molecules
US20210163944A1 (en) * 2018-08-07 2021-06-03 The Broad Institute, Inc. Novel cas12b enzymes and systems
US20220411817A1 (en) * 2019-09-03 2022-12-29 Myeloid Therapeutics, Inc. Methods and compositions for genomic integration
US20230049737A1 (en) * 2019-12-30 2023-02-16 The Broad Institute, Inc. Genome editing using reverse transcriptase enabled and fully active crispr complexes

Similar Documents

Publication Publication Date Title
US11555181B2 (en) Engineered cascade components and cascade complexes
US11001843B2 (en) Engineered nucleic acid-targeting nucleic acids
CN105518135B (en) CRISPR-Cas9 specific knockout method of porcine CMAH gene and sgRNA for specific targeting of CMAH gene
EP3350327B1 (en) Engineered crispr class 2 cross-type nucleic-acid targeting nucleic acids
JP6965466B2 (en) Manipulated cascade components and cascade complexes
CA3174483A1 (en) Improved methods and compositions for modulating a genome
CN113631708A (en) Methods and compositions for editing RNA
US20210102183A1 (en) Engineered cascade components and cascade complexes
WO2018227755A1 (en) Base editing system and method for specifically repairing hbb gene mutations of humans, reagent kit, and applications thereof
CN116376874A (en) A kind of Cas protein, its gene editing system and application
WO2019173248A1 (en) Engineered nucleic acid-targeting nucleic acids
JPWO2018030536A1 (en) Genome editing method
WO2022060707A1 (en) Systems for gene editing and methods of use thereof
CN113748205A (en) Compositions and methods for improved gene editing
US20190218533A1 (en) Genome-Scale Engineering of Cells with Single Nucleotide Precision
WO2025064511A1 (en) Bat-associated reverse transcriptase and methods of use thereof
WO2025010350A2 (en) Compositions and methods for precise genome editing using retrons
CN117210435A (en) Editing system for regulating and controlling RNA methylation modification and application thereof
Long et al. Targeted mutagenesis in human iPSCs using CRISPR genome-editing tools
US20230088902A1 (en) Cell specific, self-inactivating genomic editing using crispr-cas systems having rnase and dnase activity
US20240052370A1 (en) Modulating cellular repair mechanisms for genomic editing
WO2025049877A1 (en) Chemo-sensitive dominant clone for adaptive therapy
WO2025232868A1 (en) Active dna transposon system and use method thereof
CN118979026A (en) A gene editing protein, its corresponding gene editing system and application

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24869062

Country of ref document: EP

Kind code of ref document: A1