CN117616126A - Reprogrammable TNPB polypeptides and their uses - Google Patents
Reprogrammable TNPB polypeptides and their uses Download PDFInfo
- Publication number
- CN117616126A CN117616126A CN202280024175.9A CN202280024175A CN117616126A CN 117616126 A CN117616126 A CN 117616126A CN 202280024175 A CN202280024175 A CN 202280024175A CN 117616126 A CN117616126 A CN 117616126A
- Authority
- CN
- China
- Prior art keywords
- tnpb
- sequence
- target
- protein
- composition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
Systems, methods, and compositions for targeting polynucleotides are described in detail herein. In particular, engineered DNA targeting systems comprising novel TnpB polypeptides and reprogrammable targeting nucleic acid components are provided, as well as methods of use and applications.
Description
Cross Reference to Related Applications
The present application claims the benefit of U.S. provisional application number 63/141,371 filed on 25 th 1/2021, U.S. provisional application number 63/195,610 filed on 1 th 6/2021, U.S. provisional application number 63/210,860 filed on 15 th 6/2021, and U.S. provisional application number 63/282,352 filed on 11/2021. The entire contents of the above-mentioned application are hereby incorporated by reference in their entirety.
Statement regarding federally sponsored research
The present invention was made with government support under grant numbers HL141201 and HG09761 issued by the national institutes of health. The government has certain rights in this invention.
Sequence listing
The present application contains a sequence listing on a CD-ROM sent concurrently with the submission and labeled "COPY 1", "COPY 2" and "3of 3". Each contained an ascii. Txt file titled BROD-5345wp_st25.Txt, created at 25 days 1 month 2022 and of size 222,687,678 bytes (222.7 MB on disk). The contents of the sequence listing are incorporated herein in their entirety.
Technical Field
The subject matter disclosed herein relates generally to systems, methods, and compositions for targeted genetic modification and nucleic acid editing using a system comprising a TnpB polypeptide.
Background
While there are genome editing techniques available to generate targeted genome interference, there remains a need for new genome engineering techniques that employ new strategies and molecular mechanisms, and that are affordable, easy to build, scalable, and facilitate targeting multiple locations within the genome. Other desirable tools in genomic engineering and biotechnology will further advance the development of technology.
Citation or identification of any document in this application is not an admission that such document is available as prior art to the present invention.
Disclosure of Invention
In certain exemplary embodiments, the non-naturally occurring, engineered composition comprises a) a TnpB polypeptide comprising a RuvC-like domain, and b) a nucleic acid component molecule ωrna comprising a scaffold and a reprogrammable spacer sequence, said RNA molecule being capable of forming a complex with the TnpB polypeptide and directing the TnpB polypeptide to a target polynucleotide. In one embodiment, the TnpB polypeptide comprises from about 200 to about 500 amino acids. The composition can comprise an omega RNA component molecule reprogrammable spacer sequence from 10 nucleotides to 30 nucleotides in length. In embodiments, the nucleic acid component molecule comprises a scaffold about 80 to 200 nucleotides in length. In one aspect, the target sequence comprises a Target Adjacent Motif (TAM) sequence 5' of the target polynucleotide, which may comprise the sequence TCA or TTCAN.
In embodiments, the TnpB protein is selected from table 1A, table 1B, or fig. 1, or is encoded by a polynucleotide sequence in table 1C. In embodiments, the TnpB protein is selected from table 1A, table 1B, table 1C, or fig. 1, or comprises one or more catalytic residues corresponding to 195D, 277E, or 361D of the sequence alignment in fig. 1. In one embodiment, the TnpB protein is active, i.e., has nuclease activity, over a temperature range of 45 ℃ to 60 ℃.
In embodiments, the TnpB protein is selected from the group consisting of actinomycetes mardoensis (Actinomadura cellulosilytica) strain DSM 45823, actinomycetes nano-bipeda (Actinomadura namibiensis) strain DSM 44197, actinoplanes schizophyllum (Actinoplanus lobatus) strain DSM 43150 (TnpB-1 and TnpB-2), halophilic Zhang Liping (Lipingzhangella halophila) strain DSM 102030, corynebacterium racemosus (Ktedonobacter racemifer) and epsilon-proteobacteria (Epsilonproteobacteria bacterium) QNF01000004 _extract_ (reverse), and bacillus megaterium (Alicyclobacillus macrosporangiidus) strain DSM 17980.
In embodiments, the target polynucleotide is DNA. In one aspect, the nucleic acid component further comprises an aptamer. In embodiments, the omega RNA component molecule further comprises an extension that adds an RNA template.
In embodiments, the composition may comprise a functional domain associated with a TnpB protein. In one aspect, the functional domain has transposase activity, recombinase activity, methylase activity, demethylase activity, translational activation activity, translational repression activity, transcriptional activation activity, transcriptional repression activity, transcriptional release factor activity, chromatin modification or remodeling activity, histone modification activity, nuclease activity, single-stranded RNA cleavage activity, double-stranded RNA cleavage activity, single-stranded DNA cleavage activity, double-stranded DNA cleavage activity, nucleic acid binding activity, detectable activity, or any combination thereof. The composition may comprise serine or tyrosine recombinases.
In one embodiment, the composition may further comprise a homologous recombination donor template comprising a donor sequence for insertion into the target polynucleotide.
In one aspect, the composition provides site-specific modifications, which may include cleavage of a DNA polynucleotide. In one aspect, cleavage results in a 5' overhang, which cleavage can occur distally of the target adjacent motif. In embodiments, the TnpB-mediated cleavage occurs at the site of the spacer annealing site or 3' of the target sequence.
A vector system is also provided and may comprise one or more vectors encoding the TnpB polypeptide and omega RNA component compositions as detailed herein.
In embodiments, an engineered cell comprising a composition as detailed herein is provided.
Methods of editing nucleic acids in a target polynucleotide are provided, comprising delivering a composition, one or more polynucleotides, or one or more vectors to a cell or population of cells comprising a target polynucleotide as disclosed herein. In embodiments, the target polynucleotide is a target sequence within genomic DNA. In embodiments, the target polynucleotide is edited at one or more bases to introduce a G.fwdarw.A or C.fwdarw.T mutation.
Also provided is an isolated cell or progeny thereof comprising one or more base edits using the methods as described herein.
Provided herein are methods of modifying a target polynucleotide sequence in a cell, comprising introducing into the cell any of the compositions as disclosed herein. In one aspect, the polypeptide and/or omega RNA component is provided via one or more polynucleotides encoding the polypeptide and/or one or more omega RNAs, and wherein the one or more polynucleotides are operably configured to express a TnpB polypeptide and/or omega RNA molecule. In one embodiment, the method introduces one or more mutations, including substitutions, deletions, and insertions.
In embodiments, provided herein are engineered, non-naturally occurring compositions comprising: a) A TnpB polypeptide, wherein the TnpB polypeptide is catalytically inactive; b) A nucleotide deaminase associated with or otherwise capable of forming a complex with a TnpB protein; and c) an omega RNA component molecule capable of forming a complex with the TnpB protein and directing site-specific binding at the target sequence. In embodiments, the nucleotide deaminase is an adenosine deaminase or a cytosine deaminase.
One or more polynucleotides encoding one or more polynucleotides as disclosed herein are provided.
Also provided are one or more vectors encoding one or more polynucleotides as disclosed herein.
A cell or progeny thereof is provided that is genetically engineered to express one or more components of a composition as disclosed herein.
In embodiments, provided herein are engineered, non-naturally occurring compositions comprising: a) A catalytic death TnpB polypeptide; b) A reverse transcriptase associated with or otherwise capable of forming a complex with a TnpB polypeptide; and c) an omega RNA component molecule capable of forming a complex with the TnpB protein and directing site-specific binding of the complex to a target sequence of a target polynucleotide, the directing molecule further comprising a donor template encoding a donor sequence for insertion into the target polynucleotide.
One or more polynucleotides encoding one or more polynucleotides as disclosed herein are provided.
One or more vectors encoding one or more polynucleotides as disclosed herein are provided.
A method of modifying a target polynucleotide is provided, comprising delivering the above composition, one or more polynucleotides, or one or more vectors to a cell or population of cells comprising the target polynucleotide, wherein the complex directs a reverse transcriptase to the target sequence, and the reverse transcriptase facilitates insertion of a donor sequence encoded by a donor template from an omega RNA component molecule into the target polynucleotide.
In embodiments, provided herein are methods wherein insertion of a donor sequence: a) Introducing one or more base edits; b) Correcting or introducing a premature stop codon; c) Disruption of splice sites; d) Insertion or restoration of splice sites; e) Inserting a gene or gene fragment at one or both alleles of a target polynucleotide; or f) combinations thereof. In embodiments, provided herein is an isolated cell or progeny thereof comprising a modification made using a method as disclosed.
In embodiments, provided herein are engineered, non-naturally occurring compositions comprising: a) A TnpB polypeptide; b) A non-LTR retrotransposon protein that associates with or is otherwise capable of forming a complex with a TnpB polypeptide; and c) an omega RNA component molecule capable of forming a complex with a TnpB protein and directing site-specific binding of the complex to a target sequence of a target polynucleotide, said omega RNA molecule further comprising a donor template encoding a donor sequence for insertion into the target polynucleotide and being located between two binding elements capable of forming a complex with a non-LTR retrotransposon protein.
In embodiments, provided herein is a composition wherein the TnpB protein is fused to the N-terminus of a non-LTR retrotransposon protein. In an embodiment, provided herein is a composition wherein the TnpB protein is engineered to have nicking enzyme activity. In embodiments, the omega RNA component molecule directs the fusion protein to a target sequence 5' of the targeted insertion site, and wherein the TnpB protein generates a strand break at the targeted insertion site. In embodiments, the omega RNA component molecule directs the fusion protein to a target sequence 3' of the targeted insertion site, and wherein the TnpB protein generates a strand break at the targeted insertion site. In embodiments, the donor polynucleotide further comprises a polymerase processing element to facilitate processing of the 3' end of the donor polynucleotide sequence. In embodiments, the donor polynucleotide further comprises a region homologous to the target sequence on the 5 'end of the donor construct, the 3' end of the donor construct, or both. In an exemplary embodiment, the homology region is 8 to 25 base pairs.
One or more polynucleotides encoding one or more components of a composition as disclosed herein are provided.
One or more vectors are provided comprising one or more polynucleotides as disclosed herein.
A method of modifying a target polynucleotide is provided, comprising delivering the above composition, one or more polynucleotides, or one or more vectors to a cell or population of cells comprising the target polynucleotide, wherein the complex directs a non-LTR retrotransposon protein to the target sequence, and the non-LTR retrotransposon protein facilitates insertion of a donor polynucleotide sequence from a donor construct into the target polynucleotide.
In embodiments, provided herein are methods wherein insertion of a donor sequence: a) Introducing one or more base edits; b) Correcting or introducing a premature stop codon; c) Disruption of splice sites; d) Insertion or restoration of splice sites; e) Inserting a gene or gene fragment at one or both alleles of a target polynucleotide; or f) combinations thereof. In embodiments, provided herein is an isolated cell or progeny thereof comprising a modification made using the methods disclosed herein.
In embodiments, provided herein are engineered, non-naturally occurring compositions comprising: a) A TnpB polypeptide; b) An integrase protein that associates with or is otherwise capable of forming a complex with a TnpB polypeptide; and c) an omega RNA component molecule capable of forming a complex with the TnpB protein and directing site-specific binding of the complex to a target sequence of a target polynucleotide, said directing molecule further comprising a donor template encoding a donor sequence for insertion into the target polynucleotide and being located between two binding elements capable of forming a complex with an integrase protein. In embodiments, the TnpB protein is fused to an integrase protein and optionally to a reverse transcriptase. In an embodiment, provided herein is a composition wherein the TnpB protein is engineered to have nicking enzyme activity. In embodiments, the omega RNA component molecule directs the fusion protein to the target sequence, and wherein the TnpB protein creates a nick at the targeted insertion site. In embodiments, the donor polynucleotide further comprises a region homologous to the target sequence on the 5 'end of the donor construct, the 3' end of the donor construct, or both.
One or more polynucleotides encoding one or more components of a composition as disclosed herein are provided.
One or more vectors are provided comprising one or more polynucleotides as disclosed herein.
A method of modifying a target polynucleotide is provided, comprising delivering the above composition, one or more polynucleotides, or one or more vectors to a cell or population of cells comprising the target polynucleotide, wherein the complex directs an integrase protein to the target sequence, and the integrase protein facilitates insertion of a donor polynucleotide sequence from a donor construct into the target polynucleotide.
In embodiments, provided herein are methods wherein insertion of a donor sequence: a) Introducing one or more base edits; b) Correcting or introducing a premature stop codon; c) Disruption of splice sites; d) Insertion or restoration of splice sites; e) Inserting a gene or gene fragment at one or both alleles of a target polynucleotide; or f) combinations thereof. In embodiments, provided herein is an isolated cell or progeny thereof comprising a modification made using the methods disclosed above.
In embodiments, provided herein are compositions for detecting the presence of a target nucleotide in a sample comprising: one or more TnpB proteins with parachuting activity (collateral activity); at least one omega RNA component comprising a sequence capable of binding to a target polynucleotide and designed to form a complex with one or more TnpB proteins; a detection construct comprising a polynucleotide component, wherein the TnpB protein exhibits a parachuting nuclease activity and cleaves the polynucleotide component of the detection construct once activated by a target sequence; and optionally, isothermal amplification reagents.
In embodiments, the TnpB protein is selected from table 1A, table 1B, or fig. 1, or is encoded by a polynucleotide sequence in table 1C. In embodiments, the TnpB protein is selected from table 1A, table 1B, or fig. 1, or comprises one or more catalytic residues corresponding to 195D, 277E, or 361D of the sequence alignment in fig. 1, or is encoded by a polynucleotide sequence in table 1C. In one embodiment, the TnpB protein is active, i.e., has nuclease activity, over a temperature range of 45 ℃ to 60 ℃.
In embodiments, the isothermal amplification reagent is a loop-mediated isothermal amplification (LAMP) reagent. In an exemplary embodiment, the LAMP reagent comprises LAMP primers.
In embodiments, provided herein are compositions further comprising one or more additives to increase reaction specificity or kinetics. In embodiments, provided herein are compositions comprising polynucleotide-binding beads.
In an embodiment, a system for detecting a target sequence (e.g., coronavirus) is provided. A system for detecting the presence of a target sequence in a sample may comprise: tnpB protein; at least one omega RNA component molecule comprising a sequence capable of binding to a target sequence and designed to form a complex with a TnpB protein; and a detection construct comprising a polynucleotide component, wherein the TnpB protein exhibits parachuting rnase activity and cleaves the polynucleotide component of the detection construct once activated by a target sequence.
In an exemplary embodiment, a composition for detecting the presence of a target polynucleotide in a sample is provided that comprises an isothermal amplification reagent for amplifying the target polynucleotide, and a hands-free solution for isolating the polynucleotide from a cell or viral particle. Isothermal amplification reagents may include LAMP reagents comprising F3, B3, FIP, BIP, loop forward and loop reverse primers. In one aspect, the LAMP reagent may further comprise an Oligonucleotide Strand Displacement (OSD) probe.
The systems and methods may utilize one or more TnpB proteins that, in one aspect, are thermostable. The composition for detection may comprise a DNA extraction solution. The detection method may further comprise the step of treating the sample with a DNA extraction solution prior to contacting the sample with the system disclosed herein. Extraction may also include adding beads capable of concentrating the target of interest of the sample, in one aspect, the beads are magnetic.
In embodiments, detecting amplified target polynucleotides by binding of the target polynucleotides to the TnpB complex occurs at a temperature in the range of 45 ℃ to 60 ℃.
In embodiments, the TnpB protein in the TnpB complex is selected from the group consisting of actinomycetes mardoensis strain DSM 45823, actinomycetes nano-meter strain DSM 44197, actinoplanes spatulosa strain DSM 43150 (TnpB-1 and TnpB-2), halophilic Zhang Liping strain DSM 102030, and bacteria of the class of the Epsilon-Proteus QNF01000004 extract (reverse) and Bacillus megaterium strain DSM 17980.
An apparatus comprising the detection system is also provided. The device may comprise a lateral flow device or a cartridge. Also provided is a lateral flow device comprising a substrate comprising a first end and a second end, the first end comprising a sample loading portion; a first region comprising a detectable ligand, two or more systems of the claims provided herein, and one or more first capture regions, each first capture region comprising a first binding agent; the substrate includes two or more second capture areas between the first and second ends, each second capture area containing a different binding agent. In one aspect, the first end comprises two detection constructs, wherein each of the two detection constructs comprises an RNA or DNA oligonucleotide comprising a first molecule on the first end and a second molecule on the second end. In one aspect, the first end comprises three detection constructs, wherein each of the three detection constructs comprises an RNA or DNA oligonucleotide comprising a first molecule on the first end and a second molecule on the second end. The lateral flow device may comprise a polynucleotide encoding a TnpB, and the nucleic acid component molecules are provided as a multiplex polynucleotide configured to comprise two or more nucleic acid component molecules.
A cartridge may be provided comprising at least first and second ampoules, a lysis chamber, an amplification chamber and a sample receiving chamber, the first ampoule being fluidly connected to the sample receiving chamber, the sample receiving chamber being further connected to the lysis chamber, the lysis chamber being connected to the second ampoule and the amplification chamber via a metering channel.
The cartridge may be configured to be mounted in a system comprising heating means, optical means, means for releasing reagents on the cartridge, and means for reading out the assay results. The cassette may comprise a first ampoule comprising a lysis buffer and/or a second ampoule comprising a TnpB system comprising one or more TnpB proteins and at least one nucleic acid component molecule.
In an embodiment, a cassette is provided wherein the TnpB protein in a TnpB bypass detection system for amplifying and detecting a target polynucleotide has activity, i.e. has nuclease activity, in a temperature range of 45 ℃ to 60 ℃.
In an embodiment, a cassette is provided wherein the TnpB protein is TnpB from table 1A below: actinomycetes macerans strain DSM 45823, actinomadura nano-meter strain DSM 44197, actinoplanes schizophyllum strain DSM 43150 (TnpB-1 and TnpB-2), halophil Zhang Liping strain DSM 102030, helicobacter racemosus and epsilon bacteria QNF01000004 extract_ (reverse) and B.megaterium strain DSM 17980.
Methods for detecting polynucleotides in a sample are provided, comprising contacting one or more target sequences with a TnpB, at least one omega RNA component capable of forming a complex with the TnpB and directing sequence-specific binding to one or more target polynucleotides, and a detection construct, wherein the TnpB exhibits a paracentesis activity and cleaves the detection construct upon activation by the one or more target sequences; and detecting a signal from cleavage of the detection construct, thereby detecting the one or more target polynucleotides. Methods for detecting polynucleotides in a sample are provided, which further comprise amplifying a target polynucleotide using isothermal amplification prior to the contacting step. In an embodiment, a method is provided wherein detecting amplified target polynucleotides by binding of the target polynucleotides to a TnpB complex occurs at a temperature in the range of 45 ℃ to 60 ℃. In exemplary embodiments, the target polynucleotide is detected in one hour or less.
Also provided are methods for detecting a target nucleic acid in a sample comprising contacting the sample with a device described herein.
These and other aspects, objects, features and advantages of the exemplary embodiments will become apparent to those of ordinary skill in the art in view of the following detailed description of the exemplary embodiments.
Drawings
An appreciation of the features and advantages of the present invention can be obtained by reference to the following detailed description, which sets forth illustrative embodiments, in which the principles of the invention are utilized, and the accompanying drawings thereof:
FIG. 1-shows sequence alignment of exemplary TnpB peptides. RuvC catalyzes amino acid residue bands underlined.
FIG. 2-alignment of exemplary TnpB sequences.
FIG. 3-alignment of the 3' ends of exemplary TnpB loci.
FIG. 4-depicts the 5' Inverted Terminal Repeat (ITR) sequence of an exemplary TnpB.
FIG. 5-5' ITR of IscB shows similarity to the exemplary TnpB at the sequence level.
FIG. 6-depicts an exemplary Propionibacterium racemosum TnpB gene, RNA conserved regions, and guides (i.e., spacers).
FIG. 7-depicts annotated sequences from an exemplary TnpB locus of F.racemosus, including 5'ITR and 3' ITR.
FIG. 8-shows experimental set-up of TAM requirements for interrogation of exemplary TnpB proteins of actinoplanes sp DSM 43150 and of E.epsilon.bacterial isolate B11.
FIG. 9-shows 5' TAM weblog of actinoplanes strain DSM 43150 (TCAG) and E.epsilon.bacteria isolate B11 (TCAT).
FIG. 10-shows a schematic diagram of an exemplary plasmid cleavage assay for evaluating target cleavage.
FIGS. 11A-11D-show a TnpB Rd1 summary of ortholog 5. Rd1 is a selection of 10 orthologs that appear to be related to ncRNAs with sequence similarity to IscB ncRNAs. (11A) RNA-seq analysis confirmed that TnpB was related to ncRNA. (11B) Weblog of TnpBRd1_5_fn30_tam shows the enriched TAM (30 bp guide of Fn spacer) among all captured TAMs. (11C) Weblogo of TnpBRd1_5_644PSP1_30 shows enriched TAM (30 bp guide of 644PSP1 spacer) in the first 15% of captured TAM. (11D) Weblog of 10% depleted TAM (30 bp guide of Fn spacer) observed with TCAG TAM.
FIGS. 12A-12B-show TnpB Rd1 validation of ortholog 1 and 4. (12A) TXTL cleavage assay of plasmid substrate TAM-specific cleavage in actinoplanes spathi TnpB-1 and in actinomyces mardoensis TnpB was verified. Each condition included two separate plasmids at the same concentration for direct comparison of different substrates. (12B) The adaptor ligation sites were similar between TAM screening and validation (results for ortholog 4 shown). The positions of the adaptor ligation using the 8N TAM library plasmids (top bar) for the non-target (NT) and target (T) strands show that the adaptor ligation sites for each strand are slightly different. When a single TAM plasmid is used with non-target (NT) strands (bottom bar), the sites of adapter ligation are similar to non-target (NT) strands when a TAM library is used.
FIGS. 13A-13B-illustrate the identification of TAM sequences in seven exemplary TnpB orthologs. (13A) Two methods were used to determine TAM sequences in orthologs with 5' TAM: (i) Sequencing of intact pTarget and (ii) sequencing of TAMs enriched after cleavage and adaptor ligation. (13B) Weblogo shows TAM of the sequence of the ortholog of seven bacteria TnpB, including actinomycetes mardoensis strain DSM 45823, actinomycetes mardoensis strain DSM 44197, two TnpB from actinoplanes spalight strain DSM 43150, B.megaterium strain DSM 17980, B.halophilus strain DSM 102030 and E.epsilon bacteria and F.racemosus.
FIG. 14-shows the verification of TAM for two TnpB orthologs. TAM was determined after ligation of the adaptors of the cleavage products of actinoplanes TnpB-1 and actinomyces mardoensis TnpB.
FIGS. 15A-15B-show the identification of DNA cleavage sites for the Target Strand (TS) and the non-target strand (NTS) of Actinoplanes schizophyllum TnpB-2 (15A). The black triangles represent specific cleavage sites on both strands identified after sequencing. (15B) Sequencing and analysis of 8N TAM library plasmid cleavage products from actinomycetes TnpB-2, split She Youdong. Both non-target (NT) and target (T) strands are cleaved and show read abundance at different positions (bp) in the spacer.
FIGS. 16A-16B-show characterization of non-coding RNA (ncRNA) associated with actinoplanes spatpB. (16A) Expression and RNA pull-down (RNA pulldown) of Actinoplanes schizophyllum TnpB-2 in Escherichia coli (E.coli) shows a 173nt scaffold immediately downstream of the TnpB-2ORF, followed by a guide sequence. (16B) RNA scaffolds ranging in size from 173nt full length truncated to 102nt have been demonstrated to maintain TAM-specific enrichment.
FIG. 17-shows non-coding RNA regions associated with the Propionibacterium racemosum TnpB.
FIGS. 18A-18F-illustrate the exploration of IS200/605 superfamily nuclease diversity. (18A) Evolution between the IS200/605 transposon superfamily encoded nucleases and related RNAs. The dashed line reflects the tentative/unknown relationship. (18B) Natural expression of TnpB omega RNA in Racemosella. (18C) Comparison of omega RNA from the IscB and TnpB loci of the Racemosa-tenacillus. (18D) secondary structure prediction of KraTnpB-ligated omega RNA. (18E) Actinoplanes and cellulolytic actinomadura TnpB webloco cutting TAMs in the IVTT TAM screen using reprogramming instructions. (18F) Plasmid competition assays using actinoplanes and actinomycetes maduralensis TnpB (×p < 0.05).
FIG. 19-shows a naturally occurring RNA directed DNA targeting system. Comparison of OMEGA (obligate mobile element directed activity (OMEGA)) systems with other known RNA directed systems. Unlike CRISPR systems that capture spacer sequences and store them in CRISPR arrays in loci, omega systems transpose their loci (or trans-acting loci) into target sequences, apparently converting targets to omega RNA guidance in a process that can be referred to as guide recruitment (guide conscription).
FIG. 20-shows the results of TnpB conservation analysis. Share conservation of the 3' -end of the TnpB locus at the end of the KraIscB-1 transposon. The conserved region on the 3 'region of the TnpB locus corresponds to the 5' region of the omega RNA of IscB. Conservation of the TnpB locus outside the ORF on the 3' end suggests the presence of non-coding RNAs that may function similarly to the omega RNA of IscB.
Characterization of FIG. 21A through FIG. 21C-TnpB ωRNA directed cleavage. (21A) Recombinant purified small RNA-seq of actinoplanes TnpB-2 in the presence of downstream predicted omega RNA and a guide. Predicted omega RNA scaffolds and downstream regions constituting putative guidelines for co-purification with actinoplanes spathi TnpB-2 proteins indicate the interaction of the proteins with omega RNA transcripts. (21B) TAM screening of additional TnpB loci. (21C) plasmid competition assay positive control using SpCas 9. As expected, spCas9 only cleaves the plasmid containing TAM and target, as indicated by the presence of cleavage specific adapter ligation products. Statistical significance was assessed by comparing the number of adapter-ligated reads of the first plasmid listed under each condition using a two-tailed T-test, normalized to the average of adapter-ligated reads of the second plasmid listed in the + protein and-protein condition.
FIG. 22-shows that Target Adjacent Motif (TAM) and target dependent dsDNA cleavage using TnpB proteins at different temperatures of dsDNA substrates (range 37℃to 80 ℃) required both TAM and target and were most robust between 45℃and 60 ℃. In this experiment, the TnpB protein was obtained from ortholog 6, which corresponds to B.megaterium strain DSM 17980 and was added at a final concentration of 1. Mu.M protein and 100ng dsDNA substrate.
FIG. 23-is a photograph showing the paraclinic cleavage of a paraclinic substrate (paraclinic substrate 1) at a final concentration of 1. Mu.M using the TnpB protein from the B.chrysosporium strain DSM 17980 at 60 ℃. "ssDNA substrates" and "dsDNA substrates" contain target sequences to which omega RNA is designed to bind. "by-cut substrate 1" does not contain a target sequence. The photograph shows that cleavage of parachuting substrate 1 is induced in the presence of a dsDNA substrate, wherein the dsDNA substrate has both a target sequence and TAM. Cleavage of parachuting substrate 1 was also induced in the presence of ssDNA substrates with target sequences and without TAM.
The figures herein are for illustration purposes only and are not necessarily drawn to scale.
Detailed Description
General definition
Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Definitions of common terms and techniques in molecular biology can be found in Molecular Cloning: A Laboratory Manual,2 nd edition (1989) (Sambrook, fritsch and manitis); molecular Cloning: A Laboratory Manual, 4 th edition (2012) (Green and Sambrook); current Protocols in Molecular Biology (1987) (F.M. Ausubel et al); the series Methods in Enzymology (Academic Press, inc.):PCR 2:A Practical Approach (1995) (M.J.MacPherson, B.D.Hames and G.R.Taylor edit): antibodies, A Laboratory Manual (1988) (Harlow and Lane edit): antibodies A Laboratory Manual, version 2 in 2013 (E.A.Greenfield edit); animal Cell Culture (1987) (r.i. freshney edit); benjamin lewis, genes IX, published by Jones and Bartlet, 2008 (ISBN 0763752223); kendrew et al (editions), the Encyclopedia of Molecular Biology, published by Blackwell Science Ltd., 1994 (ISBN 0632021829); robert A. Meyers (editions), molecular Biology and Biotechnology: a Comprehensive Desk Reference, published by VCH Publishers, inc., 1995 (ISBN 9780471185710); singleton et al, dictionary of Microbiology and Molecular Biology, 2 nd edition, J.Wiley & Sons (New York, N.Y. 1994), march, advanced Organic Chemistry Reactions, mechanisms and Structure, 4 th edition, john Wiley & Sons (New York, N.Y. 1992); and Marten H.Hofker and Jan van Deurs, transgenic Mouse Methods and Protocols, version 2 (2011).
As used herein, the singular forms "a," "an," and "the" include both single and plural referents unless the context clearly dictates otherwise.
The term "optional" or "optionally" means that the subsequently described event, circumstance or substituent may or may not occur, and that the description includes instances where the event or circumstance occurs and instances where it does not.
The recitation of numerical ranges by endpoints includes all numbers subsumed within that corresponding range and fractions, and the endpoints recited.
The term "about" or "approximately" as used herein when referring to measurable values such as parameters, amounts, time intervals, etc., is intended to encompass variations of the particular values as well as variations from the particular values such as +/-10% or less, +/-5% or less, +/-1% or less and +/-0.1% or less of the particular values as well as variations from +/-10% or less, +/-5% or less, +/-1% or less and +/-0.1% or less of the particular values, so long as such variations are suitable for execution in the disclosed invention. It should be understood that the value itself to which the modifier "about" or "approximately" refers is also specifically and preferably disclosed.
As used herein, a "biological sample" may contain whole cells and/or living cells and/or cell debris. The biological sample may contain (or be derived from) a "body fluid". The present invention encompasses embodiments wherein the bodily fluid is selected from amniotic fluid, aqueous humor, vitreous humor, bile, serum, breast milk, cerebrospinal fluid, cerumen (cerumen), chyle, chyme, endolymph, perilymph, exudates, faeces, female semen (female ejacum), gastric acid, gastric juice, lymph, mucus (including nasal drainage and sputum), pericardial fluid, peritoneal fluid, pleural fluid, pus, inflammatory secretions, saliva, sebum (skin oil), semen, sputum, synovial fluid, sweat, tears, urine, vaginal secretions, vomit, and mixtures of one or more thereof. Biological samples include cell cultures, body fluids, cell cultures derived from body fluids. Body fluids may be obtained from mammalian organisms, for example, by lancing or other collection or sampling procedures.
The terms "subject," "individual," and "patient" are used interchangeably herein to refer to a vertebrate, preferably a mammal, more preferably a human. Mammals include, but are not limited to, murine, simian, human, livestock, athletic, and companion animals. Tissues, cells, and progeny of the biological entities obtained in vivo or cultured in vitro are also contemplated.
Various embodiments are described below. It should be noted that the specific embodiments are not intended as an exhaustive description or as a limitation on the broader aspects discussed herein. An aspect described in connection with a particular embodiment is not necessarily limited to the embodiment and may be implemented with any other embodiment. Throughout this specification, references to "one embodiment", "an example embodiment (an example embodiment)" mean that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of the phrases "in one embodiment (in one embodiment)", "in an embodiment (in an embodiment)", or "exemplary embodiment (an example embodiment)" in various places throughout this specification are not necessarily all referring to the same embodiment, but may. Furthermore, it will be apparent to one skilled in the art from this disclosure that the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. Furthermore, while some embodiments described herein include some but not others included in other embodiments, combinations of features of different embodiments are intended to fall within the scope of the invention. For example, in the appended claims, any of the claimed embodiments may be used in any combination.
All publications, published patent documents, and patent applications cited herein are hereby incorporated by reference to the same extent as if each individual publication, published patent document, or patent application was specifically and individually indicated to be incorporated by reference.
Summary embodiments disclosed herein provide an engineered TnpB system. The TnpB system comprises a TnpB polypeptide and a nucleic acid component capable of forming a complex with the TnpB polypeptide and directing the complex to a target polynucleotide. The TnpB system and the TnpB/nucleic acid component complex may also be referred to herein as OMEGA (obligate mobile element directing activity (Obligate Mobile Element Guided Activity)) systems or complexes, or simply OMEGA systems or complexes. The TnpB system is a unique type of omega system and includes IscB, isrB and IshB systems. The nucleic acid component of the omega system is structurally different from other RNA-guided nucleases (such as CRISPR-Cas systems) and may also be referred to as omega RNAs. In certain exemplary embodiments, the TnpB system is RNA-based, i.e., the nucleic acid component contributes more to the overall size of the TnpB complex relative to other RNA-guided nuclease systems (such as CRISPR-Cas). Furthermore, given the smaller structural features of TnpB relative to other known programmable nucleases (such as CRISPR-Cas), polynucleotide binding pockets are open and more accessible, which can facilitate greater access to nucleotides at a target region on a bound polynucleotide and the ability to manipulate, modify, edit, remove, or delete the nucleotides. Disclosed herein are TnpB systems that can act as nucleases, nicking enzymes, or catalytic inactivating polynucleotide binding proteins that can be coupled to other functional domains.
In one embodiment, the TnpB system and related compositions can specifically target single-or double-stranded DNA. In one embodiment, the TnpB system can bind and cleave double stranded DNA. In one embodiment, the TnpB system can bind double stranded DNA without introducing breaks into either strand. In one embodiment, the TnpB polypeptide or nuclease/nucleic acid component complex can be opened to disrupt the continuity of one of the two DNA strands, thereby introducing a nick in the double stranded DNA. In embodiments, and without being bound by theory, the size and configuration of the TnpB system allows for exposure to a non-targeting strand (which may be in single-stranded form) to allow for the ability to modify, edit, delete, or insert polynucleotides on the non-target strand. In embodiments, such accessibility also allows for enhanced editing results on target and/or non-target strands, e.g., increased specificity, enhanced editing efficiency.
In another aspect, embodiments disclosed herein include the use of the compositions herein, including therapeutic and diagnostic compositions and uses. Also provided are delivery of the disclosed proteins and systems, including delivery to a plurality of cells, and the delivery via a plurality of delivery vehicles.
TnpB composition
In one aspect, embodiments disclosed herein relate to a composition comprising TnpB and an omega RNA capable of forming a complex with TnpB and directing site-specific binding of TnpB to a target sequence on a target polynucleotide.
TnpB polypeptides
The TnpB polypeptides of the invention may comprise a Ruv-C like domain. Exemplary TnpB sequences are shown in fig. 1, table 1A, table 1B, table 1C, and table 5. The RuvC domain may be a split RuvC domain comprising RuvC-I, ruvC-II and RuvC-III subdomains. TnpB may also comprise one or more of an HTH domain, a bridge helix domain, and a zinc finger domain. The TnpB polypeptide does not comprise an HNH domain. In one exemplary embodiment, the TnpB protein comprises, starting from the N-terminus, an HTH domain, a RuvC-I subdomain, a bridge helix domain, a RuvC-II subdomain, a zinc finger domain, and a RuvC-III subdomain. In one exemplary embodiment, the RuvC-III subdomain forms the C-terminus of the TnpB polypeptide.
In certain of the exemplary embodiments of the present invention, the preparation method comprises the steps of (1) carrying out size adjustment between 200 and 770, carrying out size adjustment between 200 and 630, carrying out size adjustment between 200 and 620 amino acids, 200 and 610 amino acids, 200 and 600 amino acids, 200 and 590 amino acids, 200 and 580 amino acids, 200 and 570 amino acids, 200 and 560 amino acids, 200 and 540 amino acids, 200 and 530 amino acids, 200 and 520 amino acids, and 200 and 510 amino acids, 200 and 500 amino acids, 200 and 490 amino acids, 200 and 480 amino acids, 200 and 470 amino acids, 200 and 460 amino acids, 200 and 450 amino acids, 200 and 440 amino acids, 200 and 430 amino acids, 200 and 420 amino acids, 200 and 410 amino acids, 210 and 500 amino acids, 220 and 500 amino acids. Between 230 and 500 amino acids, between 240 and 500 amino acids, between 250 and 500 amino acids, between 260 and 500 amino acids, between 270 and 500 amino acids, between 280 and 500 amino acids, between 290 and 500 amino acids, between 300 and 500 amino acids, between 250 and 490 amino acids, between 250 and 480 amino acids, between 250 and 490 amino acids, or between 250 and 600 amino acids. In one embodiment, the TnpB polypeptide is between 300 and 500 amino acids, or between 350 and 450 amino acids.
In one embodiment, the TnpB polypeptide may comprise a modified naturally occurring protein, a functional fragment or truncated version thereof, or a non-naturally occurring protein. In one embodiment, the TnpB polypeptide comprises one or more domains derived from other TnpB polypeptides, more particularly from different organisms. In one embodiment, the TnpB polypeptide may be designed by computer methods. Examples of computer protein designs have been described in the art and are therefore known to the skilled artisan.
In one embodiment, the TnpB polypeptide is from a strain of bacteria of the class epsilon alteromonas or actinoplanes spatulosa, strain DSM 43150, strain maduralensis, strain DSM 45823, strain maduralensis, strain DSM 44197, strain of alicyclobacillus megaterium, strain DSM 17980, strain of halophilus Zhang Liping, strain DSM 102030, or strain of corynebacterium racemosum. In one embodiment, the TnpB polypeptide is from a corynebacterium racemosum or comprises a conserved RNA region similar to the 5' itr of the twisted corynebacterium racemosum TnpB locus. See, e.g., table 5, fig. 2. In one aspect, the TnpB polypeptide encodes a 5'itr/RNA (RNA on the 3' strand), tnpB (3 'strand), and finally a 3' itr. In one exemplary embodiment, the TnpB may comprise a Fanzor protein, tnpB homolog found in the eukaryotic genome.
The TnpB polypeptides also encompass homologs or orthologs of the TnpB polypeptides, the sequences of which are specifically described herein. The terms "ortholog" and "homolog" are well known in the art. By way of further guidance, a "homolog" of a protein as used herein is a protein of the same species that performs the same or similar function as the protein that is the homolog thereof. Homologous proteins may be, but need not be, structurally related, or only partially related. An "ortholog" of a protein as used herein is a protein of a different species that performs the same or similar function as the protein that is an ortholog thereof. An orthologous protein may be, but may not always be, structurally related, or may be only partially structurally related. In particular embodiments, a homolog or ortholog of a TnpB polypeptide (such as mentioned herein) has at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% sequence homology or identity to a TnpB polypeptide, more specifically to a TnpB sequence identified in table 1A, 1B, 1C, 5 or fig. 1. In particular embodiments, the homolog or ortholog is identified based on its domain structure and/or function. In embodiments, the homolog or ortholog comprises a catalytic residue and/or domain as defined herein, including those identified in fig. 1 and 18. Sequence alignment performed as described herein and folding studies and domain prediction as taught herein can help identify homologs or orthologs that have structural and functional characteristics that identify the TnpB polypeptide, particularly those with conserved residues (including catalytic residues) and domains of the TnpB polypeptide.
In one embodiment, the TnpB locus comprises an Inverted Terminal Repeat (ITR). The inverted terminal repeat may be present at the 5 'or 3' end of the TnpB sequence. In one aspect, the inverted terminal repeat can comprise about 20 to about 40 nucleotides, such as about 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 nucleotides. In embodiments, the ITRs comprise about 25 to 35 nucleotides, about 28 to 32 nucleotides. In one aspect, the ITRs share similarities with one or more inverted terminal repeats of the sequence encoding the IscB polypeptide. In one embodiment, the 5'itr or 3' itr of TnpB has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence homology or identity to IscB 5'itr or 3' itr. In embodiments, the 5'itr of TnpB is homologous to the 5' itr of IscB. Exemplary IscB ITRs are disclosed in Altae-Tran et al, science 2021, 9, 374:6563, pages 57-65; doi 10.1126/science.abj685, which is expressly incorporated herein by reference in its entirety, includes supplemental material data S1 through S4 and tables S1 through S6.
In one embodiment, the TnpB locus comprises a highly conserved region beyond the sequence encoding the polypeptide, which indicates the presence of RNA at the 5' end of the TnpB locus. In one aspect, the upstream region of the 5' itr of TnpB comprises a region encoding an RNA species comprising a guide sequence.
The chimeric enzyme may comprise a first fragment and a second fragment, and the fragments may be TnpB polypeptide orthologs of an organism of one genus or species, e.g., the fragments are from TnpB polypeptide orthologs of a different species.
RuvC domain
In one embodiment, the TnpB polypeptide comprises at least one RuvC-like nuclease domain. The RuvC domain may comprise conserved catalytic amino acids that indicate RuvC catalytic residues. In an exemplary embodiment, ruvC catalytic residues may be referenced relative to the following: 186D, 270E or 354D of the TnpB polypeptide 488601079 of table 1A; 172D, 254E or 337D of TnpB polypeptide 297565028 of table 1A; or 179D, 268E or 351D of the TnpB polypeptide 257060308 of table 1A. Catalytic residues may be referenced relative to 195D, 277E or 361D of the sequence alignment in fig. 1. In one aspect, the RuvC domain may comprise multiple subdomains, such as RuvC-I, ruvC-II and RuvC-III. The subdomains may be separated by intervening amino acid sequences of the protein. An exemplary domain architecture of an exemplary TnpB polypeptide is shown in fig. 18A.
In one embodiment, examples of RuvC domains include any polypeptide having structural similarity and/or sequence similarity to RuvC domains described in the art. In some examples, a RuvC domain can have an amino acid sequence that has at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99%, or 100% sequence identity to a RuvC domain known in the art.
In some examples, the RuvC domain comprises a RuvC-I subdomain, a RuvC-II subdomain, and a RuvC-III subdomain. Examples of RuvC-I subdomains also include any polypeptide having structural similarity and/or sequence similarity to RuvC-I domains described in the art. For example, the RuvC-I domain may have structural and/or sequence similarity to RuvC-I found in bacterial or archaeal species. In some examples, the RuvC domain can have an amino acid sequence that has at least 50%, at least 55%, at least 60%, at least 5%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99%, or 100% sequence identity to the RuvC-I domain. RuvC-II domains also include any polypeptide having structural and/or sequence similarity to RuvC-II domains described in the art. In some examples, the RuvC domain can have an amino acid sequence that has at least 50%, at least 55%, at least 60%, at least 5%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99%, or 100% sequence identity to the RuvC-II domain. RuvC-III domains also include any polypeptide having structural and/or sequence similarity to RuvC-III domains described in the art. In some examples, the RuvC domain can have an amino acid sequence that has at least 50%, at least 55%, at least 60%, at least 5%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99%, or 100% sequence identity to the RuvC-III domain.
For example, and as described in the art (e.g., crystal structure of Cas in complex with nucleic acid component molecule and target DNA, nishimasu et al Cell, 2014), ruvC may be composed of six-strand mixed alpha 0 sheets (α11, α22, α35, α411, α514, and α617) flanking an alpha helix (α33, α34, and α39- α45) and two additional double-stranded antiparallel beta sheets (β3/β4 and β15/β16). It is described that some of the RuvC domains have structural similarity to members of the retrovirus integrase superfamily characterized by RNase H folding, such as E.coli RuvC (PDB code 1HJR,14% identity, root mean square deviation (rmsd) is126 equivalent C.alpha.atoms) and Thermus thermophilus (Thermus thermophilus) RuvC (PDB code 4LD0, 12% identity, rmsd>131 equivalent C alpha atoms). Escherichia coli RuvC is a 3-layer alpha-beta sandwich containing 5-chain beta sheets sandwiched between 5 alpha helices. RuvC nucleases have four catalytic residues (e.g., asp7, glu70, his143, and Asp146 in thermus thermophilus RuvC) and cleave Holliday linkages through a bimetallic mechanism(or a similar cross-shaped connection). Asp10 (Ala), glu762, his983 and Asp986 of the Cas9 RuvC domain are located in a similar position to the catalytic residues of Thermus thermophilus RuvC. The RuvC-like domain of the TnpB polypeptide may comprise 1, 2, 3 or 4 catalytic residues.
In embodiments, the TnpB polypeptide is a nuclease. In one embodiment, the TnpB and nucleic acid components may direct sequence-specific nuclease activity. Cleavage may result in a 5' overhang. Cleavage may occur distally of the Target Adjacent Motif (TAM), and may also occur at the site of the spacer (guide) annealing site or 3' of the target sequence. In one aspect, the TnpB is cleaved at multiple positions within and outside of the annealing site of the nucleic acid component. In one aspect, DNA cleavage occurs at 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25 or more base pairs distal to the TAM and results in a 5' overhang.
In embodiments, the TnpB polypeptide has activity, i.e., nuclease activity, at a temperature ranging from about 37 ℃ to about 80 ℃. In embodiments, the TnpB polypeptide is active at a temperature of from about 37 ℃ to about 75 ℃, from about 37 ℃ to about 70 ℃, from about 37 ℃ to about 65 ℃, from about 37 ℃ to about 60 ℃, from about 37 ℃ to about 55 ℃, from about 37 ℃ to about 50 ℃, from about 37 ℃ to about 45 ℃. In exemplary embodiments, the TnpB polypeptide has activity in the range of 37 ℃ to 65 ℃. In exemplary embodiments, the TnpB polypeptide has activity in the range of 45 ℃ to 65 ℃. In exemplary embodiments, the TnpB polypeptide has activity in the range of 45 ℃ to 60 ℃. In another exemplary embodiment, the TnpB polypeptide is a TnpB polypeptide selected from the group consisting of: actinomycetes mardoensis strain DSM 45823, actinomycetes nano-meter Paederia strain DSM 44197, actinoplanes spatzewalsis strain DSM 43150 (TnpB-1 and TnpB-2), halophilic Zhang Liping strain DSM 102030, helicobacter racemi and epsilon bacteria QNF 01000004. RTM.extract (reverse). In another exemplary embodiment, the TnpB polypeptide is from bacillus stearothermophilus strain DSM 17980. In an exemplary embodiment, the alicyclobacillus megaterium strain DSM 17980TnpB protein is most active in the range of 45 ℃ to 60 ℃ (fig. 22).
In one embodiment, the TnpB polypeptide exhibits parachuting activity, also known as trans-cleavage, wherein upon activation and cleavage of its cognate target, non-specific cleavage of a non-cognate nucleic acid occurs. In one aspect, the TnpB polypeptide has parachuting activity once triggered by target recognition. In one aspect, upon binding to a target sequence, the TnpB polypeptide will non-specifically cleave a polynucleotide sequence, such as DNA. The target-activated nonspecific nuclease activity of TnpB is also referred to herein as bypass activity.
In embodiments, the TnpB protein exhibits nuclease activity against both ssDNA and dsDNA target sequences. In embodiments, the TnpB protein exhibits nuclease activity on ssDNA and dsDNA, wherein TAM may not be necessary to cleave ssDNA targets (fig. 23).
In embodiments, the TnpB polypeptide is a nuclease. In one embodiment, the TnpB and nucleic acid component molecules may direct sequence-specific nuclease activity. The TnpB polypeptides provided herein may also exhibit RNA-guided recombinase activity. Homology to the RuvC domain and correlation to the DDE recombinase family indicate potential recombinase activity. In embodiments, some of the TnpB polypeptides detailed herein may naturally exhibit or be engineered to exhibit deleted or reduced nuclease activity and have functional domains as detailed herein, e.g., nucleotide deaminase, reverse transcriptase, transposable elements, e.g., transposase, integrase, recombinase, thereby allowing RNA-guided target-specific modification.
Exemplary TnpB Polypeptides
In certain exemplary embodiments, the TnpB protein may comprise the sequences listed in table 1A, table 1B, or table 1C. In Table 1A, the reverse natural TnpB amino acid sequences are provided for actinomycetes, actinoplanes TnpB-1, actinomycetes halophilus (H.alba), actinomycetes Namilbemyces, actinomycetes ochromogenes (A. Umbrina) and bacteria of the class epsilon Proteus 10_QNFX01000004, all starting with valine (GTG) (position +1), but translated to methionine due to the special properties of initiator tRNA, as is well known in the art.
TABLE 1A exemplary TnpB sequence
TABLE 1B TnpB Polypeptides corresponding to SEQ ID NO 38-64,263
Table 1B exemplary TnpB sequence
TABLE 1C exemplary TnpB Polynucleotide and direct repeat sequences
The TnpB polypeptide may comprise one or more modifications. As used herein, the term "modified" with respect to a TnpB polypeptide generally refers to the TnpB polypeptide having one or more modifications or mutations (including point mutations, truncations, insertions, deletions, chimeras, fusion proteins, etc.) as compared to the wild type counterpart from which the TnpB polypeptide was derived. Derived means that the derived enzyme is largely based on the wild-type enzyme in the sense of having a high degree of sequence or structural homology with the wild-type enzyme, but it has been mutated (modified) in some way known in the art or as described herein.
The modified protein (e.g., modified TnpB polypeptide) may be catalytically inactive (also known as dead). As used herein, a catalytically inactivated or dead nuclease may have reduced nuclease activity or no nuclease activity compared to the wild-type corresponding nuclease. In some cases, the catalytically inactive or dead nuclease may have nickase activity. In some cases, the catalytically inactive or dead nuclease may not have nickase activity. Such catalytically inactive or dead nucleases may not produce double-or single-stranded breaks on the target polynucleotide, but may still bind to or otherwise form complexes with the target polynucleotide.
In embodiments, eukaryotic homologs of bacterial TnpB may be used in the present invention. These TnpB-like proteins Fanzer 1 and Fanzer 2, although having a shared amino acid motif in their C-terminal half, are variable in their N-terminal regions. See Bao et al, homologues of bacterial TnpB _IS605 are widespread in diverse eukaryotic transposable elements. Mobile DNA 4,12 (2013). Doi:10.1186/1759-8753-4-12. In one aspect, the conserved sequence between TnpB and fanzer comprises D-X (125, 275) - [ TS ] - [ TS ] -X-X- [ C4 zinc finger ] -X (5, 50) -RD. In addition to its N-terminal region differing from TnpB, the Fanzer protein also has a greater diversity, wherein the Fanzer protein is associated with different transposons and compositions. The similarity of the Fanzor system may allow for similar uses and applications as applicants find nucleic acid components and mechanisms for reprogramming the activity of the TnpB polypeptide.
In one embodiment, modification of the TnpB polypeptide may or may not result in altered functionality. For example, modifications that do not result in altered functionality include, for example, codon optimization for expression into a particular host, or providing a nuclease with a particular marker (e.g., for visualization). Modifications that can result in altered functionality can also include mutations, including point mutations, insertions, deletions, truncations (including split nucleases), and the like, as well as chimeric nucleases (e.g., comprising domains from different orthologs or homologs) or fusion proteins. Fusion proteins may include, for example, without limitation, fusions with heterologous domains or functional domains (e.g., localization signals, catalytic domains, etc.). In one embodiment, various modifications (e.g., mutant nucleases that are catalytically inactive and further fused to functional domains) can be combined, such as, for example, to induce DNA methylation or another nucleic acid modification, such as including, but not limited to, fragmentation (e.g., by a different nuclease (domain)), mutation, deletion, insertion, substitution, ligation, digestion, fragmentation, or recombination. As used herein, "altered functionality" includes, but is not limited to, altered specificity (e.g., altered target recognition, increased (e.g., an "enhanced" TnpB polypeptide) or decreased specificity, or altered TAM recognition), altered activity (e.g., increased or decreased catalytic activity, including catalytically inactivated nucleases or nickases), and/or altered stability (e.g., fusion with destabilizing domains). Examples of all such modifications are known in the art. It will be appreciated that reference herein to a "modified" nuclease, particularly a "modified" TnpB polypeptide or system or complex, preferably still has the ability to interact or bind with a polynucleic acid (e.g., form a complex with a nucleic acid component molecule). Such modified TnpB polypeptides may be combined with deaminase proteins or active domains thereof as described herein.
In one embodiment, the unmodified TnpB polypeptide may have cleavage activity. In one embodiment, the TnpB polypeptide may direct cleavage of one or both nucleic acid (DNA or RNA) strands at or near the location of the target sequence, such as within the target sequence and/or within the complement of the target sequence or at a sequence associated with the target sequence. In one embodiment, the TnpB polypeptide may direct cleavage of one or two DNA or RNA strands within about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, 200, 500 or more base pairs or nucleotides from the first or last nucleotide of the target sequence. In one embodiment, the cuts may be staggered, i.e., produce sticky ends. In one embodiment, the cut is a staggered cut with 5' overhangs. In one embodiment, the cleavage is a staggered cleavage with a 5' overhang of 1 to 5 nucleotides, preferably 4 or 5 nucleotides. In particular embodiments, the TnpB polypeptide cleaves a DNA strand.
In one embodiment, the TnpB polypeptide may be mutated relative to the corresponding wild type enzyme such that the mutated TnpB lacks the ability to cleave one or both strands of a target polynucleotide containing a target sequence. As another example, two or more catalytic domains of a TnpB polypeptide (e.g., ruvC) may be mutated to produce a mutated TnpB polypeptide that lacks substantially all DNA cleavage activity. In one embodiment, a TnpB polypeptide may be considered to lack substantially all polynucleotide cleavage activity when the polynucleotide cleavage activity of the mutated enzyme is no more than 25%, no more than 10%, no more than 5%, no more than 1%, no more than 0.1%, no more than 0.01% of the nucleic acid cleavage activity of the non-mutated form of the enzyme; an example may be a mutant form with zero or negligible nucleic acid cleavage activity compared to a non-mutant form.
In one embodiment, the TnpB polypeptide may comprise one or more modifications that result in enhanced activity and/or specificity, such as including mutation of residues of the stabilizing targeting strand or the non-targeting strand. In one embodiment, the altered or modified activity of the engineered TnpB polypeptide comprises increased targeting efficiency or reduced off-target binding. In one embodiment, the altered activity of the engineered TnpB polypeptide comprises altered cleavage activity. In one embodiment, the altered activity comprises increased cleavage activity of the target polynucleotide locus. In one embodiment, the altered activity comprises reduced cleavage activity at the target polynucleotide locus. In one embodiment, the altered activity comprises reduced cleavage activity at an off-target polynucleotide locus. In one embodiment, the modified nuclease comprises a modification that alters the association of a protein with a nucleic acid molecule comprising RNA, or a target polynucleotide locus strand, or an off-target polynucleotide locus strand. In one aspect of the invention, the engineered TnpB polypeptide comprises modifications that alter formation of the TnpB polypeptide and related complexes. In one embodiment, the altered activity comprises increased cleavage activity at an off-target polynucleotide locus. Thus, in one embodiment, the specificity for the target polynucleotide locus is increased compared to the off-target polynucleotide locus. In other embodiments, the specificity for the target polynucleotide locus is reduced compared to the off-target polynucleotide locus. In one embodiment, the mutation results in reduced off-target effects (e.g., cleavage or binding properties, activity, or kinetics), such as in the case of a TnpB polypeptide, for example, results in lower tolerance to mismatches between the target and the omega RNA. Other mutations may lead to increased off-target effects (e.g., cleavage or binding properties, activity, or kinetics). Other mutations may result in increased or decreased in-target effects (e.g., cleavage or binding properties, activity, or kinetics). In one embodiment, the mutation results in altered (e.g., increased or decreased) activity, association, or formation of a functional nuclease complex. Exemplary mutations include mutating a negatively charged or neutral residue to a positively charged residue, or mutating a positively charged residue to neutral, or mutating a neutral residue to a negatively charged residue and/or (evolutionarily) conserved residues, such as conserved positively charged residues, in order to enhance specificity. See, e.g., zhou et al, chem Rev.2018, 2, 28, 118 (4): 1691-1741, doi:10.1021/acs/chemrev.7b00305 (discussing electrostatic interactions in protein binding and the effects of amino acid mutations on such electrostatic interactions). In one embodiment, such residues may be mutated to uncharged residues, such as alanine. Because the TnpB polypeptide interacts with the DNA that directs or binds over the length of the TnpB polypeptide, mutations in residues throughout the TnpB polypeptide can be utilized to alter activity. In one aspect, the TnpB polypeptide residues used for mutation are altered based on the amino acid sequence position of the deinococcus radiodurans ISDra2, see, e.g., karvelis et al, nature 599,692-696 (2021). The ISDra2 amino acid may comprise the sequence:
In embodiments, one or more residues are mutated to alter the TAM specificity of the TnpB polypeptide. In one aspect, the one or more mutations correspond to one or more of 52TYR, 53GLY, 56SER, 57SER, 60THR, 72SER, 75ASP, 76LYS, 77PHE, 80GLN, 84LYS, 119ARG, 121GLN, 122PHE, 123THR, 124ASN, 125ASN, 126ASN, 137PRO, 138LYS, 153LYS, 155LEU, and 172LEU based on the amino acid sequence position of ISDra 2.
In one embodiment, one or more residues are mutated to alter the specificity and/or activity of TnpB, the mutation is selected from the group consisting of 6ALA, 7PHE, 8VAL, 9VAL, 10ARG, 11LEU, 12TYR, 35PHE, 36LEU, 39ARG, 40ILE, 42ALA, 43TYR, 46SER, 47GLY, 48LYS, 49GLY, 50LEU, 51THR, 52TYR, 95ARG, 96THR, 97VAL, 98LYS, 99GLN, 100SER, 101GLY, 102LYS, 103LYS, and 104VAL, 105GLY, 106PHE, 107PRO, 108ARG, 109PHE, 110ARG, 111LYS, 112LYS, 113ARG, 114THR, 115GLY, 116GLU, 117SER, 118TYR, 119ARG, 120THR, 121GLN, 154ILE, 155LEU, 156ASN, 157VAL, 158THR, 159VAL, 160ARG, 161ARG, 162ILE, 163HIS, 164GLU, 165GLY, 166HIS, 167TYR 168GLU, 169ALA, 170SER, 171VAL, 172LEU, 173CYS, 174GLU, 215TYR, 216ARG, 217SER, 218THR, 219LEU, 220LYS, 221ARG, 222LEU, 223ARG, 224LYS, 225ALA, 226GLN, 227GLN, 228THR, 229LEU, 230SER, 231ARG, 232ARG, 233LYS, 234LYS, 235GLY, 236SER, 237ALA, 238ARG, 239YR, 240GLY, 241LYS, 242ALA, 243LYS, 244THR, 245LYS, 246LEU, 247ALA, 248ARG, 249ILE, 250S, 251LYS, 252ARG, 253ILE, 254VAL, 283, 284, 285MET, 286ARG, LYS, 288, 289ARG, 290ARG, 291, 292 LEU, 293LEU, 292, 296, 292 LEU, 296, 295 and 297ASP, and more of the one or more of the following.
Without being bound by a particular scientific theory, it is believed that the V-type CRISPR-Cas system evolved from the TnpB system. The V-type system is known to have in vitro bypass activity against single stranded DNA, see for example Chen et al science.2018, 4, 27; 360 (6387):436-439.
Professional TnpB system
In one embodiment, the system is a TnpB-based system capable of performing a specialized function or activity. For example, the TnpB protein may be fused, operably coupled, or otherwise associated with one or more heterologous functional domains. In certain exemplary embodiments, the TnpB protein may be a catalytic death TnpB protein and/or have nicking enzyme activity. Nicking enzymes are TnpB proteins that cleave only one strand of a double-stranded target. In such embodiments, the catalytically inactive TnpB or nicking enzyme provides sequence-specific targeting functionality via delivery of the functional domain to the target sequence or to omega RNA adjacent to the target sequence.
It is also contemplated that the TnpB complex as a whole may be associated with two or more functional domains. For example, there may be two or more functional domains associated with the TnpB polypeptide, or there may be two or more functional domains associated with the nucleic acid component (via one or more adapter proteins or aptamers), or there may be one or more functional domains associated with the TnpB polypeptide and one or more functional domains associated with the nucleic acid component.
In one embodiment, one or more functional domains are associated with the TnpB polypeptide via an adapter protein, for example for use with modified guides of Konnerman et al (Nature 517,583-588,2015, 1, 29). In one embodiment, one or more functional domains are attached to the adapter protein such that upon binding of the TnpB polypeptide to the RNA molecule and target, the functional domain is in a spatial orientation that allows the functional domain to exert its home function.
In one embodiment, one or more functional domains are associated with a dead nucleic acid component. In one embodiment, complexes with active TnpB polypeptides direct gene regulation through a functional domain at one locus, while functional domains associated with a nucleic acid component direct DNA cleavage through active TnpB polypeptides at another locus. In one embodiment, the nucleic acid component is selected to maximize the selectivity of modulation of the locus of interest compared to off-target modulation. In one embodiment, the nucleic acid component is selected to maximize target gene regulation and minimize target cleavage. The loops of the nucleic acid component may be extended without collision with the TnpB polypeptide by insertion of different loops or different sequences that recruit adaptor proteins that may bind to the different loops or different sequences. The adaptor proteins may include, but are not limited to, orthogonal polynucleotide binding protein/aptamer combinations that are present in phage coat protein diversity. A list of such coat proteins includes, but is not limited to: qβ, F2, GA, fr, JP501, M12, R17, BZ13, JP34, JP500, KU1, M11, MX1, TW18, VK, SP, FI, ID, NL95, TW19, AP205, Φcb5, Φcb8R, Φcb12R, Φcb23R, 7s, and PRR1. These adaptor proteins or orthogonal RNA binding proteins may further recruit effector proteins or fusions comprising one or more functional domains.
Exemplary functional domains that can be fused, operably coupled or otherwise associated with a TnpB protein can be or include, but are not limited to, nuclear Localization Signal (NLS) domains, nuclear Export Signal (NES) domains, translational activation domains, transcriptional activation domains (e.g., VP64, p65, myoD1, HSF1, RTA, and SET 7/9), translational initiation domains, transcriptional repression domains (e.g., KRAB domains, nuE domains, ncoR domains, and SID domains, such as SID4X domains), nuclease domains (e.g., fokl), histone modification domains (e.g., histone acetyltransferase), light-inducible/controllable domains, chemically-inducible/controllable domains, transposase domains, homologous recombination machine domains, recombinase domains, ligase domains, topoisomerase domains, integrase domains, and combinations thereof. In embodiments, the functional domain is an HNH domain and may be used with a naturally catalytically inactive TnpB protein to engineer a nicking enzyme. The method for producing catalytic death TnpB or nickase TnpB can be adapted according to the method in Cas9 protein, see e.g. WO 2014/204725, ran et al cell.2013, 9 months 12; 154 (6) 1380-1389, which are known in the art and incorporated herein by reference. Briefly, one or more mutations that reduce or eliminate NHEJ activity may be introduced in the RuvC domain of the TnpB protein and/or the catalytic domain of the HNH domain. In one aspect, at least one mutation in the RuvC domain and at least one mutation in the HNH domain are provided. In embodiments, the TnpB polypeptide comprises mutations at D191 and/or E278 based on the amino acid sequence position of deinococcus radiodurans ISDra 2. In one aspect, the amino acid mutation comprises D191A and/or E278A based on the amino acid sequence position of deinococcus radiodurans ISDra 2.
In one embodiment, the functional domain may have one or more of the following activities: nucleobase deaminase activity, reverse transcriptase activity, transposase activity, integrase activity, recombinase activity, topoisomerase activity, ligase activity, polymerase activity, helicase activity, methylase activity, demethylase activity, translational activation activity, translational initiation activity, translational repression activity, transcriptional activation activity, transcriptional repression activity, transcriptional release factor activity, histone modification activity, nuclease activity (e.g., virD 2), single-stranded RNA cleavage activity, double-stranded RNA cleavage activity, single-stranded DNA cleavage activity, double-stranded DNA cleavage activity, molecular switch activity, chemical inducibility, photoinductivity, and nucleic acid binding activity. In one embodiment, one or more functional domains may comprise an epitope tag or a reporter. Non-limiting examples of epitope tags include histidine (His) tags, V5 tags, FLAG tags, influenza Hemagglutinin (HA) tags, myc tags, VSV-G tags, and thioredoxin (Trx) tags. Examples of reporter molecules include, but are not limited to, glutathione-S-transferase (GST), horseradish peroxidase (HRP), chloramphenicol Acetyl Transferase (CAT) beta-galactosidase, beta-glucuronidase, luciferase, green Fluorescent Protein (GFP), hcRed, dsRed, cyan Fluorescent Protein (CFP), yellow Fluorescent Protein (YFP), and autofluorescent proteins, including Blue Fluorescent Protein (BFP).
One or more functional domains may be located at, near, and/or adjacent to the end of an effector protein (e.g., a TnpB protein). In embodiments having two or more functional domains, each of the two functional domains may be positioned at or near or adjacent to the end of an effector protein (e.g., a TnpB protein). In one embodiment, such as those wherein the functional domains are operably coupled to an effector protein, one or more of the functional domains may be tethered or linked to the effector protein (e.g., a TnpB protein) via a suitable linker, including but not limited to a GlySer linker. When there is more than one functional domain, the functional domains may be the same or different. In one embodiment, all functional domains are identical. In one embodiment, all functional domains are different from each other. In one embodiment, at least two of the functional domains are different from each other. In one embodiment, at least two of the functional domains are identical to each other.
In one embodiment, histone modification domains are also preferred. Exemplary histone modification domains are discussed below. Transposase domains, HR (homologous recombination) machine domains, recombinase domains and/or integrase domains are also preferred as functional domains of the invention. In one embodiment, the DNA integration activity comprises an HR machine domain, an integrase domain, a recombinase domain, and/or a transposase domain.
In one embodiment, the DNA cleavage activity is due to a nuclease. In one embodiment, the nuclease comprises a Fok1 nuclease. See "Dimeric CRISPR RNA-guided FokI nucleases for highly specific genome editing", shengdar Q.Tsai, nicolas Wyvekes, cyd Khayter, jennifer A.Foden, vishal shape, deep Reyon, mathew J.Goodwin, martin J.Aryee, J.Keith Joung Nature Biotechnology (6): 569-77 (2014), which relates to Dimeric RNA directed FokI nucleases that recognize extension sequences and allow efficient editing of endogenous genes in human cells.
Functional domains can be used to regulate transcription, e.g., transcription repression. Transcriptional repression is typically mediated by chromatin modifying enzymes such as Histone Methyltransferases (HMTs) and deacetylases (HDACs). The repressible histone effector domains are known and an exemplary list is provided below. Preferably small-sized proteins and functional truncations that facilitate efficient viral packaging (e.g., via AAV). However, in general, the domains may include HDAC, histone Methyltransferase (HMT) and Histone Acetyltransferase (HAT) inhibitors, as well as HDAC and HMT recruitment proteins. In one embodiment, the functional domain may be or include an HDAC effector domain, an HDAC recruitment effector domain, a Histone Methyltransferase (HMT) recruitment effector domain, or a histone acetyltransferase inhibitor effector domain.
In one embodiment, the functional domain may be a methyltransferase (HMT) effector domain. Preferred examples include NUE, vSET, EHMT/G9A, SUV H1, dim-5, KYP, SUVR4, SET1, SETD8 and TgSET8.NUE is illustrated in embodiments of the invention, although preferred, it is contemplated that other examples in the category will also be useful.
In one embodiment, the functional domain may be a Histone Methyltransferase (HMT) recruitment effector domain. Preferred examples include Hp1a, PHF19 and NIPP1.
In one embodiment, the functional domain may be a histone acetyltransferase inhibitor effector domain. Preferred examples include SET/TAF-1. Beta.
In some cases, endogenous (regulatory) control elements (such as enhancers and silencers) are targeted in addition to the promoter or promoter proximal element. Thus, in addition to targeting promoters, the invention can also be used to target endogenous control elements (including enhancers and silencers). These control elements can be located upstream and downstream of the Transcription Start Site (TSS), from 200bp to 100kb away from the TSS. Targeting known control elements may be used to activate or repress genes of interest. In some cases, a single control element may affect transcription of multiple target genes. Thus, targeting a single control element can be used to control transcription of multiple genes simultaneously.
In another aspect, targeted putative control elements (e.g., by tiling the region of putative control elements and 200bp to 100kB around the elements) can be used as a means to verify such elements (by measuring transcription of the gene of interest) or to detect new control elements (e.g., by tiling the TSS upstream and downstream 100kB of the gene of interest). Furthermore, targeted putative control elements may be useful in the context of understanding the genetic cause of a disease. Many mutations and common SNP variants associated with disease phenotypes are located outside the coding region. After targeting such regions with the activation or repression systems described herein, the following transcripts can be read: a) A putative set of targets (e.g., a set of genes located in closest proximity to the control element) or b) full transcriptome read-out by, for example, RNAseq or microarray. This would allow the identification of possible candidate genes involved in the disease phenotype. Such candidate genes may be used as novel drug targets.
In one embodiment, one or more of the functional domains comprises an acetyltransferase, preferably a histone acetyltransferase. These are useful in the field of epigenomics, for example in methods of interrogating the epigenomic region. Methods of interrogating an epigenomic region can include, for example, targeting the epigenomic sequence. Targeting the epigenomic sequence can include omega RNA directed against the epigenomic target sequence. In one embodiment, the epigenomic target sequence can include a promoter, silencer, or enhancer sequence.
The functional domain may be an acetyltransferase domain. Examples of acetyltransferases are known, but in one embodiment may include histone acetyltransferases. In one embodiment, the histone acetyltransferase may comprise the catalytic core of human acetyltransferase p300 (Gerbasch & Reddy, nature Biotech 2015, 4, 6).
Nuclear localization sequences
In one embodiment, the TnpB polypeptide is fused to one or more Nuclear Localization Sequences (NLS), such as about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLS. In one embodiment, the TnpB polypeptide comprises about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more NLSs at or near the amino terminus, about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more NLSs at or near the carboxy terminus, or a combination of these (e.g., zero or at least one or more NLSs at the amino terminus and zero or at least one or more NLSs at the carboxy terminus). When more than one NLS is present, each may be selected independently of the others, such that a single NLS may be present in more than one copy and/or combined with one or more other NLSs present in one or more copies. In a preferred embodiment of the invention, the TnpB polypeptide comprises up to 6 NLS. In one embodiment, an NLS is considered near the N-terminus or C-terminus when the nearest amino acid of the NLS is within about 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50 or more amino acids along the polypeptide chain from the N-terminus or C-terminus. Non-limiting examples of NLS include NLS sequences derived from: NLS of the SV40 viral large T antigen having the amino acid sequence PKKKRKV (SEQ ID NO:64, 264); an NLS of nucleoplasmin (e.g., a double-typed nucleoplasmin NLS having sequence KRPAATKKAGQAKKKK (SEQ ID NO:64,265); c-myc NLS having amino acid sequence PAAKRVKLD (SEQ ID NO:64,266) or RQRRNELKRSP (SEQ ID NO:64,267); hRNPA 1M 9 NLS having sequence NQSSNFGPMKGGNFGGRSSGPYGGGGQYFAKPRNQGGY (SEQ ID NO:64,268), sequence of IBB domain of input protein-alpha, sequence RMRIZFKGKDTARRRVELRKAKKKRNV (SEQ ID NO:64,269), sequence of myoma T protein VSRKRPRP (SEQ ID NO:64,270) and PPKKARED (SEQ ID NO:64,271), sequence of human p53 PQPKKPL (SEQ ID NO:64,272), sequence of mouse c-abl IV SALIKKKKKMAP (SEQ ID NO:64,273), sequence of influenza virus NS1 DRLRR (SEQ ID NO:64,274) and PKQKKRK (SEQ ID NO:64,275), sequence RKLKKKIKKL of hepatitis D virus antigen (SEQ ID NO:64,276), sequence REKKKFLKRR of mouse Mx1 protein (SEQ ID NO:64,277), sequence KRKGDEVDGVDEVAKKKSKK of human poly (ribose) polymerase (SEQ ID NO:64,278), sequence of steroid hormone receptor (human glucocorticoid) (SEQ ID NO:64,272), or a polypeptide of the order of the type that can be fused to a polypeptide of the eukaryotic cell type, or a combination of the nucleic acid of which can be detected in an appropriate amount by a suitable assay, such as that the polypeptide can accumulate in a cell, or a cell of the nucleic acid of the cell, such that the location within the cell can be visualized, such as in combination with means for detecting the location of the cell nucleus (e.g., a stain specific to the cell nucleus, such as DAPI). The nuclei may also be isolated from the cells and the contents of the nuclei may then be analyzed by any suitable method for detecting proteins, such as immunohistochemistry, western blotting or enzymatic activity assays. Accumulation in the nucleus can also be determined indirectly, such as by determining the effect of complex formation (e.g., determining DNA cleavage or mutation at a target sequence, or determining altered gene expression activity affected by complex formation and/or TnpB polypeptide activity) as compared to a control not exposed to the TnpB polypeptide or complex, or exposed to a TnpB polypeptide lacking one or more NLSs. In one embodiment of the TnpB polypeptide protein complexes and systems described herein, the codon optimized TnpB polypeptide protein comprises an NLS attached to the C-terminus of the protein. In one embodiment, other localization tags may be fused to the TnpB polypeptide, such as, but not limited to, locating the TnpB polypeptide at specific sites in a cell, such as organelles, such as mitochondria, plastids, chloroplasts, vesicles, golgi (nuclear or cellular) membranes, ribosomes, nucleoli, ER, cytoskeleton, vacuoles, centrosomes, nucleosomes, particles, centrosomes, and the like.
In one embodiment of the invention, at least one Nuclear Localization Signal (NLS) is attached to a nucleic acid sequence encoding a TnpB polypeptide. In preferred embodiments, at least one or more C-terminal or N-terminal NLS are attached (thus a nucleic acid molecule encoding a TnpB polypeptide may comprise an encoding NLS such that the expressed product has an attached or linked NLS). In a preferred embodiment, the C-terminal NLS is attached to achieve optimal expression and nuclear targeting in eukaryotic cells, preferably human cells. The invention also encompasses methods for delivering a plurality of nucleic acid components, wherein each nucleic acid component is specific for a different target locus of interest, thereby modifying a plurality of target loci of interest. The nucleic acid component of the complex may comprise one or more protein-binding RNA aptamers. One or more aptamers may be capable of binding to phage coat proteins.
Joint
In some preferred embodiments, the functional domain is linked to a TnpB polypeptide (e.g., an active or dead TnpB polypeptide) to target and activate an epigenomic sequence, such as a promoter or enhancer. One or more omega RNAs directed against such promoters or enhancers may also be provided to direct binding of the TnpB polypeptide to such promoters or enhancers.
The term "associate with" as used herein relates to the association of a functional domain with a TnpB polypeptide protein or an adaptor protein. It is used in terms of how one molecule "associates" with respect to another molecule, for example between an adapter protein and a functional domain, or between a TnpB polypeptide protein and a functional domain. In the case of such protein-protein interactions, such association can be seen in terms of the manner in which the antibody recognizes the epitope. Alternatively, one protein may be associated with another protein via fusion of the two, e.g., one subunit fused to the other. Fusion typically occurs by adding the amino acid sequence of one protein to the amino acid sequence of another protein, for example, by splicing together the nucleotide sequences encoding each protein or subunit. Alternatively, this may be regarded as essentially a binding or direct connection between two molecules, such as a fusion protein. In any case, the fusion protein may comprise a linker between the two subunits of interest (i.e., between the enzyme and the functional domain or between the adaptor protein and the functional domain). Thus, in one embodiment, the TnpB polypeptide protein or adaptor protein associates with a functional domain by binding to the functional domain. In other embodiments, the TnpB polypeptide or adapter protein is associated with a functional domain, as the two are optionally fused together via an intermediate linker.
The term "linker" as used in reference to a fusion protein refers to a molecule that binds a protein to form a fusion protein. Generally, such molecules have no specific biological activity other than to bind proteins or to maintain some minimal distance or other spatial relationship between proteins. However, in one embodiment, the linker may be selected to affect some property of the linker and/or fusion protein, such as folding, net charge, or hydrophobicity of the linker.
Suitable linkers for use in the methods of the invention are well known to those skilled in the art and include, but are not limited to, straight or branched chain carbon linkers, heterocyclic carbon linkers, or peptide linkers. However, as used herein, the linker may also be a covalent bond (carbon-carbon bond or carbon-heteroatom bond). In particular embodiments, the linker is used to separate the TnpB polypeptide from the nucleotide deaminase by a distance sufficient to ensure that each protein retains its desired functional properties. Preferred peptide linker sequences adopt a flexible extended conformation and do not exhibit a propensity to form ordered secondary structures. In one embodiment, the linker may be a chemical moiety, which may be a monomer, dimer, multimer, or polymer. Preferably, the linker comprises an amino acid. Typical amino acids in flexible linkers include Gly, asn and Ser. Thus, in particular embodiments, the linker comprises a combination of one or more of Gly, asn, and Ser amino acids. Other near neutral amino acids, such as Thr and Ala, may also be used in the linker sequence. Exemplary linkers are disclosed in Maratea et al (1985), gene 40:39-46; murphy et al (1986) Proc.Nat' l.Acad.Sci.USA 83:8258-62; U.S. patent No. 4,935,233; and U.S. patent No. 4,751,180. For example, glySer linker GGS, GGGS (SEQ ID NO:64,280) or GSG may be used. GGS, GSG, GGGS (SEQ ID NO:64,280) or GGGGS (SEQ ID NO:64,281) linkers may be 3 (such as (GGS) 3 (SEQ ID NO:64,282)、(GGGGS) 3 (SEQ ID NO 64,283)) or 5, 6, 7, 9 or even 12 or more to provide a suitable length. In some cases, the linker may be (GGGGS) 3-15 (SEQ ID NOS: 64,283-64, 295). For example, in some cases, the linker may be (GGGGS) 3-11 (SEQ ID NO:64,283-64, 291), e.g., GGGGS (SEQ ID NO:64,281), (GGGGS) 2 (SEQ ID NO:64,296)、(GGGGS) 3 (SEQ ID NO:64,283)、(GGGGS) 4 (SEQ ID NO:64,284)、(GGGGS) 5 (SEQ ID NO:64,285)、(GGGGS) 6 (SEQ ID NO:64,286)、(GGGGS) 7 (SEQ ID NO:64,287)、(GGGGS) 8 (SEQ ID NO:64,288)、(GGGGS) 9 (SEQ ID NO:64,289)、(GGGGS) 10 (SEQ ID NO:64,290) or (GGGGS) 11 (SEQ ID NO:64,291)。
In particular embodiments, preferred herein are linkers such as (GGGGS) 3 (SEQ ID NO:64,283)。(GGGGS) 6 (SEQ ID NO:64,286)、(GGGGS) 9 (SEQ ID NO:64,289) or (GGGGS) 12 (SEQ ID NO:64,292) may be preferably used as an alternative. Other preferred alternatives are (GGGGS) 1 (SEQ ID NO:64,281)、(GGGGS) 2 (SEQ ID NO:64,296)、(GGGGS) 4 (SEQ ID NO:64,284)、(GGGGS) 5 (SEQ ID NO:64,285)、(GGGGS) 7 (SEQ ID NO:64,287)、(GGGGS) 8 (SEQ ID NO:64,288)、(GGGGS) 10 (SEQ ID NO:64,290) or (GGGGS) 11 (SEQ ID NO:64,291). In yet another embodiment, LEPGEKPYKCPECGKSFSQSGALTRHQRTHTR (SEQ ID NO:64,297) is used as a linker. In yet another embodiment, the linker is an XTEN linker. In a particular embodiment, the TnpB polypeptide is linked to the deaminase protein or catalytic domain thereof via a LEPGEKPYKCPECGKSFSQSGALTRHQRTHTR (SEQ ID NO:64,297) linker. In other specific embodiments, the C-terminus of the TnpB polypeptide is linked to the N-terminus of the deaminase protein or catalytic domain thereof via a LEPGEKPYKCPECGKSFSQSGALTRHQRTHTR (SEQ ID NO:64,297) linker. In addition, the N-terminal and C-terminal NLS can also act as linkers (e.g., PKKKRKVEASSPKKRKVEAS (SEQ ID NO:64,298)).
Examples of joints are shown in table 2 below.
TABLE 2.
The linker can be used between the omega RNA molecule and a functional domain (activator or repressor) or between the TnpB polypeptide and the functional domain. The joint may be used to engineer an appropriate amount of "mechanical flexibility".
In one embodiment, one or more functional domains are controllable, e.g., inducible.
Other suitable functional domains can be found, for example, in International application publication No. WO2019/018423, e.g., at [0678] - [0692], which is incorporated herein by reference. Exemplary functional domains are further detailed elsewhere herein.
Omega RNA molecules
The TnpB system herein may also comprise one or more nucleic acid components, also referred to herein as omega RNA (ωrna). Such nucleic acid components may comprise RNA, DNA, or combinations thereof, and include modified and non-canonical nucleotides as further described below. Omega RNAs can contain a reprogrammable spacer sequence and a scaffold that interacts with a TnpB polypeptide. Omega RNAs can form complexes with TnpB polypeptides (Ω complexes) and direct sequence-specific binding of the complexes to target sequences of target polynucleotides. In one exemplary embodiment, the ωrna is a single molecule comprising a scaffold sequence and a spacer sequence. In certain exemplary embodiments, the spacer is 5' of the scaffold sequence. In one exemplary embodiment, the ωrna may further comprise a conserved nucleic acid sequence between the scaffold and the spacer portion.
In embodiments, the omega RNA comprises a spacer sequence and a scaffold sequence, e.g., a conserved nucleotide sequence. In embodiments, the omega RNA comprises from about 45 to about 250 nucleotides, or about 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 17, 138, 19, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151; 152, 153, 154, 155, 156, 157, 158, 159, 160, 11, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 2340, 241, 242, 243, 244, 245, 246, 247, 249, or 250 nucleotides.
In embodiments, the omega RNA comprises a scaffold sequence, e.g., a conserved nucleotide sequence. Thus, the scaffold sequence typically comprises a conserved region, wherein the scaffold comprises about 30 to 200 nucleotides, about 50 to 180, about 80 to 175 nucleotides, or about 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 4748, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, or more nt. In one aspect, the nucleic acid component scaffold comprises a conserved nucleotide sequence. In embodiments, the conserved nucleotide sequence is located at or near the 5' end of the scaffold.
Omega RNAs may also contain spacers, which can be reprogrammed to direct site-specific binding to a target sequence of a target polynucleotide. The spacer may also be referred to herein as an omega RNA scaffold or a portion of omega RNA, and may comprise an engineered heterologous sequence. In embodiments, the scaffold may comprise the sequences of table 5. In embodiments, the scaffold comprises one or more conserved sequences of the RNA conserved regions in table 5 and depicted in fig. 2. In one embodiment, the secondary structure of the ωrna comprises the multiple pinch region shown in fig. 18D. In one aspect, the RNA species comprises an RNA conserved region + guide sequence that is different from, but generally associated with, the dr+ spacer configuration of the CRISPR-Cas system.
In one embodiment, the spacer length of the ωrna is 10 to 50nt. In one embodiment, the spacer length of the ωrna is at least 10, 11, 12, 13, 14 or 15 nucleotides. In one embodiment, the spacer is 10 to 40 nucleotides, 15 to 30nt, 15 to 17nt, e.g., 15, 16, or 17nt, 17 to 20nt, e.g., 17,18, 19, or 20nt, 20 to 24nt, e.g., 20, 21, 22, 23, or 24nt, 23 to 25nt, e.g., 23, 24, or 25nt, 24 to 27nt, e.g., 24, 25, 26, or 27nt, 27 to 30nt, e.g., 27, 28, 29, or 30nt, 30 to 35nt, e.g., 30, 31, 32, 33, 34, or 35nt or longer. In exemplary embodiments, the spacer sequence is 14, 15, 16, 17,18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50nt.
In one embodiment, the sequence of the omega RNA is selected to reduce the degree of secondary structure within the omega RNA. In one embodiment, when optimally folded, about or less than about 75%, 50%, 40%, 30%, 25%, 20%, 15%, 10%, 5%, 1% or less of the nucleotides of the nucleic acid targeting omega RNA component participate in self-complementary base pairing. Optimal folding may be determined by any suitable polynucleotide folding algorithm. Some programs are based on calculating the minimum Gibbs free energy. An example of such an algorithm is mFold, as described by Zuker and Stiegler (Nucleic Acids Res.9 (1981), 133-148). Another example of a folding algorithm is the online web server RNAfold, which was developed at the university of Vienna, university of chemical institute (Institute for Theoretical Chemistry at the University of Vienna), using centroid structure prediction algorithms (see, e.g., A.R. Gruber et al, 2008, cell106 (1): 23-24; and PA Carr and GM Church,2009,Nature Biotechnology27 (12): 1151-62).
As used herein, a heterologous ωrna is an ωrna that is not derived from the same species as the TnpB polypeptide or comprises a portion of a molecule, such as a spacer, that is not derived from the same species as the TnpB polypeptide. For example, a heterologous ωrna derived from a TnpB polypeptide of species a comprises a polynucleotide or an artificial polynucleotide derived from a different species than species a.
In particular embodiments, the omega RNA comprises a spacer sequence linked to a conserved nucleotide sequence, wherein the conserved nucleotide sequence may comprise one or more stem loops or an optimized secondary structure. In a particular embodiment, the conserved nucleotide sequence has a minimum length of 16nt and a single stem loop. In other embodiments, the conserved nucleotide sequence is greater than 16nt, preferably greater than 17nt in length, and has more than one stem loop or optimized secondary structure. In particular embodiments, the spacer sequence may be linked to all or part of the naturally occurring conserved nucleotide sequence. In particular embodiments, certain aspects of the omega RNA architecture can be modified, for example by adding, subtracting, or substituting features, while maintaining certain other aspects of the architecture. Preferred positions for engineered omega RNA modifications (including but not limited to insertions, deletions, and substitutions) include the ends of the omega RNA and the regions of the omega RNA that are exposed when complexed with the TnpB polypeptide and/or target.
In one embodiment, the omega RNA forms a stem loop with a separate non-covalent linking sequence, which may be DNA or RNA. In a particular embodiment, the sequence forming the omega RNA is first synthesized using standard phosphoramidite synthesis protocols (herdiewijn, p. Edit, methods in Molecular Biology Col 288,Oligonucleotide Synthesis:Methods and Applications,Humana Press,New Jersey (2012)). In one embodiment, these sequences may be functionalized to contain appropriate functional groups for ligation using standard protocols known in the art (Hermanson, g.t., bioconjugate Techniques, academic Press (2013)). Examples of functional groups include, but are not limited to, hydroxyl, amine, carboxylic acid halide, carboxylic acid active ester, aldehyde, carbonyl, chlorocarbonyl, imidazolylcarbonyl, hydrazide, semicarbazide, thiosemicarbazide, thiol, maleimide, haloalkyl, sulfonyl, allyl, propargyl, diene, alkyne, and azide. Once the sequence is functionalized, covalent chemical bonds or linkages may be formed between the sequence and the conserved nucleotide sequence. Examples of chemical bonds include, but are not limited to, those based on: carbamates, ethers, esters, amides, imines, amidines, aminotriazines, hydrohydrazones, disulfides, thioethers, thioesters, phosphorothioates, phosphorodithioates, sulfonamides, sulfonates, sulfones, sulfoxides, ureas, thioureas, hydrazides, oximes, triazoles, photolabile bonds, c—c bond forming groups such as Diels-Alder cycloaddition pairs or ring closure metathesis pairs, and michael reaction pairs.
In one embodiment, these stem loop forming sequences may be chemically synthesized. In one embodiment, the chemical synthesis uses an automated solid phase oligonucleotide synthesis machine utilizing 2 '-acetoxyethyl orthoester (2' -ACE) (Scaringe et al, J.am.chem.Soc. (1998) 120:11820-11821;Scaringe,Methods Enzymol. (2000) 317:3-18) or 2 '-thiocarbamate (2' -TC) chemistry (Dellinger et al, J.am.chem.Soc. (2011) 133:11540-11546; hendel et al, nat.Biotechnol. (2015) 33:985-989).
The repeat-inverted duplex (repeat: anti repeat duplex) will be apparent from the secondary structure of the omega RNA component. It may typically be the first complementary segment after the poly U beam (5 'to 3' direction) and before the tetracyclic ring; and a second complementary segment after the tetracyclic ring (5 'to 3' direction) and before the poly a bundle. The first complementary segment ("repeat") is complementary to the second complementary segment ("anti-repeat"). Thus, when folded upon each other, they undergo Watson-Crick (Watson-Crick) base pairing to form a duplex of dsRNA. Thus, an anti-repeat sequence is a complement of repeats in terms of A-U or C-G base pairing, and in terms of the fact that the anti-repeats are in opposite directions due to the four-rings.
In an embodiment of the invention, modification of the molecular architecture of the omega RNA component comprises substitution of bases in stem loop 2. For example, in one embodiment, the "actt" (in RNA, "acuu") and "aagt" (in RNA, "aagu") bases in stem loop 2 are replaced with "cgcc" and "gcgg". In one embodiment, the "actt" and "aagt" bases in stem loop 2 are replaced with complementary GC-rich regions of 4 nucleotides. In one embodiment, the complementary GC-rich regions of 4 nucleotides are "cgcc" and "gcgg" (both in the 5 'to 3' direction). In one embodiment, the complementary GC-rich regions of 4 nucleotides are "gcgg" and "cgcc" (both in the 5 'to 3' direction). Other combinations of C and G in complementary GC-rich regions of 4 nucleotides will be apparent, including CCCC and GGGG.
In one aspect, stem loop 2, e.g., "ACTTgtttAAGT (SEQ ID NO:64,304)" may be replaced by any "XXXXGttYYYY (SEQ ID NO:64,305)", e.g., where XXXX and YYYY represent any complementary set of nucleotides that together will base pair with one another to produce a stem.
As used herein, the term "spacer" may also be referred to as a "guide sequence". In one embodiment, the spacer sequence is about or greater than 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99% or more complementary to a given target sequence when optimally aligned using a suitable alignment algorithm. In certain exemplary embodiments, the omega RNA molecule comprises a spacer sequence that can be designed to have at least one mismatch with the target sequence such that an RNA duplex is formed between the sequence and the target sequence. Thus, the degree of complementarity is less than 99%. For example, in the case where the spacer sequence consists of 24 nucleotides, the degree of complementarity is more particularly about 96% or less. In particular embodiments, the spacer sequence is designed to have a stretch of two or more adjacent mismatched nucleotides such that the degree of complementarity of the entire sequence is further reduced. For example, where the spacer sequence consists of 24 nucleotides, the degree of complementarity is more specifically about 96% or less, more specifically about 92% or less, more specifically about 88% or less, more specifically about 84% or less, more specifically about 80% or less, more specifically about 76% or less, more specifically about 72% or less, depending on whether a stretch of two or more mismatched nucleotides encompasses 2, 3, 4, 5, 6, or 7 nucleotides, and the like. In one embodiment, the degree of complementarity, in addition to a stretch of one or more mismatched nucleotides, is about or greater than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99% or more when optimally aligned using a suitable alignment algorithm. The optimal alignment may be determined using any suitable algorithm for aligning sequences, non-limiting examples of which include the Smith-Waterman algorithm, the Needleman-Wunsch algorithm, the Burrows-Wheeler Transform-based algorithm (e.g., burrows Wheeler Aligner), clustal W, clustal X, BLAT, novolaign (Novocraft Technologies; available at www.novocraft.com), ELAND (Illumina, san Diego, calif.), SOAP (available at SOAP. Genetics. Org. Cn), and Maq (available at maq. Sourceforg. Net). The ability of a sequence (within a nucleic acid targeting ωrna t molecule) to direct sequence specific binding of a nucleic acid targeting complex to a target nucleic acid sequence can be assessed by any suitable assay. For example, components of the omega RNA system sufficient to form a TnpB targeting complex (including the omega RNA molecule sequence to be tested) can be provided to a host cell having a corresponding target nucleic acid sequence, such as by transfection with a vector encoding components of the TnpB targeting complex, followed by evaluation of preferential targeting (e.g., cleavage) within the target nucleic acid sequence, such as by a Surveyor assay, as described herein. Similarly, cleavage of a target nucleic acid sequence (or sequence thereabout) can be assessed in a test tube by providing a target nucleic acid sequence, components of the TnpB targeting complex (including the sequence to be tested and a control sequence different from the test ωrna), and comparing the binding or cleavage rate at or near the target sequence between the test and control ωrna molecule sequences. Other assays are possible and will occur to those of skill in the art. The spacer sequence, and thus the nucleic acid targeting ωrna, can be selected to target any target nucleic acid sequence.
Omega RNAs, and thus nucleic acid targeting spacers, can be selected to target any target nucleic acid sequence. The target sequence may be DNA. The target sequence may be any RNA sequence. In one embodiment, the target sequence may be a sequence within an RNA molecule selected from the group consisting of: messenger RNA (mRNA), pre-mRNA, ribosomal RNA (rRNA), transfer RNA (tRNA), microRNA (miRNA), small interfering RNA (siRNA), nuclear small RNA (snRNA), nucleolar small RNA (snorRNA), double-stranded RNA (dsRNA), non-coding RNA (ncRNA), long-chain non-coding RNA (lncRNA), and cytoplasmic small RNA (scRNA). In some preferred embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of: mRNA, pre-mRNA, and rRNA. In some preferred embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of ncrnas and lncrnas. In some more preferred embodiments, the target sequence may be a sequence within an mRNA molecule or a pre-mRNA molecule.
In one embodiment, the omega RNA forms a stem loop with a separate non-covalent linking sequence, which may be DNA or RNA. In a particular embodiment, the sequence forming the omega RNA component is first synthesized using standard phosphoramidite synthesis protocols (herdiewijn, p. Edit, methods in Molecular Biology Col 288,Oligonucleotide Synthesis:Methods and Applications,Humana Press,New Jersey (2012)). In one embodiment, these sequences may be functionalized to contain appropriate functional groups for ligation using standard protocols known in the art (Hermanson, g.t., bioconjugate Techniques, academic Press (2013)). Examples of functional groups include, but are not limited to, hydroxyl, amine, carboxylic acid halide, carboxylic acid active ester, aldehyde, carbonyl, chlorocarbonyl, imidazolylcarbonyl, hydrazide, semicarbazide, thiosemicarbazide, thiol, maleimide, haloalkyl, sulfonyl, allyl, propargyl, diene, alkyne, and azide. Once the sequence is functionalized, covalent chemical bonds or linkages may be formed between the sequence and the conserved nucleotide sequence. Examples of chemical bonds include, but are not limited to, those based on: carbamates, ethers, esters, amides, imines, amidines, aminotriazines, hydrohydrazones, disulfides, thioethers, thioesters, phosphorothioates, phosphorodithioates, sulfonamides, sulfonates, sulfones, sulfoxides, ureas, thioureas, hydrazides, oximes, triazoles, photolabile bonds, c—c bond forming groups such as Diels-Alder cycloaddition pairs or ring closure metathesis pairs, and michael reaction pairs.
Omega RNA chemical modification
In one embodiment, these stem loop forming sequences may be chemically synthesized. In one embodiment, the chemical synthesis uses an automated solid phase oligonucleotide synthesis machine utilizing 2 '-acetoacetate orthoester (2' -ACE) (Scaringe et al, J.am.chem.Soc. (1998) 120:11820-11821;Scaringe,Methods Enzymol. (2000) 317:3-18) or 2 '-thiocarbamate (2' -TC) chemistry (Dellinger et al, J.am.chem.Soc. (2011) 133:11540-11546; hendel et al, nat.Biotechnol. (2015) 33:985-989).
In one embodiment, the omega RNA component molecule comprises a non-naturally occurring nucleic acid and/or a non-naturally occurring nucleotide and/or nucleotide analogue and/or chemical modification. Preferably, these non-naturally occurring nucleic acids and non-naturally occurring nucleotides are located outside the omega RNA sequence. Non-naturally occurring nucleic acids may include, for example, a mixture of natural and non-naturally occurring nucleotides. Non-naturally occurring nucleotides and/or nucleotide analogs can be modified at the ribose, phosphate, and/or base portions. In an embodiment of the invention, the omega RNA component nucleic acids comprise ribonucleotides and non-ribonucleotides. In one such embodiment, the omega RNA component comprises one or more ribonucleotides and one or more deoxyribonucleotides. In embodiments of the invention, the omega RNA component comprises one or more non-naturally occurring nucleotides or nucleotide analogs, such as nucleotides with phosphorothioate linkages, locked Nucleic Acids (LNAs), nucleotides comprising a methylene bridge between the 2 'and 4' carbons of the ribose ring, or Bridged Nucleic Acids (BNAs). Other examples of modified nucleotides include 2' -O-methyl analogues, 2' -deoxy analogues or 2' -fluoro analogues. Other examples of modified bases include, but are not limited to, 2-aminopurine, 5-bromo-uridine, pseudouridine, inosine, 7-methylguanosine. Examples of chemical modifications of ωrnas include, but are not limited to, incorporation of 2' -O-methyl (M), 2' -O-methyl 3' phosphorothioate (MS), S-constrained ethyl (cEt), or 2' -O-methyl 3' phosphorothioate (MSP) at one or more terminal nucleotides. Such chemically modified omega RNA components can include increased stability and increased activity compared to unmodified omega RNA components, but mid-target and off-target specificity is unpredictable. (see Hendel,2015,Nat Biotechnol.33 (9): 985-9, doi:10.1038/nbt.3290, 29 th month 2015 on-line publication of Ragdarm et al 0215, PNAS, E7110-E7111; allerson et al J.Med. Chem.2005,48:901-904; bramsen et al front. Genet.,2012,3:154; deng et al PNAS,2015,112:11870-11875; shamma et al, medChemcom. 2014,5:1454-1471; hendel et al Nat. Biotechnol (2015) 33 (9): 985-989; li et al Nature Biomedical Engineering,2017,1, 6DOI:10.1038/s 41551-017-0066). In one embodiment, the 5 'and/or 3' end of the omega RNA component is modified with a variety of functional moieties including fluorescent dyes, polyethylene glycol, cholesterol, proteins, or detection tags. (see Kelly et al 2016, J. Biotech. 233:74-83). In one embodiment, the omega RNA component comprises ribonucleotides in the region that binds to the target sequence and one or more deoxyribonucleotides and/or nucleotide analogs in the region that binds to the TnpB polypeptide. In embodiments, deoxyribonucleotides and/or nucleotide analogs are incorporated into engineered omega RNA component structures. In one embodiment, the 3-5 nucleotides at the 3 'or 5' end of the omega RNA component are chemically modified. In one embodiment, only small modifications, such as 2' -F modifications, are introduced in the seed region. In one embodiment, the 2'-F modification is introduced at the 3' end of the omega RNA component. In one embodiment, three to five nucleotides at the 5' and/or 3' end of the omega RNA component are chemically modified with 2' -O-methyl (M), 2' -O-methyl 3' phosphorothioate (MS), S-constrained ethyl (cEt), or 2' -O-methyl 3' phosphorothioate (MSP). Such modifications may enhance genome editing efficiency (see Hendel et al, nat. Biotechnol. (2015) 33 (9): 985-989). In one embodiment, all of the phosphodiester linkages of the omega RNA component are replaced with Phosphorothioates (PS) to enhance the level of gene disruption. In one embodiment, more than five nucleotides at the 5 'and/or 3' end of the omega RNA component are chemically modified with 2'-O-Me, 2' -F, or S-constrained ethyl (cEt). Such chemically modified omega RNA components can mediate enhanced levels of gene disruption (see Ragdarm et al 0215, PNAS, E7110-E7111). In embodiments of the invention, the omega RNA component is modified to comprise a chemical moiety at its 3 'and/or 5' end. Such moieties include, but are not limited to, amines, azides, alkynes, thio, dibenzocyclooctyne (DBCO), or rhodamine. In certain embodiments, the chemical moiety is conjugated to the omega RNA component through a linker (such as an alkyl chain). In one embodiment, the chemical moiety of the modified nucleic acid component can be used to attach the omega RNA component to another molecule, such as DNA, RNA, protein, or nanoparticle. Such chemically modified omega RNA components can be used to identify or enrich cells normally edited by TnpB polypeptides and related systems (see Lee et al, eLife,2017,6:e25312, DOI: 10.7554).
In particular embodiments, the conserved nucleotide sequence may be modified to comprise one or more protein-binding RNA aptamers. In particular embodiments, one or more aptamers, such as a portion of an optimized secondary structure, may be included. Such aptamers may be capable of binding to phage coat proteins as further detailed herein.
In embodiments, the TnpB polypeptide utilizes an omega RNA component scaffold comprising a polynucleotide sequence that facilitates interaction with the TnpB protein, thereby allowing sequence-specific binding and/or targeting of the nucleic acid component molecule to the target polynucleotide. Modifications via sugar, internucleotide phosphodiester linkages, purine and pyrimidine residues are contemplated for chemical synthesis of omega RNA component scaffolds using covalent linkages utilizing various bioconjugation reactions, rings, bridges and non-nucleotide linkages. Sletten et al, angew.chem.int. Edit (2009) 48:6974-6998; manoharan, M.Curr.Opin.chem.biol. (2004) 8:570-9; behlke et al, oligonucleotides (2008) 18:305-19; watts et al, drug. Discov. Today (2008) 13:842-55; shukla et al, chemMedchem (2010) 5:328-49; chemical synthesis uses automated solid phase oligonucleotide synthesis machines using 2 '-acetoacetate orthoester (2' -ACE) (Scaringe et al, J.am.chem.Soc. (1998) 120:11820-11821;Scaringe,Methods Enzymol. (2000) 317:3-18) or 2 '-thiocarbamate (2' -TC) chemistry (Dellinger et al, J.am.chem.Soc. (2011) 133:11540-11546; hendel et al, nat.Biotechnol. (2015) 33:985-989).
In certain exemplary embodiments, the scaffold and spacer may be designed as two separate molecules that may hybridize or covalently join into a single molecule. Covalent attachment may be via a linker (e.g., a non-nucleotide loop) comprising a moiety such as a spacer, an attachment, a bioconjugate, a chromophore, a reporter group, dye-labeled RNA, and a non-naturally occurring nucleotide analog. More specifically, suitable spacers for the purposes of the present invention include, but are not limited to, polyethers (e.g., polyethylene glycol, polyols, polypropylene glycol or mixtures of ethylene glycol and propylene glycol), polyamine groups (e.g., spermine, spermidine and polymer derivatives thereof), polyesters (e.g., poly (ethyl acrylate)), polyphosphoric acid diesters, alkylene groups, and combinations thereof. Suitable attachments include any moiety that can be added to the linker to add additional properties to the linker, such as, but not limited to, fluorescent labels. Suitable bioconjugates include, but are not limited to, peptides, glycosides, lipids, cholesterol, phospholipids, diacylglycerols and dialkylglycerols, fatty acids, hydrocarbons, enzyme substrates, steroids, biotin, digoxigenin, carbohydrates, polysaccharides. Suitable luminophores, reporter groups and dye-labeled RNAs include, but are not limited to, fluorescent dyes such as fluorescein and rhodamine, chemiluminescent, electrochemiluminescent and bioluminescent labeling compounds. Also described in WO 2004/015075 is the design of an exemplary linker for conjugating two nucleic acid components, which may be suitable for use with omega RNAs.
The linker (e.g., non-nucleotide loop) may be of any length. In one embodiment, the length of the linker corresponds to about 0-16 nucleotides. In one embodiment, the length of the linker corresponds to about 0-8 nucleotides. In one embodiment, the length of the linker corresponds to about 0-4 nucleotides. In one embodiment, the length of the linker corresponds to about 2 nucleotides. An exemplary joint design is also described in International patent publication No. WO 2011/008730.
Escort omega RNA component
In particular embodiments, the composition or complex has an omega RNA component molecule having a functional structure designed to improve the omega RNA component molecular structure, architecture, stability, genetic expression, or any combination thereof. Such structures may include an aptamer.
Aptamers are biomolecules that can be designed or selected to bind tightly to other ligands, for example using a technique known as exponential enrichment ligand system evolution (SELEX; tuerk C, gold L: "Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase." Science 1990, 249:505-510). For example, the nucleic acid aptamer may be selected from a pool of random sequence oligonucleotides with high binding affinity and specificity for a wide range of biomedical related targets, indicating that the aptamer has broad therapeutic utility (Keefe, anthony d., supiya Pai, and Andrew ellington, "Aptamers as therapeutics," Nature Reviews Drug discovery9.7 (2010): 537-550). These properties also indicate the broad use of the aptamer as a drug delivery vehicle (Levy-Nissenbaum, etgar et al, "Nanotechnology and aptamers: applications in drug release." Trends in biotechnology 26.8.8 (2008): 442-449; and Hicke BJ, stephens AW. "Escoret aptamers: a delivery service for diagnosis and treatment." J Clin Invest 2000, 106:923-928.). Aptamers can also be constructed to act as molecular switches that respond que by changing properties, such as RNA aptamers that bind to fluorophores to mimic green fluorescent protein activity (Paige, jerey s., karen y.wu and Samie r.jaffrey. "RNA mimics of green fluorescent protein." Science 333.6042 (2011): 642-646). It also shows that aptamers can be used as components of targeted siRNA therapy delivery systems, such as targeted cell surface proteins (Zhou, jiehua and John j.rossi. "Aptamer-targeted cell-specific RNA interference." silnce 1.1 (2010): 4).
Thus, in particular embodiments, the omega RNA component molecules are modified, for example, by one or more aptamers designed to improve delivery of the omega RNA component molecules, including delivery across a cell membrane, into an intracellular compartment, or into a nucleus. In addition to, or in the absence of, one or more aptamers, such structures may include moieties that render the nucleic acid component molecule deliverable, inducible, or responsive to a selected effector. Thus, the invention includes molecules of omega RNA components that are responsive to normal or pathophysiological conditions, including, but not limited to, pH, hypoxia, O2 concentration, temperature, protein concentration, enzyme concentration, lipid structure, light exposure, mechanical disruption (e.g., ultrasound), magnetic field, electric field, or electromagnetic radiation.
Photoreactivity of the inducible system can be achieved via activation and binding of cryptoanthocyanin-2 and CIB 1. Blue light stimulation induces an activated conformational change in cryptoanthocyanin-2, resulting in its binding partner CIB1 recruitment. This binding is rapid and reversible, reaching saturation in <15 seconds after the pulsed stimulation and returning to baseline in <15 minutes after the stimulation is over. These rapid binding kinetics result in systems that are temporarily limited only by the rate of transcription/translation and transcription/protein degradation, and not by the uptake and clearance of the inducer. Cryptoanthocyanin-2 activation is also highly sensitive, allowing the use of low light intensity stimuli and reducing phototoxicity risks. Furthermore, in the context of, for example, an intact mammalian brain, variable light intensity may be used to control the size of the stimulated region, allowing for greater precision than may be provided by carrier delivery alone.
An energy source (such as electromagnetic radiation, sonic energy, or thermal energy) can induce the nucleic acid component molecules. Advantageously, the electromagnetic radiation is a component of visible light. In a preferred embodiment, the light is blue light having a wavelength of about 450 to about 495 nm. In a particularly preferred embodiment, the wavelength is about 488nm. In another preferred embodiment, the optical stimulus is via a pulse. The optical power may be in the range of about 0-9mW/cm 2. In a preferred embodiment, a stimulation pattern as low as 0.25 seconds every 15 seconds should result in maximum activation.
Chemical or energy sensitive omega RNA components can undergo conformational changes upon chemical source binding or energy induction, allowing them to act as omega RNAs and function as TnpB polypeptide systems or complexes. The invention may involve the application of chemical sources or energy to have omega RNA function and TnpB polypeptide system or complex function; and optionally further determining that expression of the genomic locus is altered.
There are several different designs of this chemically inducible system: 1. ABI-PYL-based systems inducible by abscisic acid (ABA) (see, e.g., stke. Scientific. Org/cgi/content/architecture/signals; 4/164/rs 2), 2. FKBP-FRB-based systems inducible by rapamycin (or rapamycin-based related chemicals) (see, e.g., natural. Com/nmet/joul 2/n6/full/nmet 763. Html), 3. GID 1-GAI-based systems inducible by Gibberellin (GA) (see, e.g., natural. Com/nchem/joul/v 8/n 5/full/nchem/hmb. 922. Html).
The chemically inducible system may be an Estrogen Receptor (ER) based system inducible by 4-hydroxy tamoxifen (4 OHT) (see, e.g., pnas. Org/content/104/3/1027. Abstrect). Upon binding to 4-hydroxy tamoxifen, the mutated ligand binding domain of the estrogen receptor (called ERT 2) translocates into the nucleus of the cell. In other embodiments of the invention, any naturally occurring or engineered derivative of any nuclear receptor, thyroid hormone receptor, retinoic acid receptor, estrogen-related receptor, glucocorticoid receptor, progestin receptor, androgen receptor can be used in an inducible system similar to an ER-based inducible system.
Another inducible system is based on the design of a system using Transient Receptor Potential (TRP) ion channels that are inducible by energy, heat or radio waves (see, e.g., scientific ag/content/336/6081/604). These TRP family proteins respond to different stimuli, including light and heat. When this protein is activated by light or heat, the ion channel will open and allow ions (such as calcium) to enter the plasma membrane. This ion influx will bind to intracellular ion interaction partners linked to polypeptides, including nucleic acid components as well as TnpB polypeptide/ωrna molecule complexes or other components of the system, and the binding will induce a change in subcellular localization of the polypeptide, resulting in the whole polypeptide entering the nucleus of the cell. Once inside the nucleus, the nucleic acid component proteins as well as other components of the TnpB polypeptide/omega RNA molecule complex will be active and regulate target gene expression in the cell.
While photoactivation may be an advantageous embodiment, it may sometimes be particularly disadvantageous for in vivo applications where light may not penetrate the skin or other organs. In this case, other energy activation methods with similar effects are considered, in particular electric field energy and/or ultrasound.
Preferably, the electric field energy is applied as described in the art using one or more electric pulses of about 1 volt/cm to about 10 kilovolts/cm under in vivo conditions. Instead of or in addition to the pulses, the electric field may be delivered in a continuous manner. The electrical pulse may be applied for between 1 microsecond and 500 milliseconds, preferably between 1 microsecond and 100 milliseconds. The electric field may be applied continuously or in pulses for about 5 minutes.
As used herein, "electric field energy" is the electrical energy to which a cell is exposed. Under in vivo conditions, the electric field preferably has an intensity of about 1 volt/cm to about 10 kilovolts/cm or more (see WO 97/49450).
As used herein, the term "electric field" includes one or more pulses at a variable capacitance and voltage, and includes exponential and/or square wave and/or modulated square wave forms. References to electric fields and electricity should be seen as including references to the presence of potential differences in the cellular environment. Such an environment may be established by static electricity, alternating Current (AC), direct Current (DC), etc., as is known in the art. The electric field may be uniform, non-uniform, or otherwise, and may vary in intensity and/or direction in a time-dependent manner.
Single or multiple applications of the electric field and single or multiple applications of ultrasound in any order and in any combination are also possible. Ultrasound and/or electric fields may be delivered as single or multiple sequential applications or as pulses (pulsed delivery).
Electroporation has been used in vitro and in vivo procedures to introduce foreign substances into living cells. In vitro applications, a living cell sample is first mixed with an agent of interest and placed between electrodes (such as parallel plates). The electrodes then apply an electric field to the cell/implant mixture. Examples of systems for performing in vitro electroporation include Electro Cell Manipulator ECM product and Electro Square Porator T820, both manufactured by BTX division of Genetronics, inc (see U.S. Pat. No. 5,869,326).
Known electroporation techniques (in vitro and in vivo) work by applying brief high voltage pulses to electrodes located around the treatment area. The electric field generated between the electrodes causes the cell membrane to temporarily become porous, so that molecules of the agent of interest enter the cell. In known electroporation applications, this electric field comprises a single square wave pulse of about 1000V/cm lasting about 100. Mu.s. Such pulses may be generated in known applications of Electro Square Porator T, 820, for example.
Under in vitro conditions, the electric field preferably has a strength of about 1V/cm to about 10 kV/cm. Thus, the electric field may have a strength of 1V/cm, 2V/cm, 3V/cm, 4V/cm, 5V/cm, 6V/cm, 7V/cm, 8V/cm, 9V/cm, 10V/cm, 20V/cm, 50V/cm, 100V/cm, 200V/cm, 300V/cm, 400V/cm, 500V/cm, 600V/cm, 700V/cm, 800V/cm, 900V/cm, 1kV/cm, 2kV/cm, 5kV/cm, 10kV/cm, 20kV/cm, 50kV/cm or more. More preferably from about 0.5kV/cm to about 4.0kV/cm under in vitro conditions. Under in vivo conditions, the electric field preferably has a strength of about 1V/cm to about 10 kV/cm. However, in the case where the number of pulses delivered to the target site increases, the electric field strength may decrease. Thus, it is envisaged to pulse the electric field at a lower field strength.
Preferably, the electric field is applied in the form of a plurality of pulses, such as double pulses having the same intensity and capacitance or sequential pulses having different intensities and/or capacitances. As used herein, the term "pulse" includes one or more pulses at variable capacitance and voltage, and includes exponential and/or square wave and/or modulated wave/square wave forms.
Preferably, the electrical pulse is delivered as a waveform selected from the group consisting of an exponential waveform, a square waveform, a modulated waveform, and a modulated square waveform.
The preferred embodiment uses low voltage direct current. The applicant thus discloses the use of an electric field applied to cells, tissue or tissue mass with a field strength of between 1V/cm and 20V/cm for a period of 100 milliseconds or more, preferably 15 minutes or more.
Ultrasound is advantageously applied at a power level of about 0.05W/cm2 to about 100W/cm 2. Diagnostic or therapeutic ultrasound or a combination thereof may be used.
As used herein, the term "ultrasound" refers to a form of energy consisting of mechanical vibrations with frequencies up to a range exceeding human hearing. The lower frequency limit of the ultrasound spectrum may typically be about 20kHz. Most diagnostic ultrasound applications employ frequencies in the 1 and 15MHz' range (taken from Ultrasonics in Clinical Diagnosis, P.N.T.Wells, editions, 2 nd edition, press Churchill Livingstone [ Edinburgh, london & N.Y., 1977 ]).
Ultrasound has been used for diagnostic and therapeutic applications. When used as a diagnostic tool ("diagnostic ultrasound"), ultrasound is typically used in an energy density range of up to about 100mW/cm2 (FDA recommended), but energy densities of up to 750mW/cm2 are also used. In physiotherapy ultrasound is commonly used as an energy source up to the range of about 3 to 4W/cm2 (WHO proposal). In other therapeutic applications, higher intensity ultrasound, such as HIFU of 100W/cm to 1kW/cm2 (or even higher), may be employed for a short period of time. The term "ultrasound" as used in this specification is intended to encompass diagnostic ultrasound, therapeutic ultrasound and focused ultrasound.
Focused Ultrasound (FUS) allows the delivery of thermal energy without the use of invasive probes (see Morocz et al 1998, volume Journal of Magnetic Resonance Imaging, phase 1, pages 136-142 Another form of focused ultrasound is High Intensity Focused Ultrasound (HIFU), reviewed by Moussatov et al in Ultrasonics (1998), volume 36, phase 8, pages 893-900, and by Tranhouhue et al in Acustica (1997), volume 83, phase 6, pages 1103-1106).
Preferably, a combination of diagnostic ultrasound and therapeutic ultrasound is employed. However, this combination is not intended to be limiting, and one skilled in the art will appreciate that any number of combinations of ultrasound may be used. In addition, the energy density, ultrasonic frequency, and exposure time period may vary.
Preferably, the power density exposed to the ultrasonic energy source is from about 0.05 to about 100Wcm-2. Even more preferably, the power density exposed to the ultrasonic energy source is from about 1 to about 15Wcm-2.
Preferably, the frequency of exposure to the ultrasonic energy source is about 0.015 to about 10.0MHz. More preferably, the frequency of exposure to the ultrasonic energy source is about 0.02 to about 5.0MHz or about 6.0MHz. Most preferably, ultrasound is applied at a frequency of 3 MHz.
Preferably, the exposure is for a period of time of about 10 milliseconds to about 60 minutes. Preferably, the exposure is for a period of time of about 1 second to about 5 minutes. More preferably, the ultrasound is applied for about 2 minutes. However, depending on the particular target cell to be destroyed, the exposure may last for a longer duration, for example for 15 minutes.
Advantageously, the target tissue is exposed to a source of ultrasonic energy having an acoustic power density of about 0.05Wcm-2 to about 10Wcm-2 and a frequency in the range of about 0.015 to about 10MHz (see WO 98/52609). However, alternatives are also possible, e.g. exposure to an ultrasonic energy source with an acoustic power density higher than 100Wcm-2, but for a shortened period of time, e.g. exposure to 1000Wcm-2, for a period of time in the millisecond range or less.
Preferably, the ultrasound is applied in the form of a plurality of pulses; thus, continuous waves and pulsed waves (pulsed ultrasound delivery) may be employed in any combination. For example, continuous wave ultrasound may be applied followed by pulsed wave ultrasound, or vice versa. This may be repeated any number of times in any order and combination. Pulsed wave ultrasound may be applied in the context of continuous wave ultrasound, and any number of pulses of any number of sets may be used.
Preferably, the ultrasound may comprise pulsed wave ultrasound. In highly preferred embodiments, ultrasound is applied as a continuous wave at a power density of 0.7Wcm-2 or 1.25 Wcm-2. If pulsed wave ultrasound is used, a higher power density may be employed.
The use of ultrasound is advantageous because, like light, ultrasound can be focused precisely on the target. Furthermore, ultrasound is advantageous because, unlike light, ultrasound can be focused deeper into tissue. Ultrasound is therefore more suitable for whole tissue penetration (such as but not limited to liver lobes) or whole organ (such as but not limited to whole liver or whole muscle, such as the heart) therapy. Another important advantage is that ultrasound is a non-invasive stimulus for a variety of diagnostic and therapeutic applications. For example, ultrasound is well known in medical imaging techniques and in addition in orthopedics. Furthermore, instruments suitable for applying ultrasound to a subject vertebrate are widely available and their use is well known in the art.
In particular embodiments, the omega RNA molecules are modified by secondary structures to increase the specificity of the TnpB polypeptides and related systems, and the secondary structures can protect from exonuclease activity and allow for 5' addition to nucleic acid component sequences, also referred to herein as protected nucleic acid component molecules.
In one aspect, the invention provides for hybridizing a "protective RNA" to the sequence of a nucleic acid component molecule, wherein the "protective RNA" is an RNA strand complementary to the 3' end of the nucleic acid component molecule, thereby producing a partially double stranded nucleic acid component. In embodiments of the invention, protecting mismatched bases (i.e., bases of a nucleic acid component molecule that do not form part of a nucleic acid component sequence) with a perfectly complementary protecting sequence reduces the likelihood that target DNA will bind to mismatched base pairs at the 3' end. In certain embodiments of the invention, additional sequences comprising an extended length may also be present within the nucleic acid component molecule such that the nucleic acid component comprises a protective sequence within the nucleic acid component molecule. This "protection sequence" ensures that the nucleic acid component molecule comprises a "protected sequence" in addition to the "exposed sequence" (comprising a portion of the nucleic acid component sequence that hybridizes to the target sequence). In particular embodiments, the nucleic acid component molecules are modified to include secondary structures, such as hairpins, by protecting the presence of the nucleic acid component. Advantageously, there are three or four to thirty or more (e.g., about 10 or more) consecutive base pairs that have complementarity to the protected sequence, the nucleic acid component sequence, or both. Advantageously, the protected moiety does not interfere with the thermodynamics of the TnpB polypeptide and related systems in interacting with its target. By providing such an extension comprising a partially double stranded nucleic acid component molecule, the nucleic acid component molecule is considered protected and results in improved specific binding of the TnpB polypeptide/nucleic acid component molecule complex while maintaining specific activity.
In a particular embodiment, a truncated omega RNA component (tru nucleic acid component), i.e.a nucleic acid component molecule comprising a nucleic acid component sequence that is truncated in length relative to the canonical nucleic acid component sequence length, is used. Such nucleic acid component molecules may allow a catalytically active TnpB polypeptide to bind to its target without cleaving the target DNA, as described in Nowak et al (Nucleic Acids Res (2016) 44 (20): 9555-9564). In particular embodiments, a truncated nucleic acid component is used that allows binding of the target but retains only the nicking enzyme activity of the TnpB polypeptide.
In one embodiment, conjugation of tri-antennary N-acetylgalactosamine (GalNAc) to the oligonucleotide component can be used to improve delivery, e.g., to selected cell types, e.g., hepatocytes (see International patent publication No. WO 2014/118272; nair, JK et al, 2014,Journal of the American Chemical Society 136 (49), 16958-16961). This is considered to be a sugar-based particle, and further detailed information about other particle delivery systems and/or formulations is provided herein. Thus, galNAc may be considered to be a particle in the sense of other particles described herein, such that general use and other considerations, such as delivery of the particle, are also applicable to GalNAc particles. For example, a solution phase conjugation strategy can be used to attach a triple antenna GalNAc cluster (molar weight of about 2000) activated as PFP (pentafluorophenyl) ester to a 5 '-hexylamino modified oligonucleotide (5' -HA ASO, molar weight of about 8000Da; Et al Bioconjugate chem.,2015,26 (8), pages 1451-1455). Similarly, poly (acrylate) polymers for in vivo nucleic acid delivery have been described (see WO2013158141, incorporated herein by reference). In a further alternative embodiment, premixing of the TnpB polypeptide nanoparticles (or protein complexes) with naturally occurring serum proteins may be used to improve delivery (Akinc a et al, 2010,Molecular Therapy, volume 18, phase 7, pages 1357-1364).
Screening techniques may be used to identify delivery enhancers, for example, by screening chemical libraries (Gilleron J. Et al, 2015,Nucl.Acids Res.43 (16): 7984-8001). Methods for assessing the efficiency of delivery vehicles, such as lipid nanoparticles, have also been described, which can be used to identify effective delivery vehicles for components (see Sahay g. Et al, 2013,Nature Biotechnology 31,653-658).
Target adjacent motifs
The TnpB system disclosed herein can recognize a Target Adjacent Motif (TAM) in order to recognize and bind to a target sequence on a target polynucleotide. In one embodiment, the nucleic acid-directed nuclease and related compositions do not contain TAM requirements. The exact sequence and length requirements of the TAM will vary depending on the nucleic acid-guided nuclease used. In some examples, a TAM is typically a 2-5 base pair sequence adjacent to the original spacer sequence (i.e., the target sequence). In one exemplary embodiment, the TAM is 3' adjacent to the target polynucleotide. In another exemplary embodiment, the TAM is 5' adjacent to the target sequence of the target polynucleotide.
In one embodiment, the cleavage site is remote from the TAM, e.g., cleavage occurs after the nth nucleotide on the non-target strand and after the nucleotide on the target strand. In one embodiment, the cleavage site occurs after an identified nucleotide on the non-target strand (counted from TAM) and after a further identified nucleotide on the target strand (counted from TAM). In one embodiment, the vector encodes a nucleic acid targeting effector protein, which may be mutated relative to the corresponding wild-type enzyme such that the mutated nucleic acid targeting effector protein lacks the ability to cleave one or both DNA and RNA strands of a target polynucleotide containing a target sequence.
In one exemplary embodiment, the TAM sequence is TCAG. In another exemplary embodiment, the TAM sequence is TCAA. TAM recognition and specificity can be identified, for example, using the methods disclosed in the examples section below.
HDR donor templates
In one embodiment, the compositions and systems herein may also comprise one or more HDR donor templates for homology directed repair-mediated editing. In some cases, the HDR donor template may comprise one or more polynucleotides. In some cases, the HDR donor template may comprise a coding sequence of one or more polynucleotides. The HDR donor template may be a DNA template.
HDR donor templates can be used to edit target polynucleotides. In some cases, the donor polynucleotide comprises one or more mutations to be introduced into the target polynucleotide. Examples of such mutations include substitutions, deletions, insertions, or combinations thereof. Mutations can result in a shift of the open reading frame on the target polynucleotide. In some cases, the HDR donor template alters a stop codon in the target polynucleotide. For example, the HDR donor template may correct the premature stop codon. Correction can be achieved by deleting the stop codon or introducing a mutation of one or more stop codons. In other exemplary embodiments, the HDR donor template addresses loss-of-function mutations, deletions, or translocations that may occur, for example, in certain disease contexts by inserting or restoring functional copies of genes, or functional fragments thereof, or functional regulatory sequences or functional fragments of regulatory sequences. Functional fragment refers to less than a complete copy of a gene by providing enough nucleotide sequence to restore functionality to a wild-type gene or non-coding regulatory sequence (e.g., a sequence encoding a long non-coding RNA). In certain exemplary embodiments, the systems disclosed herein can be used to replace a single allele of a defective gene or a defective fragment thereof. In another exemplary embodiment, the systems disclosed herein can be used to replace both alleles of a defective gene or defective gene fragment. A "defective gene" or "defective gene fragment" is a gene or gene portion that, when expressed, fails to produce a functional protein or non-coding RNA that has the function of the corresponding wild-type gene. In certain exemplary embodiments, these defective genes may be associated with one or more disease phenotypes. In certain exemplary embodiments, the defective gene or gene fragment is not replaced, but the systems described herein are used to insert an HDR donor template encoding a gene or gene fragment that compensates for or overlays defective gene expression such that the cellular phenotype associated with defective gene expression is eliminated or altered to a different or desired cellular phenotype.
In embodiments of the invention, HDR donor templates may include, but are not limited to, genes or gene fragments encoding proteins or RNA transcripts to be expressed, regulatory elements, repair templates, and the like. According to the invention, the HDR donor template may comprise left and right end sequence elements that function together with a transposition component mediating insertion.
In some cases, the HDR donor template manipulates splice sites on the target polynucleotide. In some examples, the HDR donor template disrupts splice sites. Disruption may be achieved by inserting the polynucleotide into a splice site and/or introducing one or more mutations into a splice site. In certain examples, the HDR donor template may restore splice sites. For example, a polynucleotide may comprise a splice site sequence.
The HDR donor template to be inserted may have a size of 10 base pairs or nucleotides to 50kb in length, for example 50 to 40k, 100 and 30k, 100 to 10000, 100 to 300, 200 to 400, 300 to 500, 400 to 600, 500 to 700, 600 to 800, 700 to 900, 800 to 1000, 900 to 1100, 1000 to 1200, 1100 to 1300, 1200 to 1400, 1300 to 1500, 1400 to 1600, 1500 to 1700, 600 to 1800, 1700 to 1900, 1800 to 2000 base pairs (bp) or nucleotides in length.
System and complex
In one aspect, the present disclosure provides a nucleic acid targeting system. Such systems can be used to target, modify, and otherwise manipulate target polynucleotides. In one embodiment, the system comprises a TnpB polypeptide and one or more omega RNAs. The TnpB polypeptide may have nuclease activity, e.g., can cleave DNA. In some embodiments, the TnpB polypeptide may have or may be engineered to have nicking enzyme activity, e.g., to produce a single strand break on a double stranded nucleic acid (such as dsDNA or dsRNA).
In some examples, two or more components in the systems herein may form a complex. For example, the components are separate molecules but interact directly or indirectly with each other. In certain examples, two or more components of the systems herein may be included in a fusion protein.
As used herein, "target sequence" refers to a sequence to which omega RNA is designed to have complementarity, wherein hybridization between the target sequence and the omega RNA promotes the formation of a polynucleotide targeting complex. Complete complementarity is not necessarily required, so long as there is sufficient complementarity to cause hybridization and promote formation of the TnpB targeting complex. The target sequence may comprise a DNA polynucleotide. In one embodiment, the target sequence is located in the nucleus or cytoplasm of the cell. In one embodiment, the target sequence can be located within an organelle (e.g., a mitochondria or chloroplast) of a eukaryotic cell. Sequences or templates that can be used for recombination into a targeted locus comprising a target sequence are referred to as "editing templates" or "editing sequences". In aspects of the invention, the exogenous template may be referred to as an editing template. In one aspect, the recombination is homologous recombination.
In one embodiment, formation of the TnpB targeting complex (comprising omega RNA hybridized to the target sequence and complexed with one or more nucleic acid targeting effector proteins) results in cleavage of one or both nucleic acid strands in or near the target sequence (e.g., within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50 or more base pairs from the target sequence). In one embodiment, one or more vectors driving expression of one or more elements of the TnpB system are introduced into a host cell such that expression of the elements of the TnpB system directs formation of the TnpB complex at one or more target sites. For example, the TnpB polypeptide and omega RNA can each be operably linked to separate regulatory elements on separate vectors. Alternatively, two or more of the elements expressed by the same or different regulatory elements may be combined in a single vector together with one or more additional vectors that provide any component of the TnpB system that is not contained in the first vector. The TnpB system elements combined in a single vector may be arranged in any suitable orientation, such as one element being located (upstream) 5 'with respect to a second element or (downstream) 3' with respect to a second element. The coding sequences of one element may be located on the same or opposite strands of the coding sequences of a second element and oriented in the same or opposite directions. In one embodiment, a single promoter drives expression of transcripts encoding TnpB and ωrnas embedded in one or more intron sequences (e.g., each in a different intron, two or more in at least one intron, or all in a single intron). In one embodiment, the TnpB polypeptide and the ωrna are operably linked to and expressed from the same promoter.
The present disclosure encompasses computational methods and algorithms for predicting novel TnpB polypeptides, identifying components, and novel TnpB systems therein. In some examples, the computational method of the novel TnpB polypeptide locus analysis to identify candidates can be performed by searching for additional homologs in the metagenomic database.
In one aspect, all predicted protein-encoding genes are identified by comparing the identified genes to a TnpB polypeptide specific profile and annotating them according to the NCBI Conserved Domain Database (CDD), which is a protein annotation resource consisting of a well-annotated set of multiple sequence alignment models of ancient domains (analog domains) and full-length proteins. These can be used as a position-specific scoring matrix (PSSM) for rapid identification of conserved domains in protein sequences via RPS-BLAST. The CDD content includes NCBI-tagged domain (NCBI-tagged domain) that uses 3D structural information to explicitly define domain boundaries and provide insight into sequence/structure/functional relationships, as well as domain models imported from a number of external source databases (Pfam, SMART, COG, PRK, TIGRFAM).
In another aspect, case-by-case analysis (PSI-BLAST) is performed using a position-specific iterative basic local alignment search tool. PSI-BLAST derives a position-specific scoring matrix (PSSM) or signature from a multiple sequence alignment of sequences above a given scoring threshold as detected using protein-protein BLAST. This PSSM is used to further search for new matches in the database and to make subsequent iterative updates to these newly detected sequences. Thus, PSI-BLAST provides a way to detect the long-range relationship between proteins.
In another aspect, case-by-case analysis is performed using HHPred, a method for sequence database searching and structure prediction that is as easy to use as BLAST or PSI-BLAST, while being more sensitive in finding remote homologs. In fact, the sensitivity of HHpred is competitive with the most powerful servers currently available for structure prediction. HHPred is the first server based on pairwise comparisons of profile Hidden Markov Models (HMMs). Most conventional sequence search methods search sequence databases, such as UniProt or NR, while HHpred searches compare databases, such as Pfam or SMART. This greatly simplifies hit lists for multiple sequence families, rather than chaotic single sequences. All major publicly available features and alignment databases are available through HHpred. HHPred accepts as input a single query sequence or multiple alignments. Within only a few minutes, it will return search results in an easy-to-read format similar to PSI-BLAST. The search options include local or global alignment and secondary structure similarity scores. HHpred may generate pairwise query template sequence alignments, merged query template multiple alignments (e.g., for passing searches), and 3D structural models calculated by MODELLER software from the HHpred alignments.
Special systems
The TnpB polypeptide may be in a dead form, e.g., without nuclease or nicking enzyme activity. In one embodiment, the system further comprises one or more functional domains, such as nucleotide deaminase, reverse transcriptase, non-LTR retrotransposons (and encoded proteins), polymerase, diversity generating elements (and encoded proteins), and integrase. In some examples, the system further comprises one or more donor polynucleotides. The donor polynucleotide may be systematically inserted into the target polynucleotide. The donor polynucleotide may be contained in or encoded by a nucleic acid template.
TnpB base editing system
The present disclosure also provides a base editing system. Generally, such systems can comprise a nucleobase deaminase (e.g., an adenosine deaminase or a cytidine deaminase) associated (e.g., fused) with a TnpB polypeptide. The TnpB polypeptide may be a catalytically inactive or dead TnpB polypeptide dTnpB. In certain examples, the nucleobase deaminase is a mutant form of adenosine deaminase. Mutant forms of adenosine deaminase may have adenosine deaminase and cytidine deaminase activity.
In some examples, the present disclosure provides an engineered, non-naturally occurring composition comprising: dTnpB, nucleobase deaminase, which associates with or is otherwise capable of forming a complex with dTnpB, and ωrna, which is capable of forming a complex with the TnpB protein and directing site-specific binding at a single nucleotide or nucleotide base pair to be edited or at a target sequence adjacent thereto. In one aspect, a nucleotide deaminase or other editing enzyme flips a target base within DNA. See, e.g., hong and Cheng et al DNA Base Flipping: A general Mechanism for Writing Reading and Erasing DNA Modifications, adv. Exp Med biol.,2016:945:321-341, doi:10.1007/978-3-316-43624-1_14. Without being bound by theory, the TnpB- ω RNA complex bound to the target provides a more open pocket relative to, for example, CRISPR-Cas proteins (e.g., cas9, cas 12), which advantageously provides more accessibility to the complex and base inversion of the target nucleotide by deaminase or other base editing enzyme, reduces steric hindrance, and allows for enhanced specificity of the base editing system.
In one aspect, the base editing may target about 2 to 100 base pairs from the TAM terminus, or about 4 to 100, 50 to 100, 6 to 100, 7 to 100, 8 to 100, 9 to 100, 10 to 100, 11 to 100, 12 to 100, 13 to 100, 14 to 100, 15 to 100, 16 to 100, 17 to 100, 18 to 100, 19 to 100, 20 to 100, 25 to 100, 3 to 90, 3 to 80, 3 to 70, 3 to 60, 3 to 50, 3 to 40, 3 to 30, or about 3 to 30 base pairs from the TAM terminus. In one aspect, when the base editor fuses with TnpB, the linker length can be configured to allow more precise base editing at the desired location. For example, as detailed elsewhere herein, the linker length can be adjusted to facilitate base editing nearer or farther from the TAM, and the linker length can be configured to have increased or decreased rigidity and other properties to produce a desired configuration or presentation at the binding site. The more open configuration at the binding pocket of the TnpB complex may allow for greater flexibility of the configuration of the TnpB editing system and greater specificity of contacting the target site.
In one aspect, the present disclosure provides an engineered adenosine deaminase. The engineered adenosine deaminase may comprise one or more mutations herein. In one embodiment, the engineered adenosine deaminase has cytidine deaminase activity. In certain examples, the engineered adenosine deaminase has both cytidine deaminase activity and adenosine deaminase. In some cases, modifications of the base editors herein can be used to target post-translational signaling or catalysis. In one embodiment, the compositions herein comprise a nucleotide sequence comprising a coding sequence for one or more components of a base editing system. The base editing system may comprise a deaminase (e.g., an adenosine deaminase or a cytidine deaminase) fused to a TnpB polypeptide or variant thereof. In some cases, the target polynucleotide is edited at one or more bases to introduce a G.fwdarw.A or C.fwdarw.T mutation.
In some cases, the adenosine deaminase is a double-stranded RNA-specific Adenosine Deaminase (ADAR). Examples of ADAR include YIannis A Savva et al, the ADAR protein family, genome biol.2012;13 (12) 252, which are incorporated by reference in their entirety. In some examples, the ADAR may be hADAR1. In certain examples, the ADAR can be hADAR2. The sequence of hARR 2 may be the sequence described under accession number AF 525422.1.
In some cases, the deaminase may be a deaminase domain, such as the deaminase domain of ADAR ("ADAR-D"). In one example, the deaminase may be the deaminase domain of hADR 2 ("hADR 2-D"), e.g., as in Phelps KJ et al, recognition of duplex RNA by the deaminase domain of the RNA editing enzyme ADAR2.Nucleic Acids Res.2015, month 1; 43 1123-32, which is incorporated herein by reference in its entirety. In a particular example, hADR 2-D has a sequence comprising amino acids 299-701 of hADR 2-D, such as amino acids 299-701 of the sequence under accession number AF 525422.1.
In certain examples, the system comprises a mutant form of adenosine deaminase fused to dTnpB. Mutant forms of adenosine deaminase may have adenosine deaminase and cytidine deaminase activity. In one embodiment, the adenosine deaminase may comprise one or more mutations: E488Q based on the amino acid sequence position of the hADAR2-D, and mutations in the homologous ADAR proteins corresponding to the above. In one embodiment, the adenosine deaminase may comprise one or more mutations: E488Q, V351G based on the amino acid sequence position of hADR 2-D, and mutations in homologous ADAR proteins corresponding to those described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: E488Q, V351G, S486A based on amino acid sequence position of hADR 2-D, and mutations in homologous ADAR proteins corresponding to those described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: E488Q, V351G, S486A, T375S based on amino acid sequence position of hADR 2-D, and mutations in homologous ADAR proteins corresponding to the above. In one embodiment, the adenosine deaminase may comprise one or more mutations: E488Q, V351G, S486A, T375S, S370C based on amino acid sequence position of hARR 2-D, and mutations in homologous ADAR proteins corresponding to the above. In one embodiment, the adenosine deaminase may comprise one or more mutations: E488Q, V351G, S486A, T375S, S370C, P462A based on the amino acid sequence position of hARR 2-D, and mutations in homologous ADAR proteins corresponding to those described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: E488Q, V351G, S486A, T375S, S370C, P462A, N597I based on amino acid sequence position of hARR 2-D, and mutations in homologous ADAR proteins corresponding to the above. In one embodiment, the adenosine deaminase may comprise one or more mutations: E488Q, V351G, S486A, T375S, S370C, P462A, N597I, L332I based on amino acid sequence position of hDAR 2-D, and corresponds to a mutation in a homologous ADAR protein as described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: E488Q, V351G, S486A, T375S, S370C, P462A, N597I, L332I, I V based on the amino acid sequence position of hDAR 2-D, and corresponds to a mutation in a homologous ADAR protein as described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: E488Q, V351G, S486A, T375S, S370C, P462A, N357I, L332I, I398V, K350I based on the amino acid sequence position of hDAR 2-D, and corresponds to a mutation in a homologous ADAR protein as described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: E488Q, V351G, S486A, T375S, S370C, P462A, N797I, L332I, I398V, K350I, M383L based on the amino acid sequence position of hDAR 2-D, and corresponds to a mutation in a homologous ADAR protein as described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: E488Q, V351G, S486A, T375S, S370C, P462A, N597I, L332I, I V, K350I, M383L, D619G based on the amino acid sequence position of hDAR 2-D, and corresponds to a mutation in a homologous ADAR protein as described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: E488Q, V351G, S486A, T375S, S370C, P462A, N597I, L332I, I V, K350I, M383L, D619G, S582T based on the amino acid sequence position of hADAR2-D, and corresponds to a mutation in a homologous ADAR protein as described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: E488Q, V351G, S486A, T375S, S370C, P462A, N597I, L I, I398V, K350I, M383L, D619G, S582T, V440I based on the amino acid sequence position of hADAR2-D, and corresponds to a mutation in the homologous ADAR protein described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: E488Q, V351G, S486A, T375S, S370C, P462A, N597I, L332I, I V, K350I, M383L, D619G, S582T, V440I, S495N based on the amino acid sequence position of hDAR 2-D, and corresponds to a mutation in a homologous ADAR protein described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: E488Q, V351G, S486A, T375S, S370C, P462A, N597I, L I, I V, K350I, M383L, D619G, S582T, V440I, S495N, K418E based on the amino acid sequence position of hDAR 2-D, and corresponds to a mutation in a homologous ADAR protein described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: E488Q, V351G, S486 375S, S370C, P462A, N597I, L I, I398V, K350I, M3837L, D619G, S582T, V440I, S495N, K418E, S661T based on the amino acid sequence position of the hADAR2-D and corresponds to the mutation in the homologous ADAR protein described above. In some examples, provided herein include a mutant adenosine deaminase, e.g., comprising one or more of E488Q, V351G, S486 375S, S370C, P462A, N597I, L332I, I398V, K350I, M3837L, D619G, S582T, V440I, S495N, K418E, S661T, fused to a dead TnpB polypeptide or a TnpB polypeptide nickase. In some examples, provided herein includes a mutant adenosine deaminase, e.g., an adenosine deaminase comprising E488Q, V351G, S486A, T375S, S370C, P A, N597I, L332I, I398V, K350I, M383L, D619G, S582T, V440I, S495N, K418E and S661T, fused to a dead TnpB polypeptide or TnpB polypeptide nickase. In some examples, provided herein includes a mutant adenosine deaminase, e.g., an adenosine deaminase comprising E488Q, V351G, S486 375S, S370C, P462A, N597I, L332I, I V, K350I, M3837L, D619G, S582T, V440I, S495N, K418E, S661T and S375N, fused to a dead TnpB polypeptide or TnpB polypeptide nickase.
In one embodiment, the adenosine deaminase may be a tRNA specific adenosine deaminase or a variant thereof. In one embodiment, the adenosine deaminase may comprise one or more mutations: W23L, W R, R G, H L, N37S, P S, P48T, P48A, I49V, R L, N D, L F, S97C, A106V, D108N, H123Y, G125A, A79146C, D147Y, R152H, R152P, E155V, I156F, K157N, K161T based on the amino acid sequence position of E.coli tadA and corresponds to the mutation in the homologous deaminase protein described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: D108N based on the amino acid sequence position of escherichia coli TadA, and a mutation corresponding to the above-mentioned homologous deaminase protein. In one embodiment, the adenosine deaminase may comprise one or more mutations: a106V, D N based on the amino acid sequence position of escherichia coli TadA, and a mutation in the homologous deaminase protein corresponding to the above. In one embodiment, the adenosine deaminase may comprise one or more mutations: A106V, D108N, D147Y, E155V based on the amino acid sequence position of Escherichia coli TadA, and a mutation in the homologous deaminase protein corresponding to the above. In one embodiment, the adenosine deaminase may comprise one or more mutations: a106V, D N based on the amino acid sequence position of escherichia coli TadA, and a mutation in the homologous deaminase protein corresponding to the above. In one embodiment, the adenosine deaminase may comprise one or more mutations: a106V, D108, 108N, D147Y, E155V, L84F, H123Y, I156F based on the amino acid sequence position of escherichia coli TadA, and corresponds to the mutation in the homologous deaminase protein described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: A106V, D108, 108N, D147Y, E155V, L84F, H123Y, I156F, A142N based on the amino acid sequence position of E.coli TadA, and corresponds to the mutation in the homologous deaminase protein described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: A106V, D108N, D147Y, E V, L84F, H123Y, I156F, H L, R51L, S146C, K157N based on the amino acid sequence position of Escherichia coli TadA, and corresponds to the mutation in the homologous deaminase protein described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: a106V, D108N, D147Y, E155 6284F, H123Y, I123 156F, H L, R51L, S146C, K157N, P S based on the amino acid sequence position of escherichia coli TadA, and corresponds to the mutation in the homologous deaminase protein described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: a106V, D108N, D147Y, E155V, L F, H123Y, I156 5248 51L, S146C, K157N, P6248S, A142N based on the amino acid sequence position of escherichia coli TadA, and a mutation corresponding to the homologous deaminase protein described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: a106V, D108N, D147Y, E155V, L F, H123Y, I156F, H L, R51L, S146C, K157N, P S, W23R, P a based on the amino acid sequence position of escherichia coli TadA, and corresponds to the mutation in the homologous deaminase protein described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: A106V, D108N, D147Y, E155V, L84F, H123Y, I F, H L, R51L, S146C, K157N, P48S, W23R, P A, A142N based on the amino acid sequence position of E.coli tadA and corresponds to the mutation in the homologous deaminase protein described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: A106V, D108N, D147Y, E155V, L84F, H123Y, I F, H L, R51L, S146C, K157N, P48S, W23R, P A, R152P based on the amino acid sequence position of Escherichia coli tadA and corresponds to the mutation in the homologous deaminase protein described above. In one embodiment, the adenosine deaminase may comprise one or more mutations: a106V, D108N, D147Y, E155V, L84F, H123Y, I156F, H36L, R L, S146C, K157N, P S, W23R, P A, R152P, A N based on the amino acid sequence position of escherichia coli TadA and corresponds to the mutation in the homologous deaminase protein described above.
In some examples, the base editing system may comprise an intein-mediated trans-splicing system capable of delivering a base editor in vivo, such as a split intein Cytidine Base Editor (CBE) or Adenine Base Editor (ABE) engineered to trans-splice. Examples of such base editing systems include Colin K.W.Lim et al, treatment of a Mouse Model of ALS by In Vivo Base Editing, mol Ther.2020, 14 th day pii: S1525-0016 (20) 30011-3.doi:10.1016/j.ymthe.2020.01.005; and Jonathan M.Levy et al, cytosine and adenine base editing of the brain, river, retina, heart and skeletal muscle of mice via adeno-associated viruses, nature Biomedical Engineering, volume 4, pages 97-110 (2020), which are incorporated herein by reference in their entirety.
Examples of base editing systems include those described in the following: international patent publication nos. WO 2019/071048 (e.g., paragraph [0933] - [0938 ]), WO 2019/084063 (e.g., paragraph [0173] - [0186], [0323] - [0475], [0893] - [1094 ]), WO 2019/126716 (e.g., paragraph [0290] - [0425], [1077] - [1084 ]), WO 2019/126709 (e.g., paragraph [0294] - [0453 ]), WO 2019/126762 (e.g., paragraph [0309] - [0438 ]), WO 2019/126774 (e.g., paragraph [0511] - [0670 ]), cox DBT et al, RNA editing with CRISPR-Cas13, science.2017, 11 months 24; 358 (6366) 1019-1027; abudayyeh OO et al A cytosine deaminase for programmable single-base RNA coding, science 2019, 7, 26:365, 6451, pages 382-386; gaudelli NM et al Programmable base editing of A.T to G C in genomic DNA without DNA cleavage, nature volume 551, pages 464-471 (2017, 11, 23); komor AC et al Programmable editing of a target base in genomic DNA without double-clamped DNA clear.Nature.2016, 5/19; 533 (7603) 420-4; jordan L.Doman et al Evaluation and minimization of Cas-independent off-target DNA editing by cytosine base editors, nat Biotechnol (2020). Doi.org/10.1038/s41587-020-0414-6; and Richter MF et al, phage-assisted evolution of an adenine base editor with improved Cas domain compatibility and activity, nat Biotechnol (2020). Doi.org/10.1038/s41587-020-0453-z, which are incorporated herein by reference in their entirety and may be used to adapt TnpB polypeptides.
TnpB primer editing system
In one embodiment, the compositions and systems provided by the present disclosure may comprise TnpB or dTnpB, one or more nucleic acid components, and a reverse transcriptase. The system may be used to insert a donor polynucleotide into a target polynucleotide. In some examples, the composition or system comprises a catalytically inactive TnpB polypeptide, a reverse transcriptase associated with or otherwise capable of forming a complex with the TnpB polypeptide, and a nucleic acid component molecule capable of forming a complex with the TnpB polypeptide and directing site-specific binding of the complex to a target sequence of a target polynucleotide, the nucleic acid component molecule further comprising a donor template that serves as a template for insertion of the donor sequence into the target polynucleotide by the reverse transcriptase.
In some cases, tnpB or dTnpB may be a nicking enzyme, such as a DNA nicking enzyme. The TnpB nicking enzyme may comprise one or more mutations. In some examples, the TnpB comprises a mutation corresponding to a mutation in RuvC nuclease. In some examples, the TnpB is naturally catalytically inactive and comprises a fusion with a nuclease domain (e.g., HNH or fokl domain).
The reverse transcriptase domain may be a reverse transcriptase or a fragment thereof. In certain aspects, the reverse transcriptase is a Human Immunodeficiency Virus (HIV) RT, an Avian Myeloblastosis Virus (AMV) RT, a moloney murine leukemia virus (M-MLV) RT, a group II intron-like RT, or a chimeric RT. In certain embodiments, RT comprises modified forms of these RT, such as Avian Myeloblastosis Virus (AMV) RT, moloney murine leukemia virus (M-MLV) RT, or engineered variants of Human Immunodeficiency Virus (HIV) RT (see, e.g., anzalone et al, search-and-replace genome editing without double-strand breaks or donor DNA, nature.2019, month 12; 576 (7785): 149-157).
In some examples, the compositions and systems can comprise a TnpB protein herein; a Reverse Transcriptase (RT) polypeptide linked to or otherwise capable of forming a complex with a TnpB protein; and an omega RNA molecule capable of forming a complex with a TnpB protein and comprising: omega RNA sequences capable of directing site-specific binding of the TnpB complex to the target sequence of the target polynucleotide; a 3' binding site region capable of binding to the cleaved upstream strand of the target polynucleotide; and an RT template sequence encoding an extension sequence, wherein the extension sequence comprises a variant region and a 3' homologous sequence capable of hybridizing to a downstream cleavage strand of the target polynucleotide.
A variety of Reverse Transcriptases (RTs) may be used in alternative embodiments of the invention, including prokaryotic and eukaryotic RTs, provided that the RTs function within the host to produce the donor polynucleotide sequence from the RNA template. If desired, the nucleotide sequence of the native RT may be modified, for example using known codon optimization techniques, so as to optimise expression in the desired host. Reverse Transcriptase (RT) is an enzyme used to produce complementary DNA (cDNA) from an RNA template, a process known as reverse transcription. Reverse transcriptase is used by retroviruses to replicate their genome, by retrotransposon mobile genetic elements to proliferate within the host genome, by eukaryotic cells to lengthen telomeres at the ends of their linear chromosomes, and by some non-retroviruses such as hepatitis b virus (a member of the hepadnaviridae family, which is dsDNA-RT virus). Retrovirus RT has three sequential biochemical activities: RNA-dependent DNA polymerase activity, ribonuclease H, and DNA-dependent DNA polymerase activity. In general, these activities enable enzymes to convert single-stranded RNA into double-stranded cDNA. In one embodiment, the RT domain of reverse transcriptase is used in the present invention. The domain may comprise only RNA-dependent DNA polymerase activity. In some examples, the RT domain is non-mutagenic, i.e., does not cause mutation of the donor polynucleotide (e.g., during the reverse transcriptase process). In some cases, in some examples, the RT domain may be a non-retroviral RT, such as a viral RT or a human endogenous RT. In some examples, the RT domain may be a reverse transcriptase RT or DGR RT. In some examples, RT may have lower mutagenesis than the corresponding wild-type RT. In one embodiment, the RT herein is free of mutagenesis.
Reverse transcriptase may be fused to the C-terminus of TnpB. Alternatively or additionally, a reverse transcriptase may be fused to the N-terminus of TnpB. Fusion may be via a linker and/or an adapter protein. In some examples, the reverse transcriptase may be an M-MLV reverse transcriptase or a variant thereof. The M-MLV reverse transcriptase variant may comprise one or more mutations. For example, the M-MLV reverse transcriptase may comprise D200N, L W and T330P. In another example, the M-MLV reverse transcriptase may comprise D200N, L603W, T P, T306K and W313F. In a specific example, the fusion of a TnpB polypeptide to a reverse transcriptase is a TnpB polypeptide having a mutation fused to an M-MLV reverse transcriptase (D200N+L603 W+T330 P+T380K+W313F).
The small size of the TnpB polypeptides herein may allow for easier packaging and delivery (e.g., with viral vectors (e.g., AAV or lentiviral vectors)) primer editing systems. See, for example, lino et al Drug Deliv.2018;25 (1) 1234-1257; doi 10.1080/10717544.2018.1474964, which is incorporated herein by reference, see table 1 in particular, and crispr delivery methods are incorporated herein by reference.
The TnpB polypeptide can create a single strand break (nick) at the target site on the target DNA to expose the 3' -hydroxyl, thereby initiating reverse transcription of the editorially encoded extension on the nucleic acid component molecule directly into the target site. These steps can result in branched intermediates with two redundant single-stranded DNA lobes: a 5 'flap containing unedited DNA sequence and a 3' flap containing edited sequence copied from the nucleic acid component. The 5 'flap may be removed by a structure-specific endonuclease (e.g., FEN 122) that cleaves the 5' flap created during the synthesis of the delay chain DNA and the repair of the long-patch base excision. The unedited DNA strand may be nicked to induce preferential DNA repair to preferentially replace the unedited strand. Examples of primer editing systems and methods include those described in Anzalone AV et al, search-and-replace genome editing without double-strand breaks or donor DNA, nature.2019, month 10, 21 date doi:10.1038/s41586-019-1711-4, which is incorporated herein by reference in its entirety.
TnpB (e.g., in the form of a nicking enzyme) can be used to primer edit a single nucleotide on the target DNA. Alternatively or additionally, the TnpB polypeptide can be used to primer edit at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 26, at least 27, at least 28, at least 29, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 200, at least 300, at least 400, at least 500, at least 600, at least 700, at least 800, at least 900, or at least 1000 nucleotides on a target DNA.
In yet another embodiment, primer editing is first used to create a longer 3' region (e.g., 20 nucleotides). Examples of primer editing systems and methods include those described in Anzalone AV et al, search-and-replace genome editing without double-strand breaks or donor DNA, nature.2019, month 10, 21 date doi:10.1038/s41586-019-1711-4, which is incorporated herein by reference in its entirety. In such cases, the system comprises a TnpB protein having nicking enzyme activity, a reverse transcriptase domain and a DNA polymerase, and an omega RNA molecule comprising a binding sequence and an editing sequence capable of hybridizing to the target polynucleotide. The resulting region may be further extended on a DNA template as described herein. The latter may allow the generation of target independent sequences compatible with the universal donor sequences.
On the target polynucleotide, the TnpB protein is capable of producing a first cleavage in the target sequence and a second cleavage outside the target sequence. In some variants, a second TnpB-mediated cleavage may be performed near the target site, which may enable more efficient invasion of the extended DNA.
In some examples, the compositions and systems of the TnpB proteins herein comprise: a Reverse Transcriptase (RT) polypeptide linked to or otherwise capable of forming a complex with a TnpB protein; a first omega RNA molecule capable of forming a first TnpB-reverse transcriptase complex with a TnpB protein and comprising: omega RNA sequences capable of directing site-specific binding of the first TnpB-reverse transcriptase complex to a first target sequence of a target polynucleotide; a first binding site region capable of binding to a cleavage strand or a nick strand of a target polynucleotide; an RT template sequence encoding the first extension sequence; a second ωrna molecule capable of forming a second TnpB-reverse transcriptase complex with a TnpB protein and comprising: omega RNA sequences capable of directing site-specific binding of the second TnpB-reverse transcriptase complex to a second target sequence of a target polynucleotide; a second binding site region capable of binding to a cleavage strand or a nick strand of a target polynucleotide; and an RT template sequence encoding the second extension sequence.
In some cases, the compositions and systems may further comprise: a donor template; a third ωrna sequence capable of forming a TnpB-reverse transcriptase complex- ωrna with a TnpB protein, and comprising: omega RNA sequences capable of directing site-specific binding to a target sequence on a donor template; a third binding region capable of binding to a nicked or cut strand of the donor template; and an RT template encoding a third extension region complementary to the first extension region generated on the target polynucleotide; and a fourth omega RNA sequence capable of forming a TnpB-reverse transcriptase complex with a TnpB protein and comprising: omega RNA sequences capable of directing site-specific binding to a second target sequence on a donor template; a fourth binding region capable of binding to a nicked or cut strand of the donor template; and an RT template encoding a fourth extension region complementary to the second extension region generated on the target polynucleotide.
Advantageously, the more open configuration of the TnpB complex relative to the CRISPR-Cas enzyme may allow for improved accessibility not only of sequences complementary to the target site, but also of RT template sequences and reverse transcriptase, which may increase editing efficiency at the target site. In addition, the smaller size of TnpB allows for improved packaging efficiency of reverse transcriptase delivery, accessibility of delivery of additional omega RNA sequences, all of which can further improve editing efficiency as described in further detail below.
In some cases, the compositions and systems may further comprise a site-specific recombinase, and wherein the first extension region and the second extension region are complementary to each other and introduce a serine integrase recombination site; and a donor molecule comprising a donor sequence for insertion into the target polypeptide and a recombination site complementary to the serine integrase recombination site.
In some examples, the compositions and systems may also comprise a recombinase enzyme. The recombinase is linked to or otherwise capable of forming a complex with the TnpB protein. In certain embodiments, the complex is capable of inserting a recombination site into a DNA locus of interest by extension of an RT template encoding the recombination site on 3' extension of an omega RNA sequence by reverse transcriptase. In certain embodiments, donor templates comprising compatible recombination sites are provided that can unidirectionally recombine with inserted recombination sites when a recombinase specific for the recombination sites is also provided. In certain embodiments, the donor template is a plasmid comprising a complementary recombination site and any sequence for insertion at a DNA locus of interest. In certain embodiments, the recombinase is linked to or capable of forming a complex with the TnpB enzyme such that all enzyme proteins are contacted at the locus of interest. In certain embodiments, the recombinase is codon optimized for eukaryotic cells (described further herein). In certain embodiments, the recombinase comprises NLS (described further herein). In certain embodiments, the recombinant enzyme is provided as a separate protein. The recombinase alone may form a dimer and bind to the donor template recombination site. Since insertion of compatible recombination sites is also recognized by the recombinase, the recombinase can target the locus of interest. Thus, the recombinase can recognize recombination sites inserted at the DNA locus of interest and on the donor, and target the DNA locus of interest without any additional modification to the recombinase.
In certain embodiments, the second TnpB complex linked to a recombinase targets a DNA locus of interest. In certain embodiments, the second TnpB complex comprises a dead TnpB protein (dTnpB, further described herein) such that the recombinase targets the DNA locus of interest, but the target sequence is not further cleaved. In certain embodiments, dTnpB targets sequences that are only produced after insertion of the recombination site. In certain embodiments, the recombinase recognizes and binds the donor template recombination site and the inserted recombination site. In certain embodiments, the recombinase forms a dimer with a recombinase provided as a separate protein.
As used herein, the term "recombinase" refers to an enzyme that catalyzes recombination between two or more recombination sites (e.g., acceptor and donor sites). The recombinases useful in the present invention catalyze recombination at specific recombination sites, which are specific polynucleotide sequences recognized by a particular recombinase. "one-way recombinase" or "integrase" refers to a recombinase whose recognition site is destroyed after recombination has taken place. The term "integrase" refers to a type of recombinase. In other words, the sequence recognized by the recombinase becomes a sequence not recognized by the recombinase after recombination. Thus, once the sequence is recombined by a unidirectional recombinase, the continued presence of the recombinase cannot reverse the previous recombination event.
A "recombination site" is a particular polynucleotide sequence that is recognized by a recombinase described herein. Typically, two different sites (referred to as "complementary sites" in terms of recombination) are involved, one present in the target nucleic acid (e.g., chromosome or episome of a eukaryote) and the other present on the nucleic acid to be integrated at the target recombination site. The terms "attB" and "attP" as used herein refer to attachment (or recombination) sites originally derived from a bacterial target (attachment site of a bacterium) and a phage donor (attachment site of a phage), respectively, but the recombination sites of a particular enzyme may have different names. Two attachment sites may share as little sequence identity as a few base pairs. Recombination sites typically include left and right arms separated by a core region or spacer region. Thus, attB recombination sites consist of BOB ', where B and B' are the left and right arms, respectively, and O is the core region. Similarly, attP is POP ', where P and P' are arms, and O is again the core region. When recombination occurs between attB and attP sites and nucleic acid is simultaneously integrated into the target, the recombination sites flanking the integrated DNA are called "attL" and "aatR". Using the above terminology, attL and attR sites thus consist of BOP 'and POB', respectively. In some of the representations herein, "O" is omitted, and attB and attP are designated, for example, as BB 'and PP', respectively.
TnpB related transposase system
The systems and compositions herein may comprise TnpB, one or more nucleic acid components, and one or more components of a transposase. In one exemplary embodiment, the TnpB mediates RNA-guided TnpA-catalyzed transposition. In one exemplary embodiment, the TnpB mediates RNA-guided Tn7 catalyzed transposition.
In exemplary embodiments, the transposase may comprise TnpA. The transposase may be a Y1 transposase of the IS200/IS605 family encoded by an Insertion Sequence (IS) IS608 from helicobacter pylori (e.g., tnpAIS 608), by an Insertion Sequence (IS) IS608 from an deinococcus radiodurans (e.g., ISDra 2), by an Insertion Sequence (IS) IS608 from a hydrogen-producing salt anaerobic bacterium or from sulfolobus solfataricus. Examples of transposases include Barabas, O., ronning, D.R., guynet, C., hickman, A.B., tonHoang, B., chandler, M.and Dyda, F. (2008) mechanics of IS200/IS605 family DNA transposases: activation and transposon-directed target site selection.cell,132,208-220; sadler et al, genes 2020,11,484, doi:10.3390/Genes11050484 and He et al, (2013) NAR,41:5, 3302-3313. In certain exemplary embodiments, the transposase is a single stranded DNA transposase. In certain exemplary embodiments, the single-stranded DNA transposase is TnpA or a functional fragment thereof. In one aspect, the TnpA motif for homing to the insertion site is at least 50%, 75% or 100% complementary to the TAM of TnpB, such that TnpA-catalyzed transposition can occur at or near the TAM portion of the sequence.
In some examples, the one or more transposases or transposase subunits are Tn7 transposases or are derived from Tn7 transposases. In particular embodiments, the Tn7 or Tn 7-like transposase may be a Tn5053 transposase. For example, tn5053 transposases include Minakhina S et al, tn5053 family transposons are res site hunters sensing plasmidal res sites occupied by cognate resolvases.mol microbiol.1999, 9; 33 (5) 1059-68; and Partridge SR et al Mobile Genetic Elements Associated with Antimicrobial Resistance, clin Microbiol Rev.2018, 8, 1; 31 Those in fig. 4 and related text of (4), both of which are incorporated herein by reference in their entirety. In some cases, the one or more Tn5053 transposases may comprise one or more of TniA, tniB, and TniQ. TniA is also known as TnsB. TniB is also known as TnsC. TniQ is also known as TnsD. Thus, in one embodiment, these Tn5053 transposase subunits may be referred to as TnsB, tnsC, and TnsD, respectively. In certain instances, the one or more transposases can comprise TnsB, tnsC, and TnsD.
In one embodiment, the transposase may be one or more vibrio cholerae Tn6677 transposase. In one example, the transposon can include an end operon comprising the tnsA, tnsB, and tnsC genes. The transposon may also comprise a tniQ gene. In one embodiment, tnsE may not be present in the transposon.
In certain examples, the transposase comprises one or more of Mu transposase, tniQ, tniB, or a functional domain thereof. In certain examples, the transposase comprises TniQ, tniB, tnpB or one or more of its functional domains. In certain examples, the transposase comprises one or more of the rve integrase, tniQ, tniB, or functional domains thereof.
In one embodiment, the system (more particularly, transposase) does not include an rve integrase. In one embodiment, the system (more particularly, the transposase) does not include one or more of the Mu transposase, tniQ, tniB, tnpB, istB domain, or functional domain thereof.
In certain examples, the transposase comprises one or more of Mu transposase, tniQ, tniB, or a functional domain thereof. In certain examples, the transposase comprises TniQ, tniB, tnpB or one or more of its functional domains. In certain examples, the transposase comprises one or more of an rve integrase, a TniQ, tniB, tnpB domain, or a functional domain thereof.
The right-hand or left-hand sequence element was made with reference to the exemplary Tn7 transposon. The general structure of the Left (LE) and Right (RE) sequence elements of the specification Tn7 is established. The Tn7 terminus contains a series of 22-bp TnsB binding sites. Flanking the most distal TnsB binding site is an 8-bp terminal sequence ending with 5'-TGT-3'/3 '-ACA-5'. The right end of Tn7 contains four overlapping TnsB binding sites in the right element of about 90-bp. The left end contains three TnsB binding sites dispersed in the left end of the element, about 150-bp. The number and distribution of TnsB binding sites may vary between Tn 7-like elements. The terminal sequence of the Tn 7-related element can be determined by identifying a directly repeated 5-bp target site repeat, a terminal 8-bp sequence, and a 22-bp TnsB binding site (Peters JE et al 2017). Exemplary Tn7 elements (including right and left end sequence elements) include Parks AR, plasmid, month 1 of 2009; 61 (1) those described in 1-14.
TnpB recombinase/integrase system
The systems and compositions herein may comprise a TnpB system and one or more components of a recombinase or integrase. In one aspect, the TnpB is naturally catalytically inactive and utilized with one or more nucleic acid components to provide site-specific targeting and with one or more components of a recombinase to introduce modification. In one aspect, the TnpB polypeptide may be catalytically inactivated via mutation of one or more residues of a catalytic domain (e.g., ruvC) or via truncation, and utilized with one or more nucleic acid components to provide site-specific targeting, and utilized with one or more components of a recombinase to introduce modification. In one embodiment, the naturally inactivated TnpB is provided with a recombinase, such as an integrase, and optionally a reverse transcriptase. The systems and compositions herein may comprise a TnpB polypeptide, one or more nucleic acid components, and one or more components of an integrase. In one aspect, the TnpB polypeptide is a nicking enzyme and is utilized with one or more nucleic acid components to provide site-specific targeting, wherein one or more components of the integrase introduce modification. The systems and compositions can be used to insert a donor polynucleotide into a target polynucleotide. The systems and compositions may also comprise a donor polynucleotide.
In a preferred embodiment, the recombinase mediates unidirectional site-specific recombination. In one embodiment, the recombinase IS a Serine Recombinase (SR), also known as serine integrase, encoded by, for example, the IS607 family, tn4451, and phage phiC 31. See generally, smith MC, thorpe HM Diversity in the serine recombinases. Mol microbiol 2002,44:299-307.10.1046/j.1365-2958.2002.02891.X; li et al, (2018) J.mol.biol.430:21,4401-4418.
In embodiments, the recombinase IS a tyrosine recombinase (YR) encoded by IS91, a helix (helix), IS200/IS605, a Crypton or a DIRS-retrotransposon family. See generally, goodwin TJ, butler MI, poulter T: cryptons: a group of tyrosine-recombinase-encoding DNA transposons from pathogenic fungi. Microbiology.2003,149:3099-3109.Doi:10.1099/mic.0.26529-0; cappello J, handelsman K, loish HF: sequence of Dictyostelium DIRS-1:an apparent retrotransposon with inverted terminal repeats and an internal circle junction sequence.Cell.1985,43:105-115.10.1016/0092-8674 (85) 90016-9.
In one aspect, the recombinase provides site-specific integration of a template (e.g., a donor oligonucleotide) that can be provided with the composition. Without being bound by theory, the recombinant enzyme allows integration independent of payload size and can coordinate strand exchange and reconnection between multiple cell types, allowing integration of long-length polynucleotides. In an exemplary embodiment, the serine recombinase is PhiC31 and the target is DNA. In one aspect, phiC31 allows for integration of a target site comprising an attP or pseudo attP recognition site. See, for example, system. Com/wp-content/uploads/phiC31_product sheet-1.Pdf. In embodiments utilizing phiC231, the donor oligonucleotide will have an attB at the sequence that promotes attachment at the attP site of the target genome. Similar methods of designing donor oligonucleotides having sequences complementary to the attachment site of the recombinase can be designed for use in the present invention. See, for example, li et al, (2018) J.mol.biol.430:21,4401-4418.
In a preferred embodiment, the integrase mediates gene integration at different loci by directing insertion with a TnpB nickase fused to both the reverse transcriptase and integrase. In one embodiment, the integrase is a serine integrase encoding, for example, bxbINT. See generally Ioanidi et al, "Drag-and-drop genome insertion without DNA cleavage with CRISPR-directed integrases"; doi 10.1101/2021.11.01.466786m, incorporated herein by reference in its entirety. Integration using CRISPR-Cas9 nickase fused to reverse transcriptase and serine integrase, termed programmable addition via site specific targeting elements (PASTE), and delivery via a single dose of non-dividing and primary cell functional plasmid containing AttB landing sites, termed guide RNA containing attachment sites for insertion of sequences, including different cargo sequences that can be inserted into different loci, is shown at Ioannidi, gootenberg, abudayyeh and colleagues, varying in size up to about 36kb. Other uses of the PASTE system include gene labeling, gene replacement, gene delivery, and protein production and secretion, which are contemplated for use with the TnpB nicking enzyme and integrase methods. In one aspect, the ωrna can comprise an AttB landing site. In one aspect, the recombinase provides site-specific integration of a template (e.g., a donor oligonucleotide) that can be provided with the composition.
Other serine integrases can be used with the TnpB polypeptides, for example as identified and described in Durrant et al, large-scalediscovery of recombinases for integrating DNA into the human genome, doi:10.1101/2021.11.05.467528, which is incorporated herein by reference. Other integrases include BceINT, sscINT, sacINT. See, e.g., ioanidi, fig. 6d and fig. 10a of 2021.
Without being bound by theory, the recombinant enzyme allows integration independent of payload size and can coordinate strand exchange and reconnection between multiple cell types, allowing integration of long-length polynucleotides. In exemplary embodiments, the integrase is BxbINT and the target is DNA. In one aspect, bxbINT allows for integration of target sites comprising attP or pseudo attP recognition sites. In embodiments utilizing BxbINT, the donor oligonucleotide will have attB at the sequence that promotes attachment at the attP site of the target genome. Similar methods of designing donor oligonucleotides having sequences complementary to the integrase attachment sites can be designed for use in the present invention, e.g., circular double stranded DNA templates containing AttP attachment sites, or to deliver large cargo via adenovirus or other viral vectors, as described elsewhere herein. See, for example, FIGS. 1a, 1b and 5b of Ioanidi et al, 2021.
TnpB topoisomerase system
The one or more functional domains may be one or more topoisomerase domains. Topoisomerase is a class of enzymes that alters the topological state of DNA via cleavage and re-ligation of nucleic acid strands. In some cases, the topoisomerase may be a DNA topoisomerase, an enzyme that controls and alters DNA topology during transcription, and catalyzes the transient cleavage and re-joining of single strands of DNA to allow the strands to pass through each other, thereby altering the topology of the DNA.
In one embodiment, the topoisomerase domain is capable of ligating a donor polynucleotide with a target polynucleotide. The connection may be achieved by an adhesive end or a blunt end connection. In one example, the donor polynucleotide may comprise an overhang comprising a sequence complementary to a region of the target polynucleotide. Examples of ligating donor polynucleotides to target polynucleotides include those of TOPO clones, for example those described in "The Technology Behind TOPO Cloning" of www.thermofisher.com/us/en/home/life-science/cloning/TOPO/TOPO-resources/the-technology-bond-TOPO-cloning.
In one embodiment, the topoisomerase domain can be associated with a donor polynucleotide. For example, a topoisomerase domain is covalently linked to a donor polynucleotide. In one embodiment, the topoisomerase domain can be provided with, e.g., associated with (e.g., fused to) TnpB or a variant thereof.
Alternatively or additionally, the topoisomerase domain may be located on a different molecule than the TnpB polypeptide. In some cases, the topoisomerase domain can be associated with the donor polynucleotide. For example, the topoisomerase domain may be pre-loaded covalently with the donor DNA molecule. Such a design may allow for only the efficient connection of specific goods. The topoisomerase domain can ligate a donor polynucleotide (e.g., a DNA molecule) to a target site (e.g., a free double-stranded DNA end) on a target polynucleotide. In one embodiment, the donor polynucleotide may have an overhang comprising a sequence complementary to a region of the target polynucleotide. For example, an overhang can invade a target polynucleotide at a cleavage site created by a tnpB polypeptide.
Examples of topoisomerase include type I topoisomerase, including type IA and type IB topoisomerase, which cleaves a single strand of a double-stranded nucleic acid molecule, and type II topoisomerase (e.g., gyrase), which cleaves both strands of a double-stranded nucleic acid molecule.
Type IA and type IB topoisomerase cleave one strand of a double-stranded nucleic acid molecule. In some examples, cleavage of the double-stranded nucleic acid molecule by a type IA topoisomerase produces a 5' phosphate and a 3' hydroxyl group at the cleavage site, wherein the type IA topoisomerase is covalently bound to the 5' end of the cleavage chain. Cleavage of the double-stranded nucleic acid molecule by a type IB topoisomerase can produce a 3' phosphate and a 5' hydroxyl group at the cleavage site, wherein the type IB topoisomerase is covalently bound to the 3' end of the cleavage chain.
Examples of type IA topoisomerase include escherichia coli topoisomerase I, escherichia coli topoisomerase III, eukaryotic topoisomerase II, archaebacteria reverse rotamase, yeast topoisomerase III, drosophila topoisomerase III, human topoisomerase III, streptococcus pneumoniae topoisomerase III, and the like, including other type IA topoisomerase. The enzyme is covalently bound to a 5' -thymidine residue, forming a DNA-protein adduct, wherein cleavage occurs between the two thymidine residues.
Examples of type IB topoisomerase include the nuclear type I topoisomerase present in all eukaryotic cells and those encoded by vaccinia and other cellular poxviruses. Examples of eukaryotic type IB topoisomerase are those expressed in yeast, drosophila and mammalian cells (including human cells). Examples of virus type IB topoisomerase are those produced by vertebrate poxviruses (vaccinia, shap fibroma (Shope fibroma) virus, ORF virus, chicken pox virus and molluscum contagiosum (molluscum contagiosum) virus) and entomopoxviruses (Sang Denge entomopoxviruses (Amsacta moorei entomopoxvirus)).
Examples of type II topoisomerase include bacterial gyrase, bacterial DNA topoisomerase IV, eukaryotic DNA topoisomerase II and DNA topoisomerase encoded by T-even phage. Type II topoisomerase may have cleavage and ligation activities. A substrate double-stranded nucleic acid molecule of a type II topoisomerase can be prepared such that the type II topoisomerase can form a covalent link with one strand at a cleavage site. For example, calf thymus type II topoisomerase can cleave a substrate double stranded nucleic acid molecule containing a 5 'recessed topoisomerase recognition site three nucleotides from the 5' end, resulting in dissociation of the three nucleic acid molecules located 5 'of the cleavage site and covalent binding of the topoisomerase to the 5' end of the double stranded nucleic acid molecule. In addition, when such a double stranded nucleic acid molecule carrying a type II topoisomerase is contacted with a second nucleic acid molecule containing a 3' hydroxyl group, the type II topoisomerase can ligate the sequences together and then be released from the recombinant nucleic acid molecule.
Structural analysis of topoisomerase has shown that members of each particular topoisomerase family, including type IA, IB and II topoisomerase, share common structural features with other members of the family. Furthermore, sequence analysis of various type IB topoisomerase enzymes showed that the structure is highly conserved, especially in the catalytic domain. For example, the domains of amino acids 81 to 314 of a vaccinia topoisomerase comprising 314 amino acids have significant homology to other type IB topoisomerases, and the isolated domains have substantially the same activity as the full length topoisomerase, but the isolated domains have a slower turnover rate and lower binding affinity to the recognition site. In addition, mutant vaccinia topoisomerase that is mutated at the amino-terminal domains (e.g., at amino acid residues 70 and 72) can exhibit the same properties as full-length topoisomerase. Mutation analysis of vaccinia IB type topoisomerase revealed a large number of amino acid residues that could be mutated without affecting topoisomerase activity and identified several amino acids required for activity. Given the high degree of homology between the catalytic domains of vaccinia topoisomerase and other type IB topoisomerases, and the detailed mutational analysis of vaccinia topoisomerase, it will be appreciated that for the purposes of the present invention, isolated catalytic domains of type IB topoisomerase and type IB topoisomerases having multiple amino acid mutations can be used in the methods of the present invention and are therefore considered to be topoisomerases.
Various topoisomerase enzymes exhibit a range of sequence specificities. For example, a type II topoisomerase can bind to a variety of sequences, but cleave at highly specific recognition sites. Type IB topoisomerase can include a site-specific topoisomerase that binds to and cleaves specific nucleotide sequences ("topoisomerase recognition sites"). When a double-stranded nucleic acid molecule is cleaved by a topoisomerase (e.g., type IB topoisomerase), the energy of the phosphodiester linkage is conserved via the formation of a phosphotyrosyl linkage between a specific tyrosine residue in the topoisomerase and the 3' nucleotide of the topoisomerase recognition site. In the case where the topoisomerase cleavage site is near the 3' end of the nucleic acid molecule, the downstream sequence (3 ' of the cleavage site) can be dissociated, leaving a nucleic acid molecule with a topoisomerase covalently bound to the newly generated 3' end.
Covalently bound topoisomerase enzymes can also catalyze reverse reactions, such as covalent attachment of a 3' nucleotide of a recognition sequence to a nucleic acid molecule containing a free 5' hydroxyl group, with type IB topoisomerase enzymes attached to the 3' nucleotide via a phosphotyrosyl linkage. Thus, methods have been developed for producing recombinant nucleic acid molecules using type IB topoisomerase. For example, a nucleic acid molecule to be cloned into such a vector, such as a nucleic acid molecule comprising a cDNA library, or a restriction fragment, or a sheared genomic DNA sequence, is treated with a phosphatase to generate a 5' hydroxyl end, and then added to a linearized vector under conditions that allow the topoisomerase to ligate the nucleic acid molecule at the 5' end containing the hydroxyl group and the 3' end containing the covalently bound topoisomerase.
An example of a vaccinia virus encodes a 314 amino acid type I topoisomerase that is capable of site-specific single-stranded nicking of double-stranded DNA and 5' hydroxyl-driven religation. Site-specific type I topoisomerase enzymes include, but are not limited to, viral topoisomerase enzymes such as poxviral topoisomerase enzymes. Examples of poxvirus topoisomerase include the schoxynoma virus and the ORF virus. Other site-specific topoisomerase enzymes are well known to those skilled in the art and can be used in the practice of the present invention.
Examples of vaccinia topoisomerase bind duplex DNA and cleave the phosphodiester backbone of one strand, while exhibiting a high level of sequence specificity. Cleavage can occur at related sequences in the consensus pentapyrimidine element 5' - (C/T) CCTT ∈or in the frangible chain. In one embodiment, the frangible bond is located in the range of 2 to 12bp from the 3' end of the duplex DNA. In another embodiment, the vaccinia topoisomerase forms a cleavable complex requiring six duplex nucleotides upstream and two nucleotides downstream of the cleavage site.
In some examples, the topoisomerase is DNA topoisomerase I, e.g., vaccinia virus topoisomerase I. The topoisomerase may be preloaded with the donor polynucleotide. Vaccinia virus topoisomerase may require a target comprising a 5' -OH group.
TnpB-directed excision-transposition system
Embodiments disclosed herein provide an engineered or non-naturally guided excision-transposition system. An engineered or non-naturally directed excision-transposition system may comprise one or more components of the omega RNA-TnpB system, such as an omega RNA scaffold and a spacer and/or a TnpB polypeptide, and one or more components of a class II transposon. Components of the ωrna-TnpB system can direct a class II transposon component to a retrotransposon of a target nucleic acid sequence and direct its transposition into a receptor polynucleotide.
For example, an engineered or non-naturally directed excision-transposition system may comprise (a) a first TnpB protein; (b) A first class II transposon polypeptide coupled to or otherwise capable of forming a complex with a first TnpB protein; (c) A first guide molecule capable of forming a first omega RNA-TnpB complex with a first TnpB protein and directing site-specific binding to a first target sequence of a first target polynucleotide; (d) a second TnpB protein; (e) A second class II transposon polypeptide coupled to or otherwise capable of forming a complex with a second TnpB protein; (f) A second guide molecule capable of forming a second ωrna-TnpB complex with the first TnpB protein and directing site-specific binding to a second target sequence of the first target polynucleotide; and (g) a class II transposon polynucleotide comprising a first target polynucleotide and capable of forming a complex with the first and second TnpB proteins, the first and second guide molecules, and the first and second class II transposon polypeptides.
In some embodiments, the engineered or non-naturally directed excision-transposition system may comprise (h) a third guide molecule capable of complexing with the first TnpB protein and directing site-specific binding to the first target sequence of the second target polynucleotide, wherein the third guide molecule is optionally coupled to the first TnpB protein; (i) Optionally, a first guide molecule polynucleotide encoding a third guide molecule; (j) A fourth guide molecule capable of complexing with a second TnpB protein and directing site-specific binding to a second target sequence of a second target polynucleotide, wherein the fourth guide molecule is optionally coupled to the second TnpB protein; and (k) optionally, a second guide molecule polynucleotide encoding a fourth guide molecule.
In some embodiments, the first and second class II transposon polypeptides are capable of cleaving a first target polynucleotide from the class II transposon polynucleotide. In some embodiments, the first and second class II transposon polypeptides are capable of transposing a first target polynucleotide into a second target polynucleotide. In some embodiments, the first target polynucleotide does not include one or more class II transposon long terminal repeats.
The engineered or non-naturally guided excision-transposition systems described herein may be based on class II transposons or class II transposon systems. An engineered or non-naturally directed excision-transposition system may comprise a first target polynucleotide (also referred to herein as a donor polynucleotide or transposon) and a second target polynucleotide (also referred to herein as an acceptor polynucleotide). As used herein, "transposon" (also referred to as a transposable element) refers to a polynucleotide sequence that is capable of moving from a location in the genome to another location. There are several classes of transposons. Transposons include retrotransposons (class I transposons) and DNA transposons (class II transposons). In some cases, retrotransposons require transcription of a polynucleotide that is moved (or transposed) in order to transpose the polynucleotide to a new genome or polynucleotide. DNA transposons are those that do not require reverse transcription of a polynucleotide that is moved (or transposed) in order to transpose the polynucleotide to a new genome or polynucleotide.
Any suitable transposon system may be used. Suitable transposons and systems thereof may include, but are not limited to, the sleeping American transposon system (Tc 1/mariner superfamily) (see, e.g., ivics et al 1997.Cell.91 (4): 501-510), piggyBac (piggyBac superfamily) (see, e.g., li et al 2013 110 (25): E2279-E2287 and Yusa et al 2011.PNAS.108 (4): 1531-1536), tol2 (superfamily hAT), frog Prince (Frag Prince) (Tc 1/mariner superfamily) (see, e.g., miskey et al 2003Nucleic Acid Res.31 (23): 6873-6881), and variants thereof).
In some embodiments, the first and/or second class II transposon polypeptides are DD [ E/D ] transposons or transposon polypeptides. In some embodiments, the first and/or second class II transposon polynucleotides are Tc1/mariner, piggyBac, frog prince, tn3, tn5, hAT, CACTA, P, mutator, PIF/Harbinger, transib, or Merlin/IS1016 transposon polynucleotides. In some embodiments, the first and/or second class II transposon polypeptides are Tc1/mariner, piggyBac, frog prince, tn3, tn5, hAT, CACTA, P, mutator, PIF/Harbinger, transib, or Merlin/IS1016 transposon polypeptides.
Suitable class II transposon systems and components that may be utilized may also be and are not limited to those described in Han et al, 2013.BMC Genomics.14:71,doi:10.1186/1471-2164-14-71, lopez and Garcia-Perez.2010.Curr. Genomics.11 (2): 115-128; wessler.2006.PNAS.103 (47): 176000-17401; gao et al, 2017.Marine Genomics.34:67-77; bradic et al 2014.Mobile DNA.5 (12) doi 10.1186/1759-8753-5-12; li et al, 2013.PNAS.110 (25) E2279-E2287; kebriaei et al 2017.Trends in Genetics.33 (11): 852-870); miskey et al 2003.Nucleic Acid res.31 (23): 6873-6881; nicolas et al 2015.Microbiol Spectr.3 (4) doi 10.1128/microbiolspec. MDNA3-0060-2014); W.S. Reznikoff.1993.Annu Rev. Microbiol.47:945-963; rubin et al 2001.Genetics.158 (3): 949-957; wicker et al 2003.Plant Physiol.132 (1): 52-63; majumdar and rio.2015.Microbiol. Spectrum.3 (2) doi 10.1128/microbiolspec.mdna3-0004-2014; listch.2002. Trends in Plant Sci.7 (11): 498-504; sinzelle et al 2007 PNAS.105 (12): 4715-4720; han et al 2014; genome biol. Evol.6 (7): 1748-1757; grzebelus et al 2006; mol. Genet. Genomics.275 (5): 450-459; zhang et al 2004.Genetics.166 (2): 971-986; chen and Li.2008.Gene.408 (1-2): 51-63; C.Feschotte.2004.mol.biol.Evol.21 (9): 1769-1780.
TnpB retrotransposon system
The systems and compositions herein may comprise TnpB, one or more nucleic acid components, and one or more components of a retrotransposon (e.g., a non-LTR retrotransposon). One or more components of the retrotransposon include retrotransposon proteins and retrotransposon RNAs. The systems and compositions can be used to insert a donor polynucleotide into a target polynucleotide. The systems and compositions may also comprise a donor polynucleotide.
In some examples, the present disclosure provides an engineered, non-naturally occurring composition comprising: a TnpB polypeptide; a non-LTR retrotransposon protein associated with or otherwise capable of forming a complex with a TnpB polypeptide; a single nucleic acid component capable of forming a complex with a TnpB polypeptide and directing site-specific binding to a target sequence of a target polynucleotide. The composition may further comprise a donor construct comprising a donor polynucleotide for insertion into the target polynucleotide and positioned between two binding elements capable of forming a complex with a non-LTR retrotransposon protein. In some cases, the TnpB polypeptide is engineered to have nicking enzyme activity.
In some examples, the TnpB polypeptide is fused to the N-terminus of a non-LTR retrotransposon protein. In some examples, the TnpB polypeptide is fused to the C-terminus of a non-LTR retrotransposon protein.
The nucleic acid component molecule can direct the fusion protein to a target sequence 5' to the targeted insertion site, and wherein the TnpB polypeptide produces a double strand break at the targeted insertion site. The nucleic acid component molecule can direct the fusion protein to a target sequence located 3' of the targeted insertion site, and wherein the TnpB polypeptide produces a double strand break at the targeted insertion site.
The donor polynucleotide may also comprise a polymerase processing element to facilitate processing of the 3' end of the donor polynucleotide sequence. The polymerase may be a DNA polymerase, such as DNA polymerase I. In some examples, the polymerase may be an RNA polymerase.
In some examples, the donor polynucleotide further comprises a region homologous to a target sequence on the 5 'end of the donor construct, the 3' end of the donor construct, or both. In some examples, the length of the homologous region is 1 to 50, 5 to 30, 8 to 25, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 base pairs.
Natural or wild-type non-LTR retrotransposons encode the protein machinery necessary for their self-mobilization. The non-LTR retrotransposon element comprises a DNA element integrated into the host genome. The DNA element may encode one or two Open Reading Frames (ORFs). For example, the R2 element of Bombyx mori (Bombyx mori) encodes a single ORF and restriction enzyme-like (REL) domain containing Reverse Transcriptase (RT) activity. The L1 element encodes two ORFs, ORF1 and ORF2.ORF1 contains a leucine zipper domain and a C-terminal nucleic acid binding domain involved in protein-protein interactions. ORF2 has an N-terminal apurinic/apyrimidinic endonuclease (APE), a central RT domain and a C-terminal cysteine histidine-rich domain. An exemplary replication cycle for a non-LTR retrotransposon may involve transcription of a full length retrotransposon element to produce an mRNA active element (retrotransposon RNA). The active element mRNA is translated to produce the encoded retrotransposon protein or polypeptide. A ribonucleoprotein complex is formed that comprises the active element and a retrotransposon protein or polypeptide, and this RNP facilitates integration of the active element into the genome. The RNA-transposase complex nicks the genome. The 3' end of the nicked DNA was used as a primer to allow reverse transcription of transposon RNA into cDNA. Fourth, transposase proteins integrate the cDNA into the genome.
The elements of these systems may be engineered to function in the context of the present invention. For example, a non-LTR retrotransposon polypeptide can be fused to a site specific nuclease. Binding elements that allow binding of non-LTR retrotransposon polypeptides to native retrotransposon DNA elements can be engineered into the donor construct to facilitate entry of the donor polynucleotide sequence into the target polypeptide.
In the present invention, the protein component of the non-LTR retrotransposon may be linked to a site-specific nuclease (e.g., a TnpB polypeptide), or otherwise engineered to form a complex with the site-specific nuclease. Retrotransposon RNAs can be engineered to encode donor polynucleotide sequences. Thus, in certain exemplary embodiments, a TnpB polypeptide complex is formed via the TnpB polypeptide and nucleic acid component molecular sequence that directs a retrotransposon complex (e.g., a retrotransposon polypeptide and a retrotransposon RNA) to a target sequence in a target polynucleotide, wherein the retrotransposon RNP complex facilitates integration of a donor polynucleotide sequence into the target polynucleotide. Thus, the one or more non-LTR retrotransposon components may comprise a retrotransposon polypeptide or functional domain thereof that facilitates binding of a retrotransposon RNA, reverse transcription of the retrotransposon RNA into cDNA, and/or integration of a donor polynucleotide into a target polynucleotide, and a retrotransposon RNA element modified to encode a donor polynucleotide sequence.
Examples of non-LTR retrotransposons include CRE, R2, R4, L1, RTE, tad, R1, LOA, I, jockey, CR1. In one example, the non-LTR retrotransposon is R2. In another example, the non-LTR retrotransposon is L1. Examples of non-LTR retrotransposons may include those described in the following: christensen SM et al, RNA from the 5'end of the R2 retrotransposon controls R2 protein binding to and cleavage of its DNA target site,Proc Natl Acad Sci U S A.2006, 11 months, 21 days; 103 (47) 17602-7; eickbush TH et al, integration, regulation, and Long-Term Stability of R2 retrotrans-posons, microbiol spectra 2015, month 4; 3 (2) MDNA3-0011-2014.Doi:10.1128/microbiolspec. MDNA3-0011-2014; han JS, non-long terminal repeat (Non-LTR) retrotransdonsons: mechanics, recent developments, and unanswered questions, mob DNA.2010, 5, 12; 1 (1) 15.doi:10.1186/1759-8753-1-15; malik HS et al The age and evolution of non-LTR retrotransposable elements, mol Biol Evol.1999, month 6; 16 (6) 793-805, which is incorporated herein by reference in its entirety.
Examples of non-LTR retrotransposon polypeptides also include R2 from clonorchis sinensis (Clonorchis sinensis) or diphtheria strip (Zonotrichia albicollis).
The non-LTR retrotransposon may comprise a plurality of retrotransposon polypeptides or polynucleotides encoding the polypeptides. In one embodiment, the retrotransposon polypeptide may form a complex. For example, a non-LTR retrotransposon is a dimer, e.g., comprises two retrotransposon polypeptides that form a dimer. The dimeric subunits may be linked or form a tandem fusion. The TnpB polypeptide may be associated (e.g., linked) with one or more subunits of such a complex. In some examples, the non-LTR retrotransposon is a dimer of two retrotransposon polypeptides; one of the retrotransposon polypeptides comprises nuclease or nickase activity and is linked to a TnpB polypeptide.
Retrotransposon polypeptides may comprise one or more modifications to, for example, enhance the specificity or efficiency of donor polynucleotide recognition, target-primed template recognition (TPTR). Retrotransposon polypeptides may also comprise one or more truncations or excision to remove domains or regions of the wild type protein to obtain a minimal polypeptide that retains donor polynucleotide recognition and TPTR. In some exemplary embodiments, the native endonuclease activity may be mutated to eliminate the endonuclease activity.
In certain exemplary embodiments, the modification or truncation of the non-LTR retrotransposon peptide may be in a zinc finger region, myb region, basic region, reverse transcriptase domain, cysteine-histidine-rich motif or endonuclease domain.
The non-LTR retrotransposon may comprise a polynucleotide encoding one or more retrotransposon RNA molecules. The polynucleotide may comprise one or more regulatory elements. The regulatory element may be a promoter. Regulatory elements and promoters on polynucleotides include those described throughout this application. For example, the polynucleotide may comprise a pol2 promoter, a pol3 promoter, or a T7 promoter.
In some cases, the polynucleotide encodes a retrotransposon RNA, at least a portion of the sequence of which is complementary to the target sequence. For example, the 3' end of the retrotransposon RNA may be complementary to the target sequence. The RNA can be complementary to a portion of the nicked target sequence. In one embodiment, the retrotransposon RNA can comprise one or more donor polynucleotides. In some cases, a retrotransposon RNA can encode one or more donor polynucleotides.
Retrotransposon RNAs may be able to bind retrotransposon polypeptides. Such retrotransposon RNAs may comprise one or more elements for binding to a retrotransposon polypeptide. Examples of binding elements include hairpin structures, pseudojunctions (e.g., nucleic acid secondary structures containing at least two stem-loop structures, with one stem half being embedded between the two stem halves), stem-loops, and projections (e.g., unpaired nucleotide segments located within one strand of a nucleic acid duplex). In certain examples, the retrotransposon RNA comprises one or more hairpin structures. In some examples, the retrotransposon RNA comprises one or more pseudojunctions. In certain examples, the retrotransposon RNA comprises a sequence encoding a donor polynucleotide and one or more binding elements for forming a complex with a retrotransposon polypeptide. The binding member may be located at the 5 'end or the 3' end.
In one embodiment, the retrotransposon RNA comprises a region capable of hybridizing to an overhang of a target polynucleotide at a target site. The overhang may be a single strand of DNA. The overhang may serve as a primer for reverse transcription of at least a portion of the retrotransposon RNA into cDNA. In some cases, a region of the cDNA may be capable of hybridizing to a second overhang of the target polynucleotide. The second overhang may serve as a primer for synthesizing the second strand to produce double-stranded cDNA. The cDNA may comprise a donor polynucleotide sequence. The two overhangs may be from different strands of the target polynucleotide.
Reverse transcriptase domain
The one or more functional domains may be one or more reverse transcriptase domains. In some embodiments, the system comprises an engineering system for modifying a target polynucleotide comprising: the TnpB protein or variant thereof (e.g., dTnpB); a Reverse Transcriptase (RT) domain; an RNA template comprising or encoding a donor polynucleotide to be inserted into a target sequence of a target polynucleotide; and omega RNA molecules (i.e., natural single guide RNA molecules comprising scaffolds for reprogramming).
Reverse transcriptase can produce single stranded DNA based on RNA templates. Single-stranded DNA may be produced from non-retroons, retrotranscripts or diversity-producing reverse transcription elements (DGRs). In some examples, single stranded DNA may be produced from a self-priming RNA template. Self-priming RNA templates can be used to generate DNA without the need for separate primers.
The reverse transcriptase domain may be a reverse transcriptase or a fragment thereof. A variety of Reverse Transcriptases (RTs) may be used in alternative embodiments of the invention, including prokaryotic and eukaryotic RTs, provided that the RTs function within the host to produce the donor polynucleotide sequence from the RNA template. If desired, the nucleotide sequence of the native RT may be modified, for example using known codon optimization techniques, so as to optimise expression in the desired host. Reverse Transcriptase (RT) is an enzyme used to produce complementary DNA (cDNA) from an RNA template, a process known as reverse transcription. Reverse transcriptase is used by retroviruses to replicate their genome, by retrotransposon mobile genetic elements to proliferate within the host genome, by eukaryotic cells to lengthen telomeres at the ends of their linear chromosomes, and by some non-retroviruses such as hepatitis b virus (a member of the hepadnaviridae family, which is dsDNA-RT virus). Retrovirus RT has three sequential biochemical activities: RNA-dependent DNA polymerase activity, ribonuclease H, and DNA-dependent DNA polymerase activity. In general, these activities enable enzymes to convert single-stranded RNA into double-stranded cDNA. In certain embodiments, the RT domain of reverse transcriptase is used in the present invention. The domain may comprise only RNA-dependent DNA polymerase activity. In some examples, the RT domain is non-mutagenic, i.e., does not cause mutation of the donor polynucleotide (e.g., during the reverse transcriptase process). In some examples, the RT domain may be a non-retroviral RT, such as a viral RT or a human endogenous RT. In some examples, the RT domain may be a reverse transcriptase RT or DGR RT. In some examples, RT may have lower mutagenesis than the corresponding wild-type RT. In some embodiments, RT herein is free of mutagenesis.
Reverse transcriptome
In certain embodiments, the donor template for homologous recombination is generated by using a self-priming RNA template for reverse transcription. A non-limiting example of a self-priming reverse transcription system is a reverse transcription subsystem. The term "reverse transcriptase" means a genetic element encoding a component capable of synthesizing branched RNA-linked single stranded DNA (msDNA) and reverse transcriptase. Reverse transcriptomes encoding msDNA are known in the art, such as, but not limited to, U.S. patent No. 6,017,737; U.S. patent No. 5,849,563; U.S. patent No. 5,780,269; U.S. patent No. 5,436,141; U.S. patent No. 5,405,775; U.S. patent No. 5,320,958; CA 2,075,515; said patent is incorporated herein by reference in its entirety.
In certain embodiments, the reverse transcriptase domain is a reverse transcriptase RT domain. In certain embodiments, the RNA template encodes a reverse-transcribed RNA template that is recognized and reverse transcribed by a reverse-transcribed reverse-transcriptase domain. Reverse transcriptomes are conserved across many bacterial species and are highly efficient reverse transcription systems with relatively unknown function. The reverse transcription subsystem consisted of reverse transcription RT proteins, msr and msd transcripts, msr and msd transcripts acting as primer and template sequences, respectively. All components of the reverse transcription subsystem are expressed from a single open reading frame as a single transcript that includes msr-msd and encodes the reverse transcript RT protein (Lampson et al 2005,Retrons,msDNA,and the bacterial genome.Cytogenet Genome Res 110:491-499). The msr element ORF of the reverse transcript provides the RNA portion of the msDNA molecule, while the msd element ORF provides the DNA portion of the msDNA molecule. The primary transcript of the msr-msd region is considered both as a template and primer to produce msDNA. The synthesis of msDNA is initiated from internal rG residues of RNA transcripts, using their 2' -OH groups. Msd or msr can also be modified to allow insertion of an RNA template encoding a donor polynucleotide into msd without altering the function or production of msDNA. The RNA template encoding the donor polynucleotide sequence may be any length, but is preferably less than about 5kb nucleotides, or still less than about 2kb, or still less than 500 bases, provided that an msDNA product is produced.
TnpB diversity generation reverse transcription element system
In certain embodiments, one or more of the functional domains may be a diversity generating reverse transcription element (e.g., DGR described in US20100041033 A1). In some embodiments, DGR can use its homing mechanism to insert donor polynucleotides. For example, DGR can associate with catalytically inactive TnpB proteins (e.g., dead TnpB) and use homing mechanisms to integrate single stranded DNA. In some examples, the DGR may have a lower mutagenicity than the corresponding wild-type DGR. In some examples, DGR is not prone to error. In some embodiments, DGR herein is not mutagenic. The non-mutagenic DGR may be a mutant of the wild-type DGR. As used herein, the term "DGR" encompasses both diversity-producing reverse transcription element polynucleotides and proteins encoded by diversity-producing reverse transcription element polynucleotides. In some examples, the DGR may be a protein encoded by a diversity-generating reverse transcription element polynucleotide having reverse transcriptase activity. In some examples, the DGR may be a protein encoded by a diversity-generating reverse transcription element polynucleotide having reverse transcriptase activity and integrase activity. In some cases, the template or donor polynucleotide may be encoded by a diversity-generating reverse transcription element polynucleotide. In some cases, the template may be a polynucleotide that is different from the diversity-generating reverse transcription element polynucleotide, e.g., provided as a separate construct or molecule.
In some embodiments, the DGR herein may also include group II introns (as well as any proteins and polynucleotides encoded thereby) that are mobile ribozymes that self-splice from precursor RNAs to produce excised intronic lasso RNAs that then invade the new genomic DNA site by reverse splicing. Examples of Group II Introns include Lambowitz AM et al, group II Introns Mobile Ribozymes that Invade DNA, cold Spring Harb Perspect biol.2011, month 8; 3 (8) those described in a 003616.
In some embodiments, diversity generating reverse transcription elements (DGR) are genetic elements that can generate targeted large amounts of variation in the genome carrying these elements. In some embodiments, the DGR system relies on error-prone reverse transcriptase to generate mutagenized cDNA (containing A to N mutations) from the Template Region (TR) to replace a region known as the Variable Region (VR) similar to the TR region-this process is known as mutagenized reverse transcription homing (mutagenic retrohoming) (see, e.g., sharifi and Ye, myDGR: a server for identification and characterization of diversity-generating reverse transcriptions.nucleic Acids Res.2019, 7 month 2 days; 47 (W1): W289-W294). DGR may include a family of unique reverse transcription elements that generate DNA sequence diversity. They are widely found in bacteria, archaea, phages and plasmids and benefit hosts by introducing variations and accelerating the evolution of target proteins (see, e.g., yan et al, discovery and characterization of the evolution, variation and functions of diversity-generating retroelements using thousands of genomes and methods.BMC genomics.2019; 20:595). The first DGR is found in the Botrytis phage BPP-1. Bordetella causes respiratory tract infections in humans and many other mammals, controlled by the BvgAS signaling system. The surface of bordetella varies greatly due to dynamic gene expression during the infection cycle. The invasion of bordetella by BPP-1 is dependent on phage tail fibrin Mtd. By mutagenesis, reverse transcription and cDNA integration, DGR can introduce multiple nucleotide substitutions into the Mtd gene and create different receptor binding molecules, thus conferring BPP-1 the ability to invade different cell surfaces by Bode's bacteria.
The system can be used to generate ssDNA donors using reverse transcriptase RT or DGR RT, and then integrate the donor by homologous recombination at the time of target cleavage or nicking using the TnpB polypeptide. In some embodiments, the system may comprise DGR and/or group II intron reverse transcriptase. The homing mechanism of the DGR or group II introns can be used to modify the target polynucleotide. DGR or group II intronic reverse transcriptase may be directed to the target polynucleotide by tethering to the nuclease dead TnpB polypeptide, TALE or ZF protein. In another embodiment, a non-reverse transcriptase/DGR reverse transcriptase (e.g., viral RT) may be used to generate cDNA for self-priming RNA. In some embodiments, ssDNA may be produced by RT, but integrated using dead TnpB enzyme, thereby producing an accessible R loop instead of nicking/cutting.
TnpB topoisomerase system
The one or more functional domains may be one or more topoisomerase domains. In some embodiments, an engineering system for modifying a target polynucleotide comprises: tnpB protein; a topoisomerase domain; and a nucleic acid template comprising or encoding a donor polynucleotide to be inserted into a target sequence of a target polynucleotide. In some examples, the TnpB protein; a topoisomerase domain; and two or more of the nucleic acid templates may form a complex. In some examples, the TnpB protein; two or more of the topoisomerase domains may be comprised in a fusion protein.
Topoisomerase is a class of enzymes that alters the topological state of DNA via cleavage and re-ligation of nucleic acid strands. In some cases, the topoisomerase may be a DNA topoisomerase, an enzyme that controls and alters DNA topology during transcription, and catalyzes the transient cleavage and re-joining of single strands of DNA to allow the strands to pass through each other, thereby altering the topology of the DNA.
In some embodiments, the topoisomerase domain is capable of ligating a donor polynucleotide with a target polynucleotide. The connection may be achieved by an adhesive end or a blunt end connection. In one example, the donor polynucleotide may comprise an overhang comprising a sequence complementary to a region of the target polynucleotide. Examples of ligating donor polynucleotides to target polynucleotides include those of TOPO clones, for example those described in "The Technology Behind TOPO Cloning" of www.thermofisher.com/us/en/home/life-science/cloning/TOPO/TOPO-resources/the-technology-bond-TOPO-cloning.
In some embodiments, the topoisomerase domain can be associated with a donor polynucleotide. For example, a topoisomerase domain is covalently linked to a donor polynucleotide.
In some embodiments, the topoisomerase domain can be provided with, e.g., associated with (e.g., fused to) a TnpB protein (e.g., a TnpB protein or variant thereof, such as dead TnpB or TnpB nickase). Alternatively or additionally, the topoisomerase domain may be located on a different molecule than the TnpB protein. In some cases, the topoisomerase domain can be associated with the donor polynucleotide. For example, the topoisomerase domain may be pre-loaded covalently with the donor DNA molecule. Such a design may allow for only the efficient connection of specific goods. The topoisomerase domain can ligate a donor polynucleotide (e.g., a DNA molecule) to a target site (e.g., a free double-stranded DNA end) on a target polynucleotide. In some embodiments, the donor polynucleotide may have an overhang comprising a sequence complementary to a region of the target polynucleotide. For example, the overhang may invade the target polynucleotide at the cleavage site created by the TnpB protein.
Examples of topoisomerase include type I topoisomerase, including type IA and type IB topoisomerase, which cleaves a single strand of a double-stranded nucleic acid molecule, and type II topoisomerase (e.g., gyrase), which cleaves both strands of a double-stranded nucleic acid molecule.
Type IA and type IB topoisomerase cleave one strand of a double-stranded nucleic acid molecule. In some examples, cleavage of the double-stranded nucleic acid molecule by a type IA topoisomerase produces a 5' phosphate and a 3' hydroxyl group at the cleavage site, wherein the type IA topoisomerase is covalently bound to the 5' end of the cleavage chain. Cleavage of the double-stranded nucleic acid molecule by a type IB topoisomerase can produce a 3' phosphate and a 5' hydroxyl group at the cleavage site, wherein the type IB topoisomerase is covalently bound to the 3' end of the cleavage chain.
Examples of type IA topoisomerase include escherichia coli topoisomerase I, escherichia coli topoisomerase III, eukaryotic topoisomerase II, archaebacteria reverse rotamase, yeast topoisomerase III, drosophila topoisomerase III, human topoisomerase III, streptococcus pneumoniae topoisomerase III, and the like, including other type IA topoisomerase. The enzyme is covalently bound to a 5' -thymidine residue, forming a DNA-protein adduct, wherein cleavage occurs between the two thymidine residues.
Examples of type IB topoisomerase include the nuclear type I topoisomerase present in all eukaryotic cells and those encoded by vaccinia and other cellular poxviruses. Examples of eukaryotic type IB topoisomerase are those expressed in yeast, drosophila and mammalian cells (including human cells). Examples of virus type IB topoisomerase are those produced by vertebrate poxviruses (vaccinia, schottky fibroma virus, ORF virus, chicken poxvirus and infectious mollusc virus) and entomopoxviruses (Sang Denge entomopoxvirus).
Examples of type II topoisomerase include bacterial gyrase, bacterial DNA topoisomerase IV, eukaryotic DNA topoisomerase II and DNA topoisomerase encoded by T-even phage. Type II topoisomerase may have cleavage and ligation activities. A substrate double-stranded nucleic acid molecule of a type II topoisomerase can be prepared such that the type II topoisomerase can form a covalent link with one strand at a cleavage site. For example, calf thymus type II topoisomerase can cleave a substrate double stranded nucleic acid molecule containing a 5 'recessed topoisomerase recognition site three nucleotides from the 5' end, resulting in dissociation of the three nucleic acid molecules located 5 'of the cleavage site and covalent binding of the topoisomerase to the 5' end of the double stranded nucleic acid molecule. In addition, when such a double stranded nucleic acid molecule carrying a type II topoisomerase is contacted with a second nucleic acid molecule containing a 3' hydroxyl group, the type II topoisomerase can ligate the sequences together and then be released from the recombinant nucleic acid molecule.
In some examples, the topoisomerase is DNA topoisomerase I, e.g., vaccinia virus topoisomerase I. The topoisomerase may be preloaded with the donor polynucleotide. Vaccinia virus topoisomerase may require a target comprising a 5' -OH group.
TnpB phosphatase system
The systems herein may also comprise a phosphatase domain. Phosphatases are enzymes that are capable of removing phosphate groups from molecules (e.g., nucleic acids, such as DNA). Examples of phosphatases include calf intestinal phosphatase, shrimp alkaline phosphatase, antarctic phosphatase and APEX alkaline phosphatase.
In some examples, the 5' -OH group in the target polynucleotide may be produced by a phosphatase. Topoisomerase enzymes compatible with 5' phosphate targets can be used to generate stably loaded intermediates. In some cases, a TnpB polypeptide that leaves a 5' oh after cleavage of a target polynucleotide may be used. In some cases, the phosphatase domain can be associated (e.g., fused) with the TnpB protein. The phosphatase domain may be capable of generating an-OH group at the 5' end of the target polynucleotide. The phosphatase may be delivered separately from the other components in the system on a separate carrier from the other components, for example as a separate protein.
TnpB polymerase system
The systems herein may also comprise a polymerase domain. Polymerase refers to an enzyme that synthesizes a nucleic acid strand. The polymerase may be a DNA polymerase or an RNA polymerase.
In some embodiments, the system comprises an engineering system for modifying a target polynucleotide, the engineering system comprising: tnpB protein; a DNA polymerase domain; and a DNA template comprising a donor polynucleotide to be inserted into a target sequence of a target polynucleotide. In some examples, the TnpB protein; a DNA polymerase domain; and two or more of the DNA templates may form a complex. In some examples, the TnpB protein; two or more of the DNA polymerase domains are contained in a fusion protein. For example, the TnpB protein and DNA polymerase domains may be included in a fusion protein.
In some embodiments, the system may comprise a TnpB enzyme (or variant thereof, such as dTnpB or TnpB nickase) and a DNA polymerase (e.g., phi29, T4, T7 DNA polymerase). The system may also comprise single-stranded DNA or double-stranded DNA templates. The DNA template may comprise i) a first sequence homologous to a target site of a TnpB protein on the target polynucleotide, and/or ii) a second sequence homologous to another region of the target polynucleotide. In some embodiments, the template may be a synthetic single stranded DNA molecule or a PCR-generated DNA molecule (optionally end-protected by modified nucleotides) or a viral genome (e.g., AAV). In another embodiment, a reverse transcriptase is used to produce a template. When the system is delivered into a cell, endogenous DNA polymerase in the cell may be used. Alternatively or additionally, exogenous DNA polymerase may be expressed in the cell.
The DNA template may be end-protected by one or more modified nucleotides or comprise a portion of the viral genome. In some embodiments, the DNA template comprises an LNA or other modification (e.g., at the 3' end). The presence of LNA and/or modification may result in more efficient annealing of the 3' flap produced by cleavage of the TnpB protein.
Examples of the DNA polymerase include Taq, tne (exogenous-), tma (exogenous-), pfu (exogenous-), pwo (exogenous-), thermoanaerobacter thermohydrosulfide, thermococcus coaster DNA polymerase I, escherichia coli DNA polymerase I, taq DNA polymerase I, tth DNA polymerase I, bacillus stearothermophilus (Bacillus stearothermophilus) (Bst) DNA polymerase I, escherichia coli DNA polymerase III, phage T5 DNA polymerase, phage M2 DNA polymerase, phage T4DNA polymerase, phage T7 DNA polymerase, phage phi29 DNA polymerase, phage PRD 1DNA polymerase, phi15 DNA polymerase, phage phi21DNA polymerase, phage PZE DNA polymerase, phage PDNA polymerase, nf DNA polymerase, ZA 2Y DNA polymerase, phage B103 DNA polymerase, phage SF5 DNA polymerase, phage GA-1DNA polymerase, phage GA-5 DNA polymerase, cp-7, phage PR4, PR5 DNA polymerase, PR 17 and PR 17.
TnpB ligase System
Generally, the system comprises a TnpB protein and a ligase associated with the TnpB protein. The TnpB protein may be recruited to the target sequence by omega RNAs comprising spacers capable of binding to the target sequence and create breaks in the target sequence. Omega RNAs can also contain template sequences with desired mutations or other sequence elements. The template sequence may be linked to a target sequence to introduce mutations or other sequence elements into the nucleic acid molecule. The TnpB protein may be a nicking enzyme that creates a single-strand break on a nucleic acid molecule, and the ligase may be a single-strand DNA ligase. In some embodiments, the system comprises a pair of TnpB-ligase complexes having two different omega RNA sequences. Each TnpB-ligase complex may target one strand of a double-stranded polynucleotide and act together to effectively modify the sequence of the double-stranded polynucleotide.
In some examples, tnpB is associated with a ligase or a functional fragment thereof. The ligase may ligate single strand breaks (nicks) generated by TnpB. In some cases, the ligase may ligate the double strand break generated by TnpB. In certain examples, tnpB is associated with a reverse transcriptase or a functional fragment thereof.
The invention also provides systems and methods for modifying a nucleic acid sequence using a pair of different TnpB-ligase-omega RNA complexes, the systems and methods comprising: (a) an engineered TnpB protein linked or complexed to a ligase; (b) Two different ωrna sequences that complex with such a TnpB-ligase protein complex to form first and second different TnpB-ligase ωrna complexes; (c) A first TnpB-ligase omega RNA complex that binds to one strand of the target double-stranded polynucleotide sequence and a second TnpB-ligase-omega RNA complex that binds to the other strand of the target double-stranded polynucleotide sequence; (d) When the complex binds to a locus of interest, the effector protein induces modification of a sequence associated with or at the target locus of interest, whereby the two TnpB-ligase-omega RNA complexes interact and modify the sequence on different strands of the double stranded target sequence.
One of the advantages of using such a "pair" of TnpB-ligase-omega RNA complexes includes the high efficiency of modifying sequences associated with or at a locus of interest of a target double-stranded polynucleotide.
In some embodiments, the TnpB protein may be a nicking enzyme. In a preferred embodiment, the ligase is linked to the TnpB protein. The ligase may ligate the donor sequence to the target sequence. The ligase may be a single-stranded DNA ligase or a double-stranded DNA ligase. The ligase may be fused to the carboxy terminus of the TnpB protein or to the amino terminus of the TnpB protein.
As used herein, the term "ligase" refers to an enzyme that catalyzes the joining of breaks (e.g., double-strand breaks or single-strand breaks ("nicks") between adjacent bases of a nucleic acid. For example, the ligase may be an enzyme capable of forming an intramolecular or intermolecular covalent bond between a 5 'phosphate group and a 3' hydroxyl group. The term "ligation" refers to a reaction that covalently bonds adjacent oligonucleotides by forming internucleotide linkages.
DNA ligases fall into two main categories: ATP-dependent DNA ligase (EC 6.5.1.1) and NAD (+) dependent DNA ligase (EC 6.5.1.2). NAD (+) dependent DNA ligases are only present in bacteria (and some viruses), whereas ATP dependent DNA ligases are ubiquitous. ATP-dependent DNA ligases can be divided into four classes: DNA ligases I, II, III and IV. DNA ligase I ligates the Okazaki fragment (Okazaki fragment) to form a continuous DNA strand; DNA ligase II is an alternatively spliced form of DNA ligase III, present only in non-dividing cells; DNA ligase III is involved in base excision repair; and DNA ligase IV is involved in repair of DNA double strand breaks by non-homologous end joining (NHEJ). Among all ligases, there are two types of prokaryotic ligases and one type of eukaryotic ligases that are particularly suitable for facilitating blunt-ended double-stranded DNA ligation: prokaryotic DNA ligases (T3 and T4) and eukaryotic DNA ligases (ligase 1).
In some cases, the ligase is specific for double-stranded nucleic acids (e.g., dsDNA, dsRNA, RNA/DNA duplex). An example of a ligase that is specific for double stranded DNA and DNA/RNA hybrids is T4 DNA ligase. In some cases, the ligase is specific for single stranded nucleic acids (e.g., ssDNA, ssRNA). An example of such a ligase is CircLigase II. In some cases, the ligase is specific for an RNA/DNA duplex. In some cases, the ligase is capable of acting on any combination of single stranded nucleic acids, double stranded nucleic acids, and/or RNA/DNA nucleic acids.
In some cases, the ligase may be a pan-ligase, which is a single ligase that has the ability to ligate both DNA and RNA targets. The ligase may be specific for the target (e.g., DNA-specific or RNA-specific). In some cases, the ligase may be a dual ligase system comprising DNA-specific ligases, RNA-specific ligases, and/or pan ligases in any combination.
Examples of ligases that may be used with the present disclosure include T4 DNA ligase, T3 DNA ligase, T7 DNA ligase, escherichia coli DNA ligase, hiFi Taq DNA ligase, 9℃N TM DNA ligase, taq DNA ligase,Ligase (also known as PBCV-1DNA ligase or Chlorella virus DNA ligase), thermostable 5' AppDNA/RNA ligase, T4 RNA ligase 2, truncated T4 RNA ligase 2, T4 RNA ligase 2 truncated K227Q, T RNA ligase 2, truncated KQ, rtcB ligase (ligating single stranded RNA having 3 "-phosphate or 2',3' -cyclophosphate to another RNA), circLigase II, circLigase ssDNA ligase, circLigase RNA ligase or>Thermostable DNA Ligas, NAD-dependent ligases including Taq DNA ligase, filamentous Thermus DNA ligase, escherichia coli DNA ligase, tth DNA ligase, thermus aquaticus DNA ligase (I and II), thermostable ligases, amplified enzyme thermostable DNA ligase, vanC-type ligases, 9 DEG N DNA ligases, tsp DNA ligases, and novel ligases found by biological exploration; ATP-dependent ligases, including T4 RNA ligase, T4 DNA ligase, T3 DNA ligase, T7 DNA ligase, pfu DNA ligase, DNA ligase I, DNA ligase III, DNA ligase IV, and novel ligases found by biological exploration and wild-type, mutant isoforms and genetically engineered variants thereof. In a particular example, the ligase is
In some embodiments, examples of ligases include those for sequencing by synthesis or sequencing by ligation.
TnpB spiral subsystem
The systems and compositions herein may comprise a TnpB polypeptide, one or more nucleic acid components, and one or more components of a helix. The systems and compositions can be used to insert a donor polynucleotide into a target polynucleotide. The systems and compositions may also comprise a donor polynucleotide.
As used herein, the term "helix" refers to a polynucleotide (or nucleic acid segment) that is recognized as a transposon that captures and mobilizes a gene fragment in a eukaryotic organism. The term "helix" as used herein refers to a transposase comprising an endonuclease domain and a C-terminal helicase domain. The helix is a rolling circle RNA transposon. In particular embodiments, the helix encodes a multi-domain transposase of 1400 to about 2000 amino acids, or about 1800 amino acids. In embodiments, the helix comprises a hairpin near the 3' end to act as a transposon terminator. In embodiments, the transposon comprises a RepHel motif comprising a replication initiator (Rep) and a DNA helicase (hel) domain. See Thomas J. & Pritham e.j. heitron s the eukaryotic rolling-circle transposable elements. Microbiol. Spectra.3, 893-926 (2015). In embodiments, the helices comprise an insert between a Rep nuclease domain and a C-terminal helicase domain and an AT dinucleotide in single stranded DNA. In one aspect, the C-terminal helicase unwinds the DNA in the 5 'to 3' direction. The HUH nuclease domain may comprise one or two active site tyrosine residues, in embodiments, a 2 tyrosine (Y2) HUH endonuclease domain. Helices may encompass helentron, protohelentron, and helix type 2 proteins, the structure of which may be as described in figures 1 and 3 of Thomas et al, 2015, which is expressly incorporated by reference. Specific organisms in which helices or helantrons have been found may include those described in Table 1 of Thomas J. & Protham E.J. Helitrons, the eukaryotic rolling-circle transposable elements, microbiol. Spectr.3,893-926 (2015), which is incorporated herein by reference. Similarly, helices may be identified based at least in part on the Rep motif and conserved residues in the helices, and based on the aligned sequences of figure 2 of Thomas J. & Pritham e.j. helitrons, the eukaryotic rolling-circle transposable elements, microbiol. Spectra.3, 893-926 (2015), which is expressly incorporated herein by reference.
The expression "helix reaction" as used herein refers to a reaction in which a transposase inserts a donor polynucleotide sequence into or near an insertion site on a target polynucleotide. The insertion site may contain sequences or secondary structures recognized by the helix and/or insertion motif sequences in the target polynucleotide into which the donor polynucleotide sequence may be inserted.
As described in Grabundzija 2018, the helix end sequence contains a unique long sequence of about 150 base pairs (bp), with an absolutely conserved dinucleotide at the end of the left end sequence (LTS), and a tetranucleotide at the end of the right end sequence (RTS), preceded by a palindromic sequence that can form a hairpin structure. Grabundzija et al, nat.Commun.2018;9:1278; doi 10.1035/s41467-018-03688-w.
The helix end sequences may be responsible for identifying the donor polynucleotide for transposition. The helix end sequences may be DNA sequences for performing a transposition reaction, the end sequences may be referred to herein as right and left end sequences. The donor polynucleotide may be configured to comprise first and second helix recognition sequences that are at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or 100% complementary to the left-and/or right-terminal sequences of the polynucleotide encoding the helix polypeptide.
In one aspect, the palindromic sequence may be located upstream of the right terminal sequence, e.g., about 5, 10, 15, 20, 25, 30, 35 nucleotides upstream of the end of the right terminal sequence, or about 10 to 15 nucleotides upstream of the end of the right terminal sequence, about 10 to 12 nucleotides or about 11 nucleotides upstream of the end of the right terminal sequence. Ivana Grabundzija, nat Commun.2016;7:10716, doi:10.1038/ncomms10716, incorporated herein by reference.
Exemplary helices can be identified using software, such as (EAHelitron) which has been used to identify helices in a wide range of plant genomes. See Hu, K., xu, K., wen, J.et al Helitron distribution in Brassicaceae and whole Genome Helitron density as a character for distinguishing plant specifications BMC Bioinformatics 20,354 (2019) doi 10.1186/s12859-019-2945-8, incorporated herein by reference.
The helix may be derived from eukaryotic organisms. In one aspect, the helix is derived from the mammalian genome, in one aspect, is a round neck bat (vespertilionid bat), such as Helibat. In embodiments, the helix is derived from the helidate 1 transposon. In embodiments, the helices are Helraiser, the complete DNA sequence of the consensus transposon, including the left and right terminal sequences provided in figure 1 and the identified hairpin, complemented by Grabundzija,2016, which are expressly incorporated herein by reference. In one aspect, the helices flank the left and right terminal sequences of the transposon. In one aspect, the left and right terminal sequences terminate with a conserved 5'-TC/CTAG-3' motif. In one embodiment, the helix may comprise a palindromic sequence of about 10 to about 35, or about 5 to 25bp, or a palindromic sequence of about 19-bp in length, that may form a hairpin structure.
The elements of these systems may be engineered to function in the context of the present invention. For example, a helix polypeptide may be fused to a polypeptide capable of producing an R loop. Fusion can be by any suitable linker, in an exemplary embodiment XTEN16. Binding elements that allow binding of the helix polypeptides, e.g., using sequences complementary to the right and left terminal sequences of the helix, can be engineered into the donor construct to facilitate entry of the donor polynucleotide sequence into the target polynucleotide.
In certain exemplary embodiments, the Isc polypeptide directs a helix polypeptide to a target sequence in a target polynucleotide via formation of a complex with a nucleic acid component sequence, wherein the helix facilitates integration of a donor polynucleotide sequence into the target polynucleotide.
The helix polypeptides may also comprise one or more truncations or excision to remove a domain or region of the wild-type protein to obtain a minimal polypeptide, altering functionality according to the system in which the helix is used, or mutating to enhance or attenuate specific activities associated with the helix, i.e., nuclease activity or helicase activity.
Multiplexing of
In one embodiment, the TnpB polypeptides are useful in multiplex (tandem) targeting methods. For example, a TnpB polypeptide herein may employ more than one nucleic acid component molecule without loss of activity. This may enable targeting of multiple DNA targets, genes or loci by a single enzyme, system or complex as defined herein using a TnpB polypeptide, system or complex as defined herein. The nucleic acid component molecules may be arranged in tandem, optionally separated by a nucleotide sequence (such as a conserved nucleotide sequence as defined herein). The positions of the molecules of the different nucleic acid components are in tandem without affecting activity.
In one aspect, the TnpB polypeptides can be used for tandem or multiplex targeting. It will be appreciated that any TnpB polypeptide, complex or composition elsewhere herein may be used in such a method. Any of the methods, products, compositions and uses as described elsewhere herein are equally applicable to the multiplex or tandem targeting methods described in further detail below. By way of further guidance, the following specific aspects and embodiments are provided.
In one aspect, the invention provides the use of a TnpB polypeptide, complex or system as defined herein for targeting a plurality of loci. In one embodiment, this can be established by using multiple (tandem or multiplex) nucleic acid component molecular sequences.
In one aspect, the invention provides a method for tandem or multiplex targeting using one or more elements of a TnpB polypeptide, complex or system as defined herein, wherein the system herein comprises a plurality of nucleic acid component molecular sequences. The sequences are separated by nucleotide sequences, such as conserved nucleotide sequences as defined elsewhere herein.
The TnpB polypeptide, composition, system or complex as defined herein provides an effective means for modifying a plurality of target polynucleotides. The TnpB polypeptides, systems or complexes as defined herein have broad utility, including modification (e.g., deletion, insertion, translocation, deactivation, activation) of one or more target polynucleotides in a variety of cell types. Thus, the TnpB polypeptides, systems or complexes of the invention as defined herein have broad application in, for example, gene therapy, drug screening, disease diagnosis and prognosis, including targeting multiple loci within a single system.
In one aspect, the present disclosure provides a TnpB polypeptide, system or complex as defined herein having a TnpB polypeptide with at least one destabilizing domain associated therewith, and a plurality of nucleic acid component molecules that target a plurality of nucleic acid molecules (such as DNA molecules), whereby each of the plurality of nucleic acid component molecules specifically targets its corresponding nucleic acid molecule, e.g., DNA molecule. Each nucleic acid molecule target (e.g., a DNA molecule) may encode a gene product or encompass a locus. Thus, the use of multiple nucleic acid component molecules is capable of targeting multiple loci or multiple genes. In one embodiment, the TnpB polypeptide may cleave a DNA molecule encoding a gene product. In one embodiment, expression of the gene product is altered. The TnpB polypeptide and nucleic acid component molecules do not naturally occur together. The present disclosure encompasses nucleic acid component molecules comprising nucleic acid component molecules arranged in tandem. The present disclosure also encompasses coding sequences for TnpB polypeptides that are codon optimized for expression in eukaryotic cells. In one embodiment, the eukaryotic cell is a mammalian cell, a plant cell, or a yeast cell, and in a more preferred embodiment, the mammalian cell is a human cell. Expression of the gene product may be reduced. The TnpB polypeptide may form part of a system or complex, further comprising a series of nucleic acid component molecules arranged in tandem, the nucleic acid component molecules comprising a series of 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 25, 30 or more than 30 nucleic acid component molecules, each nucleic acid component molecule capable of specifically hybridizing to a target sequence in a genomic locus of interest in a cell. In one embodiment, the functional system or complex binds to multiple target sequences. In one embodiment, the functional system or complex can edit multiple target sequences, e.g., the target sequences can comprise genomic loci, and in one embodiment, there can be alterations in gene expression. In one embodiment, the functional system or complex may comprise additional functional domains. In one embodiment, the invention provides a method for altering or modifying the expression of a plurality of gene products. The method can include introducing a cell containing the target nucleic acid (e.g., a DNA molecule) or a cell containing and expressing a target nucleic acid (e.g., a DNA molecule); for example, the target nucleic acid may encode a gene product or provide for expression of a gene product (e.g., a regulatory sequence).
In one embodiment, the TnpB polypeptide for multiple targeting is associated with one or more functional domains. In some more specific embodiments, the TnpB polypeptide used for multiple targeting is a dead TnpB polypeptide. The inventors have found that a TnpB polypeptide as described herein may enable improved and/or direct access to one or more nucleotides involved in a DNA: RNA duplex.
Inducible system
In one embodiment, the TnpB polypeptide may form a component of an inducible system. The inducible nature of the system will allow for the use of some form of energy for the spatiotemporal control of gene editing or gene expression. The form of energy may include, but is not limited to, electromagnetic radiation, acoustic energy, chemical energy, and thermal energy. Examples of inducible systems include tetracycline-inducible promoters (Tet-On or Tet-Off), small molecule two-hybrid transcriptional activation systems (FKBP, ABA, etc.), or photoinductive systems (photopigments, LOV domains, or cryptoanthocyanidins). In one embodiment, the TnpB polypeptide may be part of a Light Induced Transcriptional Effector (LITE) that directs changes in transcriptional activity in a sequence specific manner. The components of light may include a TnpB polypeptide, a photoreactive cytochrome heterodimer (e.g., from arabidopsis thaliana (Arabidopsis thaliana)), and a transcriptional activation/repression domain. Further examples of inducible DNA binding proteins and methods of their use are provided in U.S. provisional application Ser. Nos. 61/736,465 and 61/721,283 and International patent publication No. WO 2014/018423 A2, which are hereby incorporated by reference in their entirety.
Self-inactivating system
Once all copies of the gene in the genome of the cell have been edited, the system no longer needs to continue expression in the cell. Indeed, sustained expression is undesirable if off-target effects occur at unintended genomic sites, etc. Therefore, time-limited expression would be useful. Inducible expression provides a method, but in addition applicants have engineered a self-inactivating system that relies on the use of non-coding nucleic acid component molecular target sequences within the vector itself. Thus, after expression begins, the system will cause itself to break, but before the break is complete, it will have time to edit the genomic copy of the target gene (up to two edits are required for normal point mutations in diploid cells). Briefly, the self-inactivating system comprises additional RNAs (e.g., nucleic acid component molecules) that target the coding sequence of the TnpB polypeptide itself, or target one or more non-coding nucleic acid component molecule target sequences that are complementary to unique sequences present in one or more of the following: (a) within a promoter that drives expression of a non-coding RNA element, (b) within a promoter that drives expression of a TnpB polypeptide gene, (c) within 100bp of the ATG translation initiation codon in the coding sequence of the TnpB polypeptide, (d) within the inverted terminal repeat (ittr) of a viral delivery vector, e.g., in the AAV genome.
In some aspects, single nucleic acid component molecules are provided that are capable of hybridizing to sequences downstream of the TnpB polypeptide start codon, thereby losing TnpB polypeptide expression over a period of time. In some aspects, one or more nucleic acid component molecules are provided that are capable of hybridizing to one or more coding or non-coding regions of a polynucleotide encoding the system, whereby after a period of time, one or more, or in some cases all, of the system is inactivated. In some aspects of the system, and without being limited by theory, the cell may comprise a plurality of complexes, wherein a first subset of the complexes comprises a first nucleic acid component molecule capable of targeting one or more genomic loci to be edited, and a second subset of the complexes comprises at least one second nucleic acid component molecule capable of targeting a polynucleotide encoding the system, wherein the first subset of the complexes mediates editing of the one or more targeted genomic loci, and the second subset of the complexes eventually inactivates the system, thereby inactivating further expression in the cell.
The various coding sequences (TnpB polypeptide and nucleic acid component molecules) may be included on a single vector or multiple vectors. For example, the enzyme may be encoded on one vector and the various RNA sequences encoded on another vector, or the enzyme and one nucleic acid component molecule may be encoded on one vector while the remaining nucleic acid component molecules are encoded on another vector, or any other arrangement. Generally, systems employing a total of one or two different carriers are preferred.
When multiple vectors are used, they can be delivered in unequal amounts, and it is desirable to use an excess of vector encoding the first nucleic acid component molecule relative to the second nucleic acid component molecule, thereby helping to delay the final inactivation of the system until genome editing has an opportunity to occur.
The first nucleic acid component molecule can target any target sequence of interest within the genome, as described elsewhere herein. The second nucleic acid component molecules target the sequence encoding the TnpB polypeptide within the vector, thereby inactivating the enzymatic expression of the vector. Thus, the target sequence in the vector must be capable of inactivating expression. Suitable target sequences may be, for example, near or within the translation initiation codon of the TnpB polypeptide coding sequence, in a non-coding sequence in a promoter that drives expression of a non-coding RNA element, within a promoter that drives expression of the TnpB polypeptide gene, within 100bp of the ATG translation initiation codon in the TnpB polypeptide coding sequence, and/or within the inverted terminal repeat (ittr) of a viral delivery vector, e.g., in the AAV genome. Double strand breaks near this region can induce frame shifts in the TnpB polypeptide coding sequence, resulting in loss of protein expression. The replacement target sequence of a "self-inactivating" nucleic acid component molecule is intended to edit/inactivate regulatory regions/sequences required for systemic expression or vector stability. For example, transcription can be inhibited or prevented if the promoter of the TnpB polypeptide coding sequence is disrupted. Similarly, if the vector includes sequences for replication, maintenance or stability, these sequences may be targeted. For example, in AAV vectors, useful target sequences are located within the iTR. Other useful targeting sequences may be promoter sequences, polyadenylation sites, and the like.
Furthermore, if the nucleic acid component molecules are expressed in an array, a "self-inactivating" nucleic acid component molecule targeting both promoters simultaneously will result in excision of intervening nucleotides from the TnpB polypeptide expression construct, effectively resulting in complete inactivation thereof. Similarly, where a nucleic acid component molecule targets two ITRs or two or more other components simultaneously, this will result in excision of the intervening nucleotide. Generally, self-inactivation as explained herein is applicable to a system in order to provide regulation of the system. For example, self-inactivation as explained herein may be applied to repair of mutations, such as amplification disorders as explained herein. Because of this self-inactivation, repair may have only transient activity.
The addition of non-targeting nucleotides to the 5' end (e.g., 1-10 nucleotides, preferably 1-5 nucleotides) of a "self-inactivating" nucleic acid component molecule can be used to delay its processing and/or modify its efficiency as a way of ensuring editing at the targeted genomic locus prior to shutdown.
In one aspect of a self-inactivating AAV system, plasmids that co-express one or more nucleic acid component molecules (e.g., 1-2, 1-5, 1-10, 1-15, 1-20, 1-30) that target a genomic sequence of interest can be created with a "self-inactivating" nucleic acid component molecule that targets a TnpB polypeptide sequence (e.g., within 5 nucleotides, within 15 nucleotides, within 30 nucleotides, within 50 nucleotides, within 100 nucleotides) at or near the start site of an engineered ATG. Regulatory sequences in the U6 promoter region may also be targeted with nucleic acid component molecules. The U6-driven nucleic acid component molecules can be designed in an array format such that multiple nucleic acid component molecule sequences can be released simultaneously. When first delivered into the target tissue/cell (left cell), the nucleic acid component molecules began to accumulate while the TnpB polypeptide levels in the nucleus increased. The TnpB polypeptide is complexed with all nucleic acid component molecules to mediate genome editing and self-inactivation of the TnpB polypeptide plasmid.
One aspect of the self-inactivating system is the expression of 1 to 4 or more different nucleic acid component sequences, either singly or in tandem arrays; for example, up to about 20 or about 30 sequences. Each individual self-inactivating nucleic acid component molecule sequence may target a different target. It can be processed from, for example, a chimeric pol3 transcript. A Pol3 promoter, such as the U6 or H1 promoter, may be used. Pol2 promoters such as those mentioned throughout this document. The inverted terminal repeat (ittr) sequence may flank the Pol3 promoter-nucleic acid component molecule-Pol 2 promoter-TnpB polypeptide.
One aspect of tandem array transcripts is that one or more nucleic acid component molecules edit one or more targets while one or more self-inactivating nucleic acid component molecules inactivate the system. Thus, for example, the system for repairing an amplification disorder may be directly combined with the self-inactivating system described herein. Such a system may, for example, have two nucleic acid component molecules for repair to the target region and at least a third nucleic acid component molecule for self-inactivation of the TnpB polypeptide or system.
The nucleic acid component molecule may be a control molecule. For example, it may be engineered to target a nucleic acid sequence encoding the TnpB polypeptide itself, as described in U.S. patent publication No. US2015232881A1, the disclosure of which is hereby incorporated by reference. In one embodiment, the system or composition may have only nucleic acid component molecules engineered to target the nucleic acid sequence encoding the TnpB polypeptide. In addition, the system or composition can have a nucleic acid component molecule engineered to target a nucleic acid sequence encoding a TnpB polypeptide, as well as a nucleic acid sequence encoding a TnpB polypeptide, and optionally a second nucleic acid component molecule, and further optionally a repair template. The second nucleic acid component may be a primary target of the system or composition (such as a therapeutic target, diagnostic target, knockout target, etc., as defined herein). In this way, the system or composition is self-inactivating. This is illustrated in US2015232881A1 (also disclosed as WO2015070083 (A1)) for Cas and can be extrapolated to the TnpB polypeptides disclosed herein, e.g., tnpB polypeptides.
Polynucleotides encoding TNPB systems and vectors
The systems herein may comprise one or more polynucleotides. Polynucleotides may comprise coding sequences for components of the systems herein, such as TnpB polypeptides, nucleic acid components, functional domains, donor polynucleotides, and/or other components in the systems. The present disclosure also provides vectors or vector systems comprising one or more polynucleotides herein. The carrier or carrier system includes those described in the delivery section herein.
The terms "polynucleotide", "nucleotide sequence", "nucleic acid" and "oligonucleotide" are used interchangeably. They refer to polymeric forms of nucleotides of any length (deoxyribonucleotides or ribonucleotides or analogs thereof). Polynucleotides may have any three-dimensional structure and may perform any known or unknown function. The following are non-limiting examples of polynucleotides: coding or non-coding regions of a gene or gene fragment, one or more loci defined by linkage analysis, exons, introns, messenger RNAs (mRNA), transfer RNAs, ribosomal RNAs, short interfering RNAs (siRNA), short hairpin RNAs (nucleic acid components), micrornas (miRNA), ribozymes, cdnas, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes and primers. The term also encompasses nucleic acid-like structures having synthetic backbones, see e.g., eckstein,1991; basega et al, 1992; milligan,1993; WO 97/03111; WO 96/39154; mata,1997; strauss-Soukup,1997; and samstar, 1996. Polynucleotides may comprise one or more modified nucleotides, such as methylated nucleotides and nucleotide analogs. Modification of the nucleotide structure, if present, may be imparted before or after assembly of the polymer. The sequence of nucleotides may be intergenerated with non-nucleotide components. The polynucleotide may be further modified after polymerization, such as by conjugation with a labeling component. As used herein, the term "wild-type" is a term understood by those skilled in the art and means a typical form of an organism, strain, gene or trait that is present in nature as distinguished from mutant or variant forms. The "wild type" may be the baseline. As used herein, the term "variant" should be understood to mean exhibiting qualities that deviate from the patterns present in nature. The terms "non-naturally occurring" or "engineered" are used interchangeably and refer to human involvement. When referring to a nucleic acid molecule or polypeptide, the term means that the nucleic acid molecule or polypeptide is at least substantially free of at least one other component with which it is naturally associated in nature and found in nature. "complementarity" refers to the ability of a nucleic acid to form hydrogen bonds with another nucleic acid sequence by conventional Watson-Crick base pairing or other non-conventional types. Percent complementarity means the percentage of residues in a nucleic acid molecule that can form hydrogen bonds (e.g., watson-Crick base pairing) with a second nucleic acid sequence (e.g., 5, 6, 7, 8, 9, 10 out of 10 are 50%, 60%, 70%, 80%, 90% and 100% complementary). By "fully complementary" is meant that all consecutive residues of a nucleic acid sequence will form hydrogen bonds with the same number of consecutive residues in the second nucleic acid sequence. "substantially complementary" as used herein means that the degree of complementarity is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, 99% or 100% over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50 or more nucleotides, or that two nucleic acids hybridize under stringent conditions. As used herein, "stringent conditions" of hybridization refer to conditions under which nucleic acids having complementarity to a target sequence hybridize predominantly to the target sequence and do not substantially hybridize to non-target sequences. Stringent conditions typically have sequence dependencies and will vary depending on a number of factors. Generally, the longer the sequence, the higher the temperature at which the sequence specifically hybridizes to its target sequence. Non-limiting examples of stringent conditions are described in detail in Tijssen (1993), laboratory Techniques In Biochemistry And Molecular Biology-Hybridization With Nucleic Acid Probes Part I, second Chapter "Overview of principles of hybridization and the strategy of nucleic acid probe assay", elsevier, n.y. When referring to polynucleotide sequences, complementary or partially complementary sequences are also contemplated. These are preferably capable of hybridizing to the reference sequence under highly stringent conditions. "hybridization" refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between nucleotide residue bases. Hydrogen bonding may occur through watson crick base pairing, hophattan binding (Hoogstein binding), or in any other sequence-specific manner. The complex may comprise two strands forming a duplex structure, three or more strands forming a multi-strand complex, a single self-hybridizing strand, or any combination of these. Hybridization reactions may constitute a step in a broader process, such as the initiation of PCR, or enzymatic cleavage of polynucleotides. Sequences capable of hybridizing to a given sequence are referred to as the "complement" of the given sequence. As used herein, the term "genomic locus" or "locus" (loci) is a specific location on a chromosome of a gene or DNA sequence. "Gene" refers to a segment of DNA or RNA that encodes a polypeptide or RNA strand that functions in an organism, and thus is a unit of molecule inherited in a living organism. For the purposes of the present invention, a gene may be considered to include regions that regulate the production of a gene product, whether or not such regulatory sequences are adjacent to a coding sequence and/or a transcribed sequence. Thus, genes include, but are not necessarily limited to, promoter sequences, terminators, translational regulatory sequences, such as ribosome binding sites and internal ribosome entry sites, enhancers, silencers, insulators, border elements, origins of replication, matrix attachment sites, and locus control regions. As used herein, "expression of a genomic locus" or "gene expression" is the process by which information from a gene is used to synthesize a functional gene product. The product of gene expression is typically a protein, but in non-protein encoding genes such as rRNA genes or tRNA genes, the product is a functional RNA. All known living organisms-eukaryotes (including multicellular organisms), prokaryotes (bacteria and archaebacteria) and viruses use gene expression processes to produce functional products to survive. As used herein, "expression" of a gene or nucleic acid encompasses not only cellular gene expression, but also transcription and translation of the nucleic acid in a cloning system and in any other context. As used herein, "expression" also refers to the process of transcription of a polynucleotide from a DNA template (such as into mRNA or other RNA transcript) and/or subsequent translation of the transcribed mRNA into a peptide, polypeptide, or protein. Transcripts and encoded polypeptides may be collectively referred to as "gene products". If the polynucleotide is derived from genomic DNA, expression may include splicing of mRNA in eukaryotic cells. The terms "polypeptide", "peptide" and "protein" are used interchangeably herein to refer to a polymer of amino acids of any length. The polymer may be linear or branched, it may comprise modified amino acids, and it may be syndiotactic non-amino acids. The term also covers amino acid polymers that have been modified; for example disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation or any other manipulation, such as conjugation to a labeling component. The term "amino acid" as used herein includes natural and/or unnatural or synthetic amino acids, including glycine and D or L optical isomers, as well as amino acid analogs and peptidomimetics. As used herein, the term "domain" or "protein domain" refers to a portion of a protein sequence that can exist and function independently of the rest of the protein chain. As described in aspects of the invention, sequence identity is related to sequence homology. Homology comparisons can be made by the naked eye or, more generally, by means of readily available sequence comparison procedures. These commercially available computer programs can calculate percent (%) homology between two or more sequences and can also calculate sequence identity of two or more amino acid or nucleic acid sequences.
In one embodiment, the polynucleotide sequence is recombinant DNA. In other embodiments, the polynucleotide sequence further comprises additional sequences as described elsewhere herein. In one embodiment, the nucleic acid sequence is synthesized in vitro.
The present disclosure provides polynucleotide molecules encoding one or more components of a system or TnpB polypeptide as mentioned in any embodiment herein. In one embodiment, the polynucleotide molecule may comprise additional regulatory sequences. By way of guidance and not limitation, the polynucleotide sequence may be part of the following: expression plasmids, microloops, lentiviral vectors, retroviral vectors, adenoviral or adeno-associated viral vectors, piggyback vectors or tol2 vectors. In one embodiment, the polynucleotide sequence may be a bicistronic expression construct. In other embodiments, the isolated polynucleotide sequence may be incorporated into the genome of a cell. In other embodiments, the isolated polynucleotide sequence may be part of the genome of the cell. In other embodiments, the isolated polynucleotide sequence may be contained in an artificial chromosome. In one embodiment, the 5 'and/or 3' ends of the isolated polynucleotide sequences may be modified to increase the stability of the sequences that actively avoid degradation. In one embodiment, the isolated polynucleotide sequence may be contained in a phage. In other embodiments, the isolated polynucleotide sequence may be comprised in an agrobacterium (agrobacterium) species. In one embodiment, the isolated polynucleotide sequence is lyophilized.
Codon optimization
Aspects of the invention relate to polynucleotide molecules encoding one or more components of one or more systems as described in any of the embodiments herein, wherein at least one or more regions of the polynucleotide molecule can be codon optimized for expression in eukaryotic cells. In one embodiment, a polynucleotide molecule encoding one or more components of one or more systems as described in any of the embodiments herein is optimized for expression in a mammalian cell or a plant cell.
In this case, examples of codon-optimized sequences are sequences that are optimized for expression in eukaryotic cells, such as humans (i.e., optimized for expression in humans), or sequences that are optimized for expression in another eukaryotic cell, animal or mammal as discussed herein. In one embodiment, the enzyme coding sequence encoding a DNA/RNA targeting TnpB polypeptide is codon optimized for expression in a particular cell (such as a eukaryotic cell). Eukaryotic cells may be those of or may be derived from specific organisms such as plants or mammals, including but not limited to human or non-human eukaryotes or animals or mammals as discussed herein, e.g., mice, rats, rabbits, dogs, livestock or non-human mammals or primates. In one embodiment, methods for modifying the germ line genetic identity of humans and/or methods for modifying the genetic identity of animals that may cause them to suffer without any substantial medical benefit to humans or animals and animals obtained from such methods may be excluded. Generally, codon optimization refers to the process of modifying a nucleic acid sequence to enhance expression in a host cell of interest by replacing at least one codon (e.g., about or more than about 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more codons) of the native sequence with a more or most frequently used codon in the gene of the host cell of interest, while maintaining the native amino acid sequence. Various species exhibit specific preferences for certain codons for a particular amino acid. Codon preference (the difference in codon usage between organisms) is generally related to the efficiency of translation of messenger RNAs (mRNA), which in turn is believed to depend on the nature of the codons being translated, availability of specific transfer RNA (tRNA) molecules, etc. The dominance of the selected tRNA in the cell generally reflects the codons most frequently used in peptide synthesis. Thus, genes can be tailored based on codon optimization to achieve optimal gene expression in a given organism. Codon usage tables are readily available (e.g., in www.kazusa.orjp/codon/available "codon usage database") and these tables can be adjusted in a variety of ways. See Nakamura, Y. Et al, "Codon usage tabulated from the international DNA sequence databases: status for the year 2000"Nucl.Acids Res.28:292 (2000). Computer algorithms for codon optimization of specific sequences for expression in specific host cells are also available, such as Gene force (Aptagen; jacobus, pa.). In one embodiment, one or more codons (e.g., 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more or all codons) in the sequence encoding the TnpB polypeptide correspond to the most common codons for a particular amino acid.
Delivery of
The present disclosure also provides delivery systems for introducing the components of the systems and compositions herein into cells, tissues, organs or organisms. The delivery system may include one or more delivery vehicles and/or cargo. Exemplary delivery systems and methods include those described in the following: paragraphs [00117] to [00278] of Feng Zhang et al (WO 2016106236A 1) and Lino CA et al, delivering CRISPR: a review of the challenges and approaches, DRUG DELIVERY,2018, volume 25, page 1241-1251 of 1234-1257, and Table 1, which are incorporated herein by reference in their entirety, and which may be adapted for use with the TnpB proteins disclosed herein.
In one embodiment, the delivery system may be used to introduce components of the system and composition into plant cells. For example, electroporation, microinjection, aerosol beam injection of plant cell protoplasts, gene gun methods, DNA particle bombardment, and/or agrobacterium-mediated transformation can be used to deliver the components to the plants. Examples of methods and delivery systems for plants include Fu et al, transgenic Res.2, 2000; 9 (1) 11-9; klein RM et al, biotechnology.1992;24:384-6; casas AM et al Proc Natl Acad Sci U S a.1993, 12 months 1; 90 11212-11216; and U.S. Pat. No. 5,563,055, davey MR et al, plant Mol biol.1989 month; 13 (3) 273-85, which are incorporated herein by reference in their entirety.
The exemplary delivery compositions, systems, and methods described herein in relation to compositions or TnpB polypeptides are also applicable to functional domains and other components (e.g., other proteins and polynucleotides in relation to the TnpB polypeptides, such as reverse transcriptase, nucleotide deaminase, retrotransposons, donor polynucleotides, etc.). In preferred embodiments, the composition comprises delivery of the polypeptide via mRNA.
RNA delivery
In one embodiment, the TnpB system may comprise delivery as mRNA encoding a TnpB polypeptide. Omega RNAs can be delivered together with or separately from mRNA encoding the TnpB polypeptide. The in vivo translation efficiency of mRNA molecules can be further improved by RNA engineering. To achieve efficient translation, mRNA requires five structural elements, including a 5 'cap, a 3' poly (A) tail, a protein coding sequence, and 5 'and 3' untranslated regions (UTRs), one or more of which may be used for sequence engineering to improve in vivo translation.
In some embodiments, the isolated mRNA does not self-replicate.
In some embodiments, the isolated mRNA comprises and/or encodes one or more 5 'end caps (or cap structures), 3' end caps, 5 'untranslated regions, 3' untranslated regions, tailing regions, or any combination thereof.
In some embodiments, the capping region of the isolated mRNA region may be 1 to 10, e.g., 2-9, 3-8, 4-7, 1-5, 5-10, or at least 2, or 10 or fewer nucleotides in length. In some embodiments, the cap is absent.
In exemplary embodiments, mRNA can be synthesized in vitro and transferred directly into target cells, and can be further modified. For example, the mRNA may comprise the 5 'end of an endogenous mRNA modified with a 7-methylguanosine cap structure with a polyadenylation 3' end, which may facilitate protein production. Modification of pyrimidine residues may also be performed to enhance transgene expression of delivered mRNA, as it may reduce stimulation of the host cell's innate immune system. In exemplary embodiments, the mRNA comprises an anti-reverse cap analog and a 120-nt poly (A) tail, and optionally may comprise cytosine and uridine residues replaced with 5-methylcytosine and pseudouridine. See U.S. patent publication 2019/0151474, which is incorporated herein by reference.
In some embodiments, the 5 '-cap structure is cap0, cap1, ARCA, inosine, N1-methyl-guanosine, 2' -fluoro-guanosine, 7-deaza-guanosine, 8-oxo-guanosine, 2-amino-guanosine, LNA-guanosine, or 2-azido-guanosine.
In some embodiments, the 5' end cap is a 7mG (5 ') ppp (5 ') NlmpNp, m7GpppG cap, N7-methylguanine. In some embodiments, the 3 'end cap is 3' -O-methyl-m 7GpppG.
In some embodiments, the 3'-UTR is an α -globin 3' -UTR. In some embodiments, the 5' -UTR comprises a Kozak sequence.
In some embodiments, the tailing sequence may be absent to 500 nucleotides in length (e.g., at least 60, 70, 80, 90, 120, 140, 160, 180, 200, 250, 300, 350, 400, 450, or 500 nucleotides). In some embodiments, the tailing region is or includes a polyA tail. Where the tailing region is a polyA tail, the length may be determined in units of, or as a function of, polyA binding protein binding. In this embodiment, the length of the polyA tail is sufficient to bind at least 4 monomers of the polyA binding protein. PolyA binding protein monomers bind to a stretch of about 38 nucleotides. Thus, it has been observed that about 80 nucleotides and 160 nucleotides of the polyA tail are functional. In some embodiments, the polyA tail is at least 160 nucleotides in length.
In some embodiments, the mRNA polynucleotide comprises a stabilizing element. In some embodiments, the stabilizing element is a histone stem loop. In some embodiments, the stabilizing element is a nucleic acid sequence having an increased GC content relative to the wild-type sequence.
In embodiments, it is desirable to reduce the immunogenic sequence motifs of mRNA for delivery. Exemplary techniques are known in the art, see, e.g., international patent publication WO/2020/033720, which discusses exemplary immunogenic sequence motifs for removal, including those that can bind to human TLR8, which is incorporated herein by reference.
The isolated mRNA may be prepared in part or prepared using in vitro transcription alone. Methods of preparing polynucleotides by in vitro transcription are known in the art and are described in U.S. provisional patent application Ser. Nos. 61/618,862, 61/681,645, 61/737,130, 61/618,866, 61/681,647, 61/737,134, 61/618,868, 61/681,648, 61/737,135, 61/618,873, 61/681,650, 61/737,147, 61/618,878, 61/681,654, 61/737,152, 61/618,885, 61/681,658, 61/737,155, 61/618,896, 61/668,157, 61/681,661, 61/737,160, 61/618,911, 61/681,667, 61/737,168, 61/618,922, 61/737,174, 61/618,935, 61/681,935, 61/681,687, 61/687,184, 61/681,618, 61/618,61/736,61/956, 61/733,945, 61/618,61/681,61,360. International publication Nos. WO2013151666, WO2013151668, WO2013151663, WO2013151669, WO2013151670, WO2013151664, WO2013151665, WO2013151736, WO2013151672, WO2013151671, WO2013151667 and WO/2020/205793A1; the contents of each patent are incorporated by reference in their entirety. Cell-free production methods for preparing ribonucleic acids, including large-scale synthesis, are described, for example, in U.S. patent 10,954,541, which is incorporated herein by reference in its entirety.
Targeted delivery of mRNA and endosomal escape are often requirements for effective mRNA use. As detailed elsewhere herein, lipids (including lipid nanoparticles, lipid-like materials, polymers) are particularly preferred delivery vehicles.
Goods (e.g. freight)
The delivery system may comprise one or more cargo. The cargo may comprise one or more components of the systems and compositions herein. The cargo may comprise one or more of the following: i) Plasmids encoding one or more protein components (such as TnpB polypeptides and/or functional domains) in compositions and systems; ii) a plasmid encoding one or more nucleic acid components; iii) mRNA for one or more protein components (such as TnpB polypeptides and/or functional domains) in compositions and systems; iv) one or more nucleic acid component molecules; v) one or more protein components in the compositions and systems, such as TnpB polypeptides and/or functional domains; vi) any combination thereof. The one or more protein components can include a nucleic acid-guided nuclease (e.g., cas), a reverse transcriptase, a nucleotide deaminase, a retrotransposon protein, other functional domains, or any combination thereof.
In some examples, the cargo may comprise a plasmid encoding one or more protein components (such as a TnpB polypeptide and/or functional domain) and one or more (e.g., a plurality of) nucleic acid component molecules in the compositions and systems. In some cases, the plasmid may also encode a recombinant template (e.g., for HDR). In one embodiment, the cargo may comprise mRNA encoding one or more protein components and one or more nucleic acid component molecules.
In some examples, the cargo may comprise one or more protein components and one or more nucleic acid component molecules, e.g., in the form of ribonucleoprotein complexes (RNPs). Ribonucleoprotein complexes may be delivered by the methods and systems herein. In some cases, ribonucleoproteins may be delivered by a polypeptide-based shuttle agent (shuttle agent). In one example, ribonucleoprotein may be delivered using a synthetic peptide comprising an Endosomal Leakage Domain (ELD) operably linked to a Cell Penetrating Domain (CPD), a histidine-rich domain, and a CPD, e.g., as described in WO 2016161516. RNP can also be used to deliver compositions and systems to plant cells, e.g., as Wu JW et al, nat biotechnol.2015, month 11; 33 1162-4.
Physical delivery
In one embodiment, the cargo may be introduced into the cells by a physical delivery method. Examples of physical methods include microinjection, electroporation, and hydrodynamic delivery. Both nucleic acids and proteins can be delivered using such methods. For example, one or more protein components may be prepared in vitro, isolated (refolded, purified if desired), and introduced into cells.
Microinjection
Direct microinjection of cargo into cells can achieve high efficiency, for example, greater than 90% or about 100%. In one embodiment, microinjection can be performed using a microscope and needle (e.g., 0.5-5.0 μm in diameter) to pierce the cell membrane and deliver the cargo directly to the target site within the cell. Microinjection can be used for in vitro and ex vivo delivery.
Plasmids comprising coding sequences for one or more protein components and/or nucleic acid components, mRNA and/or nucleic acid component molecules may be microinjected. In some cases, microinjection can be used to i) deliver DNA directly to the nucleus, and/or ii) deliver mRNA (e.g., transcribed in vitro) to the nucleus or cytoplasm. In certain examples, microinjection can be used to deliver nucleic acid components directly to the nucleus and mRNA directly to the cytoplasm, for example, to facilitate translation and shuttling of one or more protein components to the nucleus.
Microinjection can be used to produce genetically modified animals. For example, genetic editing cargo may be injected into fertilized eggs to allow for efficient germ line modification. This method can produce normal embryos and term mouse pups with the desired modifications. Microinjection can also be used to provide transient up-or down-regulation of specific genes within the cell genome, for example using TnpB.
Electroporation method
In one embodiment, the cargo and/or delivery vehicle may be delivered by electroporation. Electroporation can use pulsed high voltage current to transiently open nano-sized pores within the cell membrane of cells suspended in a buffer, allowing components with hydrodynamic diameters of tens of nanometers to flow into the cells. In some cases, electroporation can be used for a variety of cell types and effectively transfer cargo into cells. Electroporation may be used for in vitro and ex vivo delivery.
Electroporation may also be used to deliver cargo into the nucleus of mammalian cells by applying specific voltages and reagents, for example, by nuclear transfection. Such methods include Wu Y et al (2015), cell Res 25:67-79; ye L et al (2014), proc Natl Acad Sci USA 111:9591-6; choi PS, meyerson m. (2014). Nat com 5:3728; wang J, quake SR. (2014), proc Natl Acad Sci 111:13157-62. Electroporation may also be used to deliver cargo in vivo, for example, by using the method described in Zuckermann M et al (2015), nat Commun 6:7391.
Hydrodynamic delivery
Hydrodynamic delivery may also be used to deliver cargo, for example for in vivo delivery. In some examples, hydrodynamic delivery may be performed by rapidly pushing a large volume (8-10% body weight) solution containing the genetic editing cargo into the blood stream of a subject (e.g., animal or human), such as in the case of mice, via the tail vein into the blood stream. Since blood is incompressible, large volumes of liquid may result in an increase in hydrodynamic pressure that temporarily enhances the permeability of endothelial cells and parenchymal cells, allowing cargo that is not normally able to pass through the cell membrane to enter the cells. This method can be used to deliver naked DNA plasmids and proteins. The delivered cargo may be enriched in the liver, kidneys, lungs, muscles and/or heart.
Transfection
Goods (e.g., nucleic acids) can be introduced into cells by transfection methods used to introduce nucleic acids into cells. Examples of transfection methods include calcium phosphate mediated transfection, cationic transfection, lipofection, dendrimer transfection, heat shock transfection, magnetic transfection, lipofection, puncture transfection (impalefection), optical transfection, and patenting agent (proprietary agent) enhanced nucleic acid uptake.
Delivery vehicle
The delivery system may include one or more delivery vehicles. The delivery vehicle may deliver the cargo into a cell, tissue, organ, or organism (e.g., an animal or plant). The cargo may be packaged, carried, or otherwise associated with the delivery vehicle. The delivery vehicle may be selected based on the type of cargo to be delivered and/or whether the delivery is in vitro and/or in vivo. Examples of delivery vehicles include vectors, viral, non-viral vehicles, and other delivery agents described herein.
The delivery vehicle according to the present invention may have a maximum dimension (e.g., diameter) of less than 100 micrometers (μm). In one embodiment, the delivery vehicle has a maximum dimension of less than 10 μm. In one embodiment, the delivery vehicle may have a maximum dimension of less than 2000 nanometers (nm). In one embodiment, the delivery vehicle may have a maximum dimension of less than 1000 nanometers (nm). In one embodiment, the delivery vehicle may have a maximum dimension (e.g., diameter) of less than 900nm, less than 800nm, less than 700nm, less than 600nm, less than 500nm, less than 400nm, less than 300nm, less than 200nm, less than 150nm, or less than 100nm, less than 50 nm. In one embodiment, the delivery vehicle may have a maximum dimension ranging between 25nm and 200 nm.
In one embodiment, the delivery vehicle may be or comprise particles. For example, the delivery vehicle may be or comprise nanoparticles (e.g., particles having a largest dimension (e.g., diameter) of no greater than 1000 nm). The particles may be provided in different forms, for example as solid particles (e.g., metals such as silver, gold, iron, titanium), non-metals, lipid-based solids, polymers), suspensions of particles, or combinations thereof. Metallic, dielectric, and semiconductor particles, as well as hybrid structures (e.g., core-shell particles) may be prepared. Nanoparticles may also be used to deliver compositions and systems to plant cells, for example, as described in International patent publication No. WO 2008042156, U.S. published application No. US20130185823, and International patent publication No. WO 2015/089419.
Carrier body
The system, composition, and/or delivery system may comprise one or more carriers. The present disclosure also includes a carrier system. The carrier system may comprise one or more carriers. In one embodiment, a vector refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. Vectors include single-stranded, double-stranded or partially double-stranded nucleic acid molecules; a nucleic acid molecule comprising one or more free ends, free ends (e.g., circular); a nucleic acid molecule comprising DNA, RNA, or both; and other variants of polynucleotides known in the art. The vector may be a plasmid, for example, a circular double stranded DNA loop into which additional DNA segments may be inserted, such as by standard molecular cloning techniques. Some vectors may be capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Some vectors (e.g., non-episomal mammalian vectors) integrate into the genome of a host cell upon introduction into the host cell, thereby replicating with the host genome. In certain examples, the vector may be an expression vector, e.g., capable of directing expression of a gene to which it is operably linked. In some cases, the expression vector may be used for expression in eukaryotic cells. Common expression vectors used in recombinant DNA technology are often in the form of plasmids.
Examples of vectors include pGEX, pMAL, pRIT, escherichia coli expression vectors (e.g., pTrc, pET 11d, yeast expression vectors (e.g., pYepSec1, pMFa, pJRY88, pYES2 and picZ), baculovirus (Baculovirus) vectors (e.g., for expression in insect cells such as SF9 cells) (e.g., pAc series and pVL series), mammalian expression vectors (e.g., pCDM8 and pMT2 PC).
The vector may comprise i) one or more protein component coding sequences, and/or ii) single or at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 12, at least 14, at least 16, at least 32, at least 48, at least 50 nucleic acid component molecule coding sequences. In a single vector, a promoter for each RNA coding sequence may be present. Alternatively or additionally, in a single vector, there may be promoters that control (e.g., drive transcription and/or expression) multiple RNA coding sequences.
Furthermore, the composition or system may be delivered via a carrier, e.g., a separate carrier or the same carrier encoding the complex. When provided by separate vectors, RNAs targeting expression of the TnpB polypeptide may be administered sequentially or simultaneously. When administered sequentially, RNA targeting expression of the TnpB polypeptide will be delivered after RNA intended for use in, for example, gene editing or genetic engineering. This time period may be a period of several minutes (e.g., 5 minutes, 10 minutes, 20 minutes, 30 minutes, 45 minutes, 60 minutes). This time period may be a period of several hours (e.g., 2 hours, 4 hours, 6 hours, 8 hours, 12 hours, 24 hours). This period of time may be a period of days (e.g., 2 days, 3 days, 4 days, 7 days). This time period may be a period of several weeks (e.g., 2 weeks, 3 weeks, 4 weeks). This period of time may be a period of several months (e.g., 2 months, 4 months, 8 months, 12 months). This time period may be a period of several years (2 years, 3 years, 4 years). In this way, the TnpB polypeptide associates with a first nucleic acid component molecule capable of hybridizing to a first target (such as one or more genomic loci of interest) and assumes the desired function of the system (e.g., genetic engineering); subsequently, the TnpB polypeptide may then be associated with a second nucleic acid component molecule capable of hybridizing to a sequence comprising at least a portion of the TnpB polypeptide. In the case of nucleic acid component molecules targeting sequences encoding the expression of the TnpB polypeptide, the enzyme becomes blocked and the system becomes self-inactivating. In the same manner, RNA targeting expression of the TnpB polypeptide applied via, for example, liposomes, lipofection, particles, microbubbles as explained herein, may be administered sequentially or simultaneously. Similarly, self-inactivation can be used to inactivate one or more nucleic acid component molecules used to target one or more targets.
Regulatory element
The vector may comprise one or more regulatory elements. The regulatory element may be operably linked to the coding sequence of the TnpB polypeptide, the accessory protein, the nucleic acid component scaffold and/or the nucleic acid component molecule or a combination thereof. The term "operably linked" is intended to mean that the nucleotide sequence of interest is linked to the regulatory element in a manner that allows for expression of the nucleotide sequence (e.g., in an in vitro transcription/translation system, or in a host cell when the vector is introduced into the host cell). In certain examples, the carrier may comprise: a first regulatory element operably linked to a nucleotide sequence encoding a TnpB polypeptide, and a second regulatory element operably linked to a nucleotide sequence encoding a nucleic acid component molecule.
Examples of regulatory elements include promoters, enhancers, internal Ribosome Entry Sites (IRES), and other expression control elements (e.g., transcription termination signals, such as polyadenylation signals and poly-U sequences). Such regulatory elements are described, for example, in Goeddel, GENE EXPRESSION TECHNOLOGY: METHODS IN ENZYMOLOGY 185,Academic Press,San Diego,Calif (1990). Regulatory elements include those that direct constitutive expression of a nucleotide sequence in many types of host cells, as well as those that direct expression of the nucleotide sequence in only certain host cells (e.g., tissue-specific regulatory sequences). Tissue-specific promoters may be directed primarily for expression in a desired tissue of interest, such as muscle, neuron, bone, skin, blood, a particular organ (e.g., liver, pancreas), or a particular cell type (e.g., lymphocyte). Regulatory elements may also direct expression in a time-dependent manner (such as in a cell cycle-dependent or developmental stage-dependent manner), which may or may not also be tissue or cell type specific.
Examples of promoters include one or more pol III promoters (e.g., 1, 2, 3, 4, 5, or more pol III promoters), one or more pol II promoters (e.g., 1, 2, 3, 4, 5, or more pol II promoters), one or more pol I promoters (e.g., 1, 2, 3, 4, 5, or more pol I promoters), or combinations thereof. Examples of pol III promoters include, but are not limited to, U6 and H1 promoters. Examples of pol II promoters include, but are not limited to, the retrovirus Rous Sarcoma Virus (RSV) LTR promoter (optionally with an RSV enhancer), the Cytomegalovirus (CMV) promoter (optionally with a CMV enhancer), the SV40 promoter, the dihydrofolate reductase promoter, the β -actin promoter, the phosphoglycerate kinase (PGK) promoter, and the EF1 a promoter.
Viral vectors
The cargo may be delivered by a virus. In one embodiment, a viral vector is used. Viral vectors may comprise viral-derived DNA or RNA sequences for packaging into viruses (e.g., retroviruses, replication-defective retroviruses, adenoviruses, replication-defective adenoviruses, and adeno-associated viruses). Viral vectors also include polynucleotides carried by a virus for transfection into a host cell. Viruses and viral vectors may be used for in vitro, ex vivo, and/or in vivo delivery.
Adeno-associated virus (AAV)
The systems and compositions herein may be delivered by adeno-associated virus (AAV). AAV vectors may be used for such delivery. AAV belongs to the genus parvoviridae (Parvoviridae family) dependent virus (Dependovirus genus), and is a single-stranded DNA virus. In one embodiment, AAV may provide a persistent source of DNA provided because the genomic material delivered by AAV may be present in the cell indefinitely, for example as exogenous DNA or integrated directly into host DNA by some modification. In one embodiment, the AAV does not cause or is not associated with any human disease. The virus itself is able to effectively infect cells with little or no innate or adaptive immune response or associated toxicity.
Examples of AAV useful herein include AAV-1, AAV-2, AAV-3, AAV-4, AAV-5, AAV-6, AAV-8, and AAV-9. The type of AAV may be selected according to the cell to be targeted; for example, AAV serotypes 1, 2, 5 or hybrid capsid AAV1, AAV2, AAV5, or any combination thereof, can be selected to target brain or neuronal cells; and AAV4 may be selected to target heart tissue. AAV8 may be used for delivery to the liver. AAV-2 based vectors were originally proposed for delivering CFTR to the CF airways, other serotypes (such as AAV-1, AAV-5, AAV-6, and AAV-9) exhibiting improved gene transfer efficiency in multiple lung epithelial models. Examples of AAV-targeted cell types are described in Grimm, d. Et al, j. Virol.82:5887-5911 (2008)), and are shown in table 3 below:
TABLE 3 Table 3.
AAV particles can be produced in HEK 293T cells. Once particles with a specific tropism are produced, they are used to infect target cell lines, just like natural viral particles. This may allow for the persistence of the component in the infected cell type and makes this delivery mode particularly suitable where long term expression is desired. Examples of dosages and formulations of AAV that may be used include those described in U.S. patent nos. 8,454,972 and 8,404,658.
Various strategies may be used to deliver the systems and compositions herein with AAV. In some examples, the coding sequences for the TnpB polypeptide and nucleic acid components can be packaged directly onto a DNA plasmid vector and delivered via an AAV particle. In some examples, AAV can be used to deliver a nucleic acid component into cells that have been previously engineered to express a TnpB polypeptide. In some examples, the coding sequences for the TnpB polypeptide and the nucleic acid component can be made into two separate AAV particles for co-transfecting target cells. In some examples, the markers, tags, and other sequences can be packaged in the same AAV particle as the coding sequences of the TnpB polypeptide and/or nucleic acid components.
Lentivirus virus
The systems and compositions herein may be delivered by lentiviruses. Lentiviral vectors may be used for such delivery. Lentiviruses are complex retroviruses that have the ability to infect and express their genes in mitotic and postmitotic cells.
Examples of lentiviruses include Human Immunodeficiency Virus (HIV), which can utilize the envelope glycoproteins of other viruses to target a broad range of cell types; minimal non-primate lentiviral vectors based on Equine Infectious Anemia Virus (EIAV), which are useful for ocular therapy. In one embodiment, self-inactivating lentiviral vectors (see, e.g., diGiusto et al (2010) Sci Transl Med 2:36ra 43) with siRNA, nucleolar targeting TAR decoys, and anti-CCR 5 specific hammerhead ribozymes targeting common exons common to HIV tat/rev are useful and/or adaptable to the TnpB system herein.
Lentiviruses may be pseudotyped with other viral proteins, such as the G protein of vesicular stomatitis virus. In this process, the cell tropism of the lentivirus can be changed to broad or narrow as desired. In some cases, to increase safety, second and third generation lentiviral systems may split essential genes onto three plasmids, which may reduce the likelihood of accidental reconstitution of live intracellular viral particles.
In some examples, with the ability to integrate, lentiviruses can be used to create libraries of cells comprising various genetic modifications, for example, for screening and/or studying genes and signaling pathways.
Adenovirus
The systems and compositions herein may be delivered by adenovirus. Adenovirus vectors may be used for such delivery. Adenoviruses include non-enveloped viruses having an icosahedral nucleocapsid containing a double-stranded DNA genome. Adenovirus can infect dividing cells and non-dividing cells. In one embodiment, the adenovirus is not integrated into the genome of the host cell, which can be used to limit off-target effects of the system in gene editing applications.
Viral vector for delivery to plants
The systems and compositions can be delivered to plant cells using viral vehicles. In particular embodiments, the compositions and systems may be introduced into plant cells using plant viral vectors (e.g., as described in Scholthof et al 1996,Annu Rev Phytopathol.1996;34:299-323). Such viral vectors may be vectors derived from DNA viruses such as geminivirus (e.g., cabbage leaf roller virus (cabbage leaf curl virus), bean dwarf virus (bean yellow dwarf virus), wheat dwarf virus (wheat dwarf virus), tomato leaf roller virus (tomato leaf curl virus), corn streak virus (maize streak virus), tobacco leaf roller virus (tobacco leaf curl virus) or tomato golden mosaic virus (tomato golden mosaic virus)) or nanoviruses (e.g., bean necrotic yellow virus (Faba bean necrotic yellow virus)). The viral vector may be a vector derived from an RNA virus such as tobacco brittle virus (e.g., tobacco brittle virus (tobacco rattle virus), tobacco mosaic virus (tobacco mosaic virus)), potato virus X (e.g., potato virus X) or barley virus (hordeivirus) (e.g., barley streak mosaic virus (barley stripe mosaic virus)). The replication genome of the plant virus may be a non-integrating vector.
Non-viral vehicle
The delivery vehicle may comprise a non-viral vehicle. Generally, methods and vehicles capable of delivering nucleic acids and/or proteins can be used to deliver the system compositions herein. Examples of non-viral vehicles include lipid nanoparticles, cell Penetrating Peptides (CPPs), DNA nanoclusters (nanocclews), gold nanoparticles, streptolysin O, multifunctional coated nanodevices (MENDs), lipid-coated mesoporous silica particles, and other inorganic nanoparticles. Targeted delivery of RNA and endosomal escape are often requirements for effective RNA use. As described in further detail below, lipids, including lipid nanoparticles, lipid-like materials, polymers are particularly preferred delivery vehicles for RNA.
Nanoparticles
The delivery vehicle for use with the present compositions may comprise nanoparticles, including lipid nanoparticles. Other particle systems may be used, including polymer-based materials such as calcium phosphate-silicate nanoparticles, calcium phosphate nanoparticles, silica nanoparticles, and poly (amido-amine), poly-beta amino esters (PBAEs), and Polyethylenimine (PEI). See, e.g., trepotec et al mol. Therapy 27:4, month 4 2019. In embodiments, exemplary nanoparticles comprise a modified dendrimer comprising a core, one or more homogeneous or heterogeneous intermediate layers for encapsulation and delivery of nucleic acids (e.g., mRNA), and a terminal layer. The modified dendrimers may preferably comprise one or more polyester dendrimers, for example, comprising a core of branching into one or more generations of polyester units, wherein the polyester is surface-linked to hydrophobic units (e.g., fatty acid derivatives) via amine linkers (e.g., polyamines), including Polyamidoamine (PAMAM) dendrimers, polypropyleneimine (PPI) dendrimers, or Polyethyleneimine (PEI) dendrimers. The plurality of intermediate layers may comprise at least one layer modified for endosomal escape and a polyfluorocarbon. Exemplary molecules and methods of preparation are found in WO/2020/132196 and WO 2021/207020, which are incorporated herein by reference. Formulas IB, II and III of international patent publication WO 2021/207020 are expressly incorporated herein by reference as exemplary nanoparticle delivery vehicles for delivering nucleic acids.
Lipid particles
The delivery vehicle may comprise lipid particles, such as Lipid Nanoparticles (LNPs) and liposomes. Lipid aminoglycosides and derivatives thereof are known in the art for delivery of RNA, including dioleamine-a-succinyl-neomycin ("DOSN"), dioleamine-a-succinyl-paromomycin ("DOSP"), neoCHol NeoSucChol, parinochol. Parinocapsucdola, paramoLysSucDOLA, neoDiSucDODA, neodiLysSucDOLA, and [ pariomollys ]2-Glu-Lys- [ SucDOLA ]2, as described in detail in international patent publication WO 2008/040792, which is incorporated herein by reference.
Lipid Nanoparticles (LNP)
LNP can encapsulate nucleic acids within cationic lipid particles (e.g., liposomes) and can be delivered to cells relatively easily. In some examples, the lipid nanoparticle does not contain any viral components, which helps minimize safety and immunogenicity issues. Lipid particles can be used for in vitro, ex vivo, and in vivo delivery. Lipid particles can be used in cell populations of various sizes.
In some examples, LNP can be used to deliver DNA molecules (e.g., those comprising coding sequences for TnpB polypeptides and/or nucleic acid components) and/or RNA molecules (e.g., mRNA for TnpB polypeptides, nucleic acid component molecules). In certain instances, LNP can be used to deliver RNP complexes of TnpB polypeptide/nucleic acid components.
The cationic lipids form complexes with the mRNA, forming lipid complexes, which are then endocytosed by the cell. In an exemplary embodiment, the LNP comprises a cationic lipid, a helper lipid, cholesterol, and polyethylene glycol (PEG). In an exemplary embodiment, the LNP may comprise a cationic paromomycin-based lipid having an amide or phosphoramide linker and in another aspect two imidazole-based neutral lipids also having an amide or phosphoramide that acts as a linker. In embodiments, when the cationic lipid and the helper lipid comprise different linkers, an assembly may be obtained. See Colombani et al, self-assembling complexes between binary mixtures of lipids with different linkers and nucleic acids promote universal mRNA, DNA and siRNA delivery.J.control Release. (2017) doi 10.1016/j.jcon.2017.01.041.
In embodiments, nanoparticles may be developed in accordance with selective organ targeting (SORT), wherein various classes of lipid nanoparticles are systematically engineered to specifically edit extrahepatic tissue via the addition of supplemental SORT molecules. See, e.g., cheng et al, nature Nanotechnology 15,313-320 2020). The method has been shown to use Dendrimer Lipid Nanoparticles (DLNPs), stabilized Nucleic Acid Lipid Particles (SNALPs) and lipid-like nanoparticles (LLNPs), including the use of ionizable cationic lipids (5A 2-SC8, C12-200 or DLin-MC 3-DMA) 36, 48, 49, zwitterionic lipids (DOPE or DSPC), cholesterol, DMG-PEG and permanent cationic lipids (DOTAP, DDAB or EPC). Wei et al, systemic nanoparticledelivery of CRISPR-Cas9ribonucleproteins for effective tissue specific genome edition, nature Comm (2020) 11:3232, doi:10.1038/s4146020170293, incorporated herein by reference.
In one embodiment, the composition comprises a plurality of lipid nanoparticles comprising a cationic lipid, a neutral lipid, cholesterol, a PEG lipid, or a combination thereof, wherein the plurality of lipid nanoparticles optionally have an average particle size between 80nm and 160 nm; and wherein the lipid nanoparticle comprises one or more polynucleotides encoding at least one polypeptide of the invention (e.g., a TnpB polypeptide).
The components in the LNP may comprise the cationic lipids 1, 2-dioleoyl-3-dimethylammonium-propane (DLinDAP), 1, 2-dioleoyloxy-3-N, N-dimethylaminopropane (DLinDMA), 1, 2-dioleoyloxy-ketone-N, N-dimethyl-3-aminopropane (DLinK-DMA), 1, 2-dioleoyl-4- (2-dimethylaminoethyl) - [1,3] -dioxolane (DLinKC 2-DMA), (3-o- [2"- (methoxypolyethylene glycol 2000) succinyl ] -1, 2-dimyristoyl-sn-ethylene glycol (PEG-S-DMG), R-3- [ (m-methoxy-poly (ethylene glycol) 2000) carbamoyl ] -1, 2-dimyristoyloxypropyl-3-amine (PEG-C-DOMG), and any combination thereof, the preparation and packaging of the LNP may be adjusted according to Rosen et al, molecular Therapy, volume 19, page 2200, 2011-2200.
Additional cationic lipids may include di-O-octadecenyl-3-trimethylammonium-propane (DOTMA), 1, 2-dioleoyl-sn-glycero-3-phosphoethanolamine (DOPE), 1, 2-dioleoyl-3-trimethylammonium-propane (DOTAP), biodegradable analogs of DOTMA, alone or in combination with other materials such as cholesterol. Such cationic lipid LNPs may be delivered, for example, in the form of nanoemulsions, and may be further incorporated into apatite carbonate (increasing the interaction between particles and cell membrane), or conjugated with fibronectin, thereby accelerating endocytosis. Other quaternary ammonium lipids are also contemplated for delivery, such as Dimethyl Dioctadecyl Ammonium Bromide (DDAB) which is also 2, 3-dioleoyloxy-N- [2- (spermatid-o) ethyl ] -N, N-dimethyl-1-trifluoroacetate propanammonium (DOSPA).
Lipid nanoparticles for mRNA delivery may include 2- (((((3 s,8s,9s,10R,13R,14s, 17R) -10, 13-dimethyl-17- ((R) -6-methylhept-2-yl) -2,3,4,7,8,9,10,11,12,13,14,15,16,17-decatetrahydro-1H-cyclopenta [ a ] phenanthren-3-yl) oxy) carbonyl) amino) -N, N-bis (2-hydroxyethyl) -N-methylethane-1-ammonium bromide (BHEM-cholesterol). See Zhang, y. Et al In situ repurposing of dendritic cells with CRISPR/Cas9-based nanomedicine to induce transplant tolerance. Biomaterials 217,119302 (2019), incorporated herein by reference.
In some embodiments, the lipid nanoparticle comprises a molar ratio of 20-60% ionizable cationic lipid, 5-25% non-cationic lipid, 25-55% sterol, and 0.5-15% PEG-modified lipid.
In some embodiments, the lipid nanoparticle is any nanoparticle described in U.S. patent No. 10,442,756 and/or comprises any compound described in U.S. patent No. 10,442,756, including, but not limited to, a nanoparticle according to any of formulas (IA) or (II) described therein.
In some embodiments, the lipid nanoparticle is any nanoparticle described in U.S. patent No. 10,266,485 and/or comprises any compound described in U.S. patent No. 10,266,485, including, but not limited to, a nanoparticle according to formula (II) described therein.
In some embodiments, the lipid nanoparticle is a nanoparticle described in U.S. patent No. 9,868,692 and/or comprises a compound described, for example, in U.S. patent No. 9,868,692, including, but not limited to, nanoparticles according to formulas (I), (1A), (II), (IIa), (IIb), (IIc), (IId), (IIe).
In some embodiments, the lipid nanoparticle comprises a compound of formula (I) and/or formula (II) as described in us patent No. 10272150.
In some embodiments, the mRNA is formulated in a lipid nanoparticle comprising a compound selected from compounds 3, 18, 20, 25, 26, 29, 30, 60, 108-112, and 122 of U.S. patent No. 10,272,150.
In some embodiments, at least 80% (e.g., 85%, 90%, 95%, 98%, 99%) of the uracils in the open reading frame have chemical modifications, optionally wherein the vaccine is formulated in a lipid nanoparticle (e.g., the lipid nanoparticle comprises a cationic lipid, a PEG-modified lipid, a sterol, and a non-cationic lipid).
In some embodiments, the lipid nanoparticle has an average diameter of 50-200 nm.
In some embodiments, the lipid nanoparticle comprises compound 3, 18, 20, 25, 26, 29, 30, 60, 108-112, or 122 as listed in U.S. patent No. 10272150.
In some embodiments, the lipid nanoparticle has a polydispersity value of less than 0.4 (e.g., less than 0.3, 0.2, or 0.1).
In some embodiments, the plurality of lipid nanoparticles, such as when included in a formulation, have an average PDI between 0.02 and 0.2. In some embodiments, the plurality of lipid nanoparticles, such as when included in a formulation comprising one or more polynucleotides, have an average lipid to polynucleotide ratio (weight/weight) of between 10 and 20.
In some embodiments, the lipid nanoparticle has a net neutral charge at neutral pH.
Liposome
In one embodiment, the lipid particle may be a liposome. Liposomes are spherical vesicle structures consisting of a monolayer or multilamellar lipid bilayer surrounding an inner aqueous compartment and a relatively impermeable outer lipophilic phospholipid bilayer. In one embodiment, the liposomes are biocompatible, nontoxic, can deliver hydrophilic and lipophilic drug molecules, protect their cargo from plasmatic enzymatic degradation, and transport their load across the biological membrane and Blood Brain Barrier (BBB).
Liposomes can be made from several different types of lipids (e.g., phospholipids). Liposomes can comprise natural phospholipids and lipids, such as 1, 2-distearoyl-sn-glycero-3-phosphatidylcholine (DSPC), sphingomyelin, lecithin, monosialoganglioside, or any combination thereof.
Several other additives may be added to the liposomes in order to alter their structure and properties. For example, the liposomes may also contain cholesterol, sphingomyelin, and/or 1, 2-dioleoyl-sn-glycero-3-phosphoethanolamine (DOPE), for example, to increase stability and/or prevent leakage of cargo within the liposome.
In one embodiment, the liposome comprises a transport polymer, which may optionally be branched, comprising at least 10 amino acids and a histidine to non-histidine amino acid ratio of greater than 1.5 and less than 10. The branched transport polymer may comprise one or more backbones, one or more terminal branches, and optionally one or more non-terminal branches. See U.S. Pat. No. 7,070,807, which is incorporated herein by reference in its entirety. In one embodiment, the transport polymer is a histidine-lysine copolymer (HKP) for packaging and delivering mRNA and other goods. See U.S. Pat. nos. 7,163,695 and 7,772,201, which are incorporated herein by reference in their entirety.
Stabilized Nucleic Acid Lipid Particles (SNALP)
In one embodiment, the lipid particle may be a Stabilized Nucleic Acid Lipid Particle (SNALP). SNALP may comprise an ionizable lipid (DLinDMA) (e.g., cationic at low pH), a neutral helper lipid, cholesterol, a diffusible polyethylene glycol (PEG) -lipid, or any combination thereof. In some examples, SNALP may comprise synthetic cholesterol, dipalmitoyl phosphatidylcholine, 3-N- [ (monomethoxy polyethylene glycol) 2000) carbamoyl ] -1, 2-dimyristoxypropylamine, and the cation 1, 2-diiodoyloxy-3-N, N dimethylaminopropane. In some examples, SNALP may comprise synthetic cholesterol, 1, 2-distearoyl-sn-glycero-3-phosphorylcholine, PEG-ctma, and 1, 2-dioleyloxy-3- (N; N-dimethyl) aminopropane (DLinDMA).
Other lipids
The lipid particles may also comprise one or more other types of lipids, for example cationic lipids such as the amino lipids 2, 2-diimine-4-dimethylaminoethyl- [1,3] -dioxolane (DLin-KC 2-DMA), DLin-KC2-DMA4, C12-200 and the auxiliary lipids (collid) distearoyl phosphatidylcholine, cholesterol and PEG-DMG.
Lipid/polymeric complexes
In one embodiment, the delivery vehicle comprises a lipid complex and/or a polymeric complex. The lipid complex can bind to negatively charged cell membranes and induce endocytosis into the cell. Examples of lipid complexes may be complexes comprising lipids and non-lipid components. Examples of lipid complexes and polymeric complexes include FuGENE-6 reagent (a non-lipid containing lipids and other components)Plastid solution), zwitterionic Amino Lipids (ZAL),(e.g., DNA/Ca formation) 2+ Microcomposites), polyethylenimine (PEI) (e.g., branched PEI), and poly (L-lysine) (PLL). A core-shell structured lipid complex delivery platform can also be used and is one preferred delivery of mRNA, particularly because core-shell structured particles can protein and gradually release mRNA as the polymer degrades. See U.S. patent publication 2018/0360756, which is incorporated herein by reference.
Cell penetrating peptides
In one embodiment, the delivery vehicle comprises a Cell Penetrating Peptide (CPP). CPPs are short peptides that promote cellular uptake of various molecular cargo (e.g., from nanometer-sized particles to small chemical molecules and large DNA fragments).
CPPs can have different sizes, amino acid sequences, and charges. In some examples, the CPP can translocate the plasma membrane and facilitate delivery of various molecular cargo to the cytoplasm or organelle. CPPs can be introduced into cells via different mechanisms, for example, direct membrane penetration, endocytosis-mediated entry, and translocation through formation of temporary structures.
CPPs may have an amino acid composition containing a high relative abundance of positively charged amino acids such as lysine or arginine, or have a sequence containing alternating patterns of polar/charged amino acids and non-polar, hydrophobic amino acids. These two types of structures are referred to as polycationic or amphiphilic, respectively. The third class of CPPs are hydrophobic peptides that contain only non-polar residues, have a low net charge or have hydrophobic amino acid groups that are critical for cellular uptake. Another type of CPP is the transactivation transcriptional activator (Tat) from human immunodeficiency virus 1 (HIV-1). Examples of CPPs include Penetratin, tat (48-60), transportan and (R-AhX-R4) (Ahx refers to aminohexanoyl), carbocisic Fibroblast Growth Factor (FGF) signal peptide sequences, integrin beta 3 signal peptide sequences, polyarginine peptide Arg sequences, guanine-rich molecular transporter and sweet arrow peptide (sweet arrow peptide). Examples of CPPs and related applications also include those described in U.S. patent No. 8,372,951.
CPPs can be easily used for both in vitro and ex vivo work and generally require extensive optimization for each cargo and cell type. In some examples, the CPP can be directly covalently attached to a TnpB polypeptide, which is then complexed with a nucleic acid component and delivered to a cell. In some examples, the CPP-TnpB and CPP-nucleic acid components can be delivered to multiple cells separately. CPPs may also be used to deliver RNPs.
CPPs can be used to deliver compositions and systems to plants. In some examples, CPPs can be used to deliver components to plant protoplasts that are then regenerated into plant cells and further regenerated into plants.
DNA nanowire ball
In one embodiment, the delivery vehicle comprises a DNA nanowire coil. DNA nanoclusters refer to a spherical structure of DNA (e.g. having a yarn ball shape). The nanowire clew can be synthesized by rolling circle amplification using palindromic sequences that facilitate self-assembly of the structure. The sphere may then be loaded with a payload. Examples of DNA nanowires are described in Sun W et al, J Am Chem soc.2014, 10 months 22; 136 14722-5; and Sun W et al Angew Chem Int Ed engl 2015, 10 months 5; 54 (41):12029-33. The DNA clew may have a palindromic sequence that is partially complementary to the nucleic acid component molecules within the TnpB polypeptide: nucleic acid component ribonucleoprotein complex. The DNA coils may be coated, for example with PEI, to induce endosomal escape.
Gold nanoparticles
In one embodiment, the delivery vehicle comprises gold nanoparticles (also known as AuNP or colloidal gold). Gold nanoparticles can form complexes with cargo, such as TnpB polypeptides: nucleic acid component RNP. Gold nanoparticles may be coated, for example, in silicate and endosomal destructive polymer PAsp (DET). Examples of gold nanoparticles include AuraSense Therapeutics' Spherical Nucleic Acid (SNA TM ) Constructs, mout R et al (2017), ACS Nano 11:2452-8; lee K et al (2017), nat Biomed Eng 1:889-901.
iTOP
In one embodiment, the delivery vehicle comprises iTOP. Top refers to a combination of small molecules that drive efficient intracellular delivery of the native protein independent of any transduction peptide. The iTOP can be used to trigger giant pinocytosis uptake of extracellular macromolecules by cells using NaCl-mediated high osmotic pressure together with a transduction compound (propylbetaine), induced transduction by osmosis and propylbetaine. Examples of iTOP methods and reagents include those described in D' Astolfo DS, pagliero RJ, pras A et al (2015), cell 161:674-690.
Polymer-based particles
In one embodiment, the delivery vehicle may include polymer-based particles (e.g., nanoparticles). In one embodiment, the polymer-based particles may mimic the membrane fusion mechanism of a virus. The polymer-based particles may be synthetic copies of an influenza virus machine and form transfected complexes with various types of nucleic acids (siRNA, miRNA, plasmid DNA or nucleic acid components, mRNA) taken up by the cell via the endocytic pathway, a process involving the formation of an acidic compartment. The low pH in late endosomes acts as a chemical switch that renders the particle surface hydrophobic and facilitates membrane penetration. Once in the cytosol, the particles release their payload for cellular action. This active endosomal escape technique is safe and maximizes transfection efficiency because the technique uses the natural uptake pathway. In one embodiment, the polymer-based particles may comprise alkylated and carboxyalkylated branched polyethylenimine. In some examples, the polymer-based particle is a VIROMER, e.g., VIROMER RNAi, VIROMER RED, VIROMER mRNA. Exemplary methods of delivering the systems and compositions herein include Bawage SS et al, synthetic mRNA expressed Cas a mitigates RNA virus infections, bioxiv.org/content/10.1101/370460v1.full doi:doi.org/10.1101/370460, RED,a powerful tool for transfection of keratinocytes.doi:10.13140/RG.2.2.16993.61281,Transfection-facebook 2018:technology,product overview,users'data, doi: 10.13140/RG.2.2.23912.16642.
Streptolysin O (SLO)
The delivery vehicle may be streptolysin O (SLO). SLO is a group a streptococcal produced toxin that acts by forming pores in mammalian cell membranes. SLO can function in a reversible manner, which allows the delivery of proteins (e.g., up to 100 kDa) to the cytosol of the cell without compromising overall viability. Examples of SLOs include Sierig G et al (2003) information Immun 71:446-55; walev I et al (2001) Proc Natl Acad Sci U S A98:3185-90; those described in Teng KW et al (2017), elife6:e 25460.
Multifunctional coated nanometer device (MEND)
The delivery vehicle may include a multifunctional encapsulated nano-device (MEND). The MEND may comprise concentrated plasmid DNA, PLL core and lipid membrane shell. The MEND may further comprise a cell penetrating peptide (e.g., stearyl octaarginine). The cell penetrating peptide may be located in a lipid shell. The lipid envelope may be modified with one or more functional components, for example, one or more of the following: polyethylene glycol (e.g., to increase vascular circulation time), ligands for targeting specific tissues/cells, additional cell penetrating peptides (e.g., for larger cell delivery), lipids to enhance endosomal escape, and nuclear delivery tags. In some examples, the MEND may be a four-layer MEND (T-MEND) that may target nuclei and mitochondria. In certain examples, the MEND may be PEG-peptide-DOPE-conjugated MEND (PPD-MEND), which may target bladder cancer cells. Examples of MEND include Kogure K et al (2004), J Control Release 98:317-23; nakamura T et al (2012). Acc Chem Res 45:1113-21.
Lipid coated mesoporous silica particles
The delivery vehicle may comprise lipid-coated mesoporous silica particles. The lipid-coated mesoporous silica particles may comprise a mesoporous silica nanoparticle core and a lipid membrane shell. The silica core may have a large internal surface area, resulting in a high cargo loading capacity. In one embodiment, pore size, pore chemistry, and overall particle size may be modified to load different types of cargo. The lipid coating of the particles can also be modified to maximize cargo loading, increase circulation time, and provide precise targeting and cargo release. Examples of lipid coated mesoporous silica particles include Du X et al (2014) Biomaterials35:5580-90; durfee PN et al (2016) ACS Nano 10:8325-45.
Inorganic nanoparticles
The delivery vehicle may include inorganic nanoparticles. Examples of inorganic nanoparticles include Carbon Nanotubes (CNTs) (e.g., as described in Bates K and Kostarelos K. (2013). Adv Drug Deliv Rev 65:2023-33), bare Mesoporous Silica Nanoparticles (MSNPs) (e.g., as described in Luo GF et al (2014). Sci Rep 4:6064), and dense silica nanoparticles (SiNPs) (as described in Luo D and Saltzman WM. (2000). Nat Biotechnol 18:893-5).
Exosomes
The delivery vehicle may comprise exosomes. Exosomes include membrane-bound extracellular vesicles that can be used to contain and deliver various types of biomolecules, such as proteins, carbohydrates, lipids, and nucleic acids, and complexes thereof (e.g., RNPs). Examples of exosomes include Schroeder a et al, J international med.2010 month 1; 267 9-21; el-Andalosussi S et al, nat Protoc.2012, month 12; 7 (12) 2112-26; uno Y et al, hum Gene ter.2011, month 6; 22 (6) 711-9; zou W et al, hum Gene Ther.2011, month 4; 22 (4) those described in 465-75. Exemplary exosomes may be produced by 293F cells, in some cases mRNA-loaded exosomes driving higher mRNA expression than mRNA-loaded LNPs. See, e.g., j.biol.chem. (2021) 297 (5) 101266)
In some examples, the exosomes may form a complex with one or more components of the cargo (e.g., by direct or indirect binding). In some examples, the molecule of the exosome may be fused to a first adapter protein and the component of the cargo may be fused to a second adapter protein. The first and second adapter proteins can specifically bind to each other, thereby associating the cargo with the exosomes. Examples of such exosomes include those described in Ye Y et al, biomater Sci.2020, month 4, day 28 doi:10.1039/d0bm00427 h.
Retrovirus-like delivery system
The delivery vehicle may include a retrovirus-like protein, such as PEG10, which is capable of incorporating the cargo into the virus-like particle. Because such systems can be reprogrammed to package specific cargo, polynucleotides encoding components of the TnpB systems disclosed herein can be further modified with recognition sequences that result in selective packaging of the TnpB components into such retroviral-like VLPs. The VLP may be further modified with a fusion protein that confers tissue or cell specificity. Exemplary systems are disclosed in Segel et alMammalian retrovirus-like protein PEG10 packages its own mRNA and can be pseudotyped for mRNA delivery373 science,882-889 (2021), which is incorporated herein by reference. The use of native proteins that form virus-like particles and can deliver mRNA cargo or selective endogenous encapsidation (SEND) for cell delivery can reduce the immunogenic response compared to other delivery methods.
Genetically modified cells and organisms
The present disclosure also provides cells comprising one or more components of the compositions and systems herein (e.g., tnpB polypeptide and/or nucleic acid components). Also provided are cells modified by the systems and methods herein, as well as cell cultures, tissues, organs, organisms comprising such cells or their progeny. In one embodiment, the present disclosure provides a method of modifying a cell or organism. The cells may be prokaryotic or eukaryotic. The cell may be a mammalian cell. Mammalian cells may be non-human primate, bovine, porcine, rodent or mouse cells. The cells may be non-mammalian eukaryotic cells such as poultry, fish or shrimp. The cells may be therapeutic T cells or antibody-producing B cells. The cell may also be a plant cell. The plant cell may be a cell of a crop plant (such as cassava, maize, sorghum, wheat, or rice). The plant cell may also be a cell of an alga, tree or vegetable. Modifications introduced into cells by the present invention can allow cells and cell progeny to be altered to improve the production of biological products such as antibodies, starches, alcohols, or other desired cellular outputs. The modification introduced into the cells by the present invention may be such that the cells and cell progeny include alterations that alter the biological product produced.
In one embodiment, one or more polynucleotide molecules, vectors, or vector systems that drive expression of one or more elements of the composition, system, or delivery system comprising one or more elements of the TnpB system are introduced into a host cell such that expression of the elements of the TnpB system directs formation of the TnpB-targeted complex at one or more target sites. In one embodiment of the invention, the host cell may be a eukaryotic cell, a prokaryotic cell, or a plant cell.
In a particular embodiment, the host cell is a cell of a cell line. Cell lines can be obtained from a number of sources known to those skilled in the art (see, e.g., american type culture collection (American Type Culture Collection) (ATCC) (Manassas, va.)). In one embodiment, cells transfected with one or more vectors described herein are used to establish a new cell line comprising one or more vector-derived sequences. In one embodiment, cells transiently transfected with components of the systems as described herein (such as transiently transfected with one or more vectors, or transfected with RNA) and modified by the activity of the complex are used to establish a new cell line comprising cells containing the modification but lacking any other exogenous sequence. In one embodiment, cells transiently or non-transiently transfected with one or more vectors described herein or cell lines derived from such cells are used to evaluate one or more test compounds.
Also intended is an isolated human cell or tissue, plant or non-human animal comprising one or more of the polynucleotide molecules, vectors, vector systems or cells described in any of the embodiments herein. In one aspect, host cells and cell lines modified by or comprising the compositions, systems or modified enzymes of the invention, including (isolated) stem cells and their progeny are provided.
In one embodiment, the plant or non-human animal comprises at least one of the system components, polynucleotide molecules, vectors, vector systems or cells described in any of the embodiments herein in at least one tissue type of the plant or non-human animal. In one embodiment, the non-human animal comprises at least one of the system components, polynucleotide molecules, vectors, vector systems, or cells described in any of the embodiments herein in at least one tissue type. In one embodiment, the presence of system components is temporary in that they degrade over time. In one embodiment, expression of the components of the systems and compositions described in any of the embodiments, including in a polynucleotide molecule, vector system, or cell, is limited to certain tissue types or regions in plants or non-human animals. In one embodiment, the expression of the components of the systems and compositions described in any of the embodiments that are contained in a polynucleotide molecule, vector system, or cell is dependent on a physiological signal. In one embodiment, expression of the components of the systems and compositions described in any of the embodiments that are contained in a polynucleotide molecule, vector system, or cell can be triggered by an exogenous molecule. In one embodiment, the expression of the components of the systems and compositions described in any of the embodiments that are contained in a polynucleotide molecule, vector system or cell is dependent on the expression of a non-TnpB molecule in a plant or non-human animal.
General application and use
The systems, vector systems, vectors, and compositions described herein can be used in a variety of nucleic acid targeting applications to alter or modify synthesis of gene products (such as proteins), nucleic acid cleavage, nucleic acid editing, nucleic acid splicing; transport of target nucleic acid, tracking of target nucleic acid, isolation of target nucleic acid, visualization of target nucleic acid, and the like.
Thus, aspects of the invention also encompass methods and uses of the compositions and systems described herein in genomic engineering, e.g., for altering or manipulating expression of one or more genes or one or more gene products in prokaryotic or eukaryotic cells in vitro, in vivo, or ex vivo. In some examples, the target polynucleotide is a target sequence within genomic DNA (including nuclear genomic DNA, mitochondrial DNA, or chloroplast DNA).
Generally, in the context of a TnpB system, formation of a TnpB complex comprising a nucleic acid component molecule (ωrna) that hybridizes to a target sequence and is complexed with one or more nucleic acid targeting effector proteins results in cleavage of one or two DNA or RNA strands in or near the target sequence (e.g., within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50 or more base pairs from the target sequence). As used herein, the term "sequence associated with a target locus of interest" refers to a sequence that is in close proximity to the target sequence (e.g., within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50 or more base pairs from the target sequence, wherein the target sequence is contained within the target locus of interest).
In one embodiment, the present disclosure provides a method of targeting a polynucleotide comprising contacting a sample (such as a cell, cell population, tissue, organ, or organism) comprising a target polynucleotide with a composition, system, polynucleotide, or vector. The contacting may result in a modification of the gene product or a modification of the amount or expression of the gene product. In some examples, the target sequence of the polynucleotide is a disease-related target sequence.
In one embodiment, the present disclosure provides a method of modifying a target polynucleotide comprising delivering a composition, one or more polynucleotides of 2, or one or more vectors to a cell or population of cells comprising a target polynucleotide, wherein the complex directs a reverse transcriptase to the target sequence, and the reverse transcriptase facilitates insertion of a donor sequence from a nucleic acid component into the target polynucleotide.
Examples of target polynucleotides include sequences associated with signaling biochemical pathways, such as signaling biochemical pathway-associated genes or polynucleotides. Examples of target polynucleotides include disease-related genes or polynucleotides. By "disease-related" gene or polynucleotide is meant any gene or polynucleotide that produces a transcriptional or translational product at an abnormal level or in an abnormal form in cells derived from a tissue affected by a disease, as compared to tissues or cells not affected by the disease control. It may be a gene expressed at an abnormally high level; it may be a gene expressed at abnormally low levels, with altered expression being associated with the occurrence and/or progression of the disease. Disease-related genes also refer to genes having mutations or genetic variations that are directly related or in linkage disequilibrium with the gene responsible for the disease cause. The transcribed or translated product may be known or unknown and may be at normal or abnormal levels.
The target polynucleotide of the complex may be any polynucleotide endogenous or exogenous to the eukaryotic cell. For example, the target polynucleotide may be a polynucleotide present in a eukaryotic cell nucleus. The target polynucleotide may be a sequence encoding a gene product (e.g., a protein) or a non-coding sequence (e.g., a regulatory polynucleotide or a waste DNA). Without wishing to be bound by theory, it is believed that the target sequence should associate with TAMs (target adjacent motifs); i.e., short sequences recognized by the complex. The exact sequence and length requirements of the TAM will vary depending on the TnpB polypeptide used, but typically the TAM is a 2-5 base pair sequence adjacent to the original spacer sequence (i.e., target sequence). TAM specificity may be determined, for example, according to the experimental setup described in fig. 8. In one embodiment, the TAM sequence comprises TCA. In embodiments, the TAM sequence is TCAN, where N may comprise any nucleotide. In one embodiment, the TAM sequence comprises TCAG or TCAT. The skilled artisan will be able to identify additional TAM sequences for use with a given TnpB polypeptide. In addition, engineering of TAM interaction (PI) domains may allow for programming of TAM specificity, improve target site recognition fidelity, and increase the versatility of TnpB polypeptides, genomic engineering platforms. TnpB polypeptides can be engineered to alter their TAM specificity, e.g., as described by Kleinstiver BP et al Engineered CRISPR-Cas9 nucleases with altered TAM specialties Nature 2015, 7, 23; 523 (7561) 481-5. Doi:10.1038/aperture 14592.
Examples of target polynucleotides include sequences associated with signaling biochemical pathways, such as signaling biochemical pathway-associated genes or polynucleotides. Examples of target polynucleotides include disease-related genes or polynucleotides. By "disease-related" gene or polynucleotide is meant any gene or polynucleotide that produces a transcriptional or translational product at an abnormal level or in an abnormal form in cells derived from a tissue affected by a disease, as compared to tissues or cells not affected by the disease control. It may be a gene expressed at an abnormally high level; it may be a gene expressed at abnormally low levels, with altered expression being associated with the occurrence and/or progression of the disease. Disease-related genes also refer to genes having mutations or genetic variations that are directly related or in linkage disequilibrium with the gene responsible for the disease cause. The transcribed or translated product may be known or unknown and may be at normal or abnormal levels.
Aspects of the invention relate to a method of targeting a polynucleotide comprising contacting a sample comprising the polynucleotide with: a composition, system, or TnpB polypeptide as described in any embodiment herein, a delivery system comprising a composition, system, or TnpB polypeptide as described in any embodiment herein, a polynucleotide comprising a composition, system, or TnpB polypeptide as described in any embodiment herein, a vector comprising a composition, system, or TnpB polypeptide as described in any embodiment herein, or a vector system comprising a composition, system, or TnpB polypeptide as described in any embodiment herein. In one embodiment, the target polynucleotide is contacted with at least two different compositions, systems or TnpB polypeptides. In other embodiments, the two different TnpB polypeptides have different target polynucleotide specificities or degrees of specificity. In one embodiment, the two different TnpB polypeptides have different TAM specificities.
Also contemplated are methods of targeting a polynucleotide comprising contacting a sample comprising a polynucleotide with the compositions and systems, vectors, polynucleotides herein, wherein the contacting results in a modification of a gene product or a modification of the amount or expression of a gene product. In one embodiment, the expression of the targeted gene product is increased by the method. In one embodiment, expression of the targeted gene product is increased by at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, 100%. In one embodiment, expression of the targeted gene product is increased by at least 1.5 fold, at least 2 fold, at least 2.5 fold, at least 3 fold, at least 3.5 fold, at least 4 fold, at least 4.5 fold, at least 5 fold, at least 10 fold, at least 15 fold, at least 20 fold, at least 25 fold, at least 50 fold, at least 100 fold. In one embodiment, expression of the targeted gene product is reduced by at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 100%. In one embodiment, expression of the targeted gene product is reduced by at least 1.5 fold, at least 2 fold, at least 2.5 fold, at least 3 fold, at least 3.5 fold, at least 4 fold, at least 4.5 fold, at least 5 fold, at least 10 fold, at least 15 fold, at least 20 fold, at least 25 fold, at least 50 fold, at least 100 fold. In alternative embodiments, the expression of the targeted gene product is reduced by the method. In other embodiments, expression of the targeted gene may be completely eliminated, or expression of the targeted gene may be considered to be eliminated when the residual expression level of the targeted gene is below the detection limit of methods known in the art for quantifying, detecting, or monitoring the expression level of the gene.
In one embodiment, one or more polynucleotide molecules, vectors, or vector systems that drive expression of one or more elements of the TnpB system, or a delivery system comprising one or more elements of the TnpB system, are introduced into a host cell such that expression of the elements of the TnpB system directs formation of a TnpB-targeted complex at one or more target sites. In one embodiment of the invention, the host cell may be a eukaryotic cell, a prokaryotic cell, or a plant cell.
In a particular embodiment, the host cell is a cell of a cell line. Cell lines can be obtained from a number of sources known to those skilled in the art (see, e.g., the American Type Culture Collection (ATCC) (Manassas, va.)). In one embodiment, cells transfected with one or more vectors described herein are used to establish a new cell line comprising one or more vector-derived sequences. In one embodiment, cells transiently transfected with a composition or component of a system as described herein (such as transiently transfected with one or more vectors, or transfected with RNA) and modified by the activity of the complex are used to establish a new cell line comprising cells containing the modification but lacking any other exogenous sequence. In one embodiment, cells transiently or non-transiently transfected with one or more vectors described herein or cell lines derived from such cells are used to evaluate one or more test compounds.
Also intended is an isolated human cell or tissue, plant or non-human animal comprising one or more of the polynucleotide molecules, vectors, vector systems or cells described in any of the embodiments herein. In one aspect, host cells and cell lines modified by or comprising the compositions, systems or modified enzymes of the invention, including (isolated) stem cells and their progeny are provided.
In one embodiment, the plant or non-human animal comprises at least one of the compositions, polynucleotide molecules, vectors, vector systems, or cells described in any of the embodiments herein in at least one tissue type of the plant or non-human animal. In certain embodiments, the non-human animal comprises at least one of the compositions, polynucleotide molecules, vectors, vector systems, or cells described in any of the embodiments herein in at least one tissue type. In one embodiment, the presence of the compositions is temporary in that they degrade over time. In one embodiment, expression of a composition described in any embodiment as comprised in a polynucleotide molecule, vector system or cell is limited to certain tissue types or regions in a plant or non-human animal. In one embodiment, the expression of a composition described in any embodiment as comprised in a polynucleotide molecule, vector system or cell is dependent on a physiological signal. In one embodiment, expression of a composition described in any of the embodiments as comprised in a polynucleotide molecule, vector system or cell may be triggered by an exogenous molecule. In one embodiment, expression of a composition described in any embodiment as comprised in a polynucleotide molecule, vector system, or cell is dependent on expression of the non-Cas molecule in a plant or non-human animal.
In one aspect, the invention provides a method for using one or more elements of a TnpB system. The TnpB targeting complex of the invention provides an efficient means for modifying target DNA or RNA (single-or double-stranded, linear or supercoiled). The TnpB targeting complexes of the invention have a variety of uses, including modification (e.g., deletion, insertion, translocation, inactivation, activation) of target DNA or RNA in a variety of cell types. Thus, the TnpB targeting complexes of the invention have broad applications in, for example, gene therapy, drug screening, disease diagnosis and prognosis. An exemplary TnpB targeting complex comprises a DNA or RNA targeting effector protein complexed with a nucleic acid component molecule that hybridizes to a target sequence within a target locus of interest.
In one embodiment, the invention provides a method of cleaving a target polynucleotide. The method may comprise modifying the target polynucleotide using a TnpB targeting complex that binds to the target polynucleotide and effecting cleavage of the target polynucleotide. In embodiments, the TnpB targeting complexes of the invention can generate breaks (e.g., single-or double-strand breaks) in polynucleotide sequences when introduced into cells. For example, the method may be used to cleave disease polynucleotides in cells. For example, an exogenous template comprising sequences to be integrated flanking the upstream and downstream sequences may be introduced into the cell. The upstream and downstream sequences share sequence similarity to either side of the integration site in the polynucleotide. The exogenous template comprises the sequence to be integrated (e.g., mutated RNA). The sequence for integration may be a sequence endogenous or exogenous to the cell. Examples of sequences to be integrated include polynucleotides encoding proteins or non-coding RNAs (e.g., micrornas). Thus, the sequences for integration may be operably linked to one or more appropriate control sequences. Alternatively, the sequences to be integrated may provide regulatory functions. The upstream and downstream sequences in the recombination template are selected to facilitate recombination between the RNA sequence of interest and the recombination. The upstream sequence is a polynucleotide sequence having sequence similarity to the sequence upstream of the target site for integration. Similarly, a downstream sequence is a polynucleotide sequence that has sequence similarity to a polynucleotide sequence downstream of the target site for integration. The upstream and downstream sequences in the recombinant templates may have 75%, 80%, 85%, 90%, 95% or 100% sequence identity to the targeting sequence. Preferably, the upstream and downstream sequences in the recombinant template have about 95%, 96%, 97%, 98%, 99% or 100% sequence identity to the targeting sequence. In some methods, the upstream and downstream sequences in the recombinant template have about 99% or 100% sequence identity to the target sequence. The upstream or downstream sequence may comprise about 20bp to about 2500bp, for example about 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, or 2500bp. In some methods, an exemplary upstream or downstream sequence has about 200bp to about 2000bp, about 600bp to about 1000bp, or more specifically about 700bp to about 1000bp. In some methods, the recombinant templates may further comprise a marker. Such markers may facilitate screening for targeted integration. Examples of suitable markers include restriction sites, fluorescent proteins or selectable markers. Recombinant templates of the invention can be constructed using recombinant techniques (see, e.g., sambrook et al, 2001 and Ausubel et al, 1996). In a method for modifying a target sequence by integrating a recombinant template, a break (e.g., a double-stranded or single-stranded break in double-stranded or single-stranded DNA or RNA) is introduced into the DNA or RNA sequence by a TnpB targeting complex, which break is repaired via homologous recombination with the recombinant template, such that the template is integrated into the target. The presence of double strand breaks facilitates integration of the template. In other embodiments, the invention provides a method of modifying RNA expression in a eukaryotic cell. The methods include increasing or decreasing expression of the target polynucleotide by using a TnpB targeting complex that binds DNA or RNA (e.g., mRNA or pre-mRNA). In some methods, the target may be inactivated to affect the modification of expression in the cell. For example, when the TnpB targeting complex binds to a target sequence in a cell, the target is inactivated such that the sequence is not translated, does not produce the encoded protein, or the sequence does not function as a wild type sequence. For example, the protein or microRNA coding sequence may be inactivated such that no protein or microRNA or pre-microRNA transcript is produced. The target of the TnpB targeting complex may be any polynucleotide endogenous or exogenous to the eukaryotic cell. For example, the target polynucleotide may be a polynucleotide present in a eukaryotic cell nucleus. The target polynucleotide may be a sequence encoding a gene product (e.g., a protein) or a non-coding sequence (e.g., ncRNA, lncRNA, tRNA or rRNA). Examples of target RNAs include sequences associated with signaling biochemical pathways, such as signaling biochemical pathway-associated polynucleotides. Examples of target polynucleotides include disease-related polynucleotides. By "disease-related" polynucleotide is meant any polynucleotide that produces a translation product at an abnormal level or in an abnormal form in cells derived from a tissue affected by a disease, as compared to a tissue or cell that is not a disease control. It may be a gene expressed at an abnormally high level; it may be a gene expressed at abnormally low levels, with altered expression being associated with the occurrence and/or progression of the disease. Disease-related polynucleotides also refer to genes having mutations or genetic variations that are directly related or in linkage disequilibrium with the gene responsible for the disease cause. The translated product may be known or unknown and may be at normal or abnormal levels. The target RNA of the TnpB targeting complex may be any polynucleotide endogenous or exogenous to the eukaryotic cell. For example, the target RNA may be RNA present in a eukaryotic cell nucleus. The target polynucleotide may be a sequence encoding a gene product (e.g., a protein) or a non-coding sequence (e.g., ncRNA, lncRNA, tRNA or rRNA).
In one embodiment, a method can include allowing a composition to bind to a target DNA or RNA to effect cleavage of the target DNA or RNA, thereby modifying the target DNA or RNA, wherein the TnpB targeting complex comprises a nucleic acid targeting effector protein that is complexed with a nucleic acid component molecule that hybridizes to a target sequence within the target DNA or RNA. In one aspect, the invention provides a method of modifying DNA or RNA expression in a eukaryotic cell. In one embodiment, the method comprises allowing the TnpB targeting complex to bind to DNA or RNA such that the binding results in increased or decreased expression of the DNA or RNA; wherein the TnpB targeting complex comprises a nucleic acid targeting effector protein complexed with a nucleic acid component molecule. Similar considerations and conditions apply to the method of modifying target DNA or RNA as described above. In fact, these sampling, culturing and reintroduction options are applicable to various aspects of the invention. In one aspect, the invention provides methods of modifying target DNA or RNA in eukaryotic cells, which methods may be performed in vivo, ex vivo, or in vitro. In one embodiment, the method comprises sampling a cell or population of cells from a human or non-human animal and modifying the one or more cells. The culturing may be carried out at any stage ex vivo. One or more cells may even be reintroduced into the non-human animal or plant. For reintroduced cells, it is particularly preferred that the cells are stem cells. The composition as described in any of the embodiments herein can be used to detect a nucleic acid identifier. The nucleic acid identifier is a non-coding nucleic acid that can be used to identify a particular article. Exemplary nucleic acid identifiers, such as DNA watermarks, are described in Heider and Barnekow, "DNA watermarks: A proof of concept" BMC Molecular Biology 9:40 (2008). The nucleic acid identifier may also be a nucleic acid barcode. Nucleic acid-based barcodes are short nucleotide sequences (e.g., DNA, RNA, or a combination thereof) that serve as identifiers for related molecules, such as target molecules and/or target nucleic acids. The nucleic acid barcode may have a length of at least, for example, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, or 100 nucleotides, and may be in single-stranded or double-stranded form. One or more nucleic acid barcodes may be attached or "tagged" to a target molecule and/or target nucleic acid. The target molecules and/or target nucleic acids may be labeled with a plurality of nucleic acid barcodes in combination (such as a nucleic acid barcode concatemer).
In embodiments, the composition induces double strand breaks to induce HDR-mediated correction. In another embodiment, two or more nucleic acid component molecules complexed with a TnpB polypeptide or ortholog or homolog thereof can be used to induce multiple breaks to induce HDR-mediated correction.
The term recombinant template nucleic acid as used herein refers to a nucleic acid sequence that can be used in combination with the compositions disclosed herein to alter the structure of a target location. In embodiments, the target nucleic acid is modified to have some or all of the sequence of the recombinant template nucleic acid, typically at or near the cleavage site. In embodiments, the recombinant template nucleic acid is single stranded. In an alternative embodiment, the recombinant template nucleic acid is double stranded. In embodiments, the recombinant template nucleic acid is DNA, e.g., double stranded DNA. In an alternative embodiment, the recombinant template nucleic acid is single stranded DNA.
In one embodiment, a recombination template is provided to act as a template in homologous recombination, such as within or near a target sequence that is nicked or cleaved by a nucleic acid targeting effector protein as part of a TnpB targeting complex.
The recombinant templates may be components of another vector described herein, contained in a separate vector, or provided as separate polynucleotides. The recombinant template polynucleotide may have any suitable length, such as a length of about or greater than about 10, 15, 20, 25, 50, 75, 100, 150, 200, 500, 1000, or more nucleotides. In one embodiment, the recombinant template polynucleotide is complementary to a portion of a polynucleotide comprising a target sequence. When optimally aligned, the recombinant template polynucleotide may overlap with one or more nucleotides of the target sequence (e.g., about or greater than about 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, or more nucleotides). In one embodiment, when the recombinant template sequence and the polynucleotide comprising the target sequence are optimally aligned, the nearest nucleotide of the recombinant template polynucleotide is within about 1, 5, 10, 15, 20, 25, 50, 75, 100, 200, 300, 400, 500, 1000, 5000, 10000 or more nucleotides from the target sequence.
In embodiments, the recombinant template nucleic acid alters the structure of the target location by participating in homologous recombination. In embodiments, the recombinant template nucleic acid alters the sequence of the target location. In embodiments, the recombinant template nucleic acid results in the incorporation of modified or non-naturally occurring bases into the target nucleic acid.
The recombinant template sequence may undergo cleavage-mediated or catalytic recombination with the target sequence. In embodiments, the recombinant template nucleic acid may include a sequence corresponding to a site on a target sequence that is cleaved by a TnpB polypeptide-mediated cleavage event. In embodiments, the recombinant template nucleic acid may include sequences corresponding to both: a first site on the target sequence that is cleaved in a first TnpB polypeptide mediated event and a second site on the target sequence that is cleaved in a second TnpB polypeptide mediated event.
In one embodiment, the recombinant template nucleic acid may include sequences that result in a change in the coding sequence of the translated sequence, such as a sequence that results in a substitution of one amino acid for another amino acid in the protein product, such as conversion of a mutant allele to a wild-type allele, conversion of a wild-type allele to a mutant allele, and/or introduction of a stop codon, insertion of an amino acid residue, deletion of an amino acid residue, or nonsense mutation. In one embodiment, the recombinant template nucleic acid may include sequences that result in a change in a non-coding sequence (e.g., an exon or a 5 'or 3' untranslated or non-transcribed region). Such changes include changes in control elements (e.g., promoters, enhancers), and changes in cis-acting or trans-acting control elements.
Recombinant template nucleic acids having homology to a target position in a target gene can be used to alter the structure of a target sequence. Recombinant template sequences can be used to alter unwanted structures, such as unwanted or mutated nucleotides. The recombinant template nucleic acid may comprise sequences that when integrated result in: reducing the activity of the positive control element; increasing the activity of the positive control element; decreasing the activity of the negative control element; increasing the activity of the negative control element; reducing expression of the gene; increasing expression of the gene; increasing resistance to a disorder or disease; enhancing resistance to viral entry; correction of mutations or alterations of unwanted amino acid residues, conferring, increasing, eliminating or reducing a biological property of the gene product, e.g. increasing the enzymatic activity of an enzyme, or increasing the ability of the gene product to interact with another molecule.
The recombinant template nucleic acid may comprise sequences that result in: sequence changes of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or more nucleotides of the target sequence. In embodiments, the length of the recombinant template nucleic acid may be 20+/-10, 30+/-10, 40+/-10, 50+/-10, 60+/-10, 70+/-10, 80+/-10, 90+/-10, 100+/-10, 110+/-10, 120+/-10, 130+/-10, 140+/-10, 150+/-10, 160+/-10, 170+/-10, 1 +/-10, 190+/-10, 200+/-10, 210+/-10, or 220+/-10 nucleotides. In embodiments, the length of the recombinant template nucleic acid may be 30+/-20, 40+/-20, 50+/-20, 60+/-20, 70+/-20, 80+/-20, 90+/-20, 100+/-20, 1 +/-20, 120+/-20, 130+/-20, 140+/-20, I50 +/-20, 160+/-20, 170+/-20, 180+/-20, 190+/-20, 200+/-20, 210+/-20, or 220+/-20 nucleotides. In embodiments, the recombinant template nucleic acid is 10 to 1,000, 20 to 900, 30 to 800, 40 to 700, 50 to 600, 50 to 500, 50 to 400, 50 to 300, 50 to 200, or 50 to 100 nucleotides in length.
The recombinant template nucleic acid comprises the following components: [5 'homology arm ] - [ replacement sequence ] - [3' homology arm ]. Homology arms provide recombination into the chromosome, replacing undesired elements, such as mutations or markers, with replacement sequences. In embodiments, the homology arms flank the most distal cleavage site. In embodiments, the 3' end of the 5' homology arm is located immediately adjacent to the 5' end of the replacement sequence. In embodiments, the 5' homology arm may extend at least 10, 20, 30, 40, 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1500, or 2000 nucleotides from the 5' end of the replacement sequence to the 5 '. In embodiments, the 5' end of the 3' homology arm is located immediately adjacent to the 3' end of the replacement sequence. In embodiments, the 3' homology arm may extend at least 10, 20, 30, 40, 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1500, or 2000 nucleotides from the 3' end of the substitution sequence to the 3 '.
In one embodiment, one or both homology arms may be shortened to avoid the inclusion of certain sequence repeat elements. For example, the 5' homology arm can be shortened to avoid sequence repeat elements. In other embodiments, the 3' homology arm may be shortened to avoid sequence repeat elements. In one embodiment, both 5 'and 3' homology arms may be shortened to avoid including certain sequence repeat elements.
In one embodiment, the recombinant template nucleic acid used to correct for the mutation may be designed for use as a single stranded oligonucleotide. When single stranded oligonucleotides are used, the 5 'and 3' homology arms can range in length up to about 200 base pairs (bp), for example at least 25, 50, 75, 100, 125, 150, 175 or 200bp in length.
In contrast to TnpB polypeptide-mediated gene knockout (permanent elimination of expression by mutating the gene at the DNA level), tnpB polypeptide knockout allows for temporary reduction of gene expression by the use of artificial transcription factors. Mutating key residues in both DNA cleavage domains of the TnpB polypeptide results in the production of a catalytically inactive TnpB polypeptide. The catalytically inactive TnpB polypeptide complexes with the nucleic acid component molecule and localizes to the DNA sequence specified by the targeting domain of the nucleic acid component molecule, however, it does not cleave the target DNA. Fusion of the inactivated TnpB polypeptide protein with an effector domain (e.g., a transcription repression domain) enables recruitment of the effector to any DNA site specified by the nucleic acid component molecule. In one embodiment, the TnpB polypeptide may be fused to a transcriptional repression domain and recruited to the promoter region of a gene. In particular for gene repression, it is contemplated herein that blocking the binding site of an endogenous transcription factor will help to down-regulate gene expression. In another embodiment, the inactivated TnpB polypeptide may be fused to a chromatin modifying protein. Altering chromatin state can result in reduced expression of the target gene.
In embodiments, the nucleic acid component molecules may be targeted to known transcription response elements (e.g., promoters, enhancers, etc.), known Upstream Activating Sequences (UAS), and/or functionally known or unknown sequences suspected of being capable of controlling expression of the target DNA.
In some methods, the target polynucleotide may be inactivated to affect modification of expression in the cell. For example, when the composition binds to a target sequence in a cell, the target polynucleotide is inactivated such that the sequence is not transcribed, does not produce the encoded protein, or the sequence does not function as a wild-type sequence. For example, the protein or microRNA coding sequence may be inactivated such that no protein is produced.
Non-homologous end joining
In one embodiment, nuclease-induced non-homologous end joining (NHEJ) can be used to target gene-specific knockouts. Nuclease-induced NHEJ can also be used to remove (e.g., delete) sequences in a gene of interest. Generally, NHEJ repairs double strand breaks in DNA by joining two ends together; however, in general, the original sequence can only be recovered if the two compatible ends (identical to the formation of the double strand break) are perfectly joined. Double-stranded broken DNA ends are often the subject of enzymatic treatment, resulting in the addition or removal of nucleotides at one or both strands prior to the re-joining of the ends. This results in the presence of insertion and/or deletion (indel) mutations at the NHEJ repair site of the DNA sequence. Two-thirds of these mutations typically change the reading frame and thus produce a nonfunctional protein. Furthermore, mutations that maintain the reading frame but insert or delete a large number of sequences may disrupt the function of the protein. This is locus dependent, as mutations in critical functional domains may be more intolerable than mutations in non-critical regions of the protein. Indel mutations generated by NHEJ are inherently unpredictable; however, at a given cleavage site, certain indel sequences are advantageous and are over-represented in the population, possibly due to the smaller region of microhomology. The length of the deletions may vary greatly; most often in the range of 1-50bp, but they can readily be greater than 50bp, e.g., they can readily reach greater than about 100-200bp. Insertion tends to be short and typically involves short repeats of the sequence immediately adjacent to the cleavage site. However, large insertions can be obtained, and in these cases the inserted sequence is usually traced back to other regions of the genome or to plasmid DNA present in the cell.
Since NHEJ is a mutagenesis process, it can also be used to delete small sequence motifs, as long as no specific final sequence is required to be generated. If a double strand break is targeted near a short target sequence, the deletion mutation caused by NHEJ repair typically spans and thus removes unwanted nucleotides. For the deletion of larger DNA segments, the introduction of two double strand breaks (one on each side of the sequence) can result in NHEJ between the ends, while removing the complete intervening sequence. Both methods can be used to delete specific DNA sequences; however, the error-prone nature of NHEJ may still produce indel mutations at repair sites.
Double-stranded cleavage of a TnpB polypeptide or an ortholog or homolog thereof and single-stranded or nicking enzyme TnpB polypeptide or an ortholog or homolog molecule thereof can be used in the methods and compositions described herein to create NHEJ-mediated indels. NHEJ-mediated indels of a targeted gene (e.g., a coding region, such as the early coding region of a gene of interest) may be used to knock out the gene of interest (i.e., eliminate its expression). For example, the early coding region of the gene of interest includes a sequence immediately after the transcription start site, within the first exon of the coding sequence, or within 500bp (e.g., less than 500, 450, 400, 350, 300, 250, 200, 150, 100, or 50 bp) of the transcription start site.
In embodiments wherein the nucleic acid component molecule and the TnpB polypeptide or ortholog or homolog thereof create a double strand break to induce NHEJ-mediated indels, the RNA component molecule may be configured to position one double strand break in close proximity to a nucleotide at the target position. In embodiments, the cleavage site may be 0-500bp from the target location (e.g., less than 500, 400, 300, 200, 100, 50, 40, 30, 25, 20, 15, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1bp from the target location).
In embodiments, wherein two nucleic acid component molecules complexed with a TnpB polypeptide or an ortholog or homolog thereof (e.g., a TnpB polypeptide nickase) induce two single-strand breaks to induce NHEJ-mediated indels, the two nucleic acid component molecules can be configured to position the two single-strand breaks to provide a nucleotide of a target site for NHEJ repair.
In some examples, the systems herein may introduce one or more insertion deletions via the NHEJ pathway and insert sequences from the combined templates via HDR.
Exemplary application
The present invention provides a non-naturally occurring or engineered composition for modifying a target cell in vivo, ex vivo or in vitro, or one or more polynucleotides encoding components of the composition, or a vector or delivery system comprising one or more polynucleotides encoding components of the composition, and can be performed in a manner that alters the cell such that once modified, the progeny or cell line of the TnpB polypeptide modified cell retains an altered phenotype. The modified cells and offspring may be part of a multicellular organism (such as a plant or animal) in which the composition is applied to the desired cell type ex vivo or in vivo. The methods herein include therapeutic treatment methods. Therapeutic treatment methods may include gene or genome editing, or gene therapy.
In one embodiment, one or more vectors described herein are used to produce a non-human transgenic animal or transgenic plant. In one embodiment, the transgenic animal is a mammal, such as a mouse, rat, or rabbit. Methods for producing transgenic animals and plants are known in the art and generally begin with cell transfection methods (such as described herein).
Use of orthogonally catalytically inactive TnpB polypeptides
In particular embodiments, a TnpB polypeptide nicking enzyme is used in combination with an orthokinetically inactivated TnpB polypeptide to increase the efficiency of the nicking enzyme (e.g., as described in Chen et al 2017,Nature Communications 8:14958;doi:10.1038/ncoms 14958). More particularly, the orthokinetically inactive TnpB polypeptides are characterized by a different TAM recognition site than the TnpB nicking enzyme used in the AD functionalization composition, and the corresponding nucleic acid component molecular sequence is selected to bind to a target sequence adjacent to the target sequence of the nicking enzyme that functionalizes the TnpB polypeptide. Orthogonally catalytically inactive TnpB polypeptides as used in the context of the present invention do not form part of a functionalized composition, but rather only function to increase the efficiency of the nicking enzyme and are used in combination with standard nucleic acid components for the TnpB polypeptides as described in the art. In particular embodiments, the orthotically inactivated TnpB polypeptide is a dead TnpB polypeptide, i.e. comprises one or more mutations that eliminate nuclease activity of the TnpB polypeptide. In particular embodiments, the catalytically inactive orthogonal TnpB polypeptide has two or more nucleic acid components capable of hybridizing to a target sequence adjacent to a target sequence of a nicking enzyme. In a particular embodiment, at least two nucleic acid components are used to target the catalytically inactive TnpB polypeptide, wherein at least one nucleic acid component is capable of hybridizing to a target sequence 5 'to a target sequence of a nicking enzyme of a functionalized composition and at least one nucleic acid component is capable of hybridizing to a target sequence 3' to the target sequence of the nicking enzyme, whereby the one or more target sequences may be located on the same or opposite DNA strand as the target sequence of the TnpB nicking enzyme. In particular embodiments, the guide sequences of one or more omega nucleic acid components of the orthologously catalytically inactive TnpB polypeptide are selected such that the target sequence is in proximity to the sequence of the nucleic acid component used to target the functionalized composition (e.g., for targeting a nicking enzyme). In particular embodiments, one or more target sequences of the orthographically inactivated TnpB polypeptide are each more than 5 but less than 450 base pairs apart from the target sequence of the nicking enzyme. The optimal distance between the target sequence of the nucleic acid component molecule used with the orthogonally catalytically inactive TnpB polypeptide and the target sequence of the functionalized composition can be determined by the skilled artisan. In certain embodiments, the catalytically inactive orthogonal TnpB polypeptide has been modified to alter its TAM specificity as described elsewhere herein. In particular embodiments, the TnpB polypeptide nicking enzyme is a nicking enzyme that itself has limited activity in human cells, but which combines with an inactive orthogonal TnpB polypeptide and one or more corresponding proximal nucleic acid component molecules to ensure the desired nicking enzyme activity.
Detection methods such as FISH
In one aspect, the invention provides an engineered, non-naturally occurring composition comprising a catalytically inactive TnpB polypeptide described herein, and the use of this system in detection methods such as Fluorescence In Situ Hybridization (FISH). The dead TnpB polypeptide lacking the ability to generate DNA double strand breaks can be fused to a marker, such as a fluorescent protein, such as enhanced green fluorescent protein (eefp), and co-expressed with a small nucleic acid component molecule to target near center (periceric), center, and remote repeat in vivo. The dead TnpB polypeptide system can be used to visualize both repetitive sequences and individual genes in the human genome. Such novel uses of labeled death TnpB polypeptides may be important for cell imaging and functional nuclear structure studies, particularly in the case of small nuclear volumes or complex 3-D structures. (Chen B, gilbert LA, cimini BA, schnitzbauer J, zhang W, li GW, park J, blackburn EH, weissman JS, qi LS, huang B.2013.Dynamic imaging of genomic lociin living human cells by an optimized CRISPR/Cas system. Cell 155 (7): 1479-91.Doi:10.1016/J. Cell.2013.12.001).
Patient-specific screening methods
Nucleic acid targeting systems that target DNA (e.g., trinucleotide repeats) can be used to screen patients or patient samples for the presence of such repeats. The repeat may be a target of the omega RNA of the TnpB system and if there is binding to the omega RNA by the TnpB system, the binding may be detected, thereby indicating the presence of such a repeat. Thus, the TnpB system can be used to screen patients or patient samples for the presence of duplicates. Appropriate compounds may then be administered to the patient to treat the condition; alternatively, the TnpB system may be applied to bind and cause insertions, deletions or mutations and alleviate a condition.
Genetic and epigenetic condition models
The methods of the invention can be used to generate plants, animals, or cells that can be used to model and/or study genetic or epigenetic conditions of interest, such as by a mutation model or disease model of interest. As used herein, "disease" refers to a disease, disorder, or indication in a subject. For example, the methods of the invention can be used to produce a modified animal or cell comprising one or more nucleic acid sequences associated with a disease, or a plant, animal or cell in which expression of one or more nucleic acid sequences associated with a disease is altered. Such nucleic acid sequences may encode disease-related protein sequences or may be disease-related control sequences. Thus, it should be understood that in embodiments of the invention, a plant, subject, patient, organism, or cell may be a non-human subject, patient, organism, or cell. Thus, the present invention provides plants, animals or cells or progeny thereof produced by the methods of the invention. The offspring may be clones of the plant or animal produced, or may result from sexual reproduction by crossing with other individuals of the same species to introgress the further desired trait into the offspring. In the case of multicellular organisms, in particular animals or plants, the cells may be in vivo or ex vivo. In the case of culturing cells, a cell line may be established if appropriate culture conditions are met and preferably if the cells are suitable for the purpose (e.g. stem cells). Bacterial cell lines produced by the present invention are also contemplated. Thus, cell lines are also contemplated.
In some methods, disease models can be used to study the effect of mutations on the development and/or progression of animals or cells and disease using metrics commonly used in disease studies. Alternatively, such disease models may be used to study the effect of pharmaceutically active compounds on disease.
In some methods, disease models may be used to assess the efficacy of potential gene therapy strategies. That is, a disease-associated gene or polynucleotide may be modified such that the progression and/or inhibition of disease is inhibited or reduced. In particular, the method comprises modifying a disease-associated gene or polynucleotide such that an altered protein is produced, and thus an animal or cell has an altered response. Thus, in some methods, genetically modified animals can be compared to animals susceptible to disease development such that the effect of a gene therapy event can be assessed.
In another embodiment, the invention provides a method of developing a bioactive agent that modulates a cellular signaling event associated with a disease gene. The method comprises contacting a test compound with a cell comprising one or more vectors driving expression of one or more of the TnpB polypeptides and a conserved nucleotide sequence linked to a guide/spacer sequence; and detecting a change in the reading indicative of a decrease or increase in a cell signaling event associated with, for example, a mutation in a disease gene contained in the cell.
A cell model or animal model can be constructed in combination with the methods of the invention for screening for changes in cellular function. Such models can be used to study the effect of genomic sequences modified by complexes of the invention on cellular functions of interest. For example, a cellular functional model may be used to study the effect of modified genomic sequences on intracellular signaling or extracellular signaling. Alternatively, a cellular functional model may be used to study the effect of modified genomic sequences on sensory perception. In some such models, one or more genomic sequences associated with a signaling biochemical pathway in the model are modified.
Several disease models have been studied specifically. These include the new-onset autism risk genes CHD8, KATNAL2 and SCN2A; syndrome autism (angel syndrome) gene UBE3A. These genes and resulting autism models are of course preferred but are used to demonstrate the broad applicability of the invention in genes and corresponding models. Altered expression of one or more genomic sequences associated with a signaling biochemical pathway can be determined by determining the difference in mRNA levels of the corresponding genes between test model cells and control cells when the test model cells and control cells are contacted with a candidate agent. Alternatively, differential expression of sequences associated with a signaling biochemical pathway is determined by detecting differences in the levels of the encoded polypeptide or gene product.
To determine the agent-induced changes in mRNA transcripts or corresponding polynucleotide levels, the nucleic acids contained in the sample are first extracted according to standard methods in the art. For example, mRNA can be isolated using various lyases or chemical solutions according to the procedure set forth in Sambrook et al (1989), or extracted by nucleic acid binding resins according to the manufacturer's instructions. mRNA contained in the extracted nucleic acid sample is then detected by amplification procedures or conventional hybridization assays (e.g., northern blot analysis) according to methods widely known in the art or based on the methods exemplified herein.
For the purposes of the present invention, amplification means any method employing primers and polymerase that are capable of replicating the target sequence with reasonable fidelity. Amplification may be performed by natural or recombinant DNA polymerase, such as Taqgold TM T7 DNA polymerase, klenow fragment of Escherichia coli DNA polymerase and reverse transcriptase. A preferred method of amplification is PCR. In particular, isolated RNA may be subjected to a reverse transcription assay coupled to quantitative polymerase chain reaction (RT-PCR) in order to quantify the expression level of sequences associated with the signaling biochemical pathway.
The detection of the level of gene expression can be performed in real time in an amplification assay. In one aspect, the amplified product may be visualized directly with fluorescent DNA binding agents, including, but not limited to, DNA intercalators and DNA minor groove binding agents. Since the amount of intercalator incorporated into double stranded DNA molecules is generally proportional to the amount of amplified DNA product, the amount of amplified product can be conveniently determined by quantifying the fluorescence of the intercalating dye using conventional optical systems in the art. DNA binding dyes suitable for this application include SYBR Green, SYBR blue, DAPI, propidium iodide, hoeste, SYBR gold, ethidium bromide, acridine, proflavone (proflavone), acridine orange (acridine orange), acridine yellow (acriflavone), fluorocoumarin (fluorocoumarin), ellipticine (elliptidine), daunomycin (daunomycin), chloroquine (chloroquine), distamycin D (distamycin D), chromomycin (chromycin), hu Mixiu ammonium (hominium), mithramycin (mithramycin), polypyridine ruthenium (ruthenium polypyridyls), anthracycline (anthracycline), and the like.
In another aspect, other fluorescent labels (such as sequence-specific probes) may be used in the amplification reaction to facilitate detection and quantification of the amplified product. Quantitative probe-based amplification relies on sequence-specific detection of the desired amplification product. It utilizes fluorescent, target-specific probes (e.g.,probes) resulting in increased specificity and sensitivity. Methods for performing probe-based quantitative amplification are well established in the art and taught in U.S. Pat. No. 5,210,015.
In yet another aspect, conventional hybridization assays may be performed using hybridization probes having sequence homology to sequences associated with signaling biochemical pathways. Typically, in hybridization reactions, probes are allowed to form stable complexes with sequences associated with signaling biochemical pathways contained in a biological sample derived from a test subject. It will be appreciated by those skilled in the art that in the case where an antisense nucleic acid is used as the probe nucleic acid, the target polynucleotide provided in the sample is selected to be complementary to the sequence of the antisense nucleic acid. In contrast, where the nucleotide probe is a sense nucleic acid, the target polynucleotide is selected to be complementary to the sequence of the sense nucleic acid.
Hybridization can be performed under a variety of stringent conditions. Suitable hybridization conditions for practicing the present invention provide sufficient specificity and sufficient stability for the recognition interaction between the probe and the sequence associated with the signaling biochemical pathway. Conditions for increasing the stringency of hybridization reactions are widely known and disclosed in the art. See, e.g., (Sambrook et al, (1989); nonradioactive In Situ Hybridization Application Manual, boehringer Mannheim, second edition). Hybridization assays can be formed using probes immobilized on any solid support including, but not limited to, nitrocellulose, glass, silicon, and a variety of gene arrays. Preferred hybridization assays are performed on high density gene chips as described in U.S. Pat. No. 5,445,934.
To facilitate detection of probe-target complexes formed during the hybridization assay, the nucleotide probes are conjugated with a detectable label. Detectable labels suitable for use in the present invention include any composition that is detectable photochemically, biochemically, spectroscopically, immunochemically, electrically, optically or chemically. A variety of suitable detectable labels are known in the art, including fluorescent or chemiluminescent labels, radioisotope labels, enzymes, or other ligands. In preferred embodiments, it may be desirable to employ a fluorescent label or an enzymatic tag, such as digoxin, beta-galactosidase, urease, alkaline phosphatase or peroxidase, avidin/biotin complex.
The detection method used to detect or quantify the hybridization intensity typically depends on the label selected above. For example, a photographic film or a phosphorescent imager may be used to detect the radiolabel. The emitted light may be detected using a photodetector to detect and quantify the fluorescent markers. Enzyme labels are typically detected by providing a substrate to an enzyme and measuring the reaction product resulting from the action of the enzyme on the substrate; finally, colorimetric labels are detected by simply visualizing the color labels.
Agent-induced changes in sequence expression associated with signaling biochemical pathways can also be determined by examining the corresponding gene products. Determining protein levels generally involves a) contacting a protein contained in a biological sample with an agent that specifically binds to a protein associated with a signaling biochemical pathway; and (b) identifying any agent so formed, protein complexes. In one aspect of this embodiment, the agent that specifically binds to a protein associated with a signaling biochemical pathway is an antibody, preferably a monoclonal antibody.
The reaction is performed by contacting the agent with a sample derived from the protein associated with the signaling biochemical pathway of the test sample under conditions that allow a complex to form between the agent and the protein associated with the signaling biochemical pathway. The formation of the complex may be detected directly or indirectly according to standard procedures in the art. In a direct detection method, the agent is provided with a detectable label and unreacted agent can be removed from the complex; the amount of label remaining thereby indicates the amount of complex formed. For such a method, it is preferable to select a label that remains attached to the agent even during stringent wash conditions. Labels that do not interfere with the binding reaction are preferred. In the alternative, the indirect detection procedure may use an agent containing a chemically or enzymatically introduced label. The desired label generally does not interfere with the binding or stability of the resulting agent, polypeptide complex. However, labels are typically designed to be accessible to antibodies for efficient binding, thereby producing a detectable signal.
A variety of labels suitable for detecting protein levels are known in the art. Non-limiting examples include radioisotopes, enzymes, colloidal metals, fluorescent compounds, bioluminescent compounds, and chemiluminescent compounds.
The amount of agent-polypeptide complex formed during the binding reaction can be quantified by standard quantitative determination. As described above, the formation of the agent polypeptide complex can be measured directly by the amount of label retained at the binding site. In an alternative, proteins associated with signaling biochemical pathways are tested for their ability to compete with labeled analogs for binding sites on a particular agent. In this competitive assay, the amount of label captured is inversely proportional to the amount of protein sequence present in the test sample that is associated with the signaling biochemical pathway.
Many techniques for protein analysis based on the general principles outlined above are available in the art. These include, but are not limited to, radioimmunoassays, ELISA (enzyme-linked immunosorbent assay), "sandwich" immunoassays, immunoradiometric assays, in situ immunoassays (using, for example, colloidal gold, enzyme or radioisotope labels), western blot analysis, immunoprecipitation assays, immunofluorescent assays, and SDS-PAGE.
Antibodies that specifically recognize or bind to proteins associated with signaling biochemical pathways are preferred for performing the protein assays mentioned above. Where desired, antibodies that recognize a particular type of post-translational modification (e.g., a signaling biochemical pathway-induced modification) may be used. Post-translational modifications include, but are not limited to, glycosylation, lipidation, acetylation, and phosphorylation. These antibodies can be purchased from commercial suppliers. For example, anti-phosphotyrosine antibodies that specifically recognize tyrosine-phosphorylated proteins are available from a number of suppliers including Invitrogen and Perkin Elmer. Anti-phosphotyrosine antibodies are particularly useful for detecting proteins that are differentially phosphorylated at their tyrosine residues in response to ER stress. Such proteins include, but are not limited to, eukaryotic translation initiation factor 2 alpha (eIF-2 alpha). Alternatively, conventional polyclonal or monoclonal antibody techniques can be used to produce such antibodies by immunizing a host animal or antibody-producing cell with a target protein that exhibits the desired post-translational modification.
Whole genome knockout screening
The TnpB polypeptides and systems described herein can be used to perform efficient and cost-effective functional genomic screening. Such screening may utilize a whole genome library based on TnpB polypeptides. Such screens and libraries may provide for determining the function of the gene, the cellular pathways in which the gene is involved, and how any alteration in gene expression may result in a particular biological process. One advantage of the present invention is that the composition avoids off-target binding and its resulting side effects. This is achieved using a system arranged to have a high degree of sequence specificity for the target DNA. In a preferred embodiment of the invention, the TnpB polypeptide complex is a TnpB polypeptide complex.
In embodiments of the invention, a whole genome library may comprise a plurality of TnpB polypeptide nucleic acid component molecules, as described herein, comprising guide/spacer sequences capable of targeting a plurality of target sequences in a plurality of genomic loci in a eukaryotic cell population. The cell population may be an Embryonic Stem (ES) cell population. The target sequence in the genomic locus may be a non-coding sequence. The non-coding sequence may be an intron, a regulatory sequence, a splice site, a 3'UTR, a 5' UTR, or a polyadenylation signal. The gene function of one or more gene products can be altered by the targeting. Targeting can result in knockout of gene function. The targeting of a gene product may comprise more than one nucleic acid component molecule. The gene product may be molecularly targeted by 2, 3, 4, 5, 6, 7, 8, 9 or 10 nucleic acid components, preferably 3 to 4 per gene. Off-target modifications (see, e.g., DNA targeting specificity of RNA-guided Cas nucleic. Hsu, p., scott, d., weinstein, j., ran, FA., konermann, s., agarwala, v., li, y., fine, e, wu, x., shamem, o., cralick, TJ., maraffini, LA., bao, g., & Zhang, f.nat Biotechnol doi:10.1038/nbt.2647 (2013)), which are incorporated herein by reference) can be minimized by use of staggered double strand breaks created by the TnpB polypeptide complex or by use of methods similar to those used in the compositions. Targeting may be about 100 or more sequences. Targeting may be about 1000 or more sequences. Targeting may be about 20,000 or more sequences. Targeting may be the entire genome. Targeting may be a set of target sequences focused on the relevant or desired pathway. The pathway may be an immune pathway. The pathway may be a cell division pathway.
One aspect of the invention encompasses whole genome libraries, which may comprise a plurality of nucleic acid component molecules, which may comprise guide/spacer sequences capable of targeting a plurality of target sequences in a plurality of genomic loci, wherein the targeting results in a knockout of gene function. This library may contain nucleic acid component molecules that target each gene in the genome of the organism.
In one embodiment of the invention, the organism or subject is a eukaryotic organism (including mammals, including humans) or a non-human eukaryotic organism or a non-human animal or a non-human mammal. In one embodiment, the organism or subject is a non-human animal and may be an arthropod, such as an insect, or may be a nematode. In some methods of the invention, the organism or subject is a plant. In some methods of the invention, the organism or subject is a mammal or a non-human mammal. The non-human mammal may be, for example, a rodent (preferably a mouse or a rat), ungulate or primate. In some methods of the invention, the organism or subject is an alga, including microalgae, or a fungus.
The knockout of the gene function may include introducing a vector system comprising one or more vectors of the engineered, non-naturally occurring compositions herein into each cell of the population of cells. The nucleic acid component molecule sequence can target a unique gene in each cell, wherein the TnpB polypeptide is operably linked to regulatory elements, wherein upon transcription, the nucleic acid component molecule comprising a spacer sequence directs sequence-specific binding of the TnpB polypeptide to a target sequence in a genomic locus of the unique gene to induce cleavage of the genomic locus by the TnpB polypeptide and confirm different knockout mutations in a plurality of unique genes in each cell of the cell population, thereby generating a library of gene knockout cells. The invention encompasses that the cell population is a eukaryotic cell population, and in a preferred embodiment, the cell population is an Embryonic Stem (ES) cell population.
The one or more vectors may be plasmid vectors. The vector may be a single vector comprising the TnpB polypeptide, the nucleic acid component and optionally a selectable marker for entry into a target cell. Without being bound by theory, the ability to simultaneously deliver the TnpB polypeptide and the nucleic acid component via a single vector enables application to any cell type of interest without first generating a cell line that expresses the TnpB polypeptide. The regulatory element may be an inducible promoter. The inducible promoter may be a doxycycline (doxycycline) inducible promoter. In some methods of the invention, expression of the nucleic acid component molecular sequence is under the control of a T7 promoter and driven by expression of a T7 polymerase. Confirmation of the different knockout mutations can be performed by whole-exome sequencing. Knockout mutations can be achieved in 100 or more unique genes. Knockout mutations can be achieved in 1000 or more unique genes. Knockout mutations can be achieved in 20,000 or more unique genes. Knockout mutations can be achieved throughout the genome. The knockout of gene function may be accomplished in a plurality of unique genes that function under specific physiological pathways or conditions. The pathway or condition may be an immune pathway or condition. The pathway or condition may be a cell division pathway or condition.
Functional alterations and screening
In another aspect, the invention provides a method of functional assessment and screening of genes. The use of the composition to precisely deliver a functional domain, activate or repress gene, or alter epigenetic status by precisely altering the methylation site at a particular locus of interest, can be used ex vivo or in vivo with one or more nucleic acid component molecules applied to a single cell or cell population or with a library of genomes applied to a cell pool, including administration or expression of a library comprising a plurality of nucleic acid components (comprising spacer molecules), and wherein screening further includes use of a TnpB polypeptide wherein a TnpB-comprising complex is modified to comprise a heterologous functional domain. In one aspect, the invention provides a method for screening a genome comprising administering a library to a host or expressing the library in a host. In one aspect, the invention provides a method as discussed herein, further comprising administering to or expressing in a host an activator. In one aspect, the invention provides a method as discussed herein, wherein the activator is attached to the TnpB polypeptide. In one aspect, the invention provides a method as discussed herein, wherein the activator is attached to the N-terminus or the C-terminus of the TnpB polypeptide. In one aspect, the invention provides a method as discussed herein, wherein an activator is attached to a nucleic acid component loop. In one aspect, the invention provides a method as discussed herein, further comprising administering to or expressing in a host a repressor. In one aspect, the invention provides a method as discussed herein, wherein screening comprises affecting and detecting gene activation, gene suppression, or cleavage in a locus.
For example, in addition to promoters or promoter proximal elements, targeting endogenous (regulatory) control elements (such as enhancers and silencers) are also preferred. Thus, in addition to targeting promoters, the invention can also be used to target endogenous control elements (including enhancers and silencers). These control elements can be located upstream and downstream of the Transcription Start Site (TSS), from 200bp to 100kb away from the TSS. Targeting known control elements may be used to activate or repress genes of interest. In some cases, a single control element may affect transcription of multiple target genes. Thus, targeting a single control element can be used to control transcription of multiple genes simultaneously.
In another aspect, targeted putative control elements (e.g., by tiling the region of putative control elements and 200bp to 100kB around the elements) can be used as a means to verify such elements (by measuring transcription of the gene of interest) or to detect new control elements (e.g., by tiling the TSS upstream and downstream 100kB of the gene of interest). Furthermore, targeted putative control elements may be useful in the context of understanding the genetic cause of a disease. Many mutations and common SNP variants associated with disease phenotypes are located outside the coding region. After targeting such regions with the activation or repression systems described herein, the following transcripts can be read: a) A putative set of targets (e.g., a set of genes located in closest proximity to the control element) or b) full transcriptome read-out by, for example, RNAseq or microarray. This would allow the identification of possible candidate genes involved in the disease phenotype. Such candidate genes may be used as novel drug targets.
Histone Acetyltransferase (HAT) inhibitors are mentioned herein. However, alternative embodiments are one or more functional domains comprising an acetyltransferase, preferably a histone acetyltransferase. These are useful in the field of epigenomics, for example in methods of interrogating the epigenomic region. Methods of interrogating an epigenomic region can include, for example, targeting the epigenomic sequence. Targeting the epigenomic sequence may include a nucleic acid component molecule directed against the epigenomic target sequence. The epigenomic target sequence may include a promoter, silencer or enhancer sequence.
Saturation mutagenesis
The compositions herein can be used to perform saturation or deep scan mutagenesis of genomic loci in conjunction with cellular phenotypes, e.g., to determine key minimal characteristics and discrete vulnerability of functional elements required for gene expression, drug resistance, and disease reversal. Saturation or deep scanning mutagenesis means that every or substantially every DNA base within a genomic locus is cleaved. A Cas1 effector protein nucleic acid component molecule library can be introduced into a population of cells. The library may be introduced such that each cell receives a single nucleic acid component. In the case of transduction of the library by viral vectors, low multiplicity of infection (MOI) is used as described herein. The library may include nucleic acid components that target each sequence upstream of the (target adjacent motif) (TAM) sequence in the genomic locus. For every 1000 base pairs within a genomic locus, the library may include at least 100 non-overlapping genomic sequences upstream of the TAM sequence. The library may comprise nucleic acid components that target sequences upstream of at least one different TAM sequence. The composition may comprise more than one TnpB polypeptide. Any TnpB polypeptide protein as described herein, including orthologs or engineered TnpB polypeptides. The off-target site frequency of the nucleic acid component may be less than 500. Off-target scores can be generated to select the nucleic acid component with the lowest off-target site. Any phenotype determined to be associated with cleavage at a target site of a nucleic acid component can be confirmed by using a nucleic acid component targeting the same site in a single experiment. Verification of the target site may also be performed by using a modified TnpB polypeptide as described herein and two nucleic acid components targeting the genomic site of interest. Without being bound by theory, if a phenotypic change is observed in the validation experiment, the target site is a true hit.
The genomic locus may comprise at least one contiguous genomic region. The at least one contiguous genomic region may comprise up to the entire genome. At least one contiguous genomic region may comprise a functional element of a genome. The functional element may be located within a non-coding region, coding gene, intron region, promoter or enhancer. At least one contiguous genomic region may comprise at least 1kb, preferably at least 50kb, of genomic DNA. At least one contiguous genomic region may comprise a transcription factor binding site. At least one contiguous genomic region may comprise a dnase I hypersensitive region. At least one contiguous genomic region may comprise transcriptional enhancer or repressor elements. At least one contiguous genomic region may comprise a site enriched for epigenetic features. At least one contiguous genomic DNA region may comprise an epigenetic insulator. The at least one contiguous genomic region may comprise two or more contiguous genomic regions that physically interact. The region of the genome that interacts can be determined by the "4C technique". The 4C technique allows screening of DNA segments of the entire genome that physically interact with selected DNA fragments in a fair manner, as described in Zhao et al ((2006) Nat Genet 38,1341-7) and U.S. Pat. No. 8,642,295, both of which are incorporated herein by reference in their entirety. The epigenetic characteristic may be histone acetylation, histone methylation, histone ubiquitination, histone phosphorylation, DNA methylation, or lack thereof.
Compositions for saturation or depth scan mutagenesis may be used in cell populations. The compositions may be used in eukaryotic cells, including but not limited to mammalian and plant cells. The population of cells may be prokaryotic cells. The eukaryotic cell population may be a population of Embryonic Stem (ES) cells, neuronal cells, epithelial cells, immune cells, endocrine cells, muscle cells, erythrocytes, lymphocytes, plant cells, or yeast cells.
In one aspect, the invention provides a method of screening for a functional element associated with a phenotypic change. The library may be introduced into a population of cells suitable for containing the TnpB polypeptide. Cells can be sorted into at least two groups based on phenotype. The phenotype may be expression of a gene, cell growth, or cell viability. The relative representation of the nucleic acid component molecules present in each group is determined, whereby the genomic locus associated with the phenotypic change is determined from the representation of the nucleic acid component molecules present in each group. The phenotypic change may be a change in the expression of the gene of interest. The gene of interest may be up-regulated, down-regulated or knocked out. Cells can be sorted into high-expression groups and low-expression groups. The cell population may include a reporter construct for determining a phenotype. The reporter construct may comprise a detectable marker. Cells can be sorted by using a detectable marker.
In another aspect, the invention provides a method of screening for genomic loci associated with resistance to chemical compounds. The chemical compound may be a drug or an insecticide. The library may be introduced into a population of cells suitable for containing the TnpB polypeptide, wherein each cell of the population contains no more than one nucleic acid component molecule; treating a population of cells with a chemical compound; and determining a representation of the nucleic acid component molecule at a later time point compared to the early time point after treatment with the chemical compound, whereby the genomic locus associated with chemical compound resistance is determined by enrichment of the nucleic acid component. The representation of the nucleic acid component can be determined by a depth sequencing method.
Useful in the practice of the present invention utilizing the compositions are methods used in the compositions and reference is made to articles entitled BCL11A enhancer dissection by Cas-mediated in situ saturating mutagensis.canver, m.c., smith, e.c., sher, f., pinello, l., sanjana, n.e., shamem, o., chen, d.d., schupp, p.g., vinjamur, d.s., garcia, s.p., luc, s, kurita, r., nakamura, y., fujiwara, y., maeda, t., yuan, g., zhang, f., orkin, s.h., bauer, d.e., doi: 10.1038/natural 15521, which are incorporated herein by reference in their entirety for all purposes and are discussed in the brief herein by reference on-line for 16 months.
Canver et al relates to novel pooling guide RNA libraries for in situ saturation mutagenesis of the human and mouse BCL11A red blood cell enhancer previously identified as an enhancer associated with fetal hemoglobin (HbF) levels and whose mouse ortholog is essential for red blood cell BCL11A expression. This approach reveals the key minimal features and discrete vulnerability of these enhancers. By editing the human primary progenitor cells and the mouse transgenes, the authors verified that the BCL11A red blood cell enhancer was the target for HbF re-induction. The authors bring a detailed enhanced sub-map that provides information for therapeutic genome editing.
Modification of cells or organisms
The present disclosure also provides cells comprising one or more components of the systems herein (e.g., tnpB polypeptide and/or nucleic acid components). Also provided are cells modified by the systems and methods herein, as well as cell cultures, tissues, organs, organisms comprising such cells or their progeny. In one embodiment, the invention encompasses a method of modifying a cell or organism. The cells may be prokaryotic or eukaryotic. The cell may be a mammalian cell. Mammalian cells may be non-human primate, bovine, porcine, rodent or mouse cells. The cells may be non-mammalian eukaryotic cells such as poultry, fish or shrimp. The cell may also be a plant cell. The plant cell may be a cell of a crop plant (such as cassava, maize, sorghum, wheat, or rice). The plant cell may also be a cell of an alga, tree or vegetable. Modifications introduced into cells by the present invention can allow cells and cell progeny to be altered to improve the production of biological products such as antibodies, starches, alcohols, or other desired cellular outputs. The modification introduced into the cells by the present invention may be such that the cells and cell progeny include alterations that alter the biological product produced.
Therapeutic uses and methods of treatment
Also provided herein are methods of diagnosing, prognosing, treating and/or preventing a disease, state, or condition in a subject. Generally, methods of diagnosing, prognosing, treating, and/or preventing a disease, state, or condition in a subject can include modifying a polynucleotide in a subject or a cell thereof using a composition, system, or component thereof described herein, and/or include detecting a diseased or healthy polynucleotide in a subject or a cell thereof using a composition, system, or component thereof described herein. In one embodiment, a method of treatment or prophylaxis may comprise modifying a polynucleotide of an infectious organism (e.g., a bacterium or virus) in a subject or cell thereof using a composition, system, or component thereof. In one embodiment, a method of treatment or prevention may include using a composition, system, or component thereof to modify a polynucleotide of an infectious or symbiotic organism in a subject. The compositions, systems, and components thereof may be used to develop models of diseases, states, or conditions. The compositions, systems, and components thereof may be used to detect a disease state or correction thereof, such as by the therapeutic or prophylactic methods described herein. The compositions, systems, and components thereof can be used to screen and select cells that can be used, for example, as a treatment or prophylaxis as described herein. The compositions, systems, and components thereof may be used to develop bioactive agents that may be used to alter one or more biological functions or activities in a subject or cells thereof.
Generally, the methods can include delivering the compositions, systems, and/or components thereof to a subject or cells thereof, or to an infectious or symbiotic organism, by suitable delivery techniques and/or compositions. Once administered, the components may function as described elsewhere herein to trigger a nucleic acid modification event. In some aspects, the nucleic acid modification event can occur at the genomic, epigenomic, and/or transcriptomic levels. DNA and/or RNA cleavage, gene activation and/or gene deactivation may occur. Additional features, uses, and advantages are described in more detail below. Based on this concept, there are several variations suitable for triggering genomic locus events, including DNA cleavage, gene activation or gene deactivation. Using the provided compositions, one of skill in the art can advantageously and specifically target single or multiple loci having the same or different functional domains to trigger one or more genomic locus events. In addition to treating and/or preventing a disease in a subject, the compositions can also be applied to a variety of methods for screening in cell libraries and in vivo functional modeling (e.g., gene activation and functional identification of lincRNA; obtaining functional modeling; losing functional modeling; using the compositions of the invention to establish cell lines and transgenic animals for optimization and screening purposes).
The compositions, systems, and components thereof described elsewhere herein may be used to treat and/or prevent a disease, such as a genetic and/or epigenetic disease, in a subject. The compositions, systems, and components thereof described elsewhere herein may be used to treat and/or prevent genetic infectious diseases in a subject, such as bacterial infections, viral infections, fungal infections, parasitic infections, and combinations thereof. The compositions, systems, and components thereof described elsewhere herein may be used to alter the composition or characteristics of a subject's microbiome, which in turn may alter the health state of the subject. The compositions, systems described herein can be used to modify cells ex vivo, which can then be administered to a subject, whereby the modified cells can treat or prevent a disease or symptom thereof. In some cases, this is also referred to as adoptive therapy. The compositions, systems described herein can be used to treat mitochondrial diseases, wherein the etiology of the mitochondrial disease involves mutations in mitochondrial DNA.
Also provided is a method of treating a subject (e.g., a subject in need thereof) comprising inducing gene editing by transforming the subject with a polynucleotide encoding one or more components of a composition, system, or complex or any of the polynucleotides or vectors described herein and administering them to the subject. Suitable repair templates may also be provided, for example, for delivery by a vector comprising the repair templates. The repair template herein may be a recombinant template. Also provided is a method of treating a subject (e.g., a subject in need thereof) comprising inducing transcriptional activation or repression of a plurality of target loci by transforming the subject with a polynucleotide or vector as described herein, wherein the polynucleotide or vector encodes or comprises one or more components of a composition, system, complex, or component thereof comprising a plurality of TnpB polypeptides. In the case where any treatment occurs ex vivo (e.g., in a cell culture), then it is understood that the term "subject" may be replaced by the phrase "cell or cell culture".
Also provided is a method of treating a subject (e.g., a subject in need thereof) comprising inducing gene editing by transforming the subject with a TnpB polypeptide, thereby advantageously encoding and expressing the remainder of the composition, system (e.g., RNA) in vivo. Suitable repair templates may also be provided, for example, for delivery by a vector comprising the repair templates. Also provided is a method of treating a subject (e.g., a subject in need thereof) comprising inducing transcriptional activation or repression by transforming the subject with a TnpB polypeptide that advantageously encodes and expresses the composition, the remainder of the system (e.g., a nucleic acid component molecule) in vivo; advantageously, in one embodiment, the TnpB polypeptide is a catalytically inactive TnpB polypeptide and comprises one or more relevant functional domains. In the case where any treatment occurs ex vivo (e.g., in a cell culture), then it is understood that the term "subject" may be replaced by the phrase "cell or cell culture".
One or more components of the compositions and systems described herein may be included in a composition, such as a pharmaceutical composition, and administered to a host, either alone or in combination. Alternatively, these components may be provided in the form of a single composition for administration to a host. The administration to a host may be via a viral vector (e.g., lentiviral vector, adenoviral vector, AAV vector) known to the skilled artisan or described herein for delivery to the host. As explained herein, the use of different selection markers (e.g., for lentiviral nucleic acid component selection) and nucleic acid component concentrations (e.g., depending on whether multiple nucleic acid components are used) may be advantageous to elicit an improved effect.
Thus, also described herein are methods of inducing one or more polynucleotide modifications in a eukaryotic or prokaryotic cell of a subject, or a component thereof (e.g., mitochondria), an infectious organism, and/or an organism of a microbiome of a subject. Modification may include the introduction, deletion or substitution of one or more nucleotides at the target sequence of a polynucleotide of one or more cells. Modification may be performed in vitro, ex vivo, in situ, or in vivo.
In one embodiment, a method of treating or inhibiting a condition or disease caused by one or more mutations in a genomic locus in a eukaryotic organism or a non-human organism may comprise manipulating a target sequence within coding, non-coding or regulatory elements of the genomic locus in a target sequence of a subject or non-human subject in need thereof, including modifying the subject or non-human subject by manipulation of the target sequence, and wherein the condition or disease is susceptible to treatment or inhibition by manipulation of the target sequence, comprising providing a treatment comprising delivering a composition comprising a particle delivery system or viral particle of any of the embodiments or a cell of any of the embodiments.
Also provided herein is the use of the particle delivery system or viral particle of any of the above embodiments or the cell of any of the above embodiments in ex vivo or in vivo gene or genome editing; or for in vitro, ex vivo or in vivo gene therapy. Also provided herein are particle delivery systems, non-viral delivery systems and/or viral particles of any of the above embodiments or cells of any of the above embodiments for use in the manufacture of a medicament for in vitro, ex vivo or in vivo gene or genome editing, or for use in vitro, ex vivo or in vivo gene therapy, or for use in a method of modifying an organism or non-human organism by manipulating a target sequence in a genomic locus associated with a disease, or for use in a method of treating or inhibiting a condition or disease caused by one or more mutations in a genomic locus of a eukaryotic organism or non-human organism.
In one embodiment, polynucleotide modification may include the introduction, deletion or substitution of 1-75 nucleotides at each target sequence of the polynucleotide of the cell. Modification may include introducing, deleting, or substituting at least 1, 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, or 75 nucleotides at each target sequence. The modification may comprise introducing, deleting or substituting at least 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50 or 75 nucleotides at each target sequence of the cell. The modification may comprise introducing, deleting or substituting at least 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50 or 75 nucleotides at each target sequence of the cell. The modification may comprise introducing, deleting or substituting at least 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50 or 75 nucleotides at each target sequence of the cell. The modification may comprise introducing, deleting or substituting at least 40, 45, 50, 75, 100, 200, 300, 400 or 500 nucleotides at each target sequence of the cell. The modification may include introducing, deleting, or substituting at least 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, 2500, 2600, 2700, 2800, 2900, 3000, 3100, 3200, 3300, 3400, 3500, 3600, 3700, 3800, 3900, 4000, 4100, 4200, 4300, 4400, 4500, 4600, 4700, 4800, 4900, 5000, 5100, 5200, 5300, 5400, 5500, 5600, 5700, 5800, 5900, 6000, 6100, 6200, 6300, 6400, 6500, 6600, 6700, 6800, 6900, 7000, 71900, 7200, 7400, 7600, 7700, 7800, 7900, 8000, 8100, 8200, 8300, 8400, 8800, 8600, 5200, 8600, 5500, 5600, 5700, 5800, 5900, 6000, 9700, or 9900 at each target sequence of the cell.
In one embodiment, the modification may include the introduction, deletion, or substitution of nucleotides at each target sequence of the cell via a nucleic acid component (e.g., a nucleic acid component molecule, RNA, or nucleic acid component), such as those mediated by the compositions, systems, or components thereof described elsewhere herein. In one embodiment, the modification may include introducing, deleting or substituting nucleotides at the target sequence or random sequence of the cell via a composition, system or technique.
In one embodiment, the composition, system, or component thereof may promote non-homologous end joining (NHEJ). In one embodiment, the modification of a polynucleotide (such as a diseased polynucleotide) by a composition, system, or component thereof may include NHEJ. In one embodiment, promotion of this repair pathway by a composition, system, or component thereof may be used to target gene or polynucleotide specific knockouts and/or knockins. In one embodiment, promotion of this repair pathway by a composition, system, or component thereof may be used to create NHEJ-mediated indels. Nuclease-induced NHEJ can also be used to remove (e.g., delete) sequences in a gene of interest. Generally, NHEJ repairs double strand breaks in DNA by joining two ends together; however, in general, the original sequence can only be recovered if the two compatible ends (identical to the formation of the double strand break) are perfectly joined. Double-stranded broken DNA ends are often the subject of enzymatic treatment, resulting in the addition or removal of nucleotides at one or both strands prior to the re-joining of the ends. This results in the presence of insertion and/or deletion (indel) mutations at the NHEJ repair site of the DNA sequence. Indels can range in size from 1 to 50 base pairs or more. In one embodiment of the present invention, in one embodiment, the three indels may be 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 249. 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 307, 308, 309, 310, 311, 312, 313, 314, and 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 456, 457, 458, 459, 460, 461, 462, 463, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 487, 498, 499, 498, 493, 498, 499, 498, or more. If a double strand break is targeted near a short target sequence, the deletion mutation caused by NHEJ repair typically spans and thus removes unwanted nucleotides. For the deletion of larger DNA segments, the introduction of two double strand breaks (one on each side of the sequence) can result in NHEJ between the ends, while removing the complete intervening sequence. Both methods can be used to delete specific DNA sequences.
In one embodiment, the composition, system-mediated NHEJ, may be used in a method of deleting small sequence motifs. In one embodiment, the composition, system-mediated NHEJ can be used in a method of producing a NHEJ-mediated indel that can target a gene (e.g., a coding region, such as an early coding region of a gene of interest), and can be used to knock out (i.e., eliminate expression of) the gene of interest. For example, the early coding region of the gene of interest includes a sequence immediately after the transcription start site, within the first exon of the coding sequence, or within 500bp (e.g., less than 500, 450, 400, 350, 300, 250, 200, 150, 100, or 50 bp) of the transcription start site. In embodiments wherein the nucleic acid component and the TnpB polypeptide create a double-strand break to induce NHEJ-mediated indels, the nucleic acid component may be configured to position one double-strand break in close proximity to a nucleotide at the target position. In embodiments, the cleavage site may be 0-500bp from the target location (e.g., less than 500, 400, 300, 200, 100, 50, 40, 30, 25, 20, 15, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1bp from the target location). In embodiments, where two component RNAs complexed with one or more nicking enzymes induce two single-strand breaks to induce NHEJ-mediated indels, the two component RNAs may be configured to position the two single-strand breaks to provide nucleotides of a target site for NHEJ repair.
To minimize toxicity and off-target effects, it may be important to control the concentration of the delivered TnpB polypeptide mRNA and component RNAs. The optimal concentrations of the TnpB polypeptide mRNA and component RNAs can be determined by testing different concentrations in a cellular or non-human eukaryotic animal model and analyzing the extent of modification of potentially off-target genomic sites using deep sequencing. Alternatively, to minimize the level of toxicity and off-target effects, a nicking enzyme mRNA (e.g., mutated TnpB) may be delivered with a pair of nucleic acid components targeting a site of interest.
Generally, in the context of endogenous TnpB polypeptides, formation of a TnpB polypeptide or complex (comprising a polynucleotide component sequence that hybridizes to a target sequence and is complexed with one or more TnpB polypeptides) results in cleavage, nicking, and/or another modification of one or both strands in or near the target sequence (e.g., within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50, or more base pairs from the target sequence).
In one embodiment, a method of modifying a target polynucleotide in a cell to treat or prevent a disease can include allowing a composition, system, or component thereof to bind to the target polynucleotide, e.g., to effect a cleavage, nicking, or other modification of the target polynucleotide by the composition, system, or component thereof, thereby modifying the target polynucleotide, wherein the composition, system, or component thereof is complexed with a nucleic acid component molecular sequence, and hybridizing the nucleic acid component molecular sequence to a target sequence within the target polynucleotide, wherein the nucleic acid component molecular sequence is optionally linked to a nucleic acid component scaffold sequence. In some of these embodiments, the composition, system, or component thereof may be or include a TnpB polypeptide complexed with a nucleic acid component molecular sequence. In one embodiment, the modification may include cleavage or nicking of one or both strands at the position of the target sequence by one or more components of the composition, system, or component thereof.
Cleavage, nicking or other modification that can be made by the composition, system can modify transcription of the target polynucleotide. In one embodiment, modification of transcription may include reducing transcription of the target polynucleotide. In one embodiment, the modification may include increasing transcription of the target polynucleotide. In one embodiment, the method comprises repairing the cleaved target polynucleotide by homologous recombination with a recombinant template polynucleotide, wherein the repair results in a modification such as, but not limited to, an insertion, deletion, or substitution of one or more nucleotides of the target polynucleotide. In one embodiment, the modification results in one or more amino acid changes in a protein expressed by a gene comprising the target sequence. In one embodiment, the modifications conferred by the composition, system or component thereof provide transcripts and/or proteins that can correct a disease or symptom thereof, including but not limited to any disease or symptom thereof described in more detail elsewhere herein.
In one embodiment, a method of treating or preventing a disease may comprise delivering one or more vectors or vector systems to a cell, such as a eukaryotic or prokaryotic cell, wherein the one or more vectors or vector systems comprise a composition, system, or component thereof. In one embodiment, the vector or vector system may be a viral vector or vector system, such as an AAV or lentiviral vector system, described in more detail elsewhere herein. In one embodiment, a method of treating or preventing a disease may comprise delivering one or more viral particles, such as AAV or lentiviral particles, comprising a composition, system, or component thereof. In one embodiment, the viral particles have tissue-specific tropism. In one embodiment, the viral particles have liver, muscle, eye, heart, pancreas, kidney, neuron, epithelial cell, endothelial cell, astrocyte, glial cell, immune cell or erythrocyte specific tropism.
It will be appreciated that the compositions and systems according to the invention as described herein, such as for the methods according to the invention as described herein, may be suitable for any type of application for which compositions, systems are known, preferably in eukaryotes. In certain aspects, the application is therapeutic, preferably in eukaryotic organisms such as, including but not limited to, animals (including humans), plants, algae, fungi (including yeast), and the like. Alternatively or additionally, in certain aspects, the application may involve achieving or inducing one or more specific traits or characteristics, such as genotype and/or phenotype traits or characteristics, as also described elsewhere herein.
Treatment of circulatory diseases
In one embodiment, the compositions, systems, and/or components thereof described herein may be used to treat and/or prevent circulatory disorders. For example, exemplary diseases are provided in tables 2 and 3. In one embodiment, the plasmapheresis of Wahlgren et al (Nucleic Acids Research,2012, volume 40, stage 17 e 130) can be used to deliver the compositions, systems, and/or components thereof described herein to blood. In one embodiment, circulatory diseases can be treated by using lentiviral delivery of the compositions, systems described herein to modify Hematopoietic Stem Cells (HSCs) in vivo or ex vivo (see, e.g., drakopouuu, "Review optics, the Ongoing Challenge of Hematopoietic Stem Cell-Based Gene Therapy for β -thaalassemia," Stem Cells International, volume 2011, article ID 987980,10, doi:10.4061/2011/987980, which may be suitable for use with the compositions, systems herein in view of the description herein). In one embodiment, the circulatory disorder can be treated by correcting HSCs for the disease using a composition, system, or component thereof herein, wherein the composition, system optionally comprises a suitable HDR repair template (see, e.g., cavazzana, "Outcomes of Gene Therapy for beta-Thalassemia Major via Transplantation of Autologous Hematopoietic Stem Cells Transduced Ex Vivo with a Lentiviral beta a-T87Q-Globin vector"; cavazzana-Calvo, "Transfusion independence and HMGA2 activation after gene therapy of human beta-thassaemia", nature 467,318-322 (9 th 2010) doi: 10.1038/aperture 09328; nienhuis, "Development of Gene Therapy for Thalassemia, cold Spring Harbor Perspectives in Medicine, doi: 10.1101/cshperspec.a 01163 (2012), lentiGlobin BB305, a lentiviral vector containing an engineered beta-Globin gene (. Beta.A-T87Q), and Xie et al," Seamless gene correction of beta-thalassaemia mutations in patient-specific iPSCs using CRISPR/Cas9 and piggyback "Genome Research gr.173427.114 (2014) www.genome.org/cgi/doi/10.1101/gr.173427.114 (Cold Spring Harbor Laboratory Press; [1599] Watts," Hematopoietic Stem Cell Expansion and Gene Therapy "Cytoapy 13 (10): 1164-1171.doi:10.3109/14653249.2011.620748 (2011), which may be adapted for use in a system in combination with the compositions of the present invention in this context in order to correct the diseases in a circulatory system as described herein, teachings of Xu et al (Sci Rep.2015, 7/9; 5:12065.Doi:10.1038/srep 12065) and Song et al (Stem Cells Dev.2015, 1; 24 (9): 1053-65.Doi:10.1089/scd.2014.0347Epub, 5, 2) for modification of iPSC may be suitable for use with the compositions, systems described herein.
The term "hematopoietic stem cells" or "HSCs" broadly refers to those cells that are considered HSCs, e.g., blood cells that produce all other blood cells and are derived from mesoderm; blood cells located in the red bone marrow contained in the core of most bones. HSCs of the invention include cells with hematopoietic stem cell phenotypes identified by small size, lack of lineage (lin) markers, and markers belonging to clusters of differentiation series (e.g., CD34, CD38, CD90, CD133, CD105, CD45, and receptor c-kit for stem cell factor). Hematopoietic stem cells are negative for markers for detecting lineage commitment (lineage commitment), and are therefore referred to as Lin-; also, during purification by FACS, many up to 14 different markers of mature blood lineages, e.g., CD13 and CD33 for human, bone marrow cells, CD71 for erythrocytes, CD19 for B cells, CD61 for megakaryocytes, etc.; and B220 (murine CD 45), mac-1 (CD 11B/CD 18) for B cells, gr-1 for granulocytes, ter119 for erythrocytes, IL7Ra, CD3, CD4, CD5, CD8 for T cells, etc. Mouse HSC markers: CD34lo/-, SCA-1+, thy1.1+/lo, CD38+, C-kit+, lin-, and human HSC markers: CD34+, CD59+, thy1/CD90+, CD38lo/-, C-kit/CD117+ and lin-. HSCs are identified by markers. Thus, in the embodiments discussed herein, HSCs may be cd34+ cells. HSC may also be hematopoietic stem cells that are CD34-/CD 38-. Stem cells that may lack c-kit on the cell surface that are considered in the art as HSC and CD133+ cells that are also considered in the art as HSC are within the scope of the present invention.
In one embodiment, the treatment or prevention for treating circulatory system or hematological disorders may comprise modifying human umbilical cord blood cells with any of the modifications described herein. In one embodiment, the treatment or prevention for treating circulatory system or hematological disorders may comprise modifying granulocyte colony stimulating factor mobilized peripheral blood cells (mpbs) with any of the modifications described herein. In one embodiment, the human umbilical cord blood cells or mpbs may be cd34+. In one embodiment, the modified cord blood cells or mPB cells may be autologous. In one embodiment, the cord blood cells or mPB cells may be allogeneic. In addition to modification of disease genes, the allogeneic cells may be further modified using the compositions, systems described herein to reduce immunogenicity of the cells when delivered to a recipient. Such techniques are described elsewhere herein, and are, for example, cartier, "MINI-SYMPOSIUM:X-Linked Adrenoleukodystrophypa, hematopoietic Stem Cell Transplantation and Hematopoietic Stem Cell Gene Therapy in X-Linked Adrenoleukodystrophy," Brain Pathology 20 (2010) 857-862, "which may be suitable for use with the compositions, systems herein. The modified cord blood cells or mPB cells may optionally be expanded in vitro. The modified cord blood cells or mPB cells may be delivered to a subject in need thereof using any suitable delivery technique.
The composition may be engineered to target one or more loci in the HSC. In one embodiment, the TnpB polypeptide may be codon optimized for eukaryotic cells and particularly mammalian cells (e.g., human cells, e.g., HSCs or ipscs), and nucleic acid components targeting one or more loci in HSCs (such as circulatory diseases) may be prepared. These may be delivered via particles. The particles may be formed by mixing the TnpB polypeptide and nucleic acid components. The nucleic acid component and TnpB polypeptide mixture may, for example, be mixed with a mixture comprising, consisting essentially of, or consisting of a surfactant, a phospholipid, a biodegradable polymer, a lipoprotein and an alcohol, whereby particles containing the nucleic acid component and the TnpB polypeptide may be formed. The invention encompasses the particles so prepared as well as the particles formed by such a method and uses thereof. The suitable delivery of particles of the composition or HSC to the blood or circulatory system in the context of the blood or circulatory system is described in more detail elsewhere herein.
In one embodiment, after ex vivo modification, the HSCs or iPCS may be expanded prior to administration to a subject. Amplification of HSCs may be performed via any suitable method, such as Lee, "Improved ex vivo expansion of adult hematopoietic stem cells by overcoming CUL4-mediated degradation of hoxb4." blood.2013, 5 months 16 days; 121 (20) methods described in 4082-9.Doi:10.1182/blood-2012-09-455204.Epub2013, 3, 21.
In one embodiment, the modified HSCs or ipscs may be autologous. In one embodiment, the HSCs or ipscs may be allogeneic. In addition to modification of disease genes, the allogeneic cells may be further modified using the compositions, systems described herein to reduce immunogenicity of the cells when delivered to a recipient. Such techniques are described elsewhere herein, and are, for example, cartier, "MINI-SYMPOSIUM:X-Linked Adrenoleukodystrophypa, hematopoietic Stem Cell Transplantation and Hematopoietic Stem Cell Gene Therapy in X-Linked Adrenoleukodystrophy," Brain Pathology 20 (2010) 857-862, "which may be suitable for use with the compositions, systems herein.
Treatment of neurological disorders
In one embodiment, the compositions, systems described herein may be used to treat brain and CNS disorders. The brain delivery options include encapsulation of the TnpB polypeptide and nucleic acid component molecules in DNA or RNA form into liposomes and conjugation with the molecule trojan horse for delivery across the Blood Brain Barrier (BBB). Molecular trojan horses have been shown to be effective in delivering B-gal expression vectors into the brain of non-human primates. The same method can be used to deliver vectors containing the TnpB polypeptide and nucleic acid component molecules. For example, xia CF and Boado RJ, pardridge WM ("anti-body-mediated targeting of siRNA via the human insulin receptor using avidin-biotin technology," Mol pharm.2009, 5 months to 6 months; 6 (3): 747-51.doi:10.1021/mp 800194) describes how short interfering RNAs (siRNAs) can be delivered to cells in culture and in vivo using a combination of receptor-specific monoclonal antibodies (mAbs) and avidin-biotin technology. The authors also reported that the teachings of the document may be suitable for use with the compositions, systems herein, as the bond between the targeting mAb and the siRNA is stable with the avidin-biotin technology, and RNAi effects at distal sites such as the brain are observed in vivo after intravenous administration of the targeting siRNA. In other embodiments, artificial viruses for CNS and/or brain delivery may be produced. See Zhang et al (Mol Ther.2003, month 1; 7 (1): 11-8)), which teaches that it may be suitable for use with the compositions, systems herein.
Treatment of hearing disorders
In one embodiment, the compositions and systems described herein may be used to treat hearing disorders or hearing loss of one or both ears. Deafness is usually caused by a loss or damage of hair cells that fails to transmit signals to auditory neurons. In such cases, cochlear implants may be used to respond to sound and transmit electrical signals to nerve cells. But these neurons typically degenerate and retract from the cochlea because of the less growth factors released by the damaged hair cells.
In one embodiment, the composition, system, or modified cell may be delivered to one or both ears by any suitable method or technique to treat or prevent a hearing disorder or loss. Suitable methods and techniques include, but are not limited to, those set forth in U.S. patent publication No. 20120328580, which describes the use of a syringe (e.g., a single dose syringe) to inject a pharmaceutical composition into the ear (e.g., auricular administration), such as into the lumen of the cochlea (e.g., the middle tube, vestibular tube, drum tube). For example, one or more of the compounds described herein may be administered by intraventricular injection (e.g., into the middle ear) and/or injection into the outer ear, middle ear, and/or inner ear; in situ administration via a catheter or pump (see, e.g., mcKenna et al, (U.S. patent publication No. 2006/0030837) and Jacobsen et al, (U.S. patent No. 7,206,639), in combination with an external ear-worn mechanical device such as a cochlear implant or hearing aid (see, e.g., U.S. patent publication No. 2007/0093878), which provides an exemplary cochlear implant suitable for delivering the compositions, systems described herein to the ear.
Generally, the cell therapy methods described in U.S. patent publication No. 20120328580 can be used to promote complete or partial differentiation of cells into mature cell types of the inner ear (e.g., hair cells) in vitro. Cells resulting from such methods can then be transplanted or implanted into a patient in need of such treatment. Cell culture methods required to practice these methods are described below, including methods for identifying and selecting appropriate cell types, methods for promoting full or partial differentiation of selected cells, methods for identifying fully or partially differentiated cell types, and methods for implanting fully or partially differentiated cells.
Cells suitable for use in the present invention include, but are not limited to, cells capable of fully or partially differentiating into mature cells of the inner ear, such as hair cells (e.g., inner ear and/or outer ear hair cells), when contacted (e.g., in vitro) with one or more of the compounds described herein. Exemplary cells capable of differentiating into hair cells include, but are not limited to, stem cells (e.g., inner ear stem cells, adult stem cells, bone marrow-derived stem cells, embryonic stem cells, mesenchymal stem cells, skin stem cells, iPS cells, and adipose-derived stem cells), progenitor cells (e.g., inner ear progenitor cells), support cells (e.g., deiter cells, column cells, inter-digitated cells, roof cells, and Hensen cells), and/or germ cells. The use of stem cells for replacing inner ear sensory cells is described in Li et al, (U.S. patent publication No. 2005/0287127) and Li et al, (U.S. patent application Ser. No. 11/953,797). The use of bone marrow derived stem cells to replace inner ear sensory cells is described in Edge et al, PCT/US2007/084654.iPS cells are described, for example, in Takahashi et al, cell, volume 131, stage 5, pages 861-872 (2007); takahashi and Yamanaka, cell 126,663-76 (2006); okita et al, nature448,260-262 (2007); yu, J. Et al, science 318 (5858): 1917-1920 (2007); nakagawa et al, nat. Biotechnol.26:101-106 (2008); and Zaehres and Scholer, cell 131 (5): 834-835 (2007). Such suitable cells can be identified by analyzing (e.g., qualitatively or quantitatively) for the presence of one or more tissue-specific genes. For example, gene expression may be detected by detecting protein products of one or more tissue-specific genes. Protein detection techniques involve staining the protein with antibodies to the appropriate antigen (e.g., using cell extracts or whole cells). In this case, the appropriate antigen is a protein product of tissue-specific gene expression. Although in principle it is possible to label the primary antibody (i.e. the antibody that binds the antigen), it is more common to use a secondary antibody (e.g. anti-IgG) against the primary antibody (and improve visualization). This secondary antibody is conjugated to a fluorescent dye, or to an appropriate enzyme for colorimetric reaction, or to gold beads (for electron microscopy) or to a biotin-avidin system, so that the position of the primary antibody can be identified, and thus the position of the antigen.
The compositions and systems can be delivered to the ear by applying the pharmaceutical composition directly to the outer ear, wherein the composition is modified according to U.S. patent publication No. 20110142917. In one embodiment, the pharmaceutical composition is applied to the ear canal. Delivery to the ear may also be referred to as auditory or aural delivery.
In one embodiment, the composition, system, or components thereof and/or vector system can be delivered to the ear by transfection into the inner ear through a complete round window by novel protein delivery techniques applicable to the TnpB system of the invention (see, e.g., qi et al, gene Therapy (2013), 1-9). About 40 μl of 10mM RNA may be considered as a dose to be applied to the ear.
According to Rejali et al (Hear Res.2007, 6; 228 (1-2): 180-7), the function of cochlear implants can be improved by preserving spiral ganglion neurons well (the target of implant electrical stimulation), and brain-derived neurotrophic factor (BDNF) has previously been demonstrated to enhance the survival of spiral ganglions in experimental deaf ears. Rejali et al tested an improved design of cochlear implant electrodes that included a fibroblast coating transduced by a viral vector with a BDNF gene insert. To accomplish this type of ex vivo gene transfer, rejali et al transduced guinea pig fibroblasts with adenovirus with a BDNF gene cassette insert and determined that these cells secreted BDNF, then attached BDNF-secreting cells to a cochlear implant electrode via agarose gel and implanted the electrode into the drum tube. Rejali et al determined that BDNF-expressing electrodes were able to retain significantly more spiral ganglion neurons in cochlear basal rotation after 48 days of implantation compared to control electrodes, and demonstrated the feasibility of combining cochlear implant therapy with ex vivo gene transfer to enhance spiral ganglion neuron survival. Such a system may be applied to the TnpB system of the invention for delivery to the ear.
In one embodiment, the system set forth in Mukherjea et al (annexidants & Redox Signaling, volume 13, phase 5, 2010) may be adapted for administration of the composition, system, or components thereof to the ear via the tympanic membrane. In one embodiment, the dosage of the TnpB polypeptide for administration to a human is from about 2mg to about 4mg.
In one embodiment, the system set forth in Jung et al (Molecular Therapy, volume 21, phase 4, pages 834-841, month 4 of 2013) may be adapted for delivery of the composition, system, or component thereof to the ear through the vestibular epithelium. In one embodiment, the dosage of the TnpB polypeptide for administration to a human is from about 1mg to about 30mg.
Treatment of non-dividing cell diseases
In one embodiment, the gene or transcript to be corrected is located in a non-dividing cell. Exemplary non-dividing cells are muscle cells or neurons. Non-dividing (especially non-dividing, fully differentiated) cell types present problems for gene targeting or genome engineering, for example because Homologous Recombination (HR) is normally inhibited during the G1 cell cycle phase. However, in studying the mechanism by which cells control the normal DNA repair system, durocher discovered a previously unknown switch that kept HR "off" in non-dividing cells, and devised a strategy to re-open this switch. Recent reports (Nature 16142, online publication of 12.2015, 9) of Orthwein et al (Daniel Durocher's lab at the Mount Sinai Hospital in Ottawa, canada) have demonstrated that inhibition of HR can be released and successful gene targeting in kidney (293T) and osteosarcoma (U2 OS) cells. Tumor suppressors BRCA1, PALB2 and BRAC2 are known to promote DNA DSB repair by HR. They found that the formation of the complex of BRCA1 with PALB2-BRAC2 is controlled by the ubiquitin site on PALB2, such that E3 ubiquitin ligase acts on the site. The E3 ubiquitin ligase consists of a complex of KEAP1 (PALB 2 interacting protein) and cullin-3 (CUL 3) -RBX 1. PALB2 ubiquitination inhibits its interaction with BRCA1 and is counteracted by deubiquitinase USP11, which itself is under cell cycle control. The combination of recovery of BRCA1-PALB2 interaction with activation of DNA end excision is sufficient to induce homologous recombination in the G1 phase, as measured by a variety of methods, including gene targeting assays based on Cas polypeptide nucleases for USP11 or KEAP1 (expressed by pX459 vectors). However, robust increases in gene targeting events were detected when BRCA1-PALB2 interactions were restored in G1 cells with excision capability using KEAP1 depletion or expression of PALB2-KR mutants. These teachings may be applicable and/or practical to the compositions, systems described herein.
Thus, in one embodiment, reactivation of HR in cells, particularly non-dividing, fully differentiated cell types, is preferred. In one embodiment, promoting BRCA1-PALB2 interactions is preferred. In one embodiment, the target cell is a non-dividing cell. In one embodiment, the target cell is a neuron or a muscle cell. In one embodiment, the target cell is targeted in vivo. In one embodiment, the cell is in G1 phase and HR is inhibited. In one embodiment, KEAP1 depletion, e.g., inhibition of expression of KEAP1 activity, is preferably used. KEAP1 depletion may be achieved by siRNA, for example as shown in orthowein et al. Alternatively, PALB2-KR mutants (lacking all 8 lysine residues in the BRCA1 interaction domain) are preferably expressed in combination with KEAP1 depletion or alone. PALB2-KR interacts with BRCA1 regardless of cell cycle position. Therefore, it is preferred to promote or restore BRCA1-PALB2 interactions, in particular in G1 cells. In one embodiment, particularly where the target cells are non-dividing, or where removal and return (ex vivo gene targeting) is problematic, such as neuronal or muscle cells. KEAP1 siRNA is available from thermosfischer. In one embodiment, the BRCA1-PALB2 complex may be delivered to G1 cells. In one embodiment, PALB2 deubiquitination may be promoted, for example, by increasing expression of deubiquitinase USP11, thus it is contemplated that constructs may be provided to promote or up-regulate expression or activity of deubiquitinase USP 11.
Treatment of ocular disorders
In one embodiment, the disease to be treated is a disease affecting the eye. Thus, in one embodiment, the compositions, systems, or components thereof described herein are delivered to one or both eyes.
The compositions, systems may be used to correct ocular defects caused by several genetic mutations further described in Genetic Diseases of the Eye, second edition, edited by Elias i.traboursi, oxford University Press,2012.
In one embodiment, the condition to be treated or targeted is an ocular disorder. In one embodiment, the ocular condition may comprise glaucoma. In one embodiment, the ocular condition comprises a retinal degenerative disease. In one embodiment, the retinal degenerative disease is selected from: stargardt Disease (Stargardt Disease), bardet-birdel syndrome, best Disease (Best Disease), blue Cone cell monochromism (Blue Cone Monochromacy), chorioretinopathy (choidermia), cone rod dystrophy (Cone-rod dystrophy), congenital stationary night blindness, enhanced S-Cone syndrome, adolescent X-linked retinal cleavage (Juvenile X-Linked Retinoschisis), leber congenital black Meng Zheng (Leber Congenital Amaurosis), mo Ladi-induced radrena Disease (Malattia Leventinesse), norrie Disease (Norrie Disease) or X-linked familial exudative vitreoretinopathy, modal dystrophy, sosby dystrophy (Sorsby Fundus Dystrophy), you Saishi syndrome (Usher syndrome), retinitis pigmentosa, total or macular dystrophy or degeneration, retinitis pigmentosa, total colour blindness and age-related macular degeneration. In one embodiment, the retinal degenerative disease is leber congenital black Meng Zheng (LCA) or retinitis pigmentosa. Other exemplary ocular diseases are described in more detail elsewhere herein.
In one embodiment, the composition, system is delivered to the eye, optionally via intravitreal injection or subretinal injection. Intraocular injection may be performed with the aid of a surgical microscope. For subretinal and intravitreal injections, the eye can be made prominent by gentle finger pressure and the fundus visualized using a contact lens system consisting of a drop of coupling medium solution on the cornea covered by a glass microscope slide cover slip. For subretinal injection, the tip of a 10-mm 34 gauge needle mounted on a 5- μl hamilton syringe may be advanced tangentially through the upper portion of the scleral equator under direct visualization until the hole of the needle is visible in the subretinal space. Then, 2 μl of vector suspension can be injected to create superior bullous retinal detachment, confirming subretinal vector administration. This method creates a self-sealing scleral incision that allows the carrier suspension to remain in the subretinal space until it is absorbed by the RPE, typically within 48 hours of the procedure. This operation may be repeated in the lower hemisphere to create a lower retinal detachment. This technique results in approximately 70% of sensory retinas and RPE exposure A carrier suspension. For intravitreal injection, the needle tip may be passed through the sclera 1mm behind the scleral spur and 2 μl of carrier suspension injected into the vitreous cavity. For intracameral injection, the needle tip may be advanced toward the central cornea by a cornel puncture, and 2 μl of carrier suspension may be injected. For intracameral injection, the needle tip may be advanced toward the central cornea by a cornel puncture, and 2 μl of carrier suspension may be injected. These vectors may be used in the range of 1.0 to 1.4X10 10 Or 1.0 to 1.4X10 9 Titer injection of Transduction Units (TU)/ml.
In one embodiment, the lentiviral vector is administered to the eye. In one embodiment, the lentiviral vector is an Equine Infectious Anemia Virus (EIAV) vector. Exemplary EIAV vectors for ocular delivery are described in Balagaan, J Gene Med 2006; on-line publication at Wiley InterScience (www.interscience.wiley.com) at 11 months 21 of 8:275-285,2005 DOI:10.1002/jgm.845; binley et al, HUMAN GENE THERAPY 23:980-991 (9. 2012), which may be suitable for use with the compositions, systems described herein. In one embodiment, the dose may be 1.1X105 transduction units per eye (TU/eye), in a total volume of 100. Mu.l.
Other viral vectors, such as those described in AAV vectors, such as Campochiaro et al Human Gene Therapy 17:167-176 (month 2 2006), millington-Ward et al Molecular Therapy, volume 19, month 4, 642-649 2011, dalkara et al Sci Transl Med 5,189ra76 (2013), may also be used for delivery to the eye, which may be suitable for use with the compositions, systems described herein.
In one embodiment, RXi PharmaceuticalsThe system may be used and/or adapted to groupThe compound, system is delivered to the eye. In this system, a single intravitreal administration of 3 μg of sd-rxRNA resulted in a sequence-specific decrease in PPIB mRNA levels for 14 days.The system may be applied to the TnpB system of the invention, considering the administration of a dose of about 3 to 20mg of the composition to humans.
In other embodiments, the method of U.S. patent publication No. 20130183282 (which relates to a method of cleaving a target sequence from a human rhodopsin gene) may also be modified for the TnpB system of the invention.
In other embodiments, methods for treating retinopathy and vision-threatening ophthalmic conditions of U.S. patent publication No. 20130202678, which involve delivering the Puf-a gene (which is expressed in retinal ganglion and ocular tissue pigment cells and exhibits unique anti-apoptotic activity) to the subretinal or intravitreal space of the eye, can be used or modulated. In particular, the desired targets are zgc:193933, prdm1a, spata2, tex10, rbb4, ddx3, zp2.2, blimp-1 and HtrA2, all of which can be targeted by the compositions, systems of the invention.
Wu (Cell Stem Cell,13:659-62,2013) designed a guide RNA that directed Cas9 to a single base mutation that caused cataracts in mice, wherein the Cas9 induced DNA cleavage. Then, in mutant mice, the sequence of the fragmented allele is corrected and the gene defect causing cataract is corrected using another wild type allele or oligonucleotide administered for the zygote (zygate) repair mechanism. This method may be applicable and/or applied to the TnpB compositions, systems described herein.
U.S. patent publication No. 20120159653 describes the use of zinc finger nucleases to genetically modify cells, animals and proteins associated with Macular Degeneration (MD), which teaches that the TnpB compositions, systems described herein can be applied and/or adapted.
One aspect of U.S. patent publication No. 20120159653 relates to editing any chromosomal sequence encoding a protein associated with MD, which is applicable to the TnpB system of the invention.
Treating muscle diseases and cardiovascular diseases
In one embodiment, the compositions, systems may be used to treat and/or prevent muscle diseases and related circulatory or cardiovascular diseases or conditions. The invention also contemplates delivery of the compositions, systems described herein (e.g., the TnpB effector protein system) to the heart. For the heart, cardiac tropical adeno-associated virus (AAVM) is preferred, in particular AAVM41 which shows preferential gene transfer in the heart (see, for example, lin-Yanga et al, PNAS, month 3, 10, volume 106, phase 10). Administration may be systemic or local. For systemic administration, about 1-10x 10 may be considered 14 Dose of each vector genome. See also, for example, eulalio et al (2012) Nature 492:376 and Somasu n haram et al (2013) Biomaterials 34:7790, the teachings of which may be suitable and/or applicable to the compositions, systems described herein.
For example, U.S. patent publication No. 20110023139 (the teachings of which may be suitable and/or applied to the compositions, systems described herein) describes the use of zinc finger nucleases to genetically modify cells, animals, and proteins associated with cardiovascular disease. Cardiovascular diseases generally include hypertension, heart disease, heart failure, stroke and TIA. Any chromosomal sequence involved in cardiovascular disease or a protein encoded by any chromosomal sequence involved in cardiovascular disease may be used in the methods described in the present disclosure. Cardiovascular-related proteins are typically selected based on their experimental association with the development of cardiovascular disease. For example, the rate of production or circulating concentration of a cardiovascular-related protein may be increased or decreased in a population having a cardiovascular disorder relative to a population lacking a cardiovascular disorder. Protein level differences can be assessed using proteomic techniques including, but not limited to, western blotting, immunohistochemical staining, enzyme-linked immunosorbent assay (ELISA), and mass spectrometry. Alternatively, cardiovascular-related proteins can be identified by obtaining a gene expression profile of a gene encoding the protein using genomic techniques, including, but not limited to, DNA microarray analysis, gene expression Series Analysis (SAGE), and quantitative real-time polymerase chain reaction (Q-PCR).
The compositions, systems herein may be used to treat diseases of the muscular system. The present invention also contemplates delivery of the compositions, systems, effector protein systems described herein to muscle.
In one embodiment, the muscle disorder to be treated is a muscular dystrophy, such as DMD. In one embodiment, the compositions, systems described herein (such as those capable of RNA modification) can be used to effect exon skipping to effect correction of a diseased gene. As used herein, the term "exon skipping" refers to the fact that by targeting a polypeptide having one or more complementary antisense sequences, by preventing access of a spliceosome to one or more splice donor or acceptor sites, an AON can prevent a splice reaction, resulting in deletion of one or more exons from fully processed mRNA. Exon skipping can be achieved in the nucleus during the maturation process of the pre-mRNA. In some examples, exon skipping can include masking key sequences involved in splicing of targeted exons by using the compositions, systems described herein that are capable of RNA modification. In one embodiment, exon skipping can be achieved in a dystrophin mRNA. In one embodiment, the composition, system can induce exon skipping at exons 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 45, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, or any combination thereof, of a dystrophin mRNA. In one embodiment, the composition, system, or method may induce exon skipping at exons 43, 44, 50, 51, 52, 55, or any combination thereof, of a dystrophin mRNA. Mutations in these exons can also be corrected using non-exon-skipping polynucleotide modification methods.
In one embodiment, for the treatment of muscle disorders, bortolanza et al, molecular Therapy, volume 19 may be usedStage 11, month 2055-2064 2011, 11) was applied to AAV expressing TnpB polypeptides and at about 2 x 10 15 Or 2X 10 16 The dose of vg carrier is injected into human body. The teachings of Bortolanza et al may be adapted and/or applied to the compositions, systems described herein.
In one embodiment, the method of Dumonclaux et al (volume Molecular Therapy, vol.18, 5 th, 881-887 2010, month 5) can be applied to AAV expressing TnpB polypeptides, and for example at about 10 14 To about 10 15 The dose of vg carrier is injected into human body. The teachings of dunonceaux described herein may be applicable and/or practical to the compositions, systems described herein.
In one embodiment, the method of Kinochi et al (Gene Therapy (2008) 15, 1126-1130) may be applied to the compositions described herein and injected into the human body, for example into the muscle at a dose of about 500 to 1000ml of a 40. Mu.M solution.
In one embodiment, the method of Hagstrom et al (volume Molecular Therapy, volume 10, phase 2, month 8 of 2004) may be adapted and/or applied to the compositions, systems herein and injected into the human saphenous vein at a dose of about 15 to about 50 mg.
In one embodiment, the method comprises treating a sickle cell-associated disease, e.g., a sickle cell trait, a sickle cell disease, such as sickle cell anemia, β -thalassemia. For example, the methods and systems can be used to modify the genome of a sickle cell, for example, by correcting one or more mutations in the β -globin gene. In the case of beta-thalassemia, sickle cell anemia can be corrected by modifying HSCs with the system. The system allows specific editing of the cell genome by cleaving the DNA of the cell genome and then letting it repair itself. The TnpB polypeptide is inserted and directed by the nucleic acid component molecules to the point of mutation, where it then cleaves DNA. At the same time, a healthy version of the sequence is inserted. The repair system of the cell itself uses this sequence to repair the induced cleavage. In this way, the TnpB polypeptide allows correction of mutations in previously obtained stem cells. Methods and systems can be used to correct HSCs for sickle cell anemia using a system that targets and corrects mutations (e.g., using a suitable HDR template that delivers the coding sequence of β -globin, advantageously non-sickle β -globin); in particular, the nucleic acid component molecules may target mutations that cause sickle cell anemia, and HDR may provide a coding for correct expression of β -globin. Contacting a nucleic acid component molecule targeting particles containing the mutation and the TnpB polypeptide with HSCs bearing the mutation. The particles may also contain a suitable HDR template to correct the mutation to correctly express β -globin; or the HSCs may be contacted with a second particle or carrier containing or delivering an HDR template. The cells so contacted may be administered; and optionally processing/expanding; see Cartier. The HDR templates may provide HSCs expressing engineered β -globin genes (e.g., βa-T87Q) or β -globin.
Treating liver and kidney diseases
In one embodiment, the compositions, systems, or components thereof described herein may be used to treat kidney or liver disease. Thus, in one embodiment, the compositions described herein or components thereof are delivered to the liver or kidney.
Delivery strategies that induce cellular uptake of therapeutic nucleic acids include physical forces or carrier systems, such as viral, lipid or complex-based delivery, or nanocarriers. According to the initial application with less likely clinical relevance, when nucleic acid addressing (address) is applied to kidney cells by systemic hydrodynamic high pressure injection, a number of gene therapeutic viral and non-viral vectors have been applied to target posttranscriptional events (Cssaba R v sz and P ter Hamar (2011) Delivery Methods to Target RNAs in the Kidney, gene Therapy Applications, prof. Chunsheng Kang, ISBN:978-953-307-541-9, inTech, available from www.intechopen.com/book/gene-therapy-applications/delivery-methods-to-target-rnas-interest-kidney) in different animal kidney disease models in vivo. Methods of delivery to the kidneys may include those of Yuan et al (Am J Physiol Renal Physiol 295: F605-F617,2008). The method of Yuang et al may be applied to the compositions of the present invention, and contemplates the subcutaneous injection of 1-2g of cholesterol conjugated polypeptide nuclease into a human for delivery to the kidneys. In one embodiment, the method of Molitoris et al (J Am Soc Nephrol 20:1754-1764,2009) may be suitable for the composition, and a cumulative dose of 12-20mg/kg to humans may be used for delivery to proximal tubule cells of the kidney. In one embodiment, the method of Thompson et al (Nucleic Acid Therapeutics, volume 22, phase 4, 2012) may be suitable for the composition, and may deliver doses of up to 25mg/kg via intravenous administration. In one embodiment, the method of Shimizu et al (J Am Soc Nephrol 21:622-633,2010) may be suitable for the composition and a dose of about 10-20. Mu. Mol of the composition complexed with a nanocarrier in about 1-2 liters of physiological fluid may be used for intraperitoneal administration.
Other various delivery vehicles may be used to deliver compositions, systems to the kidneys, such as viruses, hydrodynamics, lipids, polymer nanoparticles, aptamers, and various combinations thereof (see, e.g., larson et al, surgery, (month 8 2007), volume 142, phase 2, pages (262-269); hamar et al, proc Natl Acad Sci, (month 10 2004), volume 101, stage 41, pages (14883-14888); zheng et al, am J Pathol, (2008, 10 months), volume 173, 4 th, pages (973-980), feng et al, transfer, (2009, 5 months), volume 87, 9, pages (1283-1289), Q.Zhang et al, ploS ONE, (2010, 7 months), volume 5, 7, e11709, pages (1-13), kushibikia et al, J Controlled Release, (2005, 7 months), volume 105, 3, pages (318-331), wang et al, gene Therapy, (2006, 7 months), volume 13, 14, pages (1097-1103), kobayashi et al, journal of Pharmacology and Experimental Therapeutics, (2004, 308, 2, pages (688-693), wolfrum et al, nature Biotechnology, (2007, 25, 10, 9, 115, 150, 16, 150, 16, 150, etc.), phase 3, pages (217-226); zhang et al, J Am Soc Nephrol, (month 4 2006), volume 17, stage 4, pages (1090-1101); singhal et al, cancer Res, (5 months 2009), volume 69, stage 10, pages (4244-4251); malek et al, toxicology and Applied Pharmacology, (4 months of 2009), volume 236, phase 1, pages (97-108); shimizu et al, J Am Soc Nephrology, (month 4 2010), volume 21, stage 4, pages (622-633); jiang et al, molecular Pharmaceutics, (5 months to 6 months in 2009), volume 6, stage 3, pages (727-737); cao et al, J Controlled Release, (month 6 2010), volume 144, phase 2, pages (203-212); ninichuk et al, am J Pathol, (month 3 of 2008), vol.172, vol.3, (pages 628-637); purschke et al, proc Natl Acad Sci, (month 3 2006), volume 103, phase 13, pages (5173-5178).
In one embodiment, the delivery is to liver cells. In one embodiment, the liver cell is a hepatocyte (hepatocyte). Delivery of the compositions and systems herein may be via viral vectors, particularly AAV (and in particular AAV 2/6) vectors. These may be administered by intravenous injection. The preferred target for the liver, whether in vitro or in vivo, is the albumin gene. This is a so-called "safe harbor" in that albumin is expressed at very high levels, so that some reduction in albumin production after successful gene editing is tolerable. It is also preferred because the high level of expression observed from the albumin promoter/enhancer allows for the realization of useful levels of correct or transgene production (from the inserted recombinant template) even if only a small fraction of hepatocytes are edited. See Wechsler et al (report on the 57 th annual meeting and Exposure of the American society of blood-abstract available on-line at ash.confex.com/ash/2015/webrogram/Pap er86495.Html, and submitted at 12 th 2015), sites applicable to the compositions, systems herein.
Exemplary liver and kidney diseases that can be treated and/or prevented are described elsewhere herein.
Treating epithelial diseases and pulmonary diseases
In one embodiment, the disease treated or prevented by the compositions and systems described herein may be a pulmonary or epithelial disease. The compositions and systems described herein may be used to treat epithelial diseases and/or pulmonary diseases. The present invention also contemplates delivery of the compositions, systems described herein to one or both sides of the lung.
In one embodiment, viral vectors may be used to administer the compositionThe system or components thereof are delivered to the lungs. In one embodiment, the AAV is AAV-1, AAV-2, AAV-5, AAV-6, and/or AAV-9 for delivery to the lung. (see, e.g., li et al, molecular Therapy, vol. 17, 12, 2067-2077 2009, 12 months). In one embodiment, the MOI may be 1×10 3 Up to 4X 10 5 Individual vector genome/cell to cell variation. In one embodiment, the delivery vehicle may be a RSV vehicle, such as described by Zamora et al (Am J Respir Crit Care Med, vol. 183, pages 531-538, 2011. Zamora et al methods may be applied to the TnpB system of the present invention, and the present invention contemplates, for example, an aerosolized composition at a dose of 0.6 mg/kg.
A subject receiving treatment for a pulmonary disease may, for example, receive a pharmaceutically effective amount of the aerosolized AAV vector system delivered intrabronchially per lung while spontaneously breathing. Thus, in general, AAV delivery is preferably aerosolized delivery. Adenovirus or AAV particles may be used for delivery. Suitable genetic constructs (each operably linked to one or more regulatory sequences) may be cloned into a delivery vector. In this case, the following constructs are provided as examples: the Cbh or EF1a promoter of TnpB, the U6 or H1 promoter of the nucleic acid component molecule: a preferred arrangement is to use a nucleic acid component molecule targeting CFTR delta 508, a repair template for the delta F508 mutation, and a codon optimized composition, optionally with one or more nuclear localization signals or sequences (NLS), e.g., two (2) NLS.
Treatment of skin diseases
The compositions and systems described herein are useful for treating skin disorders. The present invention also contemplates delivery of the compositions and systems described herein to the skin.
In one embodiment, the composition, system, or component thereof may be delivered to the skin via one or more microneedles or a microneedle-containing device (intradermal delivery). For example, in one embodiment, the devices and methods of Hickerson et al (Molecular Therapy-Nucleic Acids (2013) 2, e 129) may be used and/or adapted to deliver compositions, systems described herein to the skin, for example, at a dose of up to 300 μl of 0.1mg/ml composition.
In one embodiment, the methods and techniques of Leachman et al (Molecular Therapy, volume 18, phase 2, month 442-446 2010) can be used and/or adapted to deliver the compositions described herein to the skin.
In one embodiment, the methods and techniques of Zheng et al (PNAS, 7, 24, volume 109, 30, 11975-11980) may be used and/or adapted to deliver the compositions described herein to the skin via nanoparticles. In one embodiment, gene knockdown in skin can be achieved by applying a dose of about 25nM in a single application.
Treatment of cancer
The compositions, systems described herein may be used to treat cancer. The present invention also contemplates delivery of the compositions, systems described herein to cancer cells. Furthermore, as described elsewhere herein, the compositions, systems can be used to modify immune cells, such as CARs or CAR T cells, which can then in turn be used to treat and/or prevent cancer. This is also described in International patent publication No. WO 2015/161276, the disclosure of which is hereby incorporated by reference and described below.
Target genes suitable for treating or preventing cancer may include those listed in tables 2 and 3. In one embodiment, target genes for cancer treatment and prevention may also include those described in international patent publication No. WO 2015/048577, the disclosure of which is hereby incorporated by reference and may be suitable for and/or applied to the compositions, systems described herein.
Adoptive cell therapy
The compositions, systems, and components thereof described herein can be used to modify cells for adoptive cell therapy. In one aspect of the invention, methods and compositions related to editing or modulating expression of a target nucleic acid sequence and uses thereof in connection with cancer immunotherapy are understood by modulating the compositions, systems of the invention. In some examples, the compositions, systems, and methods can be used to modify stem cells (e.g., induced pluripotent cells) to derive modified natural killer cells, γδ T cells, and αβ T cells, which can be used in adoptive cell therapies. In certain examples, the compositions, systems, and methods can be used to modify natural killer cells, γδ T cells, and αβ T cells.
As used herein, "ACT," "adoptive cell therapy," and "adoptive cell transfer" are used interchangeably. In one embodiment, adoptive Cell Therapy (ACT) may refer to the transfer of cells to a patient for the purpose of transferring functions and features into a new host by cell implantation (see, e.g., mettananda et al, editting an alpha-globin enhancer in primary human hematopoietic stem cells as a treatment for beta-thasassemia, nat Commun.2017, 9, 4, 8 (1): 424). As used herein, the term "engraft" or "engraftment" refers to the process by which cells are incorporated into a tissue of interest in vivo by contact with existing cells of the tissue. Adoptive Cell Therapy (ACT) may refer to the transfer of cells (most commonly immune-derived cells) back into the same patient or new recipient host in order to transfer immune function and characteristics into the new host. The use of autologous cells helps the recipient minimize GVHD problems, if possible. Autologous Tumor Infiltrating Lymphocytes (TIL) (Zachrakis et al, (2018) Nat Med.2018, month 6; 24 (6): 724-730; besser et al, (2010) Clin. Cancer Res 16 (9) 2646-55; dudley et al, (2002) Science 298 (5594): 850-4; and Dudley et al, (2005) Journal of Clinical Oncology (10): 2346-57) or genetically redirected peripheral Blood mononuclear cells (Johnson et al, (2009) Blood 114 (3): 535-46; and Morgan et al, (2006) Science 314 (5796) 126-9) adoptive transfer has been used to successfully treat patients with advanced solid tumors, including melanoma, metastatic breast cancer and colorectal cancer, and patients with CD19 expressing hematological malignancies (Kalos et al, (2011) Science Translational Medicine (95): 73). In one embodiment, allogeneic cell immune cells are transferred (see, e.g., ren et al, (2017) Clin Cancer Res 23 (9) 2255-2266). As further described herein, allogeneic cells may be edited to reduce alloreactivity and prevent graft versus host disease. Thus, the use of allogeneic cells allows cells to be obtained from a healthy donor and prepared for use in a patient, rather than autologous cells prepared from the patient after diagnosis.
Aspects of the invention relate to adoptive transfer of immune system cells (such as T cells) that are specific for a selected antigen (such as a tumor-associated antigen or a tumor-specific neoantigen) (see, e.g., maus et al, 2014,Adoptive Immunotherapy for Cancer or Viruses,Annual Review of Immunology, volume 32: 189-225; rosenberg and Restifo, volume 348, page 6230, pages 62-68; restifo et al, 2015,Adoptive immunotherapy for cancer:harnessing the T cell response.Nat.Rev.Immunol.12 (4): 269-281; and Jenson and Riddell,2014,Design and implementation of adoptive therapy with chimeric antigen receptor-modified T cells. ImmunoRev.257 (1): 127-144; and Rajasargi et al, 2014,Systematic identification of personal tumor-specific neoantigens in chronic lymphocytic leukemia. Blood.2014, 7 month 17; 124 (3): 453-62).
In one embodiment, the antigen (such as a tumor antigen) targeted in adoptive cell therapy (such as in particular CAR or TCR T cell therapy) of a disease (such as in particular a tumor or cancer) may be selected from the group consisting of: MR1 (see, e.g., crowther et al 2020, genome-wide CRISPR-Cas9 screening reveals ubiquitous T cell cancer targeting via the monomorphic MHC class I-related protein MR, volume Nature Immunology, pages 178-185), B-cell maturation antigen (BCMA) (see, e.g., friedman et al Effective Targeting of Multiple BCMA-Expressing Hematological Malignancies by Anti-BCMAAR T Cells, hum Gene Ther.2018 for 3 months 8 days; berdeja JG et al Durable clinical responses in heavily pretreated patients with relapsed/refractory multiple myeloma: updated results from a multicenter study of bb2121 anti-Bgma CAR T cell therapy.blood.2017;130:740; and Mouhiedine and Ghotbrial, immunotherapy in Multiple Myeloma: the Era of CAR T Cell Therapy, hematology, 2018 for 5 to 6 months, volume 15, stage 3); PSA (prostate specific antigen); prostate Specific Membrane Antigen (PSMA); PSCA (prostate stem cell antigen); tyrosine protein kinase transmembrane receptor ROR1; fibroblast Activation Protein (FAP); tumor-associated glycoprotein 72 (TAG 72); carcinoembryonic antigen (CEA); epithelial cell adhesion molecule (EPCAM); mesothelin; human epidermal growth factor receptor 2 (ERBB 2 (Her 2/neu)); prostase (prostase); prostatectomy phosphatase (PAP); elongation factor 2 mutant (ELF 2M); insulin-like growth factor 1 receptor (IGF-1R); gplOO; BCR-ABL (breakpoint cluster region-Abelson); tyrosinase; new York esophageal squamous carcinoma antigen 1 (NY-ESO-1); kappa-light chain, rage (L antigen); MAGE (melanoma antigen); melanomA-Associated antigen 1 (MAGE-A1); MAGE A3; MAGE A6; legumain; human Papilloma Virus (HPV) E6; HPV E7; prostein; survivin; PCTA1 (galectin 8); melan-A/MART-1; ras mutant; TRP-1 (tyrosine related protein 1 or gp 75); tyrosine related protein 2 (TRP 2); TRP-2/INT2 (TRP-2/intron 2); RAGE (kidney antigen); advanced glycation end product receptor 1 (RAGE 1); renin (renal ubiquitosus) 1, 2 (RU 1, RU 2); intestinal Carboxylesterase (iCE); a heat shock protein 70-2 (HSP 70-2) mutant; thyroid Stimulating Hormone Receptor (TSHR); CD123; CD171; CD19; CD20; CD22; CD26; CD30; CD33; CD44v7/8 (cluster of differentiation 44, exon 7/8); CD53; CD92; CD100; CD148; CD150; CD200; CD261; CD262; CD362; CS-1 (CD 2 subset 1, CRACC, SLAMF7, CD319, and 19A 24); c-type lectin-like molecule 1 (CLL-1); ganglioside GD3 (aNeu 5Ac (2-8) aNeu5Ac (2-3) bDGalp (1-4) bDGlcp (1-1) Cer); tn antigen (Tn Ag); fms-like tyrosine kinase 3 (FLT 3); CD38; CD138; CD44v6; B7H3 (CD 276); KIT (CD 117); interleukin 13 receptor subunit alpha 2 (IL-13 Ra 2); interleukin 11 receptor alpha (IL-11 Ra); prostate Stem Cell Antigen (PSCA); serine protease 21 (PRSS 21); vascular endothelial growth factor receptor 2 (VEGFR 2); a Lewis (Y) antigen; CD24; platelet-derived growth factor receptor beta (PDGFR-beta); stage specific embryonic antigen 4 (SSEA-4); mucin 1, cell surface associated (MUC 1); mucin 16 (MUC 16); epidermal Growth Factor Receptor (EGFR); epidermal growth factor receptor variant III (EGFRvIII); neural Cell Adhesion Molecules (NCAM); carbonic Anhydrase IX (CAIX); proteasome (Prosome, macropain) subunit, beta-form, 9 (LMP 2); ephrin-type a receptor 2 (EphA 2); liver accessory protein B2; fucosyl GM1; sialic acid lewis adhesion molecules (sialyl Lewis adhesion molecule) (sLe); ganglioside GM3 (aNeu 5Ac (2-3) bDGalp (1-4) bDGlcp (1-1) Cer); TGS5; high Molecular Weight Melanoma Associated Antigen (HMWMAA); o-acetyl-GD 2 ganglioside (OAcGD 2); folate receptor alpha; folate receptor beta; tumor endothelial marker 1 (TEM 1/CD 248); tumor endothelial marker 7-associated (TEM 7R); claudin 6 (CLDN 6); g protein coupled receptor class C group 5 member D (GPRC 5D); x chromosome open reading frame 61 (CXORF 61); CD97; CD179a; anaplastic Lymphoma Kinase (ALK); polysialic acid; placenta-specific 1 (PLAC 1); the hexose moiety of globoH glyceramide (globoH); breast differentiation antigen (NY-BR-1); uroplakin 2 (UPK 2); hepatitis a virus cell receptor 1 (HAVCR 1); adrenergic receptor beta 3 (ADRB 3); pannexin 3 (PANX 3); g protein-coupled receptor 20 (GPR 20); lymphocyte antigen 6 complex, K9 locus (LY 6K); olfactory receptor 51E2 (OR 51E 2); tcrγ alternate reading frame protein (TARP); wilms tumor (WT 1) protein; ETS translocation mutant gene 6, located on chromosome 12p (ETV 6-AML); sperm protein 17 (SPA 17); x antigen family member 1A (XAGE 1); angiogenin binds to cell surface receptor 2 (Tie 2); CT (cancer/testis (antigen)); melanoma cancer testis antigen 1 (MAD-CT-1); melanoma cancer testis antigen 2 (MAD-CT-2); fos-associated antigen 1; p53; a p53 mutant; human telomerase reverse transcriptase (hTERT); sarcoma translocation breakpoints; inhibitors of melanoma apoptosis (ML-IAP); ERG (transmembrane protease serine 2 (TMPRSS 2) ETS fusion gene); n-acetylglucosaminyl transferase V (NA 17); pairing box protein (PAX 3); androgen receptor; cyclin B1; cyclin D1; v-myc avian myeloblastosis virus oncogene neuroblastosis derived homolog (MYCN); ras homolog family member C (RhoC); cytochrome P450 1B1 (CYP 1B 1); CCCTC binding factor (zinc finger protein) like (BORIS); squamous cell carcinoma antigen 1 or 3 recognized by T cells (SART 1, SART 3); pairing box protein Pax-5 (Pax 5); the top body protein binding protein sp32 (OY-TES 1); lymphocyte-specific protein tyrosine kinase (LCK); a kinase anchored protein 4 (AKAP-4); synovial sarcoma, X breakpoint 1, 2, 3, or 4 (SSX 1, SSX2, SSX3, SSX 4); CD79a; CD79b; CD72; leukocyte associated immunoglobulin-like receptor 1 (LAIR 1); an Fc fragment of IgA receptor (FCAR); leukocyte immunoglobulin-like receptor subfamily a member 2 (LILRA 2); CD300 molecular-like family member f (CD 300 LF); c lectin domain family 12 member a (CLEC 12A); bone marrow stromal cell antigen 2 (BST 2); EGF-like comprises modular mucin-like hormone receptor-like 2 (EMR 2); lymphocyte antigen 75 (LY 75); glypican 3 (GPC 3); fc receptor like 5 (FCRL 5); mouse two minute 2 homolog (MDM 2); livin; alpha Fetoprotein (AFP); transmembrane Activator and CAML Interactors (TACI); b cell activating factor receptor (BAFF-R); V-Ki-ras2 Kirsten rat sarcoma virus oncogene homolog (KRAS); immunoglobulin lambda-like polypeptide 1 (IGLL 1); 707-AP (707 alanine proline); ART-4 (adenocarcinoma antigen recognized by T4 cells); BAGE (B antigen; B-catenin/m, B catenin/mutated); CAMEL (CTL on melanoma recognizes antigen); CAP1 (carcinoembryonic antigen peptide 1); CASP-8 (caspase 8); CDC27m (cell division cycle 27 mutation); CDK4/m (cyclin E-dependent kinase 4 mutation); cyp-B (cyclophilin B); DAM (differentiated antigen melanoma); EGP-2 (epithelial glycoprotein 2); EGP-40 (epithelial glycoprotein 40); erbb2, 3, 4 (erythroleukemia virus oncogene homologs 2, 3, 4); FBP (folate binding protein); fAchR (fetal acetylcholine receptor); g250 (glycoprotein 250); GAGE (G antigen); gnT-V (N-acetylglucosaminyl transferase V); HAGE (helical carbohydrate antigen); ULA-A (human leukocyte antigen a); HST2 (human print tumor 2); KIAA0205; KDR (kinase insertion domain receptor); LDLR/FUT (Low Density lipid receptor/GDP L-trehalose: b-D-galactosidase 2-a-L fucosyltransferase); l1CAM (L1 cell adhesion molecule); MC1R (melanocortin 1 receptor); myosin/m (mutant myosin); MUM-1, -2-3 (mutant melanoma ubiquitin 1, 2, 3); NA88-A (NA cDNA clone of patient M88); KG2D (natural killer group 2 member D) ligands; carcinoembryonic antigen (h 5T 4); p190 micro bcr-abl (190 KD bcr-abl protein); pml/RARa (promyelocytic leukemia/retinoic acid receptor a); PRAME (preferentially expressed melanoma antigen); SAGE (sarcoma antigen); TEL/AML1 (translocation Ets family leukemia/acute myeloid leukemia 1); TPI/m (mutant triose phosphate isomerase); CD70; and any combination thereof.
In one embodiment, the antigen targeted in adoptive cell therapy (such as in particular CAR or TCR T cell therapy) of a disease (such as in particular a tumor or cancer) is a Tumor Specific Antigen (TSA).
In one embodiment, the antigen targeted in adoptive cell therapy (such as in particular CAR or TCR T cell therapy) of a disease (such as in particular a tumor or cancer) is a neoantigen.
In one embodiment, the antigen targeted in adoptive cell therapy (such as in particular CAR or TCR T cell therapy) of a disease (such as in particular a tumor or cancer) is a Tumor Associated Antigen (TAA).
In one embodiment, the antigen targeted in adoptive cell therapy (such as in particular CAR or TCR T cell therapy) of a disease (such as in particular a tumor or cancer) is a universal tumor antigen. In certain preferred embodiments, the universal tumor antigen is selected from the group consisting of: human telomerase reverse transcriptase (hTERT), survivin (survivin), mouse two minute 2 homolog (MDM 2), cytochrome P450 1B 1 (CYP 1B), HER2/neu, wilms tumor gene 1 (WT 1), livin, alpha Fetoprotein (AFP), carcinoembryonic antigen (CEA), mucin 16 (MUC 16), MUC1, prostate Specific Membrane Antigen (PSMA), P53, cyclin (Dl), and any combination thereof.
In one embodiment, the antigen (such as a tumor antigen) targeted in adoptive cell therapy (such as in particular CAR or TCR T cell therapy) of a disease (such as in particular a tumor or cancer) may be selected from the group consisting of: CD19, BCMA, CD70, CLL-1, MAGE A3, MAGE A6, HPV E7, WT1, CD22, CD171, ROR1, MUC16 and SSX2. In certain preferred embodiments, the antigen may be CD19. For example, CD19 may target hematological malignancies, such as lymphomas, more particularly B-cell lymphomas, such as, but not limited to diffuse large B-cell lymphomas, primary mediastinal B-cell lymphomas, transforming follicular lymphomas, marginal zone lymphomas, mantle cell lymphomas, acute lymphoblastic leukemias (including adult and pediatric ALL), non-Hodgkin lymphomas (non-Hodgkin lymphomas), indolent non-Hodgkin lymphomas, or chronic lymphocytic leukemias. For example, BCMA can target multiple myeloma or plasma cell leukemia (see, e.g., 2018American Association for Cancer Research (AACR) Annual meeting Poster: allogeneic Chimeric Antigen Receptor T Cells Targeting B Cell Maturation Antigen). For example, CLL1 may target acute myeloid leukemia. For example, MAGE A3, MAGE A6, SSX2, and/or KRAS may target solid tumors. For example, HPV E6 and/or HPV E7 may target cervical cancer or head and neck cancer. For example, WT1 may target Acute Myeloid Leukemia (AML), myelodysplastic syndrome (MDS), chronic Myeloid Leukemia (CML), non-small cell lung cancer, breast cancer, pancreatic cancer, ovarian cancer, or colorectal cancer or mesothelioma. For example, CD22 may target B-cell malignancies, including non-hodgkin's lymphoma, diffuse large B-cell lymphoma, or acute lymphoblastic leukemia. For example, CD171 may target neuroblastoma, glioblastoma or lung cancer, pancreatic cancer or ovarian cancer. For example, ROR1 may target ror1+ malignancies, including non-small cell lung cancer, triple negative breast cancer, pancreatic cancer, prostate cancer, ALL, chronic lymphocytic leukemia or mantle cell lymphoma. For example, MUC16 may target MUC16ecto+ epithelial ovarian cancer, fallopian tube cancer, or primary peritoneal cancer. For example, CD70 may target hematological malignancies as well as solid cancers, such as Renal Cell Carcinoma (RCC), glioma (e.g., GBM), and head and neck cancer (HNSCC). CD70 is expressed in hematological malignancies as well as solid cancers, whereas its expression in normal tissues is limited to only a subset of lymphocyte types (see, e.g., 2018American Association for Cancer Research (AACR) Annual meeting Poster: allogeneic CRISPR Engineered Anti-CD70 CAR-T Cells Demonstrate Potent Preclinical Activity Against Both Solid and Hematological Cancer Cells).
For example, T cells can be genetically modified using various strategies by altering the specificity of T Cell Receptors (TCRs), for example by introducing new TCR alpha and beta chains with the specificity of the selected peptide (see U.S. Pat. No. 8,697,854; PCT patent publication: WO2003020763, WO2004033685, WO2004044004, WO2005114215, WO2006000830, WO2008038002, WO2008039818, WO2004074322, WO2005113595, WO2006125962, WO2013166321, WO2013039889, WO2014018863, WO2014083173; U.S. Pat. No. 8,088,379).
Alternatively or in addition to TCR modification, chimeric Antigen Receptors (CARs) can be used in order to generate immunoreactive cells (such as T cells) specific for a selected target (such as malignant cells), wherein a variety of receptor chimeric constructs have been described (see U.S. Pat. nos. 5,843,728;5,851,828;5,912,170;6,004,811;6,284,240;6,392,013;6,410,014;6,753,162;8,211,422; and PCT publication WO 9215322).
Generally, a CAR consists of an extracellular domain, a transmembrane domain, and an intracellular domain, wherein the extracellular domain comprises an antigen binding domain that is specific for a predetermined target. Although the antigen binding domain of a CAR is typically an antibody or antibody fragment (e.g., a single chain variable fragment scFv), the binding domain is not particularly limited as long as it results in specific recognition of the target. For example, in one embodiment, the antigen binding domain can comprise a receptor such that the CAR is capable of binding to a ligand of the receptor. Alternatively, the antigen binding domain may comprise a ligand such that the CAR is capable of binding to an endogenous receptor of the ligand.
The antigen binding domain of a CAR is typically separated from the transmembrane domain by a hinge or spacer. The spacer is also not particularly limited and is designed to provide flexibility to the CAR. For example, the spacer domain may comprise a portion of a human Fc domain (including a portion of a CH3 domain), or the hinge region of any immunoglobulin (such as IgA, igD, igE, igG or IgM or variant thereof). In addition, the hinge region may be modified to prevent off-target binding of FcR or other potential interferents. For example, the hinge may comprise an IgG4 Fc domain with or without S228P, L235E and/or N297Q mutations (numbering according to Kabat) in order to reduce binding to FcR. Additional spacers/hinges include, but are not limited to, CD4, CD8, and CD28 hinge regions.
The transmembrane domain of the CAR may be derived from natural or synthetic sources. When the source is natural, the domain may be derived from any membrane-bound protein or transmembrane protein. The transmembrane regions particularly useful in the present disclosure may be derived from CD8, CD28, CD3, CD45, CD4, CD5, CDs, CD9, CD 16, CD22, CD33, CD37, CD64, CD80, CD86, CD 134, CD137, CD 154, TCR. Alternatively, the transmembrane domain may be synthetic, in which case it will predominantly comprise hydrophobic residues such as leucine and valine. Preferably, triplets of phenylalanine, tryptophan and valine will be found at each end of the synthetic transmembrane domain. Optionally, a short oligopeptide or polypeptide linker (preferably between 2 and 10 amino acids in length) may form a link between the transmembrane domain and the cytoplasmic signaling domain of the CAR. Glycine-serine doublets provide particularly suitable linkers.
Alternative CAR constructs may be characterized as belonging to successive generations. First generation CARs typically consist of single chain variable fragments of antibodies specific for an antigen, e.g., comprising a VL linked to the VH of a particular antibody, linked to a cd3ζ or fcrγ transmembrane domain and an intracellular signaling domain (scFv-cd3ζ or scFv-fcrγ) by a flexible linker (e.g., via a CD8 a hinge domain and a CD8 a transmembrane domain; see us patent No. 7,741,465; us patent No. 5,912,172; us patent No. 5,906,936). The second generation CAR incorporates one or more co-stimulatory molecules, such as the intracellular domain of CD28, OX40 (CD 134) or 4-1BB (CD 137), within the intracellular domain (e.g., scFv-CD28/OX40/4-1BB-CD3 zeta; see U.S. Pat. Nos. 8,911,993;8,916,381;8,975,071;9,101,584;9,102,760;9,102,761). Third generation CARs include combinations of co-stimulatory intracellular domains such as the CD3 zeta chain, CD97, gdla-CD 18, CD2, ICOS, CD27, CD154, CDs, OX40, 4-1BB, CD2, CD7, LIGHT, LFA-1, NKG2C, B-H3, CD30, CD40, PD-1, or CD28 signaling domains (e.g., scFv-CD28-4-1BB-CD3 zeta or scFv-CD28-OX40-CD3 zeta; see U.S. patent No. 8,906,682; U.S. patent No. 8,399,645; U.S. patent No. 5,686,281; pct publication No. WO 2014/134165; pct publication No. WO 2012/079000). In one embodiment, the primary signaling domain comprises a functional signaling domain of a protein selected from the group consisting of: cd3ζ, cd3γ, cd3δ, cd3ε, common Fcrγ (FCERIG), fcrβ (fcεr1b), CD79a, CD79b, fcγriia, DAP10, and DAP12. In certain preferred embodiments, the primary signaling domain comprises a functional signaling domain of cd3ζ or fcrγ. In one embodiment, the one or more co-stimulatory signaling domains comprises a functional signaling domain of a protein each independently selected from the group consisting of: CD27, CD28, 4-1BB (CD 137), OX40, CD30, CD40, PD-1, ICOS, lymphocyte function-associated antigen-1 (LFA-1), CD2, CD7, LIGHT, NKG2C, B-H3, ligand that specifically binds to CD83, CDS, ICAM-1, GITR, BAFFR, HVEM (LIGHTR), SLAMF7, NKp80 (KLRF 1), CD160, CD19, CD4, CD8 alpha, CD8 beta, IL2 Rbeta, IL2 Rgamma, IL7 Ralpha, ITGA4, VLA1, CD49a, ITGA4, IA4, CD49D, ITGA6, VLA-6, CD49f, ITGAD, CD11D, ITGAE, CD103, ITGAL CD11a, LFA-1, ITGAM, CD11B, ITGAX, CD11c, ITGB1, CD29, ITGB2, CD18, ITGB7, TNFR2, TRANCE/RANKL, DNAM1 (CD 226), SLAMF4 (CD 244, 2B 4), CD84, CD96 (Tactive), CEACAM1, CRTAM, ly9 (CD 229), CD160 (BY 55), PSGL1, CD100 (SEMA 4D), CD69, SLAMF6 (NTB-A, lyl 08), SLAM (SLAMF 1, CD150, IPO-3), BLAMME (SLAMF 8), SELPLG (CD 162), LTBR, LAT, GADS, SLP-76, PAG/Cbp, NKp44, NKp30, NKp46 and NKG2D. In one embodiment, the one or more co-stimulatory signaling domains comprises a functional signaling domain of a protein each independently selected from the group consisting of: 4-1BB, CD27 and CD28. In one embodiment, the chimeric antigen receptor may have a design as described in U.S. Pat. No. 7,446,190, which comprises the intracellular domain of the CD3 zeta chain (such as amino acid residues 52-163 of the human CD3 zeta chain, as shown in SEQ ID NO:14 of U.S. Pat. No. 7,446,190), the signaling region from CD28, and an antigen binding element (or part or domain; such as an scFv). When between the zeta chain portion and the antigen binding element, the CD28 portion may suitably comprise the transmembrane domain and the signaling domain of CD28 (such as the complete sequence shown in SEQ ID NO:6 of amino acid residues 114-220 of SEQ ID NO:10, US 7,446,190; these may comprise the following portions of CD28 listed in Genbank identifier nm_006139; alternatively, when the zeta sequence is located between the CD28 sequence and the antigen binding element, the intracellular domain of CD28 (such as the amino acid sequence listed in SEQ ID NO:9 of US 7,446,190) may be used alone, thus, certain embodiments employ a CAR comprising (a) a zeta chain portion comprising the intracellular domain of the human CD3 zeta chain, (b) a costimulatory signaling region comprising the amino acid sequence encoded by SEQ ID NO:6 of US 7,446,190, and (c) the antigen binding element (or portion or domain).
Alternatively, co-stimulation may be coordinated by expressing the CAR in antigen-specific T cells selected to activate and expand upon binding of their native αβ TCR, e.g., by antigen on professional antigen presenting cells, with concomitant co-stimulation. In addition, additional engineered receptors may be provided on immune response cells, e.g., to improve targeting of T cell attack and/or minimize side effects.
By way of example and not limitation, kochenderfer et al, (2009) J Immunother.32 (7): 689-702 describes an anti-CD 19 Chimeric Antigen Receptor (CAR). FMC63-28Z CAR contains a single chain variable region portion (scFv) that recognizes CD19 derived from FMC63 mouse hybridoma (described in Nicholson et al, (1997) Molecular Immunology 34:1157-1165), a portion of the human CD28 molecule, and the intracellular components of the human TCR- ζ molecule. The FMC63-CD828BBZ CAR comprises FMC63 scFv, hinge and transmembrane regions of the CD8 molecule, cytoplasmic portions of CD28 and 4-1BB, and cytoplasmic components of the TCR-zeta molecule. The exact sequence of the CD28 molecule included in FMC63-28Z CAR corresponds to Genbank identifier nm_006139; the sequence includes all amino acids starting with amino acid sequence IEVMYPPPY (SEQ. I.D.No.64, 306) and continuing up to the carboxyl terminus of the protein. To encode the anti-CD 19 scFv component of the vector, the authors designed a DNA sequence based on a portion of the previously published CAR (Cooper et al, (2003) Blood 101:1637-1644). This sequence encodes the following components in frame from 5 'to 3': xhoI sites, human granulocyte-macrophage colony-stimulating factor (GM-CSF) receptor alpha chain signal sequences, FMC63 light chain variable regions (e.g., nicholson et al, supra), linker peptides (e.g., cooper et al, supra), FMC63 heavy chain variable regions (e.g., nicholson et al, supra), and NotI sites. Plasmids encoding this sequence were digested with XhoI and NotI. To form the MSGV-FMC63-28Z retroviral vector, xhoI and NotI digested fragments encoding FMC63 scFv are ligated to a second XhoI and NotI digested fragment encoding MSGV retroviral backbone (e.g., hughes et al, (2005) Human Gene Therapy 16:457-472) and a portion of the extracellular portion of human CD28, the intact transmembrane and cytoplasmic portions of human CD28 and the cytoplasmic portion of human TCR-zeta molecule (e.g., maher et al, 2002) Nature Biotechnology 20:70-75). FMC63-28Z CAR is included in KTE-C19 (axicabtagene ciloleucel) anti-CD 19CAR-T therapeutic product developed by Kite Pharma, inc, for use in treating patients with, inter alia, relapsed/refractory invasive B-cell non-hodgkin lymphoma (NHL). Thus, in one embodiment, cells intended for adoptive cell therapy, more particularly immune responsive cells (such as T cells), can express an FMC63-28Z CAR as described by Kochenderfer et al (supra). Thus, in one embodiment, cells intended for adoptive cell therapy, more particularly immunoreactive cells (such as T cells), may comprise a CAR comprising an extracellular antigen binding element (or portion or domain; such as scFv) that specifically binds an antigen, an intracellular signaling domain comprising the intracellular domain of the CD3 zeta chain, and a costimulatory signaling region comprising the signaling domain of CD 28. Preferably, the CD28 amino acid sequence starts with amino acid sequence IEVMYPPPY (SEQ ID NO:64,306) and continues to the carboxy terminus of the protein as shown in Genbank identifier NM-006139 (sequence version 1, 2 or 3). Preferably, the antigen is CD19, more preferably the antigen binding element is an anti-CD 19 scFv, even more preferably an anti-CD 19 scFv as described by Kochenderfer et al (supra).
Additional anti-CD 19 CARs are further described in International patent publication No. WO 2015/187528. More particularly, example 1 and table 1 of WO2015187528, incorporated herein by reference, demonstrate the generation of anti-CD 19 CARs based on a fully human anti-CD 19 monoclonal antibody (47G 4, as described in US 20100104509) and a murine anti-CD 19 monoclonal antibody (as described in Nicholson et al and explained above). Various combinations of signal sequences (human CD 8-alpha or GM-CSF receptor), extracellular and transmembrane regions (human CD 8-alpha) and intracellular T cell signaling domains (CD 28-CD3 zeta; 4-1BB-CD3 zeta; CD27-CD3 zeta; CD28-CD27-CD3 zeta, 4-1BB-CD27-CD3 zeta; CD27-4-1BB-CD3 zeta; CD28-CD27-Fc omega RI gamma chain; or CD28-Fc omega RI gamma chain) are disclosed. Thus, in one embodiment, cells intended for adoptive cell therapy, more particularly immunoreactive cells (such as T cells), may comprise a CAR comprising an extracellular antigen binding element that specifically binds an antigen, an extracellular region and a transmembrane region as listed in table 1 of WO2015187528, and an intracellular T cell signaling domain as listed in table 1 of international application No. WO 2015/187528. Preferably, the antigen is CD19, more preferably the antigen binding element is an anti-CD 19 scFv, even more preferably a mouse or human anti-CD 19 scFv as described in example 1 of WO 2015/187528. In one embodiment, the CAR comprises, consists essentially of, or consists of the following amino acid sequences as set forth in table 1 of WO 2015187528: SEQ ID NO. 1, SEQ ID NO. 2, SEQ ID NO. 3, SEQ ID NO. 4, SEQ ID NO. 5, SEQ ID NO. 6, SEQ ID NO. 7, SEQ ID NO. 8, SEQ ID NO. 9, SEQ ID NO. 10, SEQ ID NO. 11, SEQ ID NO. 12 or SEQ ID NO. 13.
By way of example and not limitation, chimeric antigen receptors that recognize the CD70 antigen are described in WO2012058460A2 (see also Park et al, CD70 as a target for chimeric antigen receptor T cells in head and neck squamous cell carcinoma, oral Oncol.2018, 3 months; 78:145-150; and Jin et al, CD70, a novel target of CAR T-cell therapy for gliomas, neuro Oncol.2018, 1 month 10 days; 20 (1): 55-65). CD70 is expressed by diffuse large B-cell lymphoma and follicular lymphoma, and also by malignant cells of Hodgkin's lymphoma, fahrenheit macroglobulinemia (Waldenstrom's macroglobulinemia) and multiple myeloma, and by HTLV-1 and EBV related malignancies. (Agath nggelou et al am. J. Pathol.1995;147:1152-1160; hunter et al Blood 2004; 104:4811.26; lens et al J Immunol.2005;174:6212-6219; baba et al J Virol.2008; 82:3843-3852). In addition, CD706 is expressed by non-hematological malignancies (such as renal cell carcinoma and glioblastoma). (Junker et al, J Urol.2005;173:2150-2153; chahlavi et al, cancer Res 2005; 65:5428-5438) CD70 expression is transient from a physiological point of view and is limited to only a subset of highly activated T cells, B cells and dendritic cells.
By way of example and not limitation, chimeric antigen receptors that recognize BCMA have been described (see, e.g., US20160046724A1, WO2016014789A2, WO2017211900A1, WO2015158671A1, US20180085444A1, WO2018028647A1, US20170283504A1, and WO2013154760 A1).
In one embodiment, the immune cell may comprise, in addition to a CAR or exogenous TCR as described herein, a chimeric inhibitory receptor (inhibitory CAR) that specifically binds to a second target antigen and is capable of inducing an inhibitory (inhibitory) or immunosuppressive (repressive) signal to the cell upon recognition of the second target antigen. In one embodiment, the chimeric inhibitory receptor comprises an extracellular antigen-binding element (or portion or domain) configured to specifically bind to a target antigen, a transmembrane domain, and an intracellular immunosuppressive or inhibitory signaling domain. In one embodiment, the second target antigen is an antigen that is not expressed on the surface of a cancer cell or an infected cell or whose expression is down-regulated on a cancer cell or an infected cell. In one embodiment, the second target antigen is an MHC class I molecule. In one embodiment, the intracellular signaling domain comprises a functional signaling portion of an immune checkpoint molecule (such as, for example, PD-1 or CTLA 4). Advantageously, the inclusion of such inhibitory CARs reduces the chance that the engineered immune cells attack non-target (e.g., non-cancerous) tissue.
Alternatively, CAR-expressing T cells can be further modified to reduce or eliminate expression of endogenous TCRs in order to reduce off-target effects. Decreasing or eliminating endogenous TCRs may reduce off-target effects and increase T cell availability (U.S. 9,181,527). Various methods can be used to generate T cells that stably lack expression of a functional TCR. T cells internalize, sort and degrade intact T cell receptors as a complex, where resting T cells have a half-life of about 10 hours and stimulated T cells have a half-life of 3 hours (von Essen, m. et al 2004.j. Immunol. 173:384-393). The normal function of the TCR complex requires that the proteins that make up the TCR complex have the appropriate stoichiometry. TCR function also requires two functional TCR zeta proteins with ITAM motifs. Activation of TCRs after binding to their MHC peptide ligands requires that several TCRs bind to the same T cell, all of which must correctly signal. Thus, if the TCR complex is destabilized by an incorrectly associated or non-optimally signaled protein, the activation of the T cell will be insufficient to initiate the cellular response.
Thus, in one embodiment, TCR expression can be eliminated using RNA interference (e.g., nucleic acid components, siRNA, miRNA, etc.), tnpB polypeptides, or other methods of targeting nucleic acids encoding specific TCRs (e.g., TCR-a and TCR- β) and/or CD3 chains in primary T cells. By blocking the expression of one or more of these proteins, T cells will no longer produce one or more of the key components of the TCR complex, destabilizing the TCR complex and preventing cell surface expression of the functional TCR.
In some cases, the CAR may also include a switching mechanism for controlling expression and/or activation of the CAR. For example, a CAR can comprise an extracellular domain, a transmembrane domain, and an intracellular domain, wherein the extracellular domain comprises a target-specific binding element comprising a label, binding domain, or tag that is specific for a molecule other than a target antigen expressed on or by a target cell. In such embodiments, the specificity of the CAR is provided by a second construct comprising a target antigen binding domain (e.g., an scFv or bispecific antibody that is specific for both the target antigen and a label or tag on the CAR) and a domain that is recognized by or binds to the label, binding domain or tag on the CAR. See, for example, WO 2013/044225, WO 2016/000304, WO 2015/057834, WO 2015/057852, WO 2016/070061, US 9,233,125, US2016/0129109. In this way, T cells expressing the CAR can be administered to a subject, but the CAR cannot bind its target antigen prior to administration of the second composition comprising the antigen-specific binding domain.
Alternative switching mechanisms include CARs that require multimerization in order to activate their signaling function (see, e.g., U.S. patent publication nos. US2015/0368342, US 2016/0175359, US 2015/0368360) and/or exogenous signals, such as small molecule drugs (US 2016/0166613, yung et al, science, 2015), in order to elicit a T cell response. Some CARs may also contain a "suicide switch" to induce CAR T cell death after treatment (Buddee et al, PLoS One, 2013) or to down regulate CAR expression after binding to a target antigen (international patent publication No. WO 2016/01210).
Alternative techniques may be used to transform target immune response cells, such as protoplast fusion, lipofection, transfection, or electroporation. A variety of vectors may be used, such as retroviral vectors, lentiviral vectors, adenoviral vectors, adeno-associated viral vectors, plasmids, or transposons, such as sleeping beauty transposons (see U.S. Pat. nos. 6,489,458;7,148,203;7,160,682;7,985,739;8,227,432) may be used to introduce CARs, for example using second generation antigen specific CARs that signal through CD3 zeta and CD28 or CD 137. Viral vectors may for example comprise HIV, SV40, EBV, HSV or BPV based vectors.
Cells targeted for transformation may include, for example, T cells, natural Killer (NK) cells, cytotoxic T Lymphocytes (CTLs), regulatory T cells, human embryonic stem cells, tumor-infiltrating lymphocytes (TILs), or pluripotent stem cells from which lymphocytes may be differentiated. T cells expressing the desired CAR can be selected, for example, by co-culturing with gamma-irradiated activated and proliferating cells (AaPC) that co-express the cancer antigen and the co-stimulatory molecule. Engineered CAR T cells can be expanded, for example, by co-culturing on AaPC in the presence of soluble factors such as IL-2 and IL-21. For example, such expansion can be performed to provide memory car+ T cells (which can be determined, for example, by non-enzymatic digital arrays and/or multi-panel flow cytometry). In this way, CAR T cells with specific cytotoxic activity (optionally in combination with production of a desired chemokine such as interferon-gamma) against antigen bearing tumors can be provided. Such CAR T cells may be used, for example, in animal models, for example, to treat tumor xenografts.
In one embodiment, ACT comprises co-transferred CD4+ Th 1cells and CD8+ CTL to induce a synergistic anti-tumor response (see, e.g., li et al, adoptive cell therapy with CD4+T helper 1cells and CD8+cytotoxic T cells enhances complete rejection of an established tumor,leading to generation of endogenous memory responses to non-targeted tumor peptides, clin Transl immunology, 2017, month 10; 6 (10): e 160).
In one embodiment, th17 cells are transferred to a subject in need thereof. Th17 cells were reported to eradicate murine melanoma tumors directly to a greater extent than Th 1cells (Muranski P et al, tumor-specific Th17-polarized cells eradicate large established melanoma. Blood.2008, 7 months 15; 112 (2): 362-73; and Martin-Orozco N et al, T helper 17cells promote cytotoxic T cell activation in Tumor immunity.Immunity.2009, 11 months 20; 31 (5): 787-98). Those studies involved adoptive T cell transfer (ACT) therapy methods that utilized cd4+ T cells expressing TCR-recognizing tyrosinase tumor antigens. The use of TCRs results in rapid ex vivo expansion of Th17 populations to large quantities for reinfusion into autologous tumor bearing hosts.
In one embodiment, the ACT may comprise an Autologous iPSC-based vaccine, such as an irradiated iPSC in an Autologous anti-tumor vaccine (see, e.g., kooreman, nigel g. Et al, autologo iPSC-Based Vaccines Elicit Anti-tumor Responses In Vivo, cell Stem Cell 22,1-13,2018, doi.org/10.1016/j.stem.2018.01.016).
Unlike MHC-restricted T Cell Receptors (TCRs), CARs can bind any cell surface expressed antigen and thus can be used more generally in the treatment of patients (see Irving et al, engineering Chimeric Antigen Receptor T-Cells for Racing in Solid Tumors: don' T Forget the Fuel, front. Immunol., month 4, 3, 2017, doi. Org/10.3389/fimmu. 2017.00267). In one embodiment, the transfer of CAR T cells can be used to treat a patient in the absence of endogenous T cell infiltration (e.g., due to aberrant antigen processing and presentation), which precludes the use of TIL therapy and immune checkpoint blockade (see, e.g., hinrich CS, rosenberg sa. Explication the curative potential of adoptive T-cell therapy for cancer. Immunorev (2014) 257 (1): 56-71.Doi: 10.1111/imr.12132).
Methods such as those described above may be suitable for providing methods of treating and/or increasing survival of a subject having a disease (such as neoplasia), for example, by administering an effective amount of immune-responsive cells comprising antigen recognizing receptors that bind to a selected antigen, wherein the binding activates the immune-responsive cells, thereby treating or preventing the disease (such as neoplasia, pathogen infection, autoimmune disorder, or allograft reaction).
In one embodiment, the treatment may be administered after lymphocyte removal pretreatment in the form of chemotherapy (typically a combination of cyclophosphamide and fludarabine) or radiation therapy. Preliminary studies of ACT have transient responses and the transferred cells cannot persist in vivo for long periods of time (Houot et al, T-cell-based immunotherapy: adoptive cell transfer and checkpoint inhibitor. Cancer immunoRes (2015) 3 (10): 1115-22; and Kamta et al, advancing Cancer Therapy with Present and Emerging Immuno-Oncology appoaches. Front. Oncol. (2017) 7:64). Immunosuppressive cells (such as tregs and MDSCs) may attenuate the activity of the transferred cells by competing with the transferred cells for the necessary cytokines. Without being bound by theory, lymphocyte removal pretreatment may eliminate suppressor cells, allowing TIL to persist.
In one embodiment, the treatment may be administered to a patient undergoing immunosuppressive therapy (e.g., glucocorticoid therapy). Cells or cell populations may be rendered resistant to at least one such immunosuppressant due to inactivation of genes encoding receptors for such immunosuppressant. In one embodiment, immunosuppressive therapy provides for the selection and expansion of immunoreactive T cells in a patient.
In one embodiment, the treatment may be administered prior to primary treatment (e.g., surgery or radiation therapy) to shrink the tumor prior to the primary treatment. In another embodiment, the treatment may be administered after the initial treatment to remove any remaining cancer cells.
In one embodiment, the immune metabolism barrier may be targeted therapeutically prior to and/or during ACT to enhance the response to ACT or CAR T cell therapy and support endogenous immunity (see, e.g., irving et al, engineering Chimeric Antigen Receptor T-Cells for Racing in Solid Tumors: don' T Forget the Fuel, front. Immunol., month 4, 3, 2017, doi. Org/10.3389/fimmu.2017.00267).
Administration of a cell or cell population as disclosed herein (such as an immune system cell or cell population, such as more particularly an immune responsive cell or cell population) may be performed in any convenient manner, including by aerosol inhalation, injection, ingestion, transfusion, implantation or transplantation. The cell or cell population may be administered to the patient subcutaneously, intradermally, intratumorally, intranodal, intramedullary, intramuscularly, intrathecally, by intravenous or intralymphatic injection, or intraperitoneally. In one embodiment, the disclosed CARs can be delivered or administered into a cavity formed by excision of tumor tissue (i.e., endoluminal delivery) or directly into the tumor prior to excision (i.e., intratumoral delivery). In one embodiment, the cell composition of the invention is preferably administered by intravenous injection.
The administration of the cells or cell populations may consist of administration of 104-109 cells/kg body weight (preferably 105 to 106 cells/kg body weight), including all integer values of cell numbers within those ranges. Administration in CAR T cell therapy, with or without lymphocyte removal procedures using cyclophosphamide for example, may involve, for example, administration of 106 to 109 cells/kg. The cells or cell populations may be administered in one or more doses. In another embodiment, an effective amount of the cells is administered as a single dose. In another embodiment, an effective amount of cells is administered as more than one dose over a period of time. The time of administration is within the discretion of the attendant physician and depends on the clinical condition of the patient. The cells or cell populations may be obtained from any source, such as a blood bank or donor. Although individual needs vary, it is within the skill of one of ordinary skill in the art to determine the optimal range of effective amounts for a given cell type for a particular disease or condition. An effective amount means an amount that provides a therapeutic or prophylactic benefit. The dosage administered will depend on the age, health and weight of the recipient, the type of concurrent treatment (if any), the frequency of treatment, and the nature of the desired effect.
In another embodiment, an effective amount of cells or a composition comprising those cells are administered parenterally. The administration may be intravenous administration. Administration can be directly by injection within the tumor.
To prevent possible adverse reactions, engineered immune-responsive cells may be equipped with a transgenic safety switch in the form of a transgene that makes the cell susceptible to exposure to a specific signal. For example, the herpes simplex virus Thymidine Kinase (TK) gene may be used in this manner, for example, by introduction into allogeneic T lymphocytes that are used as donor lymphocyte infusions after stem cell transplantation (Greco et al Improving the safety of cell therapy with the TK-suicide gene. Front. Pharmacol.2015; 6:95). In such cells, administration of a nucleoside prodrug, such as ganciclovir (ganciclovir) or acyclovir (acyclovir), results in cell death. Alternative safety switch constructs include inducible caspase 9, for example triggered by administration of a small molecule dimer that binds two nonfunctional icasp9 molecules together to form an active enzyme. Various alternative methods of implementing cell proliferation control have been described (see U.S. patent publication No. 20130071414; international patent publication No. WO 2011/146862; international patent publication No. WO 2014/01987; international patent publication No. WO 2013/040371; zhou et al BLOOD,2014,123/25:3895-3905; di Stasi et al The New England Journal of Medicine 2011;365:1673-1683;Sadelain M,The New England Journal of Medicine 2011;365:1735-173; ramos et al, stem Cells 28 (6): 1107-15 (2010)).
In a further refinement of adoptive therapy, genome editing may be used to tailor immune response cells to alternative implementations, for example to provide edited CAR T cells (see Poirot et al, 2015,Multiplex genome edited T-cell manufacturing platform for "off-the-shell" adoptive T-cell immunotherapies, cancer Res 75 (18): 3853; ren et al, 2017,Multiplex genome editing to generate Universal CAR T cells resistant to PD1 inhibition,Clin Cancer Res.2017, month 1; 23 (9): 2255-2266.Doi:10.1158/1078-0432.CCR-16-1300. Epub. Month 11, month 4; qasim et al, 2017,Molecular remission of infant B-ALL after infusion of Universal TALEN gene-edited CAR T cells, sci Transl Med.2017, month 25; 9 (374); legut et al, 2018, CRISPR-mediated TCR replacement generates superior anticancer transgenic T cells. Blood,131 (3), 311-322; and Georgitis et al, long Terminal Repeat CRISPR-CAR-coud "Universal" T Cells Mediate Potent Anti-light Effects 82s, press, corp. 6, 21, month 8, 21, month 6 of the year of the disclosure). Cells can be edited using any of the CRISPR systems described herein and methods of using the same. The compositions and systems may be delivered to immune cells by any of the methods described herein. In a preferred embodiment, the cells are edited ex vivo and transferred to a subject in need thereof. Immunoreactive cells, CAR T cells, or any cell used for adoptive cell transfer may be edited. Editing may be performed, for example, to insert or knock-in a foreign gene, such as a foreign gene encoding a CAR or TCR, at a preselected locus (e.g., a TRAC locus) in a cell; eliminating potential alloreactive T Cell Receptors (TCRs) or preventing inappropriate pairing between endogenous and exogenous TCR chains, such as knocking out or knocking down expression of endogenous TCRs in cells; disrupting the target of the intracellular chemotherapeutic agent; blocking immune checkpoint, such as knocking out or knocking down expression of immune checkpoint proteins or receptors in cells; knocking out or knocking down expression of one or more other genes in the cell, the reduced expression or lack of expression of the genes may enhance efficacy of adoptive therapy using the cell; knocking out or knocking down expression of an endogenous gene in the cell, the endogenous gene encoding an antigen targeted by the exogenous CAR or TCR; knocking out or knocking down expression of one or more MHC constituent proteins in the cell; activating T cells; modulating the cells such that the cells are resistant to depletion or dysfunction; and/or increase differentiation and/or proliferation of functionally depleted or dysfunctional cd8+ T cells (see international patent publication nos. WO 2013/176915, WO 2014/059173, WO 2014/172606, WO 2014/184744 and WO 2014/191128).
In one embodiment, editing may result in inactivation of a gene. By inactivating the gene, the gene of interest is intended not to be expressed in the form of a functional protein. In certain embodiments, the system specifically catalyzes cleavage of a targeted gene, thereby inactivating the targeted gene. The resulting nucleic acid strand breaks are usually repaired by different mechanisms of homologous recombination or non-homologous end joining (NHEJ). However, NHEJ is an imperfect repair process that typically results in DNA sequence changes at the cleavage site. Repair via non-homologous end joining (NHEJ) generally results in small insertions or deletions (indels) and can be used to create specific gene knockouts. Cells in which cleavage-induced mutagenesis events have occurred can be identified and/or selected by methods well known in the art. In one embodiment, homology Directed Repair (HDR) is used to simultaneously inactivate a gene (e.g., TRAC) and insert an endogenous TCR or CAR into the inactivated locus.
Thus, in one embodiment, editing can be performed on cells (particularly cells intended for adoptive cell therapy, more particularly immune response cells such as T cells) to insert or knock in an exogenous gene, such as an exogenous gene encoding a CAR or TCR, at a preselected locus in the cell. Traditionally, nucleic acid molecules encoding CARs or TCRs have been transfected or transduced into cells using random integration vectors, which, depending on the site of integration, may result in clonal expansion, oncogenic transformation, diversified transgene expression, and/or transgene transcriptional silencing. Directing the transgene to a particular locus in the cell can minimize or avoid such risks and advantageously provide for uniform expression of the transgene by the cell. Suitable "safe harbor" loci for targeted transgene integration include, without limitation, CCR5 or AAVS1. Homology Directed Repair (HDR) strategies are known and described elsewhere in this specification, which allow insertion of transgenes into desired loci (e.g., TRAC loci).
Other suitable loci for insertion of transgenes (particularly CAR or exogenous TCR transgenes) include, but are not limited to, loci comprising genes encoding endogenous T cell receptor components, such as the T cell receptor alpha locus (TRA) or the T cell receptor beta locus (TRB), e.g., the T cell receptor alpha constant (TRAC) locus, the T cell receptor beta constant 1 (TRBC 1) locus, or the T cell receptor beta constant 2 (TRBC 1) locus. Advantageously, insertion of a transgene into such a locus can achieve both expression of the transgene (possibly under the control of an endogenous promoter) and knockdown expression of the endogenous TCR. This approach has been illustrated in Eyquem et al, (2017) Nature 543:113-117, where authors knocked a DNA molecule encoding a CD 19-specific CAR into the TRAC locus downstream of the endogenous promoter using CRISPR/Cas9 gene editing; CAR-T cells obtained by CRISPR are significantly more excellent in reducing CAR signaling and depletion.
T Cell Receptors (TCRs) are cell surface receptors that are involved in T cell activation in response to antigen presentation. TCRs are generally composed of two chains, α and β, which assemble to form heterodimers and associate with CD3 transduction subunits to form T cell receptor complexes that reside on the cell surface. Each α and β chain of the TCR consists of immunoglobulin-like N-terminal variable (V) and constant (C) regions, a hydrophobic transmembrane domain, and a short cytoplasmic region. For immunoglobulin molecules, the variable regions of the alpha and beta chains are produced by V (D) J recombination, thereby generating multiple antigen specificities within a T cell population. However, unlike immunoglobulins which recognize intact antigens, T cells are activated by processed peptide fragments associated with MHC molecules, thereby introducing an additional dimension for antigen recognition by T cells, known as MHC restriction. Recognition of MHC differences between donor and recipient by T cell recipients leads to T cell proliferation and potential development of Graft Versus Host Disease (GVHD). Inactivation of TCR α or TCR β can result in TCR elimination from the T cell surface, preventing recognition of alloantigen, and thus GVHD. However, TCR disruption generally results in the elimination of CD3 signaling components and alters the way T cells further expand.
Thus, in one embodiment, editing may be performed on cells (particularly cells intended for adoptive cell therapy, more particularly immune response cells such as T cells) to knock out or knock down expression of endogenous TCRs in the cells. For example, NHEJ-based or HDR-based gene editing methods may be used to disrupt endogenous TCR alpha and/or beta chain genes. For example, one or more gene editing systems, such as one or more TnpB systems, may be designed to target sequences present within the TCR β chain that are conserved between β1 and β2 constant region genes (TRBC 1 and TRBC 2) and/or to target the constant region of the TCR α chain (TRAC) gene.
Allogeneic cells are rapidly rejected by the host immune system. It has been demonstrated that allogeneic leukocytes present in the unirradiated blood product will last for no more than 5 to 6 days (Boni, muranski et al 20088 blood 1;112 (12): 4746-54). Thus, in order to prevent rejection of allogeneic cells, the immune system of the host must generally be suppressed to some extent. However, in the case of adoptive cell transfer, the use of immunosuppressive drugs also has an adverse effect on the therapeutic T cells introduced. Thus, in order to effectively use adoptive immunotherapy approaches in these situations, the introduced cells will need to be resistant to immunosuppressive therapy. Thus, in a particular embodiment, the invention further comprises the step of modifying T cells to render them resistant to an immunosuppressant, preferably by inactivating at least one gene encoding an immunosuppressant target. Immunosuppressants are agents that inhibit immune function through one of several mechanisms of action. The immunosuppressant may be, but is not limited to, a calcineurin inhibitor, a rapamycin target, an interleukin 2 receptor alpha chain blocker, an inosine monophosphate dehydrogenase inhibitor, a dihydrofolate reductase inhibitor, a corticosteroid, or an immunosuppressive antimetabolite. The present invention allows for conferring T cell immunosuppressive resistance for use in immunotherapy by inactivating targets of immunosuppressants in T cells. As a non-limiting example, the target of an immunosuppressant may be a receptor for an immunosuppressant, such as: CD52, glucocorticoid Receptor (GR), FKBP family gene members, and cyclophilin family gene members.
In one embodiment, editing may be performed on cells (particularly cells intended for adoptive cell therapy, more particularly immune responsive cells such as T cells) to block immune checkpoints, such as knockdown or knockdown of expression of immune checkpoint proteins or receptors in the cells. Immune checkpoints are inhibitory pathways that slow or stop immune responses and prevent uncontrolled activity of immune cells from causing excessive tissue damage. In one embodiment, the immune checkpoint targeted is the programmed death-1 (PD-1 or CD 279) gene (PDCD 1). In other embodiments, the immune checkpoint targeted is a cytotoxic T lymphocyte-associated antigen (CTLA-4). In further embodiments, the targeted immune checkpoint is another member of the CD28 and CTLA4 Ig superfamily, such as BTLA, LAG3, ICOS, PDL1, or KIR. In yet further embodiments, the targeted immune checkpoint is a member of the TNFR superfamily, such as CD40, OX40, CD137, GITR, CD27, or TIM-3.
Additional immune checkpoints include protein tyrosine phosphatase 1 (SHP-1) containing the Src homology 2 domain (Watson HA et al, SHP-1:the next checkpoint target for cancer immunotherapyBiochem Soc Trans.2016, 4-15; 44 (2): 356-62). SHP-1 is a widely expressed inhibitory Protein Tyrosine Phosphatase (PTP). In T cells, the PTP is a negative regulator of antigen dependent activation and proliferation. The PTP is a cytoplasmic protein and is therefore unsuitable for antibody-mediated therapies, but its role in activation and proliferation makes it an attractive target for genetic manipulation in adoptive transfer strategies such as Chimeric Antigen Receptor (CAR) T cells. Immune checkpoints may also include T cell immune receptors with Ig and ITIM domains (TIGIT/Vstm 3/WUCAM/VSIG 9) and VISTA (Le Mercier I et al, (2015) Beyond CTLA-4and PD-1,the generation Z of negative checkpoint regulators.Front.Immunol.6:418).
International patent publication No. WO 2014/172606 relates to the use of MT1 and/or MT2 inhibitors to increase proliferation and/or activity of depleted cd8+ T cells and to reduce cd8+ T cell depletion (e.g., to reduce functional depleted or non-responsive cd8+ immune cells). In one embodiment, the metallothionein is targeted by gene editing in adoptive transfer T cells.
In one embodiment, the target of gene editing may be at least one targeting locus involved in immune checkpoint protein expression. Such targets may include, but are not limited to, CTLA4, PPP2CA, PPP2CB, PTPN6, PTPN22, PDCD1, ICOS (CD 278), PDL1, KIR, LAG3, HAVCR2, BTLA, CD160, TIGIT, CD96, CRTAM, LAIR1, SIGLEC7, SIGLEC9, CD244 (2B 4), TNFRSF10B, TNFRSF10A, CASP, CASP10, CASP3, CASP6, CASP7, FADD, FAS, TGFBRII, TGFRBRI, SMAD2, SMAD3, SMAD4, SMAD10, SKI, SKIL, TGIF1, IL10RA, IL10RB, HMOX2, IL6R, IL6ST, EIF2AK4, CSK, PAG1, SIT1, FOXP3, PRDM1, BATF, VISTA, GUCY A2, GUCY1B3, MT1, MT2, CD40, OX40, CD137, GITR, CD27, TIM-1, ACA-3, ACA-CEM-3, or CEM-5. In preferred embodiments, the loci involved in PD-1 or CTLA-4 gene expression are targeted. In other preferred embodiments, the gene combinations are targeted, such as, but not limited to, PD-1 and TIGIT.
By way of example and not limitation, international patent publication number WO 2016/196388 relates to an engineered T cell comprising (a) a genetically engineered antigen receptor that specifically binds an antigen, which receptor may be a CAR; and (b) a disrupted gene encoding PD-L1, an agent for disrupting a gene encoding PD-L1, and/or disruption of a gene encoding PD-L1, wherein the disruption of the gene can be mediated by a gene editing nuclease, zinc Finger Nuclease (ZFN), CRISPR/Cas9, and/or TALEN. WO2015142675 relates to immune effector cells comprising a combination of a CAR and an agent (such as a composition or system herein) that increases the efficacy of the immune effector cells in the treatment of cancer, wherein the agent can inhibit an immunosuppressive molecule such as PD1, PD-L1, CTLA-4, TIM-3, LAG-3, VISTA, BTLA, TIGIT, LAIR1, CD160, 2B4, TGFR beta, CEACAM-1, CEACAM-3, or CEACAM-5.Ren et al, (2017) Clin Cancer Res 23 (9) 2255-2266 simultaneously performed lentiviral delivery of CAR and electrotransfer of Cas9 mRNA and gRNA targeting endogenous TCRs, beta-2 microglobulin (B2M) and PD1 to generate genetically disrupted allogeneic CAR T cells lacking TCR, HLA class I molecules and PD 1.
In one embodiment, the cell can be engineered to express a CAR, wherein expression and/or function (such as a composition or system herein) of a methylcytosine dioxygenase gene (TET 1, TET2, and/or TET 3) in the cell has been reduced or eliminated (e.g., as described in WO 201704916).
In one embodiment, editing may be performed on cells (particularly cells intended for adoptive cell therapy, more particularly immune-responsive cells such as T cells) to knock out or knock down expression of endogenous genes in the cells that encode exogenous CAR or TCR-targeted antigens, thereby reducing the likelihood of targeting the engineered cells. In one embodiment, the targeting antigen may be one or more antigens selected from the group consisting of: CD38, CD138, CS-1, CD33, CD26, CD30, CD53, CD92, CD100, CD148, CD150, CD200, CD261, CD262, CD362, human telomerase reverse transcriptase (hTERT), survivin, mouse double minute 2 homolog (MDM 2), cytochrome P450B 1 (CYP 1B), HER2/neu, wilms tumor gene 1 (WT 1), livin, alpha Fetoprotein (AFP), carcinoembryonic antigen (CEA), mucin 16 (MUC 16), MUC1, prostate Specific Membrane Antigen (PSMA), P53, cyclin (D1), B Cell Maturation Antigen (BCMA), transmembrane Activator and CAML Interactor (TACI) and B cell activator receptor (BAFF-R) (e.g., as described in International patent publication Nos. WO 2016/011440 and WO 2017/01804).
In one embodiment, editing may be performed on cells, particularly cells intended for adoptive cell therapy, more particularly immune responsive cells such as T cells, to knock out or knock down expression of one or more MHC component proteins, such as one or more HLA proteins and/or beta-2 microglobulin (B2M), in the cells, such that rejection of non-autologous (e.g., allogeneic) cells by the recipient immune system may be reduced or avoided. In preferred embodiments, one or more HLA class I proteins, such as HLA-A, B and/or C and/or B2M, may be knocked out or knocked down. Preferably, B2M may be knocked out or knocked down. For example, ren et al, (2017) Clin Cancer Res 23 (9) 2255-2266 simultaneously performed lentiviral delivery of CARs and electrotransfer of Cas mRNA and gRNA targeting endogenous TCRs, β -2 microglobulin (B2M) and PD1 to generate genetically disrupted allogeneic CAR T cells lacking TCR, HLA class I molecules and PD 1.
In other embodiments, at least two genes are edited. Gene pairs may include, but are not limited to, PD1 and TCR α, PD1 and TCR β, CTLA-4 and TCR α, CTLA-4 and TCR β, LAG3 and TCR α, LAG3 and TCR β, tim3 and TCR α, tim3 and TCR β, BTLA and TCR β, BY55 and TCR α, BY55 and TCR β, TIGIT and TCR α, TIGIT and TCR β, B7H5 and TCR α, B7H5 and TCR β, LAIR1 and TCR α, LAIR1 and TCR β, SIGLEC10 and TCR α, SIGLEC10 and TCR β, 2B4 and TCR α, 2B4 and TCR β, B2M and TCR α, B2M and TCR β.
In one embodiment, the cells can be subjected to multiple editing (multiple genome editing) as taught herein to (1) knock out or knock down expression of endogenous TCRs (e.g., TRBC1, TRBC2, and/or TRAC), (2) knock out or knock down expression of immune checkpoint proteins or receptors (e.g., PD1, PD-L1, and/or CTLA 4); and (3) knocking out or knocking down expression of one or more MHC constituent proteins (e.g., HLA-A, B and/or C and/or B2M, preferably B2M).
Whether before or after genetic modification of T cells, T cells can generally be activated and expanded using methods such as those described in the following patents: U.S. patent No. 6,352,694;6,534,055;6,905,680;5,858,358;6,887,466;6,905,681;7,144,575;7,232,566;7,175,843;5,883,223;6,905,874;6,797,514;6,867,041; and 7,572,631.T cells can be expanded in vitro or in vivo.
The immune cells may be obtained using any method known in the art. In one embodiment, allogeneic T cells may be obtained from a healthy subject. In one embodiment, T cells that have infiltrated the tumor are isolated. T cells can be removed during surgery. T cells can be isolated after removal of tumor tissue by biopsy. T cells may be isolated by any means known in the art. In one embodiment, the T cells are obtained by apheresis. In one embodiment, the method may comprise obtaining a large population of T cells from a tumor sample by any suitable method known in the art. For example, a large T cell population can be obtained from a tumor sample by dissociating the tumor sample into a cell suspension from which a particular cell population can be selected. Suitable methods of obtaining a large population of T cells may include, but are not limited to, any one or more of mechanically dissociating (e.g., chopping) the tumor, enzymatically dissociating (e.g., digesting) the tumor, and aspirating (e.g., with a needle).
The large population of T cells obtained from the tumor sample may comprise any suitable type of T cells. Preferably, the large population of T cells obtained from a tumor sample comprises Tumor Infiltrating Lymphocytes (TILs).
The tumor sample may be obtained from any mammal. As used herein, unless otherwise indicated, the term "mammal" refers to any mammal, including but not limited to the following: lagomorpha (logo), such as rabbit; carnivora (Carnivora), including felines (cats) and canines (dogs); artiodactyla (Artiodactyla), including bovine (dairy) and porcine (swine); or of the order perissodactyla, including equine animals (horses). The mammal may be a non-human Primate such as primates (Primate), quadruped (eboid) or simiales (Simoid) or apes (Anthropoid) (humans and apes). In one embodiment, the mammal may be a rodent (Rodentia) mammal, such as a mouse and hamster. Preferably, the mammal is a non-human primate or a human. Particularly preferred mammals are humans.
T cells can be obtained from a variety of sources including Peripheral Blood Mononuclear Cells (PBMCs), bone marrow, lymph node tissue, spleen tissue, and tumors. In one embodiment of the invention, T cells may be obtained from a unit of blood collected from a subject using a variety of techniques known to those skilled in the art, such as Ficoll isolation. In a preferred embodiment, the cells from the circulating blood of the individual are obtained by apheresis or leukopenia. Apheresis products typically contain lymphocytes, including T cells, monocytes, granulocytes, B cells, other nucleated leukocytes, erythrocytes, and platelets. In one embodiment, cells collected by apheresis may be washed to remove plasma fractions and placed in an appropriate buffer or medium for subsequent processing steps. In one embodiment of the invention, the cells are washed with Phosphate Buffered Saline (PBS). In alternative embodiments, the wash solution lacks calcium and may lack magnesium or may lack many, if not all, divalent cations. The initial activation step in the absence of calcium results in an activation amplification. Those of ordinary skill in the art will readily appreciate that the washing step may be accomplished by methods known to those of ordinary skill in the art, such as by using a semi-automated "flow-through" centrifuge (e.g., cobe 2991 cell processor) according to manufacturer's instructions. After washing, the cells may be resuspended in various biocompatible buffers, such as, for example, ca-free, mg-free PBS. Alternatively, the unwanted components of the apheresis sample can be removed and the cells resuspended directly in culture medium.
In another embodiment, the method is performed by lysing erythrocytes and depleting monocytes, e.g., by PERCOL TM Gradient centrifugation separates T cells from peripheral blood lymphocytes. Specific T cell subsets, such as cd28+, cd4+, CDC, cd45ra+ and cd45ro+ T cells, may be further isolated by positive or negative selection techniques. For example, in a preferred embodiment, the conjugate is provided by a bead (such as a bead conjugated with anti-CD 3/anti-CD 28 (i.e., 3X 28)M-450 CD3/CD 28T or XCyte DYNABEADS TM ) The T cells are isolated by incubating for a period of time sufficient to positively select the desired T cells. In one embodiment, the period of time is about 30 minutes. In another embodiment, the time period ranges from 30 minutes to 36 hours or more and all integer values therebetween. In another embodiment, the period of time is at least 1, 2, 3, 4, 5, or 6 hours. In yet another preferred embodiment, the period of time is from 10 to 24 hours. In a preferred embodiment, the incubation period is 24 hours. For isolation of T cells from leukemia patients, the use of longer incubation times (such as 24 hours) can increase cell yield. In any case where there are fewer T cells than other cell types, longer incubation times may be used to isolate T cells, such as Tumor Infiltrating Lymphocytes (TILs) from tumor tissue or immunocompromised individuals. In addition, the use of longer incubation times can increase the efficiency of cd8+ T cell capture. / >
Enrichment of T cell populations by negative selection can be achieved by a combination of antibodies directed against surface markers specific for the negative selection cells. The preferred method is cell sorting and/or selection via negative magnetic immunoadhesion or flow cytometry using a monoclonal antibody mixture directed against cell surface markers present on negatively selected cells. For example, to enrich for cd4+ cells by negative selection, monoclonal antibody cocktails typically include antibodies to CD14, CD20, CD11b, CD16, HLA-DR, and CD 8.
In addition, in the case of the optical fiber,depletion of monocyte populations (e.g., cd14+ cells) from blood products can be facilitated by a variety of methods, including anti-CD 14 coated beads or columns, or by utilizing phagocytic activity of these cells. Thus, in one embodiment, the invention uses paramagnetic particles of a size sufficient to be phagocytized by phagocytic monocytes. In one embodiment, the paramagnetic particles are commercially available beads, for example Life Technologies under the trade name Dynabeads TM Those beads produced. In one embodiment, other non-specific cells are removed by coating the paramagnetic particles with an "unrelated" protein (e.g., serum protein or antibody). Irrelevant proteins and antibodies include those that do not specifically target the T cells to be isolated or fragments thereof. In one embodiment, the unrelated beads include beads coated with sheep anti-mouse antibodies, goat anti-mouse antibodies, and human serum albumin.
In short, this depletion of monocytes is performed by: t cells isolated from whole blood, apheresis peripheral blood or tumors are pre-incubated with one or more irrelevant or non-antibody conjugated paramagnetic particles to allow removal of any amount of monocytes (about 20:1 bead: cell ratio) at 22 ℃ to 37 ℃ for 30 minutes to 2 hours, followed by magnetic removal of cells that attach or phagocytize the paramagnetic particles. Such separation may be performed using standard methods available in the art. For example, any magnetic separation method may be used, including a variety of commercially available magnetic separation methods (e.g.,magnetic particle concentrator (DYNAL +.>)). The assurance of the necessary depletion can be monitored by a variety of methods known to those of ordinary skill in the art, including flow cytometry analysis of CD14 positive cells before and after depletion.
To isolate a desired population of cells by positive or negative selection, the concentration of cells and surfaces (e.g., particles, such as beads) may be varied. In one embodiment, it may be desirable to significantly reduce the volume of beads and cells mixed together (i.e., increase the cell concentration) to ensure maximum contact of the cells and beads. For example, in one embodiment, a concentration of 20 hundred million cells/ml is used. In one embodiment, a concentration of 10 hundred million cells/ml is used. In another embodiment, greater than 1 hundred million cells/ml are used. In another embodiment, a cell concentration of 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500, or 5000 ten thousand cells/ml is used. In yet another embodiment, a cell concentration of 7500, 8000, 8500, 9000, 9500 ten thousand or 1 hundred million cells/ml is used. In other embodiments, a concentration of 1.25 or 1.5 hundred million cells/ml may be used. The use of high concentrations can lead to increased cell yield, cell activation and cell expansion. In addition, the use of high cell concentrations allows for more efficient capture of cells that may weakly express the target antigen of interest, such as CD28 negative T cells, or cells from samples where many tumor cells are present (i.e., leukemia blood, tumor tissue, etc.). Such cell populations may be of therapeutic value and are desirable. For example, the use of high concentrations of cells allows for more efficient selection of cd8+ T cells that typically have weaker CD28 expression.
In related embodiments, it may be desirable to use lower concentrations of cells. By significantly diluting the mixture of T cells and the surface (e.g., particles, such as beads), interactions between particles and cells are minimized. This will select for cells that express a large amount of the desired antigen bound to the particle. For example, at dilute concentrations, cd4+ T cells express higher levels of CD28 and are captured more efficiently than cd8+ T cells. In one embodiment, the cell concentration used is 5X 106/ml. In other embodiments, the concentration used may be about 1X 105/ml to 1X 106/ml, as well as any integer value therebetween.
T cells may also be frozen. Without wishing to be bound by theory, the freezing and subsequent thawing steps provide a more uniform product by removing granulocytes and some level of monocytes from the cell population. After the washing step to remove plasma and platelets, the cells may be suspended in a frozen solution. While many freezing solutions and parameters are known in the art and can be used in context, one approach involves using PBS or other suitable cell freezing medium containing 20% DMSO and 8% human serum albumin, then freezing the cells to-80 ℃ at a rate of 1 ℃/min and storing in the gas phase of a liquid nitrogen storage tank. Other controlled freezing methods may be used and uncontrolled freezing may be performed immediately at-20 ℃ or in liquid nitrogen.
The T cells used in the present invention may also be antigen specific T cells. For example, tumor-specific T cells may be used. In one embodiment, antigen-specific T cells may be isolated from a patient of interest, such as a patient suffering from cancer or an infectious disease. In one embodiment, neoepitopes of the subject are determined and T cells specific for these antigens are isolated. antigen-Specific Cells for expansion may also be produced in vitro using any of the methods known in the art, for example, as described in U.S. patent publication No. US20040224402 or U.S. patent No. 6,040,177, titled Generation and Isolation of Antigen-Specific T Cells. Antigen-specific cells for use in the present invention may also be produced in vitro using any of a variety of methods known in the art, for example, as described in Current Protocols in Immunology or Current Protocols in Cell Biology, both of which are published by John Wiley & Sons, inc.
In related embodiments, it may be desirable to sort or otherwise positively select antigen-specific cells (e.g., via magnetic selection) before or after one or both rounds of expansion. Sorting or positive selection of antigen-specific cells can be performed using peptide-MHC tetramers (Altman et al, science.1996, 10, 4; 274 (5284): 94-6). In another embodiment, an adaptive tetramer technique is used (Andersen et al, 2012Nat Protoc.7:891-902). Tetramers are limited by the need to use predicted binding peptides based on previous assumptions and limitations on specific HLA. peptide-MHC tetramers can be produced using techniques known in the art and can be prepared with any MHC molecule of interest and any antigen of interest as described herein. Various assays known in the art can be used to identify specific epitopes for use in context. For example, the ability of a polypeptide to bind to MHC class I can be indirectly assessed by monitoring the ability of 125I-labeled β2-microglobulin (β2m) to incorporate into MHC class I/β2m/peptide heterotrimer complexes.
In one embodiment, the cells are directly labeled with an epitope specific reagent for isolation by flow cytometry, and then the phenotype and TCR are characterized. In one embodiment, the T cells are isolated by contact with a T cell specific antibody. Sorting of antigen-specific T cells, or generally any of the cells of the invention, may be performed using any of a variety of commercially available cell sorters, including, but not limited to, moFlo sorters (dakocytotion, fort Collins, colo.), FACSAria TM 、FACSArray TM 、FACSVantage TM 、BD TM LSR II and FACSCalibur TM (BD Biosciences,San Jose,Calif.)。
In a preferred embodiment, the method comprises selecting cells that also express CD 3. The method may comprise specifically selecting the cells in any suitable manner. Preferably, flow cytometry is used for selection. Flow cytometry can be performed using any suitable method known in the art. Flow cytometry may employ any suitable antibodies and staining agents. Preferably, the antibody is selected such that it specifically recognizes and binds to the particular biomarker selected. For example, specific selection of CD3, CD8, TIM-3, LAG-3, 4-1BB or PD-1 may be performed using anti-CD 3, anti-CD 8, anti-TIM-3, anti-LAG-3, anti-4-lBB or anti-PD-1 antibodies, respectively. One or more antibodies can be conjugated to a bead (e.g., a magnetic bead) or a fluorescent dye. Preferably, the flow cytometry is Fluorescence Activated Cell Sorting (FACS). TCRs expressed on T cells can be selected based on responsiveness to autologous tumors. Furthermore, T cells that are reactive to tumors can be selected based on markers using the methods described in patent publications WO2014133567 and WO2014133568, which are incorporated herein by reference in their entirety. In addition, activated T cells can be selected based on the surface expression of CD107 a.
In one embodiment of the invention, the method further comprises expanding the number of T cells in the enriched cell population. Such methods are described in U.S. patent No. 8,637,307 and incorporated herein by reference in their entirety. The number of T cells may be increased by at least about 3-fold (or 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, or 9-fold), more preferably by at least about 10-fold (or 20-fold, 30-fold, 40-fold, 50-fold, 60-fold, 70-fold, 80-fold, or 90-fold), more preferably by at least about 100-fold, more preferably by at least about 1,000-fold, or most preferably by at least about 100,000-fold. Any suitable method known in the art may be used to expand the number of T cells. Exemplary methods of expanding cell numbers are described in patent publication No. WO 2003/057171, U.S. patent No. 8,034,334, and U.S. patent publication No. 2012/0244233, each of which is incorporated herein by reference.
In one embodiment, ex vivo T cell expansion may be performed by isolating T cells and then stimulating or activating followed by further expansion. In one embodiment of the invention, T cells may be stimulated or activated by a single agent. In another embodiment, T cells are stimulated or activated with two agents, one inducing a primary signal and the second inducing a co-stimulatory signal. The ligand that can be used to stimulate a single signal or stimulate a primary signal and the accessory molecule that stimulates a secondary signal can be used in soluble form. The ligand may be attached to the cell surface, attached to an Engineered Multivalent Signaling Platform (EMSP), or immobilized on a surface. In a preferred embodiment, the first and second agents are co-immobilized on a surface (e.g., a bead or cell). In one embodiment, the molecule that provides the primary activation signal may be a CD3 ligand and the co-stimulatory molecule may be a CD28 ligand or a 4-1BB ligand.
In one embodiment, T cells comprising a CAR or exogenous TCR can be made as described in international patent publication No. WO 2015/120096 by a method comprising: enriching a population of lymphocytes obtained from a donor subject; stimulating a lymphocyte population with one or more T cell stimulatory agents to produce an activated T cell population, wherein the stimulation is performed in a closed system using a serum-free medium; transducing the activated T cell population with a viral vector comprising a nucleic acid molecule encoding a CAR or TCR, using single cycle transduction to produce a transduced T cell population, wherein transduction is performed in a closed system using serum-free medium; and expanding the transduced T cell population for a predetermined time to produce an engineered T cell population, wherein the expanding is performed in a closed system using serum-free medium. In one embodiment, T cells comprising a CAR or exogenous TCR can be made as described in WO 2015/120096 by a method comprising: obtaining a lymphocyte population; stimulating a lymphocyte population with one or more stimulatory agents to produce an activated T cell population, wherein the stimulation is performed in a closed system using a serum-free medium; transducing the activated T cell population with a viral vector comprising a nucleic acid molecule encoding a CAR or TCR, using at least one cycle of transduction to produce a transduced T cell population, wherein transduction is performed in a closed system using serum-free medium; and expanding the transduced T cell population to produce an engineered T cell population, wherein the expanding is performed in a closed system using serum-free medium. The predetermined time for expanding the transduced T cell population may be 3 days. The time from enrichment of the lymphocyte population to production of the engineered T cells may be 6 days. The closed system may be a closed bag system. Also provided is a population of T cells comprising a CAR or exogenous TCR obtainable or obtained by the method, and pharmaceutical compositions comprising such cells.
In one embodiment, maturation or differentiation of T cells in vitro can be delayed or inhibited by a method described in international patent publication No. WO 2017/070395, which comprises contacting one or more T cells from a subject in need of T cell therapy with an AKT inhibitor (such as, for example, one AKT inhibitor or a combination of two or more AKT inhibitors as disclosed in claim 8 of WO 2017070395) and at least one of exogenous interleukin 7 (IL-7) and exogenous interleukin 15 (IL-15), wherein the resulting T cells exhibit delayed maturation or differentiation and/or wherein the resulting T cells exhibit improved T cell function (such as, for example, increased proliferation of T cells; increased cytokine production, and/or increased cytolytic activity) relative to T cells cultured in the absence of the AKT inhibitor.
In one embodiment, a patient in need of T cell therapy may be conditioned by a method as described in International patent publication No. WO 2016/191756, comprising administering to the patient a dose of cyclophosphamide between 200mg/m 2/day and 2000mg/m 2/day and a dose of fludarabine between 20mg/m 2/day and 900mg/m 2/day.
Disease of the human body
Genetic diseases and diseases with genetic and/or epigenetic characteristics
The compositions, systems, or components thereof may be used to treat and/or prevent genetic diseases and diseases having genetic and/or epigenetic characteristics. The genes and conditions exemplified herein are not exhaustive. In one embodiment, a method of treating and/or preventing a genetic disease may comprise administering to a subject a composition, system, and/or one or more components thereof, wherein the composition, system, and/or one or more components thereof are capable of modifying one or more copies of one or more genes associated with the genetic disease or a disease having genetic and/or epigenetic characteristics in one or more cells of the subject. In one embodiment, modifying one or more copies of one or more genes associated with a genetic disease or a disease having genetic and/or epigenetic characteristics in a subject can eliminate the genetic disease or symptoms thereof in the subject. In one embodiment, modifying one or more copies of one or more genes associated with a genetic disease or a disease having genetic and/or epigenetic characteristics in a subject can reduce the severity of the genetic disease or symptoms thereof in the subject. In one embodiment, the composition, system, or component thereof may modify one or more genes or polynucleotides associated with one or more diseases, including genetic diseases and/or diseases having genetic and/or epigenetic characteristics, including, but not limited to, any one or more of those listed in table 4A. It should be understood that those diseases and related genes listed herein are non-exhaustive and non-limiting. In addition, some genes play a role in the development of various diseases.
In one embodiment, the compositions, systems, or components thereof may be used to treat or prevent a disease in a subject by modifying one or more genes associated with one or more cellular functions, such as any one or more of those in table 4B. In one embodiment, the disease is a genetic disease or disorder. In some embodiments, the composition, system, or component thereof may modify one or more genes or polynucleotides associated with one or more genetic diseases (such as any of the diseases listed in table 4B).
In one aspect, the invention provides a method of individualizing or personalizing a genetic disorder in a subject in need of such treatment, comprising: (a) Introducing one or more mutations ex vivo into a tissue, organ or cell line, or in vivo into a transgenic non-human mammal, comprising delivering to the tissue, organ, cell or cell of the mammal a composition comprising the particle delivery system or delivery system of any of the above embodiments or the viral particle or cell of any of the above embodiments, wherein the specific mutation or exact sequence substitution is or has been associated with a genetic disease; (b) Testing for treatment of a genetic disease on cells that have delivered a vector having specific mutations or precise sequence substitutions associated with the genetic disease; and (c) treating the subject based on the results of the treatment test of step (b).
Infectious diseases
In one embodiment, the compositions, systems, or components thereof may be used to diagnose, prognose, treat, and/or prevent infectious diseases caused by microorganisms (such as bacteria, viruses, fungi, parasites, or combinations thereof).
In one embodiment, the system or components thereof are capable of targeting specific microorganisms within a mixed population. Exemplary methods of such techniques are described, for example, in Gomaa AA, klumpe HE, luo ML, selle K, barrangou R, beisel CL.2014.Programmable removal of bacterial strains by use of genome-targeting composition, systems, mBio 5:e00928-13; citorik RJ, mimee M, lu TK.2014.Sequence-specific antimicrobials using efficiently delivered RNA-guided nucleic acids.Nat Biotechnol 32:1141-1145, the teachings of which may be adapted for use with the compositions, systems, and components thereof described herein.
In one embodiment, the compositions, systems, and/or components thereof are capable of targeting pathogenic and/or drug resistant microorganisms, such as bacteria, viruses, parasites, and fungi. In one embodiment, the compositions, systems, and/or components thereof are capable of targeting and modifying one or more polynucleotides in a pathogenic microorganism such that the microorganism is less toxic, killed, inhibited, or otherwise unable to cause disease and/or infection and/or replicate in a host cell.
In one embodiment, pathogenic bacteria that can be targeted and/or modified by the compositions, systems, and/or components thereof described herein include, but are not limited to, pathogenic bacteria of the following genera: actinomycetes (e.g., actinomycetes (A. Israeli)), bacillus (e.g., bacillus anthracis (B. Anthracis), bacillus cereus (B. Cerius)), bacteroides (Bactoides) such as Bacteroides fragilis (B. Fragilis), barotia (Barotia hantazii (B. Henselae), barotia penta (B. Quintana)), bordetella (Bordetella pertussis (B. Pertussis)), borrelia (e.g., borrelia burgdorferi (B. Burgdorferi), borrelii (B. Garini), alzheilia (B. Afzeli) and back-heat borrelia (B. Reconti)), brucella (Brucella) (e.g., brucella abortus (B.abortus), brucella canis (B.canis), brucella maltesi (B.melitensis) and Brucella suis (B.suis)), campylobacter (e.g., campylobacter jejuni), chlamydia (e.g., chlamydia pneumoniae (C.pneumoniae) and Chlamydia trachomatis), chlamydophila (Chlamydophila) (e.g., chlamydophila psittaci (C.psittaci)), clostridium (Clostridium) (e.g., clostridium botulinum), clostridium difficile (C.diffiile), clostridium perfringens (C.perfringens), clostridium (C.tetani), corynebacterium (e.g., corynebacterium diphtherium), enterococcus (e.g., endococcus), enterococcus faecalis (E.Faecalis), enterococcus faecium (E.Faecium)), ehrlichia (Ehrlichia) (canine and calicheapest (E.chaffesis)), escherichia (e.g., escherichia coli), francisella (Francisela) (e.g., francisella tularensis (F.tularemia)), haemophilus (e.g., haemophilus influenzae (H.influeniae)), helicobacter (Helicobacter) and Legionella (Legionella) (e.g., klebsiella pneumoniae (K.pnuemonnii)), legionella (e.g., legionella pneumophila (L.pnumophila)), levospira (Leospira) (e.g., leptospira interrogans (L.interogans), leptospira sanguinalis (L.santa) Leptospira (L.weiliii), leptospira verrucosa (L.noguchi)), listeria (e.g., listeria monocytogenes (L.monocytogees)), mycobacterium (Mycobacterium) (e.g., mycobacterium leprae (M.leprae), mycobacterium tuberculosis (M.tuboculi), mycobacterium ulcerans (M.ulcerans)), mycoplasma (Mycoplasma pneumoniae) (M.pneumamium), neisseria (Neisseria gonorrhoeae (N.norrhiza) and Neisseria meningitides), nocardia (Nocart) (e.g., nocardia astromonas (N.aseoides)) Pseudomonas (P.aeromonas)), rickettsia (Rickettsia) (Rickettsia (R.rickettsia)), salmonella (Salmonella) (Salmonella typhi (S.tyrphi) and Salmonella typhimurium (S.tyrophilum)), shigella (Shigella) and Shigella dysenteriae (S.sonnei) and Shigella dysenteriae (S.dysenteriae), staphylococcus (S.aureusis), staphylococcus epidermidis (S.epsilon) and Staphylococcus sapropyticus), streptococcus (Streptomyces) (Streptococcus agalactiae), streptococcus pneumoniae (S.pneumocandidus), streptococcus pyogenes (S.pyogenus), treponema (T. Spirochaetes) (Usta), for example, ureaplasma urealyticum (u.ureolyticum)), vibrio (Vibrio) (e.g., vibrio cholerae (v. Cholerae)), yersinia (Yersinia) (e.g., yersinia pestis (y. Peptides), yersinia enterocolitica (y. Entericolytica), and Yersinia pseudotuberculosis (y. Pseudootoubculosis)).
In one embodiment, pathogenic viruses that may be targeted and/or modified by the compositions, systems, and/or components thereof described herein include, but are not limited to, double stranded DNA viruses, partially double stranded DNA viruses, single stranded DNA viruses, positive single stranded RNA viruses, negative single stranded RNA viruses, or double stranded RNA viruses. In one embodiment, the pathogenic virus may be from the following viral families: adenoviridae (adenoviruses) (e.g., adenoviruses), herpesviridae (herpesviridae) (e.g., herpes simplex type 1, herpes simplex type 2, varicella zoster virus, epstein-Barr virus (Epstein-Barr virus), human cytomegalovirus, human herpesvirus type 8), papillomaviridae (papuloviridae) (e.g., human papillomaviruses), polyomaviridae (Polyomaviridae) (e.g., BK viruses, JC viruses), poxviridae (Poxviridae) (e.g., smallpox), hepadnaviridae (Hepadnaviridae) (e.g., hepatitis B), parvoviridae (Parvoviridae) (e.g., parvoviridae B19), astroviridae (Astroviridae) (e.g., human astrovirus), califorviridae (californicae) (e.g., norwalk viruses (Coronaviridae), picornaviridae (coronaviruses), pneumoviridae (coronaviruses, rhinoviruses (coronaviruses) (e.g., coronaviruses, rhinoviruses) (e.g., coronaviruses, severe strains of the respiratory disease); severe acute respiratory syndrome virus, severe acute respiratory syndrome coronavirus 2 (COVID-19)), flaviviridae (e.g., hepatitis C virus, yellow fever virus, dengue virus, west Nile virus, TBE virus), togaviridae (e.g., rubella virus), hepeviridae (e.g., hepeviridae), hepatitis E virus), retrovirus (HIV), orthomyxoviridae (Orthomex oviridae) (e.g., influenza virus), arenaviridae (Arenavirae) (e.g., lassa virus), bunyaviridae (Bunyaviridae) (e.g., crimean-Congo hemorrhagic fever virus), hantaan virus (Hantaan virus), filoviridae (Filovidae) (e.g., ebola virus (Ebola virus) and Marburg virus (Marburg virus)), paramyxoviridae (Paramyxoviridae) (e.g., measles virus, mumps virus, parainfluenza virus), rhabdoviridae (rabies virus), hepatitis delta, reoviridae (Reoviridae) (e.g., rotavirus (Rotavirus), circoviridae), and Marburg virus (Colnarweoviridae).
In one embodiment, pathogenic fungi that may be targeted and/or modified by the compositions, systems, and/or components thereof described herein include, but are not limited to, fungi of the following genus: candida (e.g., candida albicans), aspergillus (Aspergillus) (e.g., aspergillus fumigatus (a. Fumigus), aspergillus flavus (a. Flavus), aspergillus clavatus (a. Clavatus)), cryptococcus (e.g., cryptococcus neoformans, cryptococcus gatus (c. Gattii)), histoplasma (Histoplasma) Histoplasma (h. Capsulatum)), pneumosporium (pneumospori) (e.g., pneumospori (p. Jirovidensis)), scilla (Stachybotrys) (e.g., sciences (s. Chartarum)).
In one embodiment, pathogenic parasites that can be targeted and/or modified by the compositions, systems, and/or components thereof described herein include, but are not limited to, protozoa, helminths, and ectoparasites. In one embodiment, pathogenic protozoa that may be targeted and/or modified by the compositions, systems, and/or components thereof described herein include, but are not limited to, protozoa of the following purposes: sarcodaceae (sarcodia) (e.g., ameba), such as amoeba (Entamoeba)), dinoflagellates (Mastinophora) (e.g., flagellates (flagellates), such as Giardia (Giardia) and Leishmania (Leishmania)), ciliates (e.g., ciliates), such as helminths (balantrum), and sporozoites (sporozoa) (e.g., plasmodium (plasmodium) and cryptosporidium (cryptosporidium)). In one embodiment, pathogenic worms that may be targeted and/or modified by the compositions, systems, and/or components thereof described herein include, but are not limited to, flat worms (plathelminhs), spinners (acanthocephalines), and roundworms (nematodes). In one embodiment, pathogenic ectoparasites that can be targeted and/or modified by the compositions, systems, and/or components thereof described herein include, but are not limited to, ticks, fleas, lice, and mites.
In one embodiment, pathogenic parasites that can be targeted and/or modified by the compositions, systems, and/or components thereof described herein include, but are not limited to, certain species of Acanthamoeba (Acanthamoeba spp.), baboon Babesia (Balamuthia mandrillaris), certain species of Babesia (Babesia spp.), e.g., babesia (Babesia), babesia (b.divengens), babesia (b.bigemina), babesia (b.equi), babesia (b.microfti), danbabesia (b.duncan), certain species of ciliate (balanopiasis spp.) (e.g., ciliate colonic (balanopsis coll)), some species of human budworms (Blastocystis spp.), some species of Cryptosporidium (Cryptosporidium spp.), some species of Cyclosporis (Cycoporium spp.) (e.g., kyoto-be-used (Cyclospora cayetanensis)), some species of binuclear amoeba (Dientamoeba spp.) (e.g., amoebiensis fragrans (Dientamoeba fragilis)), some species of amoeba (Amoebias spp.) (e.g., endomona histolytica (Entamoeba histolytica)), some species of Giardia (Giardia spp.) (e.g., giardia lamblia spp.)), some species of Spongia (Isosporias.) (e.g., bezibetia equi) or the like), some species of Leishmania (Leishmania spp.)) (e.g., naeglia spp.)), in Fossilia Fusarium (Naegleria fowleri)), plasmodium species (e.g., plasmodium falciparum (Plasmodium falciparum), plasmodium vivax, plasmodium falciparum subspecies (Plasmodium ovale curtisi), plasmodium falciparum subspecies (Plasmodium ovale wallikeri), plasmodium malariae (Plasmodium malariae), plasmodium nomorium (Plasmodium knowlesi)), spongia species (Rhinosporiosidiosis spp.) (e.g., plasmodium sibiricum (Rhinosporidium seeberi)), sarcocystis species (e.g., sarcocystis bovis (Sarcocystis bovihominis), sarcocystis sui (Sarcocystis suihominis)), toxoplasma species (Toxoplasmas spp.), toxoplasma gondii (Toxoplasma gondii)), trichomonas species (e.g., trichomonas spp.) (e.g., trichomonas vaginalis (Trichomonas vaginalis)), trypanosoma species (e.g., trypanosoma brucei (Trypanosoma brucei)), trypanosoma species (e.g., trypanosoma cruzi (Trypanosoma cruzi)), tapeworm (e.g., cestoda, multicameria (Taenia multiceps), beef tapeworm (Taenia samgita), pork tapeworm (Taenia solium)), echinococcus species (Diphyllobothrium latum spp.), echinococcus species (Echinococcus spp.) (e.g., echinococcus granulosus spp.) (Echinococcus granulosus), echinococcus (Echinococcus multilocularis), echinococcus multifidum, echinococcus fumago (E.vogeli), echinococcus selinum (E.oligoarmywrus), echinococcus species (Clonorchis spinosa), echinococcus species (Hymenoepsis spp.) (e.g., echinococcus (Hymenoepis nap), reduced Echinococcus (Hymenolepis diminuta)), bertonia species (Bertoniella spp.) (e.g., gubam (Bertiella mucronata), echinococcus (Bertiella studeri)), echinococcus (Spira) (e.g., erinaceus) Gong Taochong (Spirometra erinaceieuropaei)), testosoma (Clonorchis spp.) (e.g., clonorchis sinensis (Clonorchis sinensis), thailand testosterone (Clonorchis viverrini)), dicrococelium spp.) (e.g., branch double chamber (Dicrocoelium dendriticum)), fasciola species (Fasciola spp.) (e.g., fasciola hepatica (Fasciola hepatica), fasciola megaschistosum (e.g., fasciola megafascians (67), fasciola Fasciola (e.g., fascintica fascians (37), fascintica Fasciola (e.g., fascintica fascians), fascintica (37) and Fascintica Fasciola (e.g., fascintida (37) of Fascintida (37), fascintida (e.g., fascintida (37) of Fascintillans), fascintimus (e.g., fascintimus, fascintida (37), fascintimus (37) and Fascion) of Fascintimus (Fascintillans (Fascintionas (37), and Fascintimus (e.g., fascintillans) of Fascintipede) And genus species (Paragonimus spp.) (e.g., fabricius katzerland (Paragonimus westermani), african and Fasciola (Paragonimus africanus), karCatzfeldt (Paragonimus caliensis), fetzfeldt-Jakob (Paragonimus kellicotti), skikukob (Paragonimus skrjabini), double sided Catzfeldt-Jakob (Paragonimus uterobilateralis)), schistosoma species (Schistoma sp.), schistosoma species (e.g., schistosoma mansoni (Schistosoma mansoni), egypt schistosome (Schistosoma haematobium), japanese schistosome (Schistosoma japonicum), fasciola (Schistosoma mekongi) and Odorsalis (Schistosoma intercalatum)), echinoma spp.) (e.g., echinococcus spinosa (E. Echinocpontum)), mao Bi species (Trichosporozozia spp.) (e.g., fasciences Mao Bi (Trichobilharzia regent)), ancylodes species (Ancyspp.) (e.g., trichikukola (Ancylostoma duodenale)), fasciola species (e.g., fasciola Necator (3886), bactrum species (e.g., fasciola species (43), bactrum species (32), bactrum species (e.g., fasciola (43)) (32) and Fasciola species (e.g., fasciences (45)), ma Laibu brucella (Brugia malayi), imperial brucella (Brugia timori)), meloidogyne species (dioctophne sp.) (e.g., reniform meloidogyne (Dioctophyme renale)), dragon species (draguncius sp.) (e.g., maidenhair nematode (Dracunculus medinensis)), enterobia species (Enterobacterium sp.) (e.g., helminth enterobiasis (Enterobius vermicularis), grignard pinworm (Enterobius gregorii)), jaw nematode species (gnaphomal sp.) (e.g., acanthus (Gnathostoma spinigerum), rigid acanthocerate nematode (Gnathostoma hispidum)), cachexia species (halichondrin sp.) (e.g., gum bakuch. Halicephalobus gingivalis)), rogue species (Loa sp.) (Luo Aluo filaria) filaria (Loa filaria) species (e.g., loa filaria sp.), mandson species (e.g., 5), point (e.g., toxoplasma sp.) (35), point-like nematodes (e.g., toxicona sp.) (45), point-5), point (e.g., toxicona sp.) (54.g., toxicona sp.) (45), point-type nematodes (54.g., toxicona), point (67), point-like nematodes (54.g., oenon., catchfly (Toxocara cati), lion cati (Toxascaris leonine)), trichinella spp (e.g., trichinella (Trichinella spiralis), bristletail (Trichinella britovi), nauplii (Trichinella nelsoni), xiang (Trichinella nativa)), trichinella spp (e.g., dinoflagellate (Trichuris trichiura), canine flagellate (Trichuris vulpis)), evohium spp (Wuchereria spp) (e.g., banmatola spp (Wuchereria bancrofti)), dermatidae spp (e.g., human fly (Dermatobia hominis)), flea spp (e.g., tunea spp), the species of the genus trichina (tunea penetrans)), trypanosoma (e.g., spirochete (Cochliomyia hominivorax)), glossociata (lingutatus spp.) (e.g., serratia lingualis (Linguatula serrata)), protoechinococcales (archibalpha sp.)), nostoc (Moniliformis sp.) (e.g., nostoc (Moniliformis Moniliformis)), pedia (pedicolus spp.) (e.g., head lice (Pediculus humanus capitis), body lice (Pediculus humanus humanus)), anidae (Pthirus spp.) (e.g., anidae (Pthirus pubis)), arachnida spp.) (e (e.g., tsutsugambiridae (Trombicalidae), tricidae), hard ticks (Ixodidae), soft ticks (Argaside)), certain species of the order pariphylla (Siphonaptera sp.) (e.g., the order pariphylla: fleaceae (Siphonaptera: puricinase)), bedbug (e.g., temperate bed bugs (Cimex lectularius) and hemiptera (Cimex hepfit)), diptera (Diptera spp.), demodex spp (e.g., hair follicle/sebum/canine Demodex (Demodex folliculorum/brevis/canis)), sarcoptica (sarcoptica spp.) (e.g., scabies (Sarcoptes scabiei)), dermatophachus (dermanssus spp.) (e.g., gallica (Dermanyssus gallinae)), fowl spp. (Ornithina spp.) (e.g., forest fowl spp. (Ornithonyssus sylviarum), tropical fowl spp. (Ornithonyssus bursa), platycladus (Ornithonyssus bacoti)), rhizopus spp.) (e.g., hair follicle/sebum/dog Demodex spp.) (e.g., 69 (Sarcoptes scabiei)), dermatophagomphasis spp.) (9743, etc.).
In one embodiment, the gene target may be any of those listed in table 1 of Strich and chemnow, 2019.J. Clin. Microbio.57:4e013307-18, which is incorporated herein as if fully expressed herein.
In one embodiment, a method may include delivering a composition, system, and/or component thereof to a pathogenic organism described herein, thereby allowing the composition, system, and/or component thereof to specifically bind to and modify one or more targets in the pathogenic organism, whereby the modification kills, inhibits, reduces, or otherwise renders the pathogenic organism non-pathogenic. In one embodiment, delivery of the composition, system, occurs in vivo (i.e., in a treated subject). In one embodiment that occurs through a mediator, such as a microorganism or phage that is non-pathogenic to the subject but is capable of transferring polynucleotides and/or infecting pathogenic microorganisms. In one embodiment, the intermediate microorganism may be an engineered bacterium, virus or phage containing the composition, system and/or components thereof and/or vector system. The method may comprise administering to the subject to be treated an intermediate microorganism comprising the composition, system and/or components thereof and/or carrier system. The intermediate microorganism may then produce the composition and/or components thereof, or transfer the composition, system, polynucleotide to a pathogenic organism. In embodiments, where the composition and/or component, carrier or carrier system thereof is transferred to a pathogenic microorganism, the composition, system or component thereof is then produced in the pathogenic microorganism and the pathogenic microorganism is modified such that it is less toxic, killed, inhibited or otherwise unable to cause disease and/or infection and/or replicate in a host cell.
In one embodiment, where the pathogenic microorganism inserts its genetic material into the genome of the host cell (e.g., a virus), the composition, system, may be designed such that it modifies the genome of the host cell such that the viral DNA or cDNA cannot be replicated by the mechanism of the host cell into a functional virus. In one embodiment, where the pathogenic microorganism inserts its genetic material into the genome of the host cell (e.g., a virus), the composition, system, may be designed such that it modifies the genome of the host cell such that viral DNA or cDNA is deleted from the genome of the host cell.
It will be appreciated that inhibition or killing of pathogenic microorganisms may treat or prevent diseases and/or conditions in a subject caused by their infection. Accordingly, also provided herein are methods of treating and/or preventing one or more diseases or symptoms thereof caused by any one or more pathogenic microorganisms, such as any of the pathogenic microorganisms described herein.
Mitochondrial diseases
Some of the most challenging mitochondrial disorders are caused by mutations in mitochondrial DNA (mtDNA), a maternal inherited high copy number genome. In one embodiment, mtDNA mutations can be modified using the compositions, systems described herein. In one embodiment, the diagnosable, predictive, therapeutic and/or preventable mitochondrial disease may be MELAS (mitochondrial encephalomyopathy (mitochondrial myopathy encephalopathy) and lactate and stroke-like symptoms), CPEO/PEO (chronic progressive extraocular myoparalysis Syndrome/progressive extraocular myoparalysis), KSS (karns-Sayre Syndrome), MIDD (maternal inherited diabetes and deafness), MERRF (muscle-wasting-by-epileptic-associated red 35124, e.g., myofibrosis (myoclonic epilepsy associated with ragged red fibers)), NIDDM (non-insulin dependent diabetes), LHON (leber hereditary optic neuropathy), LS (Leigh Syndrome), aminoglycoside-induced hearing disorders, NARP (neuropathy, ataxia and pigment retinopathy), extrapyramidal disorders with movement loss, mental disorders and hl, non-complex hearing loss, a cardiomyopathy, encephalopathy, pearson's Syndrome or a combination thereof.
In one embodiment, the mtDNA of the subject can be modified in vivo or ex vivo. In one embodiment, in the case of an mtDNA ex vivo modification, the modified cells containing modified mitochondria can be administered back to the subject. In one embodiment, the composition, system, or component thereof is capable of correcting mtDNA mutations or a combination thereof.
In one embodiment, at least one of the one or more mtDNA mutations is selected from the group consisting of: a3243G, C3256T, T3271C, G1019 1304C, G15533C, G1494C, G4467C, G1658C, G12315C, G3421C, G8344C, G8356C, G8363C, G13042C, G3200C, G3242C, G3252C, G3264C, G3316C, G3394C, G14577C, G4833C, G3460C, G9804) C, G11778C, G14459C, G15257C, G8993C, G8993C, G10197C, G1095C, G5214 1555C, G1541C, G1634C, G3260C, G5269C, G7587C, G8238C, G8348C, G8363C, G9957C, G9997C, G12192 12297C, G15059C, G-series (SEQ ID NO: C, G) repeats at positions 305-314 and/or 956-965, deletions at positions 8,469-13,447, 4,308-14,874 and/or 4,398-14,822, 961ins/delC, mitochondrial co-deletions (e.g., mtDNA 4,977bp deletions), and combinations thereof.
In one embodiment, the mitochondrial mutation may be any mutation as set forth in or identified by use of one or more bioinformatics tools available on Mitomap. Such tools include, but are not limited to, "Variant Search, also known as Market Finder", find Sequences for Any Haplogroup, also known as "Sequence Finder", "Variant Info", "POLG Pathogenicity Prediction Server", "MITOMASTER", "Allle Search", "Sequence and Variant Downloads", "Data Downloads". MitoMap contains a report of mtDNA mutations that may be associated with disease, and maintains a database of reported mitochondrial DNA base substitution diseases: rRNA/tRNA mutation.
In one embodiment, the method comprises delivering the composition, system, and/or components thereof to a cell, and more specifically to one or more mitochondria in the cell, thereby allowing the composition, system, and/or components thereof to modify one or more target polynucleotides in the cell, and more specifically to modify one or more mitochondria in the cell. The target polynucleotide may correspond to a mutation of mtDNA, such as any one or more of the mutations described herein. In one embodiment, the modification may alter the function of the mitochondria such that the mitochondria function normally, or at least are less dysfunctional than unmodified mitochondria. The modification may occur in vivo or ex vivo. When the modification is performed ex vivo, cells containing the modified mitochondria may be administered to a subject in need thereof in an autologous or allogeneic manner.
Microbiome modification
Microbiome plays an important role in health and disease. For example, intestinal microbiomes can play a health role by controlling digestion, preventing the growth of pathogenic microorganisms, and are thought to affect mood and emotion. Unbalanced microbiomes contribute to disease and are thought to lead to weight gain, uncontrolled blood glucose, high cholesterol, cancer and other conditions. Healthy microbiomes have a range of combined features that can be distinguished from unhealthy individuals, and therefore detection and identification of disease-related microbiomes can be used to diagnose and detect disease in individuals. The compositions, systems, and components thereof may be used to screen microbiome cell populations and to identify microbiomes associated with disease. Cell screening methods utilizing the compositions, systems, and components thereof are described elsewhere herein, and can be applied to screen microbiomes of a subject, such as intestinal, skin, vaginal, and/or oral microbiomes.
In one embodiment, the compositions, systems, and/or components thereof described herein can be used to modify a microbiota of a microbiome in a subject. In one embodiment, the compositions, systems, and/or components thereof may be used to identify and select one or more cell types in a microbiome and remove them from the microbiome. Exemplary methods of selecting cells using the compositions, systems, and/or components thereof are described elsewhere herein. In this way, the composition or microbiological characteristics of the microbiome can be altered. In one embodiment, the change causes a change from a diseased microbiome composition to a healthy microbiome composition. In this way, the ratio of one or more microorganisms to another microorganism may be changed, for example from a diseased ratio to a healthy ratio. In one embodiment, the selected cell is a pathogenic microorganism.
In one embodiment, the compositions and systems described herein can be used to modify polynucleotides in a microbiome microorganism in a subject. In one embodiment, the microorganism is a pathogenic microorganism. In one embodiment, the microorganism is a symbiotic and nonpathogenic microorganism. Methods of modifying polynucleotides in a subject cell are described elsewhere herein and can be applied to these embodiments.
Models of diseases and conditions
In one aspect, the invention provides a method of modeling a disease associated with a genomic locus in a eukaryotic organism or a non-human organism, comprising manipulating a target sequence within coding, non-coding or regulatory elements of the genomic locus, comprising delivering a non-naturally occurring or engineered composition comprising a viral vector system comprising one or more viral vectors operably encoding a composition for expression thereof, wherein the composition comprises the particle delivery system or delivery system of any of the embodiments or the viral particle or the cell of any of the embodiments.
In one aspect, the invention provides a method of producing a model eukaryotic cell that may include one or more mutated disease genes and/or infectious microorganisms. In one embodiment, a disease gene is any gene associated with an increased risk of suffering from or developing a disease. In one embodiment, the method comprises (a) introducing one or more vectors into a eukaryotic cell, wherein the one or more vectors comprise a composition, system, and/or component thereof and/or a vector or vector system capable of driving expression of the composition, system, and/or component thereof, including, but not limited to: a nucleic acid component molecule sequence, one or more TnpB polypeptides, and combinations thereof, and (b) allowing a composition, system, or complex to bind to one or more target polynucleotides, e.g., to effect cleavage, nicking, or other modification of a target polynucleotide within the disease gene, wherein the composition, system, or complex consists of one or more TnpB polypeptides that are complexed with: (1) One or more nucleic acid component molecular sequences that hybridize to a target sequence within a target polynucleotide, and optionally (2) a nucleic acid component scaffold sequence, thereby producing a model eukaryotic cell that contains one or more mutated disease genes. Thus, in one embodiment, the compositions and systems contain nucleic acid molecules for and driving expression of one or more of the following: a TnpB polypeptide, a nucleic acid component molecular sequence, and/or a homologous recombination template and/or a stabilizing ligand (if the TnpB polypeptide has a destabilizing domain). In one embodiment, the cleavage comprises cleavage of one or both strands by the TnpB polypeptide at the position of the target sequence. In one embodiment, nicking comprises nicking one or both strands at the position of the target sequence by a TnpB polypeptide. In one embodiment, the cleavage or nick results in the transcription of the modification of the target polynucleotide. In one embodiment, the modification results in reduced transcription of the target polynucleotide. In one embodiment, the method further comprises repairing the cleaved or nicked target polynucleotide by homologous recombination with a recombinant template polynucleotide, wherein the repairing results in a mutation comprising an insertion, deletion, or substitution of one or more nucleotides of the target polynucleotide. In one embodiment, the mutation results in one or more amino acid changes in the protein expression of a gene comprising the target sequence.
The modeled disease can be any disease having a genetic or epigenetic component. In one embodiment, the modeled disease can be any disease as discussed elsewhere herein.
In situ disease detection
The compositions, systems and/or components thereof may be used in diagnostic detection methods such as the following: CASFISH (see, e.g., deng et al 2015.PNAS USA 112 (38): 11870-11875), CRISPR-Live FISH (see, e.g., wang et al 2020.Science;365 (6459): 1301-1305), sm-FISH (Lee and Jefcoat.2017. Front. Endocrinol. Doi. Org/10.3389/fendo. 2017.00289), sequence FISH CRISPRainbow (Ma et al Nat Biotechnol,34 (2016), pages 528-530), CRISPR-Sirius (Nat Methods,15 (2016), pages 928-931), casilio (Cheng et al Cell Res,26 (2016), pages 254-257), halogen tag-based genomic locus visualization techniques (e.g., deng., et al 2015.PNAS USA 112 (38): 11870-11875; knit et al, science, 823), RNA aptamer-based Methods (e.g., pages 2016), pages 214-2016, pages 537, and so forth), beacon-100 (2016-2016), and so forth. Wu et al Nucleic Acids Res (2018)), quantum dot based systems (e.g., ma et al Chem,89 (2017), pages 12896-12901), multiple Methods (e.g., ma et al, proc Natl Acad Sci U S A,112 (2015), pages 3002-3007; fu et al Nat Commun,7 (2016), page 11707; ma et al Nat Biotechnol,34 (2016), pages 528-530; shao et al Nucleic Acids Res,44 (2016), article e 86), wang et al Sci Rep,6 (2016), page 26857) and other methods based on in situ CRISPR hybridization (e.g., chen et al Cell,155 (2013), pages 1479-1491; gu et al Science 359 (2018), pages 1050-1055; tanebaum et al Cell,159 (2014), pages 635-646; ye et al Protein Cell,8 (2017), pages 853-855; chen et al, nat Commun,9 (2018), page 5065; shao et al ACS Synth Biol (2017); fu et al Nat Commun,7 (2016), page 11707; shao et al Nucleic Acids Res,44 (2016), article e86; wang et al, sci Rep,6 (2016), page 26857), which is incorporated by reference in its entirety as if fully expressed, and in view of the description herein, the teachings of which may be applied to the compositions, systems, and components thereof described herein.
In one embodiment, the composition, system, or component thereof may be used in a detection method, such as an in situ detection method described herein. In one embodiment, the composition, system, or component thereof may comprise a catalytically inactive TnpB polypeptide described herein, and such a system is used in a detection method, such as Fluorescence In Situ Hybridization (FISH) or any other method described herein. In one embodiment, an inactivated TnpB polypeptide lacking the ability to generate a DNA double strand break may be fused to a marker, such as a fluorescent protein, such as an enhanced green fluorescent protein (eefp), and co-expressed with a small nucleic acid component molecule to target near-center, and far-distance repeats in vivo. The dead TnpB polypeptide or system thereof can be used to visualize both repeat sequences and individual genes in the human genome. Such novel uses of the labeled death TnpB polypeptides and compositions thereof, systems may be important for cell imaging and functional nuclear structure studies, particularly in the case of small nuclear volumes or complex 3-D structures.
Cell selection
In one embodiment, the compositions, systems, and/or components thereof described herein can be used in methods of screening and/or selecting cells. In one embodiment, a composition, system based screening/selection method can be used to identify diseased cells in a population of cells. In one embodiment, selection of the cell results in modification of the cell such that the selected cell dies. In this way, diseased cells can be identified and removed from a healthy cell population. In one embodiment, the diseased cell may be a cancer cell, a precancerous cell, a cell infected with a virus or other pathogenic organism, or other abnormal cell. In one embodiment, the modification may confer another detectable change (e.g., a functional change and/or a genomic barcode) to the cell to be selected, which facilitates selection of the desired cell. In one embodiment, a negative selection protocol may be used to obtain the desired cell population. In these embodiments, the cells to be selected are modified so that they can be removed from the population based on their death or identification, or sorted based on a detectable change imparted to the cells. Thus, in these embodiments, the cells remaining after selection are the desired cell population.
In one embodiment, a method of selecting one or more cells containing a polynucleotide modification may comprise: introducing one or more compositions, systems, and/or components thereof and/or vectors or vector systems into a cell, wherein the compositions, systems, and/or components thereof and/or vectors or vector systems contain and/or are capable of expressing one or more of the following: tnpB polypeptide, nucleic acid component sequence and recombinant template; wherein, for example, the expressed content is within and expressed in vivo by a composition, system, vector or vector system and/or the recombinant template comprises one or more mutations that eliminate cleavage of the TnpB polypeptide; allowing homologous recombination of the recombination template with the target polynucleotide in the cell to be selected; allowing a composition, system, or complex to bind to a target polynucleotide to effect cleavage of the target polynucleotide within the gene, wherein the AAV complex comprises a TnpB polypeptide complexed with: (1) A nucleic acid component molecule sequence that hybridizes to a target sequence within a target polynucleotide, and (2) a nucleic acid component scaffold, wherein binding of the complex to the target polynucleotide induces cell death or imparts some other detectable change to the cell, thereby allowing selection of one or more cells into which one or more mutations have been introduced. In one embodiment, the cell to be selected may be a eukaryotic cell. In one embodiment, the cell to be selected may be a prokaryotic cell. Selection of a particular cell may be performed via the methods herein without the need for a selection marker or a two-step process that may include a counter-selection system.
Therapeutic agent development
The compositions, systems, and components thereof described herein can be used to develop a TnpB polypeptide-based bioactive agent, such as a small molecule therapeutic. Thus, described herein are methods for developing bioactive agents that modulate cellular functions and/or signaling events associated with diseases and/or disease genes. In one embodiment, the method comprises (a) contacting a test compound with a diseased cell and/or a cell containing a disease gene; and (b) detecting a change in the reading, the change being indicative of a decrease or increase in a cell signaling event or other cellular function associated with the disease or disease gene, thereby developing the bioactive agent that modulates the cell signaling event or other function associated with the disease gene. In one embodiment, the diseased cell is a model cell as described elsewhere herein. In one embodiment, the diseased cell is a diseased cell isolated from a subject in need of treatment. In one embodiment, the test compound is a small molecule agent. In one embodiment, the test compound is a small molecule agent. In one embodiment, the test compound is a biomolecular agent.
In one embodiment, the method involves developing a therapeutic agent based on the compositions, systems described herein. In particular embodiments, the therapeutic agent comprises a TnpB polypeptide and/or a nucleic acid component having a reprogrammable spacer capable of hybridizing to a target sequence of interest. In particular embodiments, the therapeutic agent is a vector or vector system that may contain a) a first regulatory element operably linked to a nucleotide sequence encoding a TnpB polypeptide; and b) a second regulatory element operably linked to one or more nucleotide sequences encoding one or more nucleic acid molecules comprising a nucleic acid component comprising a reprogrammable spacer sequence, a conserved RNA sequence; wherein components (a) and (b) are on the same or different supports. In certain embodiments, the bioactive agent is a composition comprising a delivery system operably configured to deliver the composition, system, or component thereof and/or one or more polynucleotide sequences, vectors, or vector systems containing or encoding the component into a cell and capable of forming a complex with the composition and component of the system herein, and wherein the complex is operable in the cell. In one embodiment, the complex may include a TnpB polypeptide as described herein, a nucleic acid component scaffold comprising a guide sequence (a reprogrammable spacer sequence), and a conserved nucleotide sequence. In any such composition, the delivery system may be a yeast system, a lipofection system, a microinjection system, a gene gun system, a virosome, a liposome, an immunoliposome, a polycation, a lipid: nucleic acid conjugate or artificial virosome, or any other system as described herein. In particular embodiments, delivery is via particles, nanoparticles, lipids, or Cell Penetrating Peptides (CPPs).
Also described herein are methods for developing or designing a composition, system, optionally a composition, system-based therapy or therapeutic agent, comprising (a) selecting a (therapeutic) locus nucleic acid component target site of interest, wherein the target site has minimal sequence variation in a population, and sub-selecting a target site from the selected target sites, wherein the nucleic acid component for the target site identifies a minimal number of off-target sites in the population, or (b) selecting a (therapeutic) locus nucleic acid component target site of interest, wherein the target site has minimal sequence variation in a population, or selecting a (therapeutic) locus nucleic acid component target site of interest, wherein the nucleic acid component for the target site identifies a minimal number of off-target sites in the population, and optionally estimating the number of (sub) selected target sites required to treat or otherwise regulate or manipulate a population, and optionally verifying one or more of the (sub) selected target sites in a single subject, optionally designing one or more nucleic acid components of the (sub) selected target sites.
In one embodiment, a method for developing or designing a nucleic acid component for a composition, system, optionally composition, system-based therapy, or therapeutic agent can include (a) selecting a (therapeutic) locus nucleic acid component target site of interest, wherein the target site has a minimum sequence variation in a population, and selecting a target site from the selected target sites, wherein a nucleic acid component molecule for the target site recognizes a minimum number of off-target sites in the population, or (b) selecting a (therapeutic) locus nucleic acid component molecule target site of interest, wherein the target site has a minimum sequence variation in a population, or selecting a (therapeutic) locus nucleic acid component molecule target site of interest, wherein a nucleic acid component molecule for the target site recognizes a minimum number of off-target sites in the population, and optionally estimating the number of (sub) selected target sites required to treat or otherwise modulate or manipulate a population, optionally verifying one or more of the (sub) selected target sites in a single subject, optionally designing one or more of the (sub) selected target sites.
In one embodiment, a method for developing or designing a composition, system, optionally composition, system-based therapy, or therapeutic agent in a population can include (a) selecting a (therapeutic) locus-reprogrammable spacer target site of interest, wherein the target site has a minimum sequence variation in the population, and selecting a target site from the selected target sites, wherein a nucleic acid component for the target site identifies a minimum number of off-target sites in the population, or (b) selecting a (therapeutic) locus nucleic acid component-reprogrammable spacer target site, wherein the target site has a minimum sequence variation in the population, or selecting a (therapeutic) locus nucleic acid component-reprogrammable spacer target site of interest, wherein a nucleic acid component for the target site identifies a minimum number of off-target sites in the population, and optionally estimates the number of (sub) selected target sites required to treat or otherwise modulate or manipulate the population, optionally verifying the design of one or more of the (sub) selected target sites in a single subject, optionally one or more of the (sub) selected target sites.
In one embodiment, a method for developing or designing a composition, system, optionally composition, system-based therapy, or therapeutic agent for use in a population may comprise (a) selecting a (therapeutic) locus nucleic acid component molecule target site of interest, wherein the target site has a minimum sequence variation in the population, and selecting a target site from the selected target sites, wherein the nucleic acid component molecule for the target site recognizes the minimum number of off-target sites in the population, or (b) selecting a (therapeutic) locus nucleic acid component molecule target site of interest, wherein the target site has a minimum sequence variation in the population, or selecting a (therapeutic) locus nucleic acid component molecule target site of interest, wherein the nucleic acid component molecule for the target site recognizes the minimum number of off-target sites in the population, and optionally estimating the number of (sub) selected target sites required for treatment or otherwise modulating or manipulating the population, optionally verifying one or more of the (sub) selected target sites in a single subject, optionally designing one or more of the (sub) selected target sites, or more of the (therapeutic) locus nucleic acid component molecules of interest.
In one embodiment, a method for developing or designing a composition, a system (such as a composition, system-based therapy or therapeutic agent) (optionally in a population), or a nucleic acid component reprogrammable spacer (optionally in a population) for a composition, system (optionally composition, system-based therapy or therapeutic agent) may comprise selecting a set of target sequences in one or more loci of a target population, wherein the target sequences do not contain variants in the target population (i.e., platinum target sequences) that occur at higher than a threshold allele frequency; removing any target sequences having high frequency off-target candidates (relative to other (platinum) targets in the set) from the selected (platinum) target sequences to define a final set of target sequences; one or more, such as a set of compositions, systems, are prepared based on the final set of target sequences, optionally wherein the amount of composition prepared is based (at least in part) on the size of the target population.
In one embodiment, a sequencing-based Double Strand Break (DSB) detection assay (such as described elsewhere herein) is used to identify or determine off-target candidates/off-targets, TAM restriction, target cleavage efficiency, or effector protein specificity. In one embodiment, a sequencing-based Double Strand Break (DSB) detection assay (such as described elsewhere herein) is used to identify or determine off-target candidates/off-targets. In one embodiment, the off-target or off-target candidate has at least 1, preferably 1-3 mismatches or (distal) TAM mismatches, such as 1 or more, such as 1, 2, 3 or more (distal) TAM mismatches. In one embodiment, the sequencing-based DSB detection assay comprises labeling the site of the DSB with an adapter comprising a primer binding site, labeling the site of the DSB with a barcode or unique molecular identifier, or a combination thereof, as described elsewhere herein.
It will be appreciated that the reprogrammable spacer sequence of the nucleic acid component is 100% complementary to the target site, i.e. does not contain any mismatches with the target site. It will also be appreciated that "recognition" of a (de) target site by a reprogrammable spacer is premised on the function of the composition, the system, i.e., if binding of the (de) target site by a reprogrammable spacer RNA results in the activity of the composition, the system (such as induction of single or double stranded DNA cleavage, transcriptional regulation, etc.), the (de) target site is only recognized by the reprogrammable spacer RNA.
In one embodiment, the target site with minimal sequence variation in the population is characterized by the absence of sequence variation in at least 99%, preferably at least 99.9%, more preferably at least 99.99% of the population. In one embodiment, optimizing the target position comprises selecting a target sequence or locus that is free of sequence variation in at least 99%, preferably at least 99.9%, more preferably at least 99.99% of the population. These targets are also referred to herein elsewhere as "platinum targets". In one embodiment, the population comprises at least 1000 individuals, such as at least 5000 individuals, such as at least 10000 individuals, such as at least 50000 individuals.
In one embodiment, the off-target site is characterized by at least one mismatch between the off-target site and the nucleic acid component. In one embodiment, the off-target site is characterized by at most five, preferably at most four, more preferably at most three mismatches between the off-target site and the nucleic acid component. In one embodiment, the off-target site is characterized by at least one mismatch between the off-target site and the nucleic acid component, and at most five, preferably at most four, more preferably at most three mismatches between the off-target site and the nucleic acid component.
In one embodiment, the minimum number of off-target sites in the population for high frequency haplotypes in the population is determined. In one embodiment, the minimum number of off-target sites in the population is determined for a high frequency haplotype of off-target site loci in the population. In one embodiment, the minimum number of off-target sites in the population for a high frequency haplotype of target site loci in the population is determined. In one embodiment, the high frequency haplotype is characterized as occurring in at least 0.1% of the population.
In one embodiment, the number of (sub) selected target sites required for a therapeutic population is estimated based on low frequency sequence variations (such as those captured in a large-scale sequencing dataset). In one embodiment, the number of (sub) selected target sites required to treat a population of a given size is estimated.
In one embodiment, the method further comprises obtaining genomic sequencing data of the subject to be treated; and treating the subject with a composition, system selected from the group of compositions, systems, wherein the selected composition, system is based (at least in part) on genomic sequencing data of the individual. In one embodiment, the target(s) (sub) selected is verified by genomic sequencing, preferably whole genome sequencing.
In one embodiment, the target sequence or locus as described herein is (further) selected based on optimization of one or more parameters, such as TAM type (natural or modified), TAM nucleotide content, TAM length, target sequence length, TAM restriction, target cleavage efficiency and target sequence position within the gene, locus or other genomic region. Optimization methods are discussed in more detail elsewhere herein.
In one embodiment, the target sequence or locus as described herein is (further) selected based on optimization of one or more of target locus position, target length, target specificity, and TAM characteristics. As used herein, TAM characteristics may include, for example, TAM sequences, TAM lengths, and/or TAM GC content. In one embodiment, optimizing TAM characteristics includes optimizing the nucleotide content of TAMs. In one embodiment, optimizing the nucleotide content of a TAM is selecting a TAM having motifs that maximize abundance in one or more target loci, minimize mutation frequency, or both. Minimizing the mutation frequency can be achieved, for example, by selecting TAM sequences that lack CpG or have low or minimal CpG.
In one embodiment, the compositions, each composition in the system set, and effector proteins of the system are selected based on optimization of one or more parameters selected from the group consisting of: effector protein size, ability of effector protein to enter high chromatin accessibility regions, degree of uniformity of enzymatic activity across genomic targets, epigenetic tolerance, mismatch/bulge tolerance, effector protein specificity, effector protein stability or half-life, effector protein immunogenicity or toxicity. Optimization methods are discussed in more detail elsewhere herein.
Optimization of a system
The methods of the present invention may involve optimizing selected parameters or variables related to the composition, system, and/or function thereof, as further described elsewhere herein. The optimization of the compositions, systems in the methods as described herein may depend on the target, such as one or more therapeutic targets, the composition, the mode or type of system modulation, such as therapeutic target modulation, modification or manipulation based on the composition, system, and delivery of the composition, system, component. One or more targets may be selected based on genotypic and/or phenotypic outcomes. For example, one or more therapeutic targets may be selected according to the etiology of the (genetic) disease or the desired therapeutic outcome. The (therapeutic) target may be a single gene, locus or other genomic locus, or may be a plurality of genes, loci or other genomic loci. As known in the art, a single gene, locus, or other genomic locus may be targeted more than once, such as through the use of multiple nucleic acid components, or a scaffold of nucleic acid components and multiple reprogrammable spacers.
The activity of a composition and/or system (such as a TnpB polypeptide-based therapy or therapeutic agent) may involve target disruption, such as target mutation, such as to result in a gene knockout. The activity of a composition and/or system (such as a TnpB polypeptide-based therapy or therapeutic agent) may involve the replacement of a particular target site, such as to result in target correction. TnpB polypeptide-based therapies or therapeutic agents may involve the removal of specific target sites, such as resulting in target deletions. The activity of a composition and/or system (such as a TnpB polypeptide-based therapy or therapeutic agent) may involve the modulation of target site function, such as target site activity or accessibility, resulting in, for example, (transcriptional and/or epigenetic) gene or genomic region activation or gene or genomic region silencing. The skilled artisan will appreciate that modulation of target site function may involve mutations in the TnpB polypeptide (such as, for example, production of a catalytically inactive TnpB polypeptide) and/or functionalization (such as, for example, fusion of the TnpB polypeptide with a heterologous functional domain, such as a transcriptional activator or repressor), as described elsewhere herein.
Accordingly, in one aspect, the present invention relates to a method as described herein, comprising selecting one or more (therapeutic) targets, selecting one or more functions of the composition and/or system, and optimizing selected parameters or variables related to the composition and/or its function. In a related aspect, the invention relates to a method as described herein, comprising (a) selecting one or more (therapeutic) target loci, (b) selecting one or more composition functions, (c) optionally selecting one or more delivery modes, and preparing, developing or designing a composition herein selected based on steps (a) - (c).
In one embodiment, the function of the composition and/or system includes genomic mutation. In one embodiment, the function of the composition and/or system comprises a single genomic mutation. In one embodiment, the function of the composition and/or system includes a plurality of genomic mutations. In one embodiment, the function of the composition and/or system comprises gene knockout. In one embodiment, the function of the composition and/or system comprises a single gene knockout. In one embodiment, the function of the composition and/or system includes multiple gene knockouts. In one embodiment, the function of the composition and/or system includes gene correction. In one embodiment, the function of the composition and/or system includes single gene correction. In one embodiment, the function of the composition and/or system includes multiple gene corrections. In one embodiment, the function of the composition and/or system includes genomic region correction. In one embodiment, the function of the composition and/or system includes single genome region correction. In one embodiment, the function of the composition and/or system includes multiple genomic region corrections. In one embodiment, the function of the composition and/or system includes a gene deletion. In one embodiment, the function of the composition and/or system includes a single gene deletion. In one embodiment, the function of the composition and/or system includes multiple gene deletions. In one embodiment, the function of the composition and/or system includes a genomic region deletion. In one embodiment, the function of the composition and/or system includes a single genomic region deletion. In one embodiment, the function of the composition and/or system includes multiple genomic region deletions. In one embodiment, the function of the composition and/or system includes modulation of the function of a gene or genomic region. In one embodiment, the function of the composition and/or system includes modulation of the function of a single gene or genomic region. In one embodiment, the function of the composition and/or system includes modulation of the function of multiple genes or genomic regions. In one embodiment, the function of the composition and/or system comprises a gene or genomic region function, such as a gene or genomic region activity. In one embodiment, the function of the composition and/or system includes a single gene or genomic region function, such as gene or genomic region activity. In one embodiment, the function of the composition and/or system includes multiple gene or genomic region functions, such as gene or genomic region activity. In one embodiment, the function of the composition and/or system comprises modulating gene activity or accessibility, optionally resulting in transcription and/or epigenetic gene or genomic region activation or gene or genomic region silencing. In one embodiment, the function of the composition and/or system comprises modulating the activity or accessibility of an individual gene, thereby optionally resulting in transcription and/or epigenetic gene or genomic region activation or silencing of the gene or genomic region. In one embodiment, the function of the composition and/or system comprises modulating a plurality of gene activities or accessibility, thereby optionally resulting in transcription and/or epigenetic gene or genomic region activation or gene or genomic region silencing.
Optimization of selected parameters or variables in the methods as described herein can result in optimized or improved systems, such as TnpB polypeptide-based therapies or therapeutics, specificity, efficacy, and/or safety. In one embodiment, one or more of the following parameters or variables are considered, selected or optimized in the methods of the invention as described herein: tnpB polypeptide allosteric interactions, tnpB polypeptide functional domain and functional domain interactions, tnpB polypeptide specificity, nucleic acid component specificity, composition specificity, TAM restriction, TAM type (natural or modified), TAM nucleotide content, TAM length, tnpB polypeptide activity, nucleic acid component activity, tnpB polypeptide/nucleic acid component molecule complex activity, target cleavage efficiency, target site selection, target sequence length, ability of effector proteins to enter high chromatin accessibility regions, degree of uniformity of enzyme activity across genomic targets, epigenetic tolerance, mismatch/bulge tolerance, tnpB polypeptide stability, tnpB polypeptide mRNA stability, nucleic acid component molecule stability, tnpB polypeptide complex stability, tnpB polypeptide protein or mRNA immunogenicity or toxicity, nucleic acid component molecule immunogenicity or toxicity, tnpB polypeptide immunogenicity or mRNA dose or titer, tnpB polypeptide protein size, tnpB polypeptide expression level, and space-time expression of a nucleic acid component expression polypeptide.
By way of example and not limitation, parameter or variable optimization may be accomplished as follows. The TnpB polypeptide specificity can be optimized by selecting the TnpB polypeptide (e.g., tnpB) that is the most specific. This can be accomplished, for example, by selecting the most specific ortholog of the TnpB polypeptide or by increasing the specificity of a particular TnpB polypeptide mutation. The specificity of the nucleic acid component can be optimized by selecting the most specific nucleic acid component. This can be achieved, for example, by selecting nucleic acid components having low homology (i.e., having at least one or preferably a plurality, such as at least 2 or preferably at least 3 mismatches, with the off-target site). Specificity can be optimized by increasing TnpB polypeptide specificity and/or nucleic acid component specificity as described above.
The target length or target sequence length can be optimized, for example, by selecting an appropriate TnpB polypeptide (such as an appropriate TnpB polypeptide that recognizes the desired target or target sequence nucleotide length). Alternatively or additionally, target (sequence) length may be optimized by providing a target that deviates in length from the target (sequence) length typically associated with a TnpB polypeptide, such as a naturally occurring TnpB polypeptide. The TnpB polypeptide or target (sequence) length may be naturally occurring or may be optimized, for example, based on TnpB polypeptide mutants having altered target (sequence) length recognition or a library of target (sequence) length recognition. For example, increasing or decreasing the target (sequence) length may affect target recognition and/or off-target recognition. The TnpB polypeptide activity can be optimized by selecting the most active TnpB polypeptide. This can be accomplished, for example, by selecting the most active TnpB polypeptide ortholog or by increasing the activity of a particular TnpB polypeptide mutation. The ability of a TnpB polypeptide protein to access high chromatin accessibility regions can be optimized by selecting an appropriate TnpB polypeptide or mutant thereof, and the size, charge, or other dimensional variables of the TnpB polypeptide can be considered, etc. The uniformity of the activity of a TnpB polypeptide can be optimized by selecting an appropriate TnpB polypeptide or mutant thereof, and can take into account TnpB polypeptide specificity and/or activity, TAM specificity, target length, mismatch tolerance, epigenetic tolerance, tnpB polypeptide and/or nucleic acid component stability and/or half-life, immunogenicity and/or toxicity of a TnpB polypeptide and/or nucleic acid component, and the like. The activity of the nucleic acid component can be optimized by selecting the most active nucleic acid component. In one embodiment, this may be achieved by increasing the stability of the nucleic acid component by modification with RNA. The composition activity can be optimized by increasing the TnpB polypeptide activity and/or the nucleic acid component activity as described above.
Target site selection may be optimized by selecting the optimal location of the target site within a gene, locus, or other genomic region. Target site selection may be optimized by optimizing target locations, including selecting target sequences of genes, loci, or other genomic regions with low variability. This can be accomplished, for example, by selecting target sites in early and/or conserved exons or domains (i.e., having low variability within the population, such as polymorphisms).
In one embodiment, optimizing the target (sequence) length includes selecting one or more target sequences within a target locus between 5 and 25 nucleotides. In one embodiment, the target sequence is 20 nucleotides.
In one embodiment, optimizing target specificity includes selecting target loci that minimize off-target candidates.
In one embodiment, the target site may be selected by minimizing off-target effects (e.g., off-target is defined as having 1-5, 1-4, or preferably 1-3 mismatches compared to the target, preferably also taking into account variability within the population). The stability of a TnpB polypeptide can be optimized by selecting a TnpB polypeptide that has an appropriate half-life (such as a preferably short half-life) while still being able to maintain sufficient activity. In one embodiment, this can be accomplished by selecting an appropriate TnpB polypeptide ortholog having a particular half-life or by mutation or modification of a particular TnpB polypeptide that affects half-life or stability, such as including (e.g., fusion) a stabilizing or destabilizing domain or sequence. The TnpB polypeptide mRNA stability can be optimized by increasing or decreasing the TnpB polypeptide mRNA stability. In one embodiment, this can be accomplished by increasing the TnpB polypeptide mRNA stability by modification with mRNA. The nucleic acid component stability can be optimized by increasing or decreasing the nucleic acid component stability. In one embodiment, this may be accomplished by increasing or decreasing the stability of the nucleic acid component by modification with RNA. Stability may be optimized by increasing or decreasing TnpB polypeptide stability and/or nucleic acid component molecular stability as described above. The TnpB polypeptide protein or mRNA immunogenicity or toxicity may be optimized by reducing the TnpB polypeptide or mRNA immunogenicity or toxicity. In one embodiment, this may be achieved by mRNA or protein modification. Similarly, in the case of DNA-based expression systems, DNA immunogenicity or toxicity may be reduced. The immunogenicity or toxicity of the nucleic acid component may be optimized by reducing the immunogenicity or toxicity of the nucleic acid component. In one embodiment, this may be achieved by modification of the nucleic acid component. Similarly, in the case of DNA-based expression systems, DNA immunogenicity or toxicity may be reduced. Immunogenicity or toxicity may be optimized by reducing the immunogenicity or toxicity of the TnpB polypeptide and/or the immunogenicity or toxicity of the nucleic acid component as described above, or by selecting a TnpB polypeptide/nucleic acid component combination that is least immunogenic or toxic. Similarly, in the case of DNA-based expression systems, DNA immunogenicity or toxicity may be reduced. The dosage or titer of the TnpB polypeptide protein or mRNA may be optimized by selecting a dosage or titer that minimizes toxicity and/or maximizes specificity and/or efficacy. The dose or titer of the nucleic acid component can be optimized by selecting a dose or titer that minimizes toxicity and/or maximizes specificity and/or efficacy. The dose or titer of the composition can be optimized by selecting a dose or titer that minimizes toxicity and/or maximizes specificity and/or efficacy. The TnpB polypeptide size may be optimized by selecting a minimum protein size to increase the efficiency of delivery, particularly virus-mediated delivery. The expression level of a TnpB polypeptide, nucleic acid component, or complex thereof may be optimized by limiting (or extending) the duration of expression and/or limiting (or increasing) the expression level. This may be achieved, for example, by: using self-inactivating compositions, systems, such as including self-targeting (e.g., tnpB polypeptide targeting) nucleic acid component molecules, by using viral vectors with limited expression duration, by using appropriate promoters for low (or high) expression levels, by combining different delivery methods of the respective TnpB system components, such as a combination of viral-mediated delivery of a TnpB polypeptide encoding a nucleic acid with a non-viral-mediated delivery of a nucleic acid component, or a combination of viral-mediated delivery of a nucleic acid component with a non-viral-mediated delivery of a TnpB polypeptide or mRNA. The temporal and spatial expression of a TnpB polypeptide, nucleic acid component or TnpB complex can be optimized by appropriate selection of conditional and/or inducible expression systems, including controllable TnpB polypeptide activity, optionally destabilized TnpB polypeptides and/or split TnpB polypeptides and/or cell or tissue specific expression systems.
In one aspect, the invention relates to a method as described herein comprising selecting one or more (therapeutic) targets, selecting a function of a composition and/or system, selecting a composition delivery mode, selecting a composition delivery vehicle or expression system, and optimizing selected parameters or variables related to the composition and/or function thereof, optionally wherein the parameters or variables are selected from one or more of the following: tnpB polypeptide specificity, nucleic acid component specificity, tnpB complex specificity, tnpB polypeptide activity, nucleic acid component molecular activity, tnpB polypeptide/nucleic acid component complex activity, target cleavage efficiency, target site selection, target sequence length, ability of effector protein to enter high chromatin accessibility regions, degree of uniformity of enzyme activity across genomic targets, epigenetic tolerance, mismatch/bulge tolerance, tnpB polypeptide stability, tnpB polypeptide mRNA stability, nucleic acid component stability, tnpB complex stability TnpB polypeptide protein or mRNA immunogenicity or toxicity, nucleic acid component immunogenicity or toxicity, tnpB polypeptide/nucleic acid component complex immunogenicity or toxicity, tnpB polypeptide protein or mRNA dose or titer, nucleic acid component dose or titer, tnpB complex dose or titer, tnpB polypeptide protein size, tnpB polypeptide expression level, nucleic acid component expression level, tnpB polypeptide/nucleic acid component molecule complex expression level, tnpB polypeptide temporal-temporal expression, nucleic acid component temporal-temporal expression, tnpB polypeptide/nucleic acid component complex temporal-temporal expression.
It should be understood that the parameters or variables to be optimized as well as the nature of the optimization may depend on the (therapeutic) target, the function of the composition and/or the system, the mode of delivery of the system and/or the delivery vehicle or expression system of the composition.
In one aspect, the invention relates to a method as described herein, comprising optimizing nucleic acid component specificity at a population level. Preferably, said optimisation of nucleic acid component specificity comprises minimising nucleic acid component target site sequence variation in the population and/or minimising the incidence of nucleic acid component off-target in the population.
In one embodiment, the optimization may result in the selection of naturally occurring or modified TnpB polypeptides. In one embodiment, the optimization may result in selection of a TnpB polypeptide having nuclease, nickase, deaminase, transposase and/or having one or more effector functions that are inactivated or eliminated. In one embodiment, optimizing TAM specificity may include selecting TnpB polypeptides having modified TAM specificity. In one embodiment, optimizing may include selecting a TnpB polypeptide having a smallest dimension. In one embodiment, optimizing effector protein stability includes selecting effector proteins having a short half-life while maintaining sufficient activity, such as by selecting appropriate TnpB polypeptide orthologs having a particular half-life or stability. In one embodiment, optimizing immunogenicity or toxicity includes minimizing effector protein immunogenicity or toxicity by protein modification. In one embodiment, optimizing functional specificity includes selecting protein effectors with reduced tolerance to mismatches and/or bulges between the nucleic acid component molecules and one or more target loci.
In one embodiment, optimizing efficacy includes optimizing overall efficiency, epigenetic tolerance, or both. In one embodiment, maximizing overall efficiency includes selecting effector proteins having uniform enzymatic activity in target loci with different chromatin complexity, selecting effector proteins whose enzymatic activity is limited to open chromatin accessibility regions. In one embodiment, chromatin accessibility is measured using one or more of an ATAC-seq or DNA proximity ligation assay. In one embodiment, optimizing epigenetic tolerance includes optimizing methylation tolerance, epigenetic signature competition, or both. In one embodiment, optimizing methylation tolerance includes selecting effector proteins that modify methylated DNA. In one embodiment, optimizing epigenetic tolerance comprises selecting an effector protein that is incapable of modifying a chromosomal silencing region, selecting an effector protein that is capable of modifying a chromosomal silencing region, or selecting a target locus that is not enriched for an epigenetic marker
In one embodiment, selecting an optimized nucleic acid component molecule includes optimizing stability, immunogenicity, or both, or other related parameters or variables as described elsewhere herein.
In one embodiment, optimizing nucleic acid component molecular stability and/or nucleic acid component molecular immunogenicity includes RNA modification, or other nucleic acid component molecular related parameters or variables as described elsewhere herein. In one embodiment, the modification comprises removing 1-3 nucleotides from the 3' end of the target complementary region of the nucleic acid component molecule. In one embodiment, the modification comprises an extended nucleic acid component molecule and/or a trans RNA/DNA element that produces a stable structure in the nucleic acid component molecule that competes with base pairing with the nucleic acid component molecule at the target of the off-target locus, or comprises an extended complementary nucleotide between the nucleic acid component molecule and the target sequence, or both.
In one embodiment, the delivery mode comprises delivery of a nucleic acid component molecule and/or a TnpB polypeptide, delivery of a nucleic acid component molecule and/or a TnpB polypeptide mRNA, or delivery of a nucleic acid component molecule and/or a TnpB polypeptide as a DNA-based expression system. In one embodiment, the delivery mode further comprises selecting the delivery vehicle and/or expression system from the group consisting of liposomes, lipid particles, nanoparticles, gene gun, or virus-based expression/delivery system. In one embodiment, expression is spatiotemporal expression, which is optimized by selecting for conditional and/or inducible expression systems comprising controllable TnpB polypeptide activity, optionally destabilized TnpB polypeptides and/or split TnpB polypeptides and/or cell or tissue specific expression systems.
The methods as described herein may also involve selection of a delivery mode. In one embodiment, the nucleic acid component and/or the TnpB polypeptide is delivered or is to be delivered. In one embodiment, the nucleic acid component and/or the TnpB polypeptide mRNA is delivered or is to be delivered. In one embodiment, the nucleic acid component and/or TnpB polypeptide provided in the DNA-based expression system is delivered or is to be delivered. In one embodiment, the delivery of the individual system components includes a combination of the above delivery modes. In one embodiment, delivering comprises delivering a nucleic acid component and/or a TnpB polypeptide protein, delivering a nucleic acid component and/or a TnpB polypeptide mRNA, or delivering a nucleic acid component and/or a TnpB polypeptide as a DNA-based expression system.
The methods as described herein may also involve selection of a composition delivery vehicle and/or expression system. Delivery vehicles and expression systems are described elsewhere herein. For example, nucleic acid and/or protein delivery vehicles include nanoparticles, liposomes, and the like. Delivery vehicles for DNA (such as DNA-based expression systems) include, for example, gene guns, viral-based vector systems (e.g., adenovirus, AAV, lentivirus), and the like. The skilled artisan will appreciate that the mode of delivery and the choice of delivery vehicle or expression system may depend, for example, on the cell or tissue to be targeted. In one embodiment, the delivery vehicle and/or expression system for delivering the composition, system or component thereof comprises a liposome, a lipid particle, a nanoparticle, a gene gun, or a viral-based expression/delivery system.
Considerations for therapeutic applications
One consideration of genome editing therapies is the selection of sequence-specific nucleases (such as variants of TnpB polypeptides). Each nuclease variant may have its unique advantages and disadvantages, many of which must be balanced against a therapeutic background to maximize therapeutic efficacy. In order for a particular editing therapy to be effective, a sufficiently high level of modification must be achieved in the target cell population to reverse the disease symptoms. This therapeutic modification "threshold" is determined by the amount of gene product required to edit the cell's fitness after treatment and reverse symptoms. With respect to adaptability, editing creates three potential outcomes for treated cells relative to unedited cells: adaptability increases, does not change or decreases. With increased adaptability, corrected cells may be able to expand relative to diseased cells to mediate treatment. In this case, even a small number of edited cells can be expanded by expansion, providing therapeutic benefit to the patient, with the selective advantage of edited cells. In the case of edited cells without change in adaptation, an increase in the therapeutic modification threshold can be ensured. Thus, significantly higher levels of editing may be required to treat a disease in which editing creates a neutral adaptability advantage relative to a disease in which editing creates increased adaptability for target cells. If editing brings an adaptive disadvantage, just as restoring the function of the tumor suppressor gene in cancer cells, the modified cells will be defeated in competition with diseased cells, resulting in lower therapeutic benefit relative to the editing rate. This can be overcome by supplemental therapy to increase the efficacy and/or adaptability of the edited cells relative to the diseased cells.
In addition to cellular adaptation, the amount of gene product required to treat a disease can also affect the minimum level of therapeutic genome editing that can treat or prevent the disease or symptoms thereof. Where small changes in gene product levels may result in significant changes in clinical outcome, the minimum level of therapeutic genome editing is lower relative to situations where larger changes in gene product levels are required to obtain a clinically relevant response. In one embodiment, the minimum level of therapeutic genome editing may be in the range of 0.1-1%, 1-5%, 5-10%, 10-15%, 15-20%, 20-25%, 25-30%, 30-35%, 35-40%, 40-45%, 45-50%, or 50-55%. Thus, small changes in gene product levels can affect clinical outcome as well as diseases where editing cells have an adaptive advantage, are ideal targets for genome editing therapies because the treatment modification threshold is low enough to allow a high chance of success.
The activity of NHEJ and HDR DSB repair can vary depending on cell type and cell status. NHEJ is not highly regulated by the cell cycle and is effective in cell types, allowing high levels of gene disruption in accessible target cell populations. In contrast, HDR functions primarily during S/G2 phase and is therefore limited to actively dividing cells, limiting the treatments that require precise genomic modifications to mitotic cells [ Ciccia, a. & ellidge, s.j. Molecular cells 40,179-204 (2010); chapman, j.r. et al Molecular cell 47,497-510 (2012) ].
The efficiency of correction via HDR can be controlled by the epigenetic status or sequence of the targeted loci, or by the specific repair template configuration used (single and double stranded, long and short homology arms) [ Hacein-Bey-Abina, s.et al The New England journal of medicine, 1185-1193 (2002); gaspar, H.B. et al Lancet 364,2181-2187 (2004); beumer, k.j. Et al G3 (2013) ]. The relative activities of NHEJ and HDR mechanisms in target cells can also affect gene correction efficiency, as these pathways may compete for resolving DSBs [ Beumer, k.j. Et al Proceedings of the National Academy of Sciences of the United States of America 105,19821-19826 (2008) ]. HDR also presents a delivery challenge not seen by the NHEJ strategy because it uses simultaneous delivery of nucleases and repair templates. Thus, these differences can be kept in mind when designing, optimizing and/or selecting therapeutic agents based on TnpB polypeptides as described in more detail elsewhere herein.
Polynucleotide modification applications based on TnpB polypeptides may include combinations of proteins, small RNA molecules, and/or repair templates, and in one embodiment may make delivery of these multiple moieties significantly more challenging than, for example, traditional small molecule therapies. Two main strategies have been developed for delivering compositions, systems and components thereof: ex vivo and in vivo. In one embodiment of ex vivo treatment, diseased cells are removed from the subject, edited, and then transplanted back into the patient. In other embodiments, cells from healthy allogeneic donors are harvested, modified with the composition or components thereof to confer various functions and/or reduced immunogenicity, and administered to an allogeneic recipient in need of treatment. The advantage of ex vivo editing is to allow for a well-defined target cell population and to specify a specific dose of therapeutic molecule delivered to the cells. The latter consideration may be particularly important when off-target modifications are considered, as titrating the amount of nuclease may reduce such mutations (Hsu et al 2013). Another advantage of ex vivo methods is the high rate of editing that can generally be achieved due to the development of efficient delivery systems that deliver proteins and nucleic acids into cultured cells for research and gene therapy applications.
In vivo polynucleotide modification via a composition, system, and/or component thereof involves delivering the composition, system, and/or component thereof directly to a cell type in its native tissue. In vivo polynucleotide modification via compositions, systems, and/or components thereof allows for the treatment of diseases where the affected cell population is not suitable for ex vivo manipulation. In addition, the in situ delivery of the compositions, systems, and/or components thereof to cells allows for the treatment of a variety of tissues and cell types.
In one embodiment, such as those embodiments that use a viral vector system to generate viral particles to deliver a composition and/or components thereof to a cell, the total cargo size of the composition and/or components thereof should be considered as the vector system may have a limit on the size of polynucleotides that may be expressed by and/or packaged into cargo within the viral particles. In one embodiment, the propensity of a vector system (such as a viral vector system) should be considered as it can affect the cell type to which the composition or components thereof can be efficiently and/or effectively delivered.
When delivering the system or components thereof via a virus-based system, it is important to consider the amount of viral particles needed to achieve a therapeutic effect in order to consider potential immune responses that the viral particles may elicit when delivered to a subject or cell. When delivering a system or components thereof via a virus-based system, it is important to consider the mechanism that controls the in vivo distribution and/or dosage of the system. Generally, to reduce the likelihood of off-target effects, it is optimal, but not necessary, for the amount of system to be as close as possible to the minimum or lowest effective dose. In practice, this can be difficult to do.
In one embodiment, it is important to consider the immunogenicity of the system or components thereof. In embodiments, where immunogenicity of a system or component thereof is involved, immunogenicity of the system or component thereof may be reduced. By way of example only, the methods set forth in Tangri et al may be used to reduce the immunogenicity of a system or component thereof. Thus, directed evolution or rational design can be used to reduce the immunogenicity of a TnpB polypeptide in a host species (human or other species).
Xenograft
The invention also contemplates the use of the compositions described herein (e.g., the TnpB polypeptide protein system) for providing RNA-guided DNA nucleases suitable for use in providing modified tissue for transplantation. For example, RNA-guided DNA nucleases can be used to knock-out, knock-down, or disrupt expression of a selected gene in an animal, such as a transgenic pig (such as a human heme-oxygenase-1 transgenic pig line), for example, by disrupting expression of a gene encoding an epitope recognized by the human immune system (i.e., a heterologous antigen gene). Candidate pig genes for disruption may include, for example, the α (1, 3) -galactosyltransferase and cytidine monophosphate-N-acetylneuraminic acid hydroxylase genes (see PCT patent publication WO 2014/066505). In addition, genes encoding endogenous retroviruses may be disrupted, for example, genes encoding all porcine endogenous retroviruses (see Yang et al, 2015, genome-wide inactivation of Porcine Endogenous Retroviruses (PERVs), science 2015, 11, 27: volume 350, 6264, pages 1101-1104). In addition, RNA-guided DNA nucleases can be used to target sites for integration of additional genes (such as the human CD55 gene) in xenograft donor animals to improve protection against hyperacute rejection.
Embodiments of the invention also relate to methods and compositions related to knocking out genes, amplifying genes, and repairing specific mutations associated with DNA repeat instability and neurological disorders (Robert d.wells, tetsuo Ashizawa, genetic Instabilities and Neurological Diseases, second Edition, academic Press,2011, 13 th day-Medical). Specific features of tandem repeats have been found to be associated with more than twenty human diseases (New insights into repeat instability: role of RNA-DNA hybrids. McIvor EI, polak U, napierala M.RNA biol.2010 for 9 to 10 months; 7 (5): 551-8). The effector protein system of the present invention may be used to correct for these genomic instability deficiencies.
Several additional aspects of the invention relate to correcting deficiencies associated with a variety of genetic diseases further described under the subject subsection genetic disorder (Genetic Disorders) on the website of the national institutes of health (website is health. Hereditary brain diseases may include, but are not limited to, adrenoleukodystrophy, callus deficiency, aicarpi Syndrome (Aicarpi syncrome), alpers ' Disease, alzheimer's Disease, basth Syndrome (Barth syncrome), batten Disease, CADAIL, cerebellar degeneration, fabry Disease, gerstmann-Stlausler-Shen Kebing (Gerstmann-Straussler-Scheinker Disease), huntington's Disease and other triplet repeat disorders, lewy Disease, leishmani-niheng Syndrome, menkes Disease (Menkes Disease), mitochondrial myopathy and NINDS cavitation brain. These diseases are further described in the hereditary brain disorders section of the national institutes of health website.
Application in plants and fungi
The compositions, systems and methods described herein can be used to perform genetic or genomic interrogation or editing or manipulation in plants and fungi. For example, applications include interrogation and/or selection and/or interrogation and/or comparison and/or manipulation and/or transformation of plant genes or genomes; for example, to create, identify, develop, optimize or confer a trait or characteristic on a plant, or to transform a plant or fungal genome. Thus, the production of plants, new plants with new traits or combinations of features, or new plants with enhanced traits may be improved. The compositions, systems and methods may be used for plants in site-directed integration (SDI) or Gene Editing (GE) or any Near Reverse Breeding (NRB) or Reverse Breeding (RB) technology.
The compositions, systems and methods herein can be used to impart desired traits (e.g., enhanced nutritional quality, increased resistance to disease and resistance to biotic and abiotic stress, and increased yield of commercially valuable plant products or heterologous compounds) to essentially any plant and fungus and cells and tissues thereof. The compositions, systems and methods can be used to modify endogenous genes or modify their expression without permanently introducing any foreign genes into the genome.
In one embodiment, the compositions, systems and methods can be used in the context of genome editing in plants or where RNAi or similar genome editing techniques have been previously used; see, e.g., nekraov, "Plant genome editing made easy: targeted mutagenesis in model and crop plants using the CRISPR-Cas system," Plant Methods2013,9:39 (doi: 10.1186/1746-4811-9-39); brooks, "Efficient gene editing in tomato in the first generation using the CRISPR-Cas9 system," Plant Physiology 2014, 9 months pp 114.247577; shan, "Targeted genome modification of crop plants using a CRISPR-Cas system," Nature Biotechnology, 686-688 (2013); feng, "Efficient genome editing in plants using a CRISPR/Cas system," Cell Research (2013) 23:1229-1232.Doi:10.1038/cr.2013.114; online publication is carried out on the 8 th and 20 th 2013; xie, "RNA-guided genome editing in plants using a CRISPR-Cas system," Mol plant.2013Nov;6 (6) 1975-83.Doi:10.1093/mp/sst119.Epub 2013, 8 months 17 days; xu, "Gene targeting using the Agrobacterium tumefaciens-mediated CRISPR-Cas systemin Rice," Rice 2014,7:5 (2014), zhou et al, "Exploiting SNPs for biallelic CRISPR mutations in the outcrossing woody perennial Populus reveals-coumarate: coA ligase specificity and Redundancy," New Phytologist (2015) (Forum) 1-4 (available only online at www.newphytologist.com); caliando et al, "Targeted DNA degradation using a CRISPR device stably carried in the host genome, NATURE COMMUNICATIONS 6:6989, DOI:10.1038/ncomms7989, www.nature.com/naturecommunications DOI:10.1038/ncomms7989; U.S. Pat. No. 6,603,061-Agrobacterium-Mediated Plant Transformation Method; U.S. Pat. Nos. 7,868,149-Plant Genome Sequences and Uses Thereof and 2009/0100536-Transgenic Plants with Enhanced Agronomic Traits, morrell et al, "Crop genemics: advances and applications," Nat Rev Genet.2011, 12 months 29 days; 13 (2) 85-96, the entire contents and disclosures of each of which are incorporated herein by reference in their entirety. Aspects of utilizing compositions, systems and methods may be similar to using compositions in PLANTs, and mentions guidance for nucleic acid modification in PLANT systems by the university of arizona (University of Arizona) website "CRISPR-PLANT" (genome. Arizona. Edu/CRISPR /) (sponsored by Penn State and AGI).
The compositions, systems, and methods may also be used with protoplasts. "protoplast" refers to a plant cell whose protective cell wall is completely or partially removed, e.g., mechanically or enzymatically, to produce a complete unit of living plant biochemical capacity that can be reformed into a cell wall, proliferated, and regenerated into a complete plant under appropriate growth conditions.
The compositions, systems, and methods can be used to screen genes of interest (e.g., endogenous genes, mutations). In some examples, genes of interest include those encoding enzymes involved in the production of components with added nutritional value, or genes that generally affect agronomic traits of interest across species, phylum and plant kingdom. By selectively targeting genes encoding enzymes of metabolic pathways, for example, genes responsible for certain nutritional characteristics of plants can be identified. Similarly, by selectively targeting genes that may affect a desired agronomic trait, related genes may be identified. Thus, the present invention encompasses methods of screening for genes encoding enzymes involved in the production of compounds having particular nutritional values and/or agronomic traits.
It will also be appreciated that references herein to animal cells, mutatis mutandis, can also be applied to plant or fungal cells, unless otherwise indicated; also, enzymes herein with reduced off-target effects and systems employing such enzymes may be used in plant applications, including those mentioned herein.
In some cases, nucleic acids introduced into plants and fungi can be codon optimized for expression in the plants and fungi. Methods of codon optimization include those described in Kwon KC, et al Codon Optimization to Enhance Expression Yields Insights into Chloroplast Translation Plant physiol 2016, 9; 172 (1) those described in 62-77.
Components in compositions and systems (e.g., tnpB polypeptides) may also comprise one or more functional domains described herein. In some examples, the functional domain may be an exonuclease. Such exonucleases can increase the efficiency of TnpB polypeptide function, e.g., mutagenesis efficiency. An example of a functional domain is Trex2, as described in Weiss T et al, www.biorxiv.org/content/10.1101/2020.04.11.037572v1, doi: doi.org/10.1101/2020.04.11.037572.
Examples of plants
The compositions, systems, and methods herein can be used to impart desirable traits to essentially any plant. A variety of plants and plant cell systems can be engineered to achieve desired physiological and agronomic characteristics. Generally, the term "plant" relates to any of a variety of photosynthetic, eukaryotic, unicellular, or multicellular organisms of the plant kingdom characterized by growth by cell division, containing chloroplasts, and having a cell wall composed of cellulose. The term plant encompasses both monocotyledonous and dicotyledonous plants.
The compositions, systems and methods are useful in a wide range of plants, such as, for example, dicotyledonous plants belonging to the following orders: magnoliales (magnolias), anisales (illinciales), camphorales (Laurales), piperales (Piperales), aristolochiae (aristolochiae), water lily (nyphaea), buttercup (ranunculaces), poppy (Papeverales), family (Sarraceniaceae), kunmunoccus (Trochodendrales), hamamelis (hamamelis), eucommia (eucommiae), phellodendles (leitneria), bayberry (Myricales), beech (Fagales), horsetails (Cannabis), garcinia (Caryophyllales), carnalliales (carnalliales), sarcoplasmales (Batales), polygonales (Polygonales), blueteeth (Plumbaginales), pentadaceae (Divalales), camellia (Theales), malformation (Malvales) nettle (Urticales), yucymarmor (Lecythides), violales (Violales), salicales (Salicales), geranium (Capparides), rhododendron (Eriicales), geranium (Ericales), gemcarum Mei Mu (Diapendales), diospyros (Ebenales), primachiales (Primula), rosales (Rosales), lepidales (Fabales), sichuan grass (Podostemales), lepidoptera (Haloragalles), myrtales (Myrtales), cornals (Cornales), hylocarum (Proteales), santalum (Santales), rhus (Raffleales), celastrus (Celastraceae), euphorbiales (Euphralls), rhus (Rhna), sapinales (Sapinales), juglandaceae (Juglandaceae), geraniales (Geraniales), polygalaales (Polygalales), umbelliferales (Umbelliferales), gentianales (Gentiales), allium (Polemoniales), leptodermales (Lamiales), plantago asiatica (Plantaginales), scrophulariales (Scrophulariales), campanulaceae (Campanulaces), rubiales (Rubiales), dipsacales (Dipsacales), and Juglans (Asterales); monocotyledonous plants such as those belonging to the following objectives: the plants of the order alismatides (Alismatales), water beetles (hydromatriales), allgales (Najadales), mould grass (trilidales), dayflower (Commelinales), pipewort (eriocaales), broom grass (resolvines), gramineae (Poales), rush (juncaliles), sedge (Cyperales), typhales (Typhales), pineapple (bromelies), zingiberales (Zingiberales), palmales (areca), panamales (cyclophanles), pandanales (Pandanales), arisaema (Arales), lilies (lillliales), and blueales (oscidales), or gymnosperms (gypses), for example belonging to the following orders: pine (Pinales), ginkgales (Ginkgoales), cycadales (Cycadales), southern cedar (Araucariales) and Bai Mu (Cupressales) and gnetitum (Gnetales).
The compositions, systems, and methods herein can be used for a wide range of plant species including in the following non-limiting list of dicots, monocots, or gymnosperms: belladonna (Atropa), plumbum (Alseodaphne), anacardium (Anacardium), arachis (Arachis), orthosiphon (Belischemia), brassica (Brassica), safflower (Carthamus), mentha (Cocculus), croton (Croton), cucumis (Cucure), citrus (Citrus), capsicum (Capsicum), catharanthus (Catharanthus), cocois, coffee (Coffea), cucurbita (Cucurbria), daucus (Daucus), centipeda (Duetia), ficus (Esscholzia), ficus (Ficus), fragaria (Fragaria), papaver (Glauucium), cynanchum (Glauum) soyabean (Glycine), cotton (Gossypium), sunflower (Helianthus), rubber (Hevea), henbane (Hyoscyamus), lettuce (Lactuca), phyllanthus (Landolphia), flax (Linum), litsea (Litsea), tomato (Lycopersicon), lupinus (Lupinus), cassava (Manihot), marjoram (Majorana), malus (Malus), alfalfa (Medicago), tobacco (Nicotiana), olea (Olea), guayule (Parthenium (patrinium), poppy (Papaver), avocado (Persea), phaseolus (Phaseolus), pistacia (pista), pea (Pisum), marjoram (piopam), pear (Pyrus), plum (Prunus), radish (Raphanus), castor (Ricinus), senecio (Senecio), fenglong (Sinomenium), stephania (Stephania), sinapis (Sinapis), solanum (Solanum)), cocoa (Theobroma), trifolium (Trifolium), trigonella (Trigonella), vicia (Vicia), vinca (Vinca), vitis (Vinifera) and Vigna (Vigna); and Allium (Allium), saxifraga (Andropogon), saxifraga (Argrostis), asparagus (Asparagus), avena (Avena), cynodon (Cynodon), oil palm (Elaeis), festuca (Festuca), avena (Festulium), hemerocallis (Heterocallis), hordeum (Hordeum), lemna (Lemna), lolium (Lolium), musa (Musa), oryza (Oryza), panicum (Panicum), pennisetum (Pannesum), tinospora (Phalaum), poa (Poa), secale (Kaolia), sorghum (Sorgum), triticum (Triticum), zea (Zea), abies (Abies), cunninghamia (Cunninghamia), epuza (Picea), picea (Picea), and Picea (Picea).
In one embodiment, the target plants and plant cells for engineering include those monocotyledonous and dicotyledonous plants, such as crops, including cereal crops (e.g., wheat, corn, rice, millet, barley), fruit crops (e.g., tomatoes, apples, pears, strawberries, oranges), feed crops (e.g., alfalfa), root vegetable crops (e.g., carrot, potato, beet, yam), leaf vegetable crops (e.g., lettuce, spinach); flowering plants (e.g., petunia, rose, chrysanthemum), conifers, and pine (e.g., pine, spruce); plants for phytoremediation (e.g., plants that accumulate heavy metals); oil crops (e.g., sunflower, rapeseed) and plants for experimental purposes (e.g., arabidopsis). In particular, plants are intended to include, but are not limited to, angiosperms and gymnosperms, such as acacia, alfalfa, amaranth, apple, apricot, artichoke, white wax, asparagus, avocado, banana, barley, beans, beet, birch, beech, blackberry, blueberry, broccoli, brussels sprout, cabbage, rape, cantaloupe, carrot, cassava, cauliflower, cedar, cereal, celery, chestnut, cherry, chinese cabbage, citrus, clematis orange (clevenline), clover, coffee, corn, cotton, cowpea, cucumber, cypress, eggplant, elm, chicory, eucalyptus, fennel, fig, fir, geranium, grape, grapefruit, peanut, cherries, gum, poison pears, pecan, kiwi, peach, kohlrabi, larch, lettuce, leek, lemon, lime, locust, pine, peacock grass, cotton, cowpea, cucumber, eggplant, elm, chicory, fennel, fig, fir, geranium, grape, grapefruit, peanut, cherry, gum, poison peaches, kiwi, pine, peashrub corn, mango, maple, melon, millet, mushroom, mustard, nut, oak, oat, oil palm, okra, onion, orange, ornamental plant or tree, papaya, palm, parsley, divaricate saposhnikovia, pea, peach, peanut, pear, peat, pepper, persimmon, pigeon pea, pine, pineapple, plantain, plum, pomegranate, potato, pumpkin (pumpkin), chicory, radish, rapeseed, raspberry, rice, rye, sorghum, safflower, salix chinensis, soybean, spinach, spruce, pumpkin (squarish), strawberry, beet, sugarcane, sunflower, sweet potato, sweet corn, orange, tea, tobacco, tomato, tree, triticale, turf grass, turnip, rattan, walnut, bean, watermelon, wheat, yam, yew and zucchini.
The term plant also encompasses algae, which are mainly photoautotrophic organisms, mainly due to the lack of roots, leaves and other organs specific to higher plants. The compositions, systems, and methods are useful for a wide range of "algae" or "algal cells. Examples of algae include eukaryotic phylum including Rhodophyta (red algae), chlorophyta (chlorphyta) (green algae), phaeophyta (Phaeophyta) (brown algae), diatom (Bacillophyta) (diatom), ocular phylum (Eustomatophyta) and Diels dinoflagellates, and Protophyta Cyanobacteria (Cyanophyta) (blue algae). Examples of algal species include those of: the genus Anikstrodesmus (Amphora), anikstrodesmus, botryococcus (Botryococcus), chaetoceros (Chaetoceros), chlamydomonas (Chlamydomonas), chlorella (Chlorococcus), cycloella (Cycotella), cyclopentaria (Cylindromyca), dunaliella (Dunaliella), chlorella (Emilania), euglena (Euglena), haemophilus (Haemococus), isochrysis (Isochrysis), monascophyta (Monochidiosis), monochamurensis (Monochamus), monochamus (Monochamus), nannochloris), nannovis (Nannoschlla), chlorella (Nannoschlla) the genus renilla (nephrix), renkochia (nephrix selmis), diamond-shaped algae (Nitzschia), joint chlorella, candida, ochloropsis (Oochromonas), oocystis (oolysitis), tremella, pavova (Pavlova), phaeodactylum (Phaeodactylum), flat algae (platlonas), chrysophyta (plarochrysis), porphyra (Porphyra), pseudoanabaena (Pseudoanabaena), tazomorpha (pyramara), serissa, synechococcus, synechocystis, tetrazoma, thalassophyta (thassioidea) and Shu Maozao (trichoderma).
Plant promoters
To ensure proper expression in plant cells, the components and components of the systems herein may be placed under the control of a plant promoter. A plant promoter is a promoter operable in a plant cell. Plant promoters are capable of initiating transcription in plant cells, whether or not the source is plant cells. It is envisaged to use different types of promoters.
In some examples, the plant promoter is a constitutive plant promoter, which is a promoter capable of expressing its controlled Open Reading Frame (ORF) in all or nearly all plant tissue during all or nearly all developmental stages of the plant (referred to as "constitutive expression"). An example of a constitutive promoter is the cauliflower mosaic virus (cauliflower mosaic virus) 35S promoter. In some examples, plant promoters are regulated promoters that direct gene expression non-constitutively, but in a temporally and/or spatially regulated manner, and include tissue-specific, tissue-preferential, and inducible promoters. Different promoters may direct expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. In some examples, a plant promoter is a tissue-preferred promoter that can be used to target enhanced expression in certain cell types within a particular plant tissue (e.g., a particular cell of a vascular cell or seed in a leaf or root).
Exemplary plant promoters include plant promoters obtained from plants, plant viruses, and bacteria (such as agrobacterium or rhizobia) that contain genes expressed in plant cells. Additional examples of promoters include Kawamata et al, (1997) Plant Cell Physiol 38:38:792-803; yamamoto et al, (1997) Plant J12:255-65; hire et al, (1992) Plant Mol Biol 20:207-18, kuster et al, (1995) Plant Mol Biol 29:759-72 and Capana et al, (1994) Plant Mol Biol 25:681-91.
In some examples, the plant promoter may be an inducible promoter that is inducible and allows some form of energy to be used for gene editing or spatiotemporal control of gene expression. The form of energy may include acoustic energy, electromagnetic radiation, chemical energy, and/or thermal energy. Examples of inducible systems include tetracycline-inducible promoters (Tet-On or Tet-Off), small molecule two-hybrid transcriptional activation systems (FKBP, ABA, etc.), or photoinductive systems (photopigments, LOV domains, or cryptoanthocyanidins), such as photoinductive transcription effectors (LITE) that direct changes in transcriptional activity in a sequence-specific manner. In particular examples, components of the light-induced system include a TnpB polypeptide, a light-reactive cytochrome heterodimer (e.g., from arabidopsis thaliana), and a transcriptional activation/repression domain.
In some examples, the promoter may be a chemically regulated promoter (in which application of an exogenous chemical substance induces gene expression) or a chemically repressed promoter (in which application of a chemical substance represses gene expression). Examples of chemically inducible promoters include the maize ln2-2 promoter (activated by a benzenesulfonamide herbicide safener), the maize GST promoter (activated by a hydrophobic electrophilic compound used as a pre-emergence herbicide), the tobacco PR-1a promoter (activated by salicylic acid), promoters under the control of antibiotics (such as tetracycline-inducible and tetracycline-repressible promoters).
Stable integration into plant genome
In one embodiment, polynucleotides encoding components of the compositions and systems may be introduced to stabilize integration into the genome of a plant cell. In some cases, vectors or expression systems may be used for such integration. The design of the vector or expression system may be adjusted according to the time, place and conditions of expression of the nucleic acid component molecules and/or the TnpB polypeptide gene. In some cases, the polynucleotide may integrate into an organelle of the plant (such as a plastid, mitochondrial, or chloroplast). The elements of the expression system may be located on one or more expression constructs that are circular, such as plasmids or transformation vectors, or non-circular, such as linear double-stranded DNA.
In one embodiment, the integrated process generally comprises the steps of: the selection of a suitable host cell or host tissue, the introduction of the construct into the host cell or host tissue, and the regeneration of the plant cell or plant therefrom. In some examples, the expression system for stable integration into the plant cell genome may contain one or more of the following elements: promoter elements useful for expressing RNA and/or TnpB polypeptides in plant cells; a 5' untranslated region that enhances expression; an intron element that further enhances expression in certain cells (such as monocot cells); multiple cloning sites that provide convenient restriction sites for insertion of nucleic acid component molecules and/or TnpB polypeptide gene sequences and other desired elements; and a 3' untranslated region that provides efficient termination of the expressed transcript.
Transient expression in plants
In one embodiment, the components of the compositions and systems may be transiently expressed in plant cells. In some examples, the compositions and systems can modify the target nucleic acid only when both the nucleic acid component molecule and the TnpB polypeptide are present in the cell, such that genomic modification can be further controlled. Plants regenerated from such plant cells are typically free of foreign DNA because expression of the TnpB polypeptide is transient. In certain examples, the TnpB polypeptide is stably expressed and the nucleic acid component molecule sequence is transiently expressed.
DNA and/or RNA (e.g., mRNA) can be introduced into plant cells for transient expression. In such cases, the introduced nucleic acid may be provided in a sufficient amount to modify the cell, but not persist after a desired period of time has elapsed or after one or more cell divisions.
Transient expression may be achieved using a suitable vector. Exemplary vectors useful for transient expression include the pEAQ vector (which may be adapted for agrobacterium-mediated transient expression) and cabbage leaf curl virus (CaLCuV), as well as sainbury f et al, plant Biotechnol j.2009, month 9; 7 (7) 682-93; and YIN K et al, volume Scientific Reports, volume 5, article number 14926 (2015).
Combinations of the different methods described above are also contemplated.
Translocation to and/or expression in specific plant organelles
The compositions and systems herein may comprise elements for translocation to and/or expression in a particular plant organelle.
Chloroplast targeting
In one embodiment, compositions and systems are contemplated for specifically modifying a chloroplast gene or ensuring expression in a chloroplast. Compositions and systems (e.g., tnpB polypeptides, nucleic acid components, or polynucleotides encoding the same) can be transformed, compartmentalized, and/or targeted to chloroplasts. In one example, introducing genetic modifications in the plastid genome can reduce biosafety issues, such as gene flow through pollen.
Examples of chloroplast transformation methods include particle bombardment, PEG treatment, and microinjection, as well as translocation of the transformation cassette from the nuclear genome to the plastid. In some examples, targeting of chloroplasts may be achieved by incorporating a sequence encoding a Chloroplast Transit Peptide (CTP) or plastid transit peptide into the chloroplast localization sequence and/or expression construct operably linked to the 5' region of the sequence encoding components of the compositions and systems. Additional examples of chloroplast transformation, targeting and localization include those described in WO2010061186, protein Transport into Chloroplasts,2010,Annual Review of Plant Biology, volume 61:157-180 and US20040142476, which are incorporated herein by reference in their entirety.
Exemplary applications in plants
The compositions, systems, and methods can be used to create genetic variations in a plant of interest (e.g., a crop). One or more nucleic acid components, e.g., a library of nucleic acid components, targeted to one or more locations in the genome may be provided and introduced into a plant cell along with the TnpB polypeptide. For example, a collection of genomic-scale point mutations and gene knockouts can be generated. In some examples, the compositions, systems, and methods can be used to produce plant parts or plants from the cells so obtained and to screen the cells for a trait of interest. The target gene may include coding and non-coding regions. In some cases, the trait is stress tolerance and the method is a method for producing a stress tolerant crop variety.
In one embodiment, the compositions, systems and methods are used to modify endogenous genes or modify their expression. Expression of the components may be induced by direct activity of the TnpB polypeptide and optionally introduction of recombinant template DNA, or by modification of the targeted gene to induce targeted modification of the genome. The different strategies described above allow TnpB polypeptide mediated targeted genome editing without the need to introduce components into the plant genome.
In some cases, modifications can be made to avoid the presence of foreign DNA in the plant genome without permanently introducing any foreign genes (including those encoding the composition components herein) into the plant genome. This may be of interest because regulatory requirements for non-transgenic plants are less stringent. Components that are transiently introduced into plant cells are typically removed upon hybridization.
For example, the modification may be performed by transient expression of components of the compositions and systems. Transient expression may be performed by delivering components of the compositions and systems with viral vectors, delivered to protoplasts by means of particle molecules (such as nanoparticles or CPPs).
Producing plants with desirable traits
The compositions, systems, and methods herein can be used to introduce a desired trait into a plant. The method includes introducing one or more foreign genes to confer a trait of interest, editing or modulating endogenous genes to confer the trait of interest.
Agronomic traits
In one embodiment, crop plants may be improved by affecting specific plant traits. Examples of traits include improved agronomic traits such as herbicide resistance, disease resistance, abiotic stress tolerance, high yield and excellent quality, pesticide resistance, disease resistance, insect and nematode resistance, resistance to parasitic weeds, drought tolerance, nutritional value, stress tolerance, self-pollination inefficiency, feed digestibility biomass and grain yield.
In one embodiment, genes conferring resistance to pests or diseases can be introduced into plants. In the presence of endogenous genes that confer such resistance to plants, their expression and function may be enhanced (e.g., by introducing additional copies, modifications that enhance expression and/or activity).
Examples of genes conferring resistance include plant disease resistance genes (e.g., cf-9, pto, RSP2, slDMR 6-1), genes conferring pest resistance (e.g., those described in WO 96/305117), bacillus thuringiensis proteins, lectins, vitamin binding proteins (e.g., affinins), enzyme inhibitors (e.g., protease (protease) or protease (protease) inhibitors or amylase inhibitors), insect-specific hormones or pheromones (e.g., ecdysteroids or juvenile hormones, variants thereof, mimics based thereon or antagonists or agonists thereof) or genes involved in the production and regulation of such hormones and pheromones, insect-specific peptides or neuropeptides, insect-specific venoms (e.g., venom produced by snakes, wasps, etc., or analogs thereof), enzymes responsible for super-accumulation of monoterpenes, sesquiterpenes, steroids, hydroxamic acids, phenylpropionic acid derivatives, or another non-protein molecule having pesticidal activity, enzymes involved in modification of biologically active molecules (e.g., glycolytic enzymes, proteolytic enzymes, lipolytic enzymes, nucleases, cyclases, transaminases, esterases, hydrolases, phosphatases, kinases, phosphorylases, polymerases, elastases, chitinases, and glucanases, whether natural or synthetic), molecules that stimulate signal transduction, viral invasive proteins or complex toxins derived therefrom, development-blocking proteins produced in nature by pathogens or parasites, development-blocking proteins produced in nature by plants, or any combination thereof.
The compositions, systems, and methods can be used to identify, screen, introduce, or remove mutations or sequences that result in genetic variability that causes susceptibility to certain pathogens (e.g., host-specific pathogens). Such methods may result in plants that are not host resistant, e.g., the host and pathogen are incompatible, or may be partially resistant to all races of the pathogen, which is typically controlled by many genes and/or also fully resistant to some races of the pathogen but not to others.
In one embodiment, the compositions, systems and methods can be used to modify genes involved in plant diseases. Such genes may be removed, inactivated or otherwise regulated or modified. Examples of plant diseases include those described in [0045] - [0080] of US20140213619A1, which is incorporated herein by reference in its entirety.
In one embodiment, genes conferring resistance to herbicides can be introduced into plants. Examples of genes conferring herbicide resistance include genes conferring resistance to herbicides that inhibit growth points or meristem (such as imidazolinone or sulfonylurea), genes conferring resistance to glyphosate (e.g., resistance conferred by, for example, mutant 5-enolpyruvylshikimate-3-phosphate synthase genes, aroA genes and Glyphosate Acetyltransferase (GAT) genes, respectively), genes encoding herbicides or resistance to mutant amidase enzymes such as by glufosinate (glufosinate acetyltransferase (from streptomyces species including streptomyces hygroscopicus and streptomyces viridochromogenes) and genes encoding resistance to pyridinyloxy or phenoxypropionic acid and cyclohexanone by accase inhibitor encoding genes, genes encoding resistance to herbicides that inhibit photosynthesis (such as triazine (psbA and gs+ genes) or benzonitrile (nitrilase genes) and glutathione S-transferase, genes encoding herbicides or resistance to mutant amidase enzymes that inhibit resistance, genes encoding phosphinic amine transferase as hpase from streptomyces species including streptomyces hygroscopicus and streptomyces viridochromogenes, and genes encoding hpase (hpase) of hpbacteria such as HPPD or hpase from the mutant species such as HPPD or hpase.
In one embodiment, genes involved in abiotic stress tolerance may be introduced into plants. Examples of genes include genes capable of reducing expression and/or activity of the poly (ADP-ribose) polymerase (PARP) gene, transgenes capable of reducing expression and/or activity of the PARG encoding gene, genes encoding plant functional enzymes of the nicotinamide adenine dinucleotide rescue synthesis pathway including nicotinamide enzyme, nicotinic acid ribosyl transferase, nicotinic acid mononucleotide adenyltransferase, nicotinamide adenine dinucleotide synthase or nicotinamide phosphoglycosyltransferase, enzymes involved in carbohydrate biosynthesis, enzymes involved in polyfructose (e.g., inulin and levan type) production, alpha-1, 6 branched alpha-1, 4-glucan production, alternan (alternan) production, hyaluronic acid production.
In one embodiment, genes that improve drought tolerance may be introduced into plants. Examples of genes: ubiquitin protein ligase protein (UPL) protein (UPL 3), DR02, DR03, ABC transporter and DREB1A.
Nutrient modified plant
In one embodiment, the compositions, systems and methods can be used to produce nutritionally improved plants. In some examples, such plants may provide functional foods, such as modified foods or food ingredients that may provide health benefits beyond the traditional nutrition they contain. In certain examples, such plants may provide nutritional foods, e.g., substances that may be considered as foods or as part of foods and provide health benefits, including prevention and treatment of diseases. The nutraceutical is useful for the prevention and/or treatment of diseases in animals and humans, such as cancer, diabetes, cardiovascular diseases and hypertension.
The modified plants may naturally produce one or more desired compounds, and the modification may enhance the level or activity or quality of the compounds. In some cases, the modified plant may not naturally produce the compound, and the modification enables the plant to produce such compound. In some cases, the compositions, systems, and methods are used to indirectly modify the endogenous synthesis of such compounds, for example by modifying one or more transcription factors that control the metabolism of such compounds.
Examples of nutritionally modified plants include plants comprising: altered protein quality, content and/or amino acid composition, essential amino acid content, oils and fatty acids, carbohydrates, vitamins and carotenoids, functional secondary metabolites and minerals. In some examples, the modified plant may comprise or produce a compound having a health benefit. Examples of nutritionally modified plants include those described in New well-McGloughlin, plant Physiology, month 7 of 2008, volume 147, pages 939-953.
Examples of compounds that may be produced include carotenoids (e.g., alpha-carotene or beta-carotene), lutein, lycopene, zeaxanthin, dietary fibers (e.g., insoluble fibers, beta-glucan, soluble fibers, fatty acids (e.g., omega-3 fatty acids, conjugated linoleic acid, GLA), flavonoids (e.g., hydroxycinnamates, flavonols, catechins, and tannins), glucosinolates, indoles, isothiocyanates (e.g., glucoraphanin), phenols (e.g., stilbenes, caffeic acid and ferulic acid, epicatechin), phytostanols/sterols, levan, inulin, fructooligosaccharides, saponins, soy proteins, phytoestrogens (e.g., isoflavones, lignans), sulfides and thiols such as diallyl sulfide, allyl methyl trisulfide, dithiones, tannins such as proanthocyanidins, or any combination thereof.
The compositions, systems and methods may also be used to alter protein/starch functionality, shelf life, taste/aesthetics, fiber quality and allergen, anti-nutrient and toxin-reducing traits.
Examples of genes and nucleic acids that can be modified to introduce traits include stearyl-ACP desaturase, DNA associated with a single allele that may lead to a maize mutant characterized by low phytic acid levels, tf RAP2.2, and its interaction partners SINAT2, tf Dof1, and Dof Tf atdof1.1 (OBP 2).
Modification of polyploid plants
Compositions, systems, and methods can be used to modify polyploid plants. Polyploid plants carry repeated copies of their genome (e.g., up to six such as wheat). In some cases, the compositions, systems, and methods may be multiplexed to affect all copies of a gene or to target tens of genes simultaneously. For example, the compositions, systems and methods may be used to simultaneously ensure loss of functional mutations in different genes responsible for inhibiting defense against disease. The modification may simultaneously inhibit expression of TaMLO-Al, taMLO-Bl and TaMLO-Dl nucleic acid sequences in wheat plant cells and regenerate the wheat plant from the wheat plant cells in order to ensure that the wheat plant is resistant to powdery mildew (e.g. as described in WO 2015109752).
Regulation of fruit ripening
The compositions, systems and methods can be used to regulate the ripening of fruit. Ripening is the normal stage in the ripening of fruits and vegetables. Only after the first few days of it, it may render the fruit or vegetable inedible, which may cause significant losses to farmers and consumers.
In one embodiment, the compositions, systems, and methods are used to reduce ethylene production. In some examples, the compositions, systems, and methods can be used to inhibit expression and/or activity of ACC synthase, insert an ACC deaminase gene or functional fragment thereof, insert a SAM hydrolase gene or functional fragment thereof, inhibit ACC oxidase gene expression.
Alternatively or additionally, the compositions, systems, and methods can be used to modify ethylene receptors (e.g., inhibit ETR 1) and/or Polygalacturonase (PG). Inhibition of a gene may be achieved by introducing mutations, antisense sequences and/or truncated copies of the gene into the genome.
Prolonging the shelf life of plants
In one embodiment, the compositions, systems and methods are used to modify genes involved in the production of compounds that affect the shelf life of plants or plant parts. The modification may be in a gene that prevents accumulation of reducing sugars in potato tubers. After high temperature processing, these reducing sugars react with free amino acids, producing brown, bitter tasting products, and leading to elevated levels of acrylamide, a potential carcinogen. In particular embodiments, the methods provided herein are used to reduce or inhibit expression of a vacuolar invertase gene (VInv) encoding a protein that breaks down sucrose into glucose and fructose.
Reduction of allergens in plants
In one embodiment, the compositions, systems and methods are used to produce plants with reduced allergen levels, making them safer for consumers. To this end, the compositions, systems, and methods can be used to identify and modify (e.g., inhibit) one or more genes responsible for plant allergen production. Examples of such genes include Lol p5, as well as genes in peanuts, soybeans, lentils, peas, lupins, mung beans (green beans), mung beans (mung beans), such as Nicolaou et al, current Opinion in Allergy and Clinical Immunology2011;11 222), which are incorporated herein by reference in their entirety.
Production of Male sterile plants
The compositions, systems, and methods can be used to produce male sterile plants. Hybrid plants generally have advantageous agronomic traits compared to inbred plants. However, for self-pollinating plants, the generation of hybrid varieties can be challenging. Genes important for plant fertility, more particularly male fertility, have been identified in different plant types (e.g., maize and rice). The plants so genetically engineered may be used in a cross-breeding program.
The compositions, systems and methods may be used to modify genes involved in male fertility, e.g., to inactivate genes required for male fertility (such as by introducing mutations). Examples of genes involved in male fertility include cytochrome P450-like gene (MS 26) or meganuclease gene (MS 45), and Wan X et al, mol plant.2019, 3 months, 4 days; 12 (3) 321-342; and kimyj et al, trends Plant sci.2018, month 1; 23 (1) genes such as those described in 53-65.
Increasing fertility stage of plants
In one embodiment, the compositions, systems and methods can be used to prolong the fertility stage of plants (such as rice). For example, a rice fertility stage gene, such as Ehd3, can be targeted to produce a mutation in the gene, and a regenerated plant fertility stage prolonged plantlet can be selected.
Early production of product
In one embodiment, the compositions, systems, and methods can be used to produce early yields of product. For example, the flowering process may be regulated, for example, by mutating a flowering repressor gene such as SP 5G. Examples of such methods include Soyk S et al, nat genet.2017, month 1; 49 162-168.
Oil and biofuel production
The compositions, systems and methods can be used to produce plants for the production of oil and biofuels. Biofuel includes fuels made from plants and plant-derived sources. Biofuel may be extracted from organic matter whose energy is obtained by a carbon fixation process or made by the use or conversion of biomass. Such biomass may be used directly in biofuels or may be converted into convenient energetic materials by thermal, chemical and biochemical conversion. Such biomass conversion may produce fuel in solid, liquid or gaseous form. Biofuels include bioethanol and biodiesel. Bioethanol can be produced by a sugar fermentation process of cellulose (starch), which can be derived from corn and sugar cane. Biodiesel can be produced from oil crops such as rapeseed, palm and soybean. The biofuel may be used for transportation.
Production of plants for the production of vegetable oils and biofuels
The compositions, systems, and methods can be used to produce algae (e.g., diatoms) and other plants (e.g., grapes) that express or over-express high levels of oil or biofuel.
In some cases, the compositions, systems, and methods can be used to modify genes involved in modification of lipid numbers and/or lipid quality. Examples of such genes include genes involved in the fatty acid synthesis pathway, such as acetyl-coa carboxylase, fatty acid synthase, 3-ketoacyl-acyl-carrier protein synthase III, glycerol-3-phosphate dehydrogenase (G3 PDH), enoyl-acyl carrier protein reductase (enoyl-ACP-reductase), glycerol-3-phosphate acyltransferase, lysophosphatidyl transferase or diacylglycerol acyltransferase, phospholipid: diacylglycerol acyltransferase, phosphatidic acid phosphatase, fatty acid thioesterases such as palmitoyl protein thioesterase, or malic enzyme activity.
In other embodiments, it is contemplated to produce diatoms with increased lipid accumulation. This can be achieved by targeting genes that reduce lipid catabolism. Examples of genes include genes involved in activation of triacylglycerols and free fatty acids, beta-oxidation of fatty acids, such as genes for acetyl-coa synthase, 3-ketoacyl-coa thiolase, acetyl-coa oxidase activity and glucose phosphomutase.
In some examples, the algae can be modified to produce oils and biofuels, including fatty acids (e.g., fatty acid esters such as Fatty Acid Methyl Esters (FAME) and Fatty Acid Ethyl Esters (FAEE)). Examples of methods of modifying microalgae include stovick et al meta. Eng. Comm, 2015;2:1; U.S. patent No. 8,945,839; and the method described in International patent publication No. WO 2015/086795.
In some examples, one or more genes can be introduced into a plant (e.g., an alga) (e.g., overexpressed therein) to produce oil and biofuel (e.g., fatty acids) from a carbon source (e.g., an alcohol). Examples of genes include genes encoding: acetyl-CoA synthase, ester synthase, thioesterase (e.g., tesA,' tesA, tesB, fatB, fatB2, fatB3, fatAl or fatA), acetyl-CoA synthase (e.g., fadD, jadK, BH3103, pfl-4354, EAV15023, fadDl, fadD2, RPC_4074, fadDD35, fadDD22, faa 39), ester synthase (e.g., from synthase/acetyl-CoA diacylglycerol acyltransferase: paraffin (Simmondsia chinensis), acinetobacter species ADP, alkania pomace (Alcanivorax borkumensis), pseudomonas aeruginosa, marine spirochete (Fundibacter jadensis), arabidopsis thaliana or Alcaligenes eutrophus (Alkaligenes eutrophus) or variants thereof).
Additionally or alternatively, one or more genes in a plant (e.g., an algae) may be inactivated (e.g., expression of the genes is reduced). For example, one or more mutations may be introduced into a gene. Examples of such genes include genes encoding: acetyl-coa dehydrogenase (e.g., fade), outer membrane protein receptor, and transcriptional regulator of fatty acid biosynthesis (e.g., repressor) (e.g., fabR), pyruvate formate lyase (e.g., pflB), lactate dehydrogenase (e.g., idhA).
Organic acid production
In one embodiment, the plant may be modified to produce an organic acid, such as lactic acid. Plants can use sugar, pentose or hexose to produce organic acids. To this end, one or more genes may be introduced into (e.g., and overexpressed in) a plant. Examples of such genes include LDH genes.
In some examples, one or more genes may be inactivated (e.g., expression of the genes is reduced). For example, one or more mutations may be introduced into a gene. Genes may include those encoding proteins involved in endogenous metabolic pathways that produce metabolites other than the organic acid of interest and/or wherein the endogenous metabolic pathway consumes the organic acid.
Examples of genes that can be modified or introduced include those encoding: pyruvate decarboxylase (pdc), fumaric acid reductase, alcohol dehydrogenase (adh), acetaldehyde dehydrogenase, phosphoenolpyruvate carboxylase (ppc), D-lactate dehydrogenase (D-ldh), L-lactate dehydrogenase (L-ldh), lactate 2-monooxygenase, lactate dehydrogenase, cytochrome-dependent lactate dehydrogenase (e.g., cytochrome B2-dependent L-lactate dehydrogenase).
Enhancing plant properties of biofuel production
In one embodiment, the compositions, systems and methods are used to alter the properties of plant cell walls to facilitate the entry of key hydrolyzing agents to more effectively release sugars for fermentation. By reducing the proportion of lignin in the plant, the proportion of cellulose can be increased. In certain embodiments, lignin biosynthesis in plants may be down-regulated to increase fermentable carbohydrates.
In some examples, one or more lignin biosynthesis genes may be down-regulated. Examples of such genes include 4-coumarate 3-hydroxylase (C3H), phenylalanine Ammonia Lyase (PAL), cinnamic acid 4-hydroxylase (C4H), hydroxycinnamoyl transferase (HCT), caffeic acid O-methyltransferase (COMT), caffeoyl-CoA 3-O-methyltransferase (CCoAOMT), ferulic acid 5-hydroxylase (F5H), cinnamyl Alcohol Dehydrogenase (CAD), cinnamoyl-CoA reductase (CCR), 4-coumarate CoA ligase (4 CL), monolignol-lignin specific glycosyltransferase and aldehyde dehydrogenase (ALDH), and those described in WO 2008064289.
In some examples, plant quality that produces lower levels of acetic acid during fermentation may be reduced. To this end, genes involved in polysaccharide acetylation (e.g., those described in Cas1L and WO 2010096488) can be inactivated.
Other microorganisms for oil and biofuel production
In one embodiment, microorganisms other than plants may be used to produce oil and biofuels using the compositions, systems, and methods herein. Examples of microorganisms include those of the following genera: escherichia, bacillus, lactobacillus (lactobacilli), rhodococcus, synechococcus, synechocystis, pseudomonas, aspergillus, trichoderma, neurospora (Neurospora), fusarium (Fusarium), humicola (Humicola), rhizomucor (Rhizomucor), kluyveromyces (Kluyveromyces), pichia (Pichia), mucor (Mucor), myceliophthora (myceliophthora), penicillium (Penicillium), protopilus (pharmacohaete), pleurotus (Pleurotus), trametes (Trametes), chrysosporium (Chrysosporium), trichoderma (stenocarcinomyces), stenocardia (stenocardia), schizosaccharomyces (stenocardia) or Yarrowia.
Plant cultivation and regeneration
In one embodiment, the modified plant or plant cell may be cultured to regenerate a whole plant having the transformed or modified genotype and thus the desired phenotype. Examples of regeneration techniques include regeneration techniques that rely on manipulation of certain plant hormones in tissue culture growth medium, regeneration techniques that rely on the introduction of biocide and/or herbicide markers with desired nucleotide sequences, regeneration techniques obtained from cultured protoplasts, plant calli, explants, organs, pollen, embryos or parts thereof.
Detection of modifications in plant genome selectable markers
When the compositions, systems and methods are used to modify plants, the modifications made in the plants can be identified and detected using suitable methods. In some examples, when multiple modifications are made, one or more desired modifications or traits resulting from the modifications may be selected and detected. Detection and validation can be performed by biochemical and molecular biological techniques such as Southern analysis, PCR, northern blotting, S1 rnase protection, primer extension or reverse transcriptase-PCR, enzyme assays, ribozyme activity, gel electrophoresis, western blotting, immunoprecipitation, enzyme-linked immunoassay, in situ hybridization, enzyme staining, and immunostaining.
In some cases, one or more markers (such as a selectable marker and a detectable marker) may be introduced into the plant. Such markers can be used to select, monitor, isolate cells and plants having desired modifications and traits. The selectable marker may confer positive or negative selection and may or may not be conditioned upon the presence of an external substrate. Examples of such markers include genes and proteins conferring resistance to antibiotics such as hygromycin (hpt) and kanamycin (nptII), genes conferring resistance to herbicides such as glufosinate (bar) and chlorsulfuron (als), enzymes capable of producing or processing colored substances (e.g., beta-glucuronidase, luciferase, B or C1 genes).
Use in fungi
The compositions, systems and methods described herein can be used to perform efficient and cost-effective genetic or genomic interrogation or editing or manipulation in fungi or fungal cells (such as yeast). Methods and applications in plants can also be applied to fungi.
The fungal cell may be any type of eukaryotic cell within the kingdom of fungi, such as Ascomycota (Ascomycota), basidiomycota (Basidiomycota), amycolatopsis (Blastocladiomycota), chytrium (chytrium) phylum, sacculus phylum (Glomeromycota), microsporidia (Microsporidia) and neomycetoma (neocaltimastigota). Examples of fungi or fungal cells include yeasts, molds and filamentous fungi.
In one embodiment, the fungal cell is a yeast cell. Yeast cells refer to any fungal cell within the ascomycota and basidiomycota. Examples of yeasts include budding yeast, merozoites and molds, saccharomyces cerevisiae (s.cereovisiae), kluyveromyces marxianus (Kluyveromyces marxianus) or isaria orientalis (Issatchenkia orientalis), candida species (e.g., candida albicans), yarrowia species (e.g., yarrowia lipolytica (Yarrowia lipolytica)), pichia species (e.g., pichia pastoris), kluyveromyces species (e.g., kluyveromyces lactis (Kluyveromyces lactis) and kluyveromyces marxianus), neurospora species (e.g., pichia pastoris (Neurospora crassa)), fusarium species (e.g., fusarium oxysporum (Fusarium oxysporum)) and isatchenkia species (isatchia spp.) (e.g., isaria orientalis, pichia pastoris (Pichia kudriavzevii) and thermotolerant yeast (Candida acidothermophilum)).
In one embodiment, the fungal cell is a filamentous fungal cell that grows in a filament, such as a hypha or mycelium. Examples of filamentous fungal cells include aspergillus species (e.g., aspergillus niger (Aspergillus niger)), trichoderma species (e.g., trichoderma reesei (Trichoderma reesei)), rhizopus species (e.g., rhizopus oryzae) and Mortierella species (Mortierella spp.) (e.g., mortierella pustula (Mortierella isabellina)).
In one embodiment, the fungal cell is an industrial strain. "Industrial strains" include any fungal cell strain used in or isolated from an industrial process (e.g., producing a product on a commercial or industrial scale). An industrial strain may refer to a fungal species commonly used in industrial processes, or it may refer to an isolate of a fungal species that may also be used for non-industrial purposes (e.g., laboratory research). Examples of industrial processes include fermentation (e.g., in the production of food or beverage products), distillation, biofuel production, production of compounds, and production of polypeptides. Examples of industrial strains include, but are not limited to JAY, 270 and ATCC4124.
In one embodiment, the fungal cell is a polyploid cell whose genome is present in more than one copy. Polyploid cells include cells naturally occurring in a polyploid state as well as cells that have been induced to exist in a polyploid state (e.g., by specific regulation, alteration, inactivation, activation, or modification of meiosis, cytokinesis, or DNA replication). A polyploid cell may be a cell whose entire genome is polyploid, or a cell that is polyploid in a particular genomic locus of interest. In some examples, the abundance of nucleic acid component molecules may be the rate-limiting component more often in genomic engineering of polyploid cells than in haploid cells, and thus methods using the compositions described herein may be utilized using certain fungal cell types.
In one embodiment, the fungal cell is a diploid cell whose genome is present in two copies. Diploid cells include cells naturally occurring in a diploid state as well as cells that have been induced to exist in a diploid state (e.g., by specific regulation, alteration, inactivation, activation, or modification of meiosis, cytokinesis, or DNA replication). A diploid cell may refer to a cell whose entire genome is diploid, or it may refer to a cell that is diploid in a particular genomic locus of interest.
In one embodiment, the fungal cell is a haploid cell, the genome of which is present in one copy. Haploid cells include cells that naturally occur in a haploid state and cells that have been induced to exist in a haploid state (e.g., by specific regulation, alteration, inactivation, activation, or modification of meiosis, cytokinesis, or DNA replication). A haploid cell may refer to a cell whose entire genome is haploid, or it may refer to a cell that is haploid in a particular genomic locus of interest.
Compositions and systems can be introduced into fungal cells using the delivery systems and methods herein, as well as nucleic acids encoding the compositions and systems. Examples of delivery systems include lithium acetate treatment, bombardment, electroporation, kawai et al, 2010,Bioeng Bugs.2010, 11 months to 12 months; 1 (6) 395-403.
In some examples, yeast expression vectors (e.g., those having one or more regulatory elements) may be used. Examples of such vectors include Centromere (CEN) sequences, autonomously Replicating Sequences (ARS), promoters (such as an RNA polymerase III promoter operably linked to a sequence or gene of interest), terminators (such as an RNA polymerase III terminator), origins of replication, and marker genes (e.g., auxotrophs, antibiotics, or other selectable markers). Examples of expression vectors for yeast may include plasmids, yeast artificial chromosomes, 2 μ plasmids, yeast integrative plasmids, yeast replicative plasmids, shuttle vectors, and episomal plasmids.
Fungal biofuel and material production
In one embodiment, the compositions, systems, and methods can be used to produce modified fungi for biofuel and material production. For example, the modified fungi are useful for producing biofuels or biopolymers from fermentable sugars and are optionally capable of degrading plant-derived lignocellulose derived from agricultural waste as a source of fermentable sugars. Foreign genes required for biofuel production and synthesis can be introduced into fungi. In some examples, the gene may encode an enzyme involved in: conversion of pyruvate to ethanol or another product of interest, degradation of cellulose (e.g., cellulase), an endogenous metabolic pathway competing with biofuel production pathways.
In some examples, compositions, systemsThe systems and methods can be used to produce and/or select yeast strains with improved xylose or cellobiose utilization, isoprenoid biosynthesis, and/or lactic acid production. One or more genes involved in the metabolism and synthesis of these compounds may be modified and/or introduced into yeast cells. Examples of methods and genes include lactate dehydrogenase, PDC1 and PDC5, and Ha, S.J. et al (2011) Proc.Natl. Acad.Sci.USA 108 (2): 504-9 and Galazka, J.M. et al (2010) Science 330 (6000): 84-6;t et al, meta eng.2015, 3 months; 28:213-222; stovick V, et al, FEMS Yeast Res.2017, 8 months 1 day; 17 Those described in (5).
Improved plant and yeast cells
The present disclosure also provides improved plants and fungi. The improved fungi may comprise one or more genes introduced by and/or modified by the compositions, systems, and methods herein. The improved plants and fungi may have increased food or feed yield (e.g., higher protein, carbohydrate, nutrient, or vitamin levels), oil and biofuel yield (e.g., methanol, ethanol), tolerance to pests, herbicides, drought, low or high temperatures, excess water, and the like.
The plant or fungus may have one or more modified parts, such as leaves, stems, roots, tubers, seeds, endosperm, ovules and pollen. The moiety may be viable, non-viable, renewable and/or non-renewable.
The modified plants and fungi may include gametes, seeds, embryos, zygotes or somatic cells, offspring and/or hybrids of the modified plants and fungi. The offspring may be clones of the plant or fungus produced, or may result from sexual reproduction by crossing with other individuals of the same species to introgress the further desired trait into the offspring. In the case of multicellular organisms, in particular plants, the cells may be in vivo or ex vivo.
Further application in plants
Further applications of the compositions, systems and methods on plants and fungi include visualization of genetic element dynamics (e.g., as described in Chen B, et al, cell.2013, 12, 19; 155 (7) 1479-91), targeted gene disruption in vivo and in vitro (e.g., malina A et al, genes Dev.2013, 12 months 1; 27 (23) described in 2602-14), epigenetic modifications such as fusion using a TnpB polypeptide and a histone modification enzyme (e.g., as Rusk N, nat methods, 2014, month 1; 11 (1): 28), recognition of transcriptional regulators (e.g., as described in Waldrip ZJ, epigenetics.2014, 9 (9): 1207-11), antiviral treatment of RNA and DNA viruses (e.g., as described in Price AA et al, proc Natl Acad Sci U S a.2015, 12, 112 (19): 6164-9; raman V et al, sci rep.2015, 2, 5: 10833), alteration of genomic complexity such as chromosome number (e.g., as described in Karimi-ash iyani R et al, proc Natl Acad Sci U S a.2019, 8, 112 (36): 11211-6; anton T et al, nucleic.2014, 3 to 4, 5 (2): 163-72), self-cleavage of compositions for controlled inactivation/activation (e.g., as described in Sugano et al, plant Physiol.3, 35 a.2014, 35 a.2014, 8, 112 (36): 11211-6; anton T et al, nucleic.2014, 5, 35 (2), nucleic.35, 35 c.29, 35, 35.35, 35, 35.2014, etc.) The kit for multiplex genome editing was developed (as described in Xing HL et al, BMC Plant biol.2014, 11 months 29; 14:327), starch production (as described in KH et al, front Plant Sci.2015, 23 months; 6:247), targeting multiple Genes in families or pathways (as described in MaX et al, mol Plant.2015, 8 (8): 1274-84), modulating non-coding Genes and sequences (as described in Lowder LG et al, plant Physiol.2015, 169 (2): 971-85), genes for editing trees (as described in Belhaj K et al, plant methods, 2013, 11 months; 9 (1): 39, harrison et al, genes Dev.2014, 9 months 1; 28 (17): 1859-72, phytoplasm X et al, 2015, and 298.301) to obtain specific resistance to the host pest, and obtain specific mutations.
Additional examples of plant and fungal modifications that may be performed using the compositions, systems and methods include similar modifications described in International patent publication Nos. WO2016/099887, WO2016/025131, WO2016/073433, WO2017/066175, WO2017/100158, WO 2017/105991, WO2017/106414, WO2016/100272, WO2016/100571, WO 2016/100568, WO 2016/100562 and WO 2017/019867.
Use in non-human animals
The compositions, systems and methods are useful for studying and modifying non-human animals, e.g., introducing desirable traits and disease recovery capabilities, treating diseases, promoting reproduction, etc. In one embodiment, the compositions, systems and methods can be used to improve reproduction and introduce desirable traits, e.g., increase the frequency of trait-related alleles, introgress into alleles from other varieties/species without creating linkage drag, and create favorable alleles from scratch. Genes and other genetic elements that can be targeted can be screened and identified. Examples of applications and methods include Tait-Burkurd C et al, livestock 2.0-genome editing for fitter, healthier, and more productive farmed animals.genome biol.2018, 11, 26; 19 204; lillico S, agricultural applications of genome editing in farmed animals.Transgenic Res.2019, month 8; 28 (journal 2) 57-60; houston RD et al, harnessing genomics to fast-track genetic improvement in aquaculture. Nat Rev Genet.2020, month 4, 16 days doi:10.1038/s41576-020-0227-y, which is incorporated herein by reference in its entirety. Applications described in other sections (such as therapeutic applications, diagnostic applications, etc.) may also be used with the animals herein.
The compositions, systems and methods are useful for animals, such as fish, amphibians, reptiles, mammals and birds. The animals may be farm and agricultural animals or pets. Examples of farm and agricultural animals include horses, goats, sheep, pigs, cattle, llamas, alpacas and birds, such as chickens, turkeys, ducks and geese. The animal may be a non-human primate such as baboon, pigtail, chimpanzee, marmoset, macaque, marmoset, gold lion marmoset, spider monkey, squirrel monkey, and black long tail monkey. Examples of pets include dogs, cats, horses, wolves, rabbits, ferrets, gerbils, hamsters, dragon cats, chinchilla, guinea pigs, canary, long tail parrots, and parrots.
In one embodiment, one or more genes can be introduced into an animal (e.g., overexpressed therein) to obtain or enhance one or more desired traits. Growth hormone, insulin-like growth factor (IGF-1) may be introduced to increase growth in animals (e.g., pigs or salmon) (such as described in Pursel VG et al, J Reprod Fertil journal of Propril journal 1990;40:235-45;Waltz E,Nature.2017;548:148). The Fat-1 gene (e.g., from caenorhabditis elegans) can be introduced, for example, in pigs to produce a greater proportion of n-3 to n-6 fatty acids (such as described in Li M et al, genetics.2018; 8:1747-54). Phytase (e.g., from Escherichia coli), xylanase (e.g., from Aspergillus niger), beta-glucanase (e.g., from Bacillus licheniformis) may be introduced, for example, in pigs to reduce environmental impact by reducing phosphorus and nitrogen release (such as described in Golovan SP et al, nat Biotechnol.2001;19:741-5; zhang X et al, elife. 2018). Nucleic acid component baits may be introduced, for example, in chickens to induce avian influenza restoring forces (such as those described in Lyall et al, science.2011; 331:223-6). Lysozyme or lysostaphin can be introduced, for example, in goats and cows to induce mastitis restorative forces (such as described in Maga EA et al, food pathway Dis.2006;3:384-92; wall RJ et al, nat Biotechnol.2005; 23:445-51). Histone deacetylases such as HDAC6 can be introduced, for example, in pigs to induce PRRSV recovery capacity (such as described in Lu T. Et al, PLoS one.2017;12: e 0169317). CD163 can be modified (e.g., inactivated or removed) in pigs to introduce PRRSV recovery capability (such as described in Prather RS et al, sci Rep.2017, 10-17; 7 (1): 13371). Similar methods can be used to inhibit or remove viruses and bacteria that may be transmitted from animals to humans (e.g., swine Influenza Virus (SIV) strains, which include influenza c and influenza a subtypes, referred to as H1N1, H1N2, H2N1, H3N2, and H2N3, as well as pneumonia, meningitis, and oedema).
In one embodiment, one or more genes may be modified or edited to obtain disease resistance and production traits. Myostatin (e.g., GDF 8) can be modified to increase muscle growth (such as described in Crispo M et al, PLoS one.2015;10:e0136690;Wang X et al, anim Genet al 2018;49:43-51; khalil K et al, sci Rep.2017;7:7301; kang J-D et al, RSC adv.2017; 7:12541-9), for example, in cows, sheep, goats, catfish and pigs. Pc POLLED can be modified, for example, in cows to induce polledness (such as described in Carlson DF et al, nat Biotechnol.2016; 34:479-81). The KISS1R may be modified, for example, in pigs to induce boar taint (hormone release during sexual maturation, resulting in undesirable meat taste). Dead-end proteins (dnd) can be modified, for example, in salmon to induce sterility (such as described in Wargelius A et al, sci Rep.2016; 6:21284). Nano2 and DDX can be modified, for example, in pigs and chickens to induce sterility (e.g., in a surrogate host) (such as described in Park K-E et al, sci Rep.2017;7:40176; taylor L et al, development.2017; 144:928-34). CD163 may be modified, for example, in pigs to induce PRRSV resistance (such as described in Whitworth KM et al, nat Biotechnol.2015; 34:20-2). RELA may be modified, for example in pigs, to induce ASFV restoration capacity (such as described in Lillico SG et al Sci Rep.2016; 6:21645). CD18 may be modified, for example, in cows to induce the haemolytic recovery capacity of Mannheimia (Pasteurella) such as described in Shanthalingam S et al roc Natl Acad Sci U S A.2016; 113:13186-90. NRAMP1 may be modified, for example, in cows to induce tuberculosis recovery capability (such as described in Gao Y et al Genome biol.2017; 18:13). Endogenous retroviral genes can be modified or deleted for xenografts such as Yang L et al science.2015;350:1101-4; niu D et al science.2017; 357:1303-7. Negative regulators of muscle mass (e.g., myostatin) can be modified (e.g., inactivated) in dogs, for example, to increase muscle mass (such as described in Zou Q et al, J Mol Cell biol.2015, 12 months; 7 (6): 580-3).
Animals such as pigs that have Severe Combined Immunodeficiency (SCID) can be generated (e.g., by modification of RAG 2) to provide a useful model for regenerative medicine, xenografts (also discussed elsewhere herein), and tumor development. Examples of methods and pathways include Lee K et al Proc Natl Acad Sci U S a.2014, 5 months 20 days; 111 7260-5; and Schomberg et al FASEB Journal, month 4 of 2016; 30 (1) those described in journal 571.1.
SNPs in animals can be modified. Examples of methods and pathways include Tan w. et al Proc Natl Acad Sci U S a.2013, 10 month 8; 110 (41) 16526-31; mali P et al science.2013, 2 months 15 days; 339 (6121) 823-6.
Stem Cells (e.g., induced pluripotent Stem Cells) can be modified and differentiated into desired progeny Cells, e.g., as in Heo YT et al, stem Cells dev.2015, 2 months 1 day; 24 (3) 393-402.
A characterization analysis (such as igitity) may be performed on the animals to screen and identify genetic variations associated with economic traits. Genetic variation may be modified to introduce or improve traits such as carcass composition, carcass quality, maternal and reproductive traits, and average daily gain.
Detection composition and detection method
In another aspect, embodiments disclosed herein relate to polynucleotide detection compositions, systems, and methods. The detection composition can comprise any of the TnpB polypeptides discussed above and any one or more omega RNAs. In addition, the compositions and systems may comprise a detection construct. In one exemplary embodiment, the detection construct comprises at least a portion of a single stranded polynucleotide. One or more omega RNAs are configured to bind a target sequence on a target polypeptide. Binding of the TnpB complex to the target sequence activates the TnpB cleavage activity and may further activate the TnpB cleavage activity, whereby the TnpB subsequently cleaves non-target single stranded polynucleotides in an omega RNA independent manner. Thus, the detection construct may be configured such that a detectable signal is generated upon cleavage of the single stranded portion of the detection construct, thereby indicating the presence of the target sequence in the sample. Exemplary detection constructs are discussed in more detail below. In other exemplary embodiments, the composition may further comprise an amplification reagent. The amplification reagents may comprise primers and polymerase and/or reverse transcriptase necessary to amplify the target sequence. In one exemplary embodiment, the amplification reagent is an isothermal amplification reagent. In other exemplary embodiments, the compositions and systems may further comprise a rapid extraction solution that allows for detection of target sequences in a crude sample or minimal purification prior to amplification and/or detection.
Detection construct
The systems and methods described herein comprise detecting a construct. As used herein, a "detection construct" refers to a molecule that can be cleaved or otherwise inactivated by an activated TnpB system protein described herein. The term "detection construct" may alternatively be referred to as a "masking construct". Depending on the nuclease activity of the TnpB protein and the method utilized, the masking construct may be an RNA-based masking construct or a DNA-based masking construct. The nucleic acid-based masking construct comprises a nucleic acid element cleavable by a TnpB protein. Cleavage of the nucleic acid element releases the agent, or produces a conformational change that allows for the production of a detectable signal. Exemplary constructs demonstrating how nucleic acid elements can be used to prevent or mask the generation of a detectable signal are described below, and embodiments of the invention include variants thereof. The masking construct blocks the generation or detection of a positive detectable signal prior to cleavage, or when the masking construct is in an "active" state. In one embodiment, the detection construct is designed to cleave a motif of a particular TnpB protein.
It should be appreciated that in certain exemplary embodiments, minimal background signal may be generated in the presence of the active masking construct. The positive detectable signal may be any signal that can be detected using optical, fluorescent, chemiluminescent, electrochemical, or other detection methods known in the art. The term "positive detectable signal" is used to distinguish between other detectable signals that are detectable in the presence of a masking construct. For example, in one embodiment, a first signal (i.e., a negative detectable signal) can be detected when a masking agent is present or when the TnpB system has not been activated, which is then converted to a second signal (e.g., a positive detectable signal) upon detection of the target molecule and cleavage or inactivation of the masking agent or upon activation of the TnpB protein. Then, the positive detectable signal is the signal detected upon activation of the TnpB protein and in a colorimetric or fluorometric assay may be a decrease in fluorescence or color relative to the control or an increase in fluorescence or color relative to the control, depending on the configuration of the lateral flow matrix, and as further described herein.
In certain exemplary embodiments, the masking construct may comprise an HCR initiation sequence and a cleavage motif, or a cleavable structural element, such as a loop or hairpin, that prevents the initiator from eliciting an HCR reaction. The cleavage motif may be preferentially cleaved by one of the activated TnpB effector proteins. When the cleavage motif or structural element is cleaved by the activated TnpB protein, the initiator is then released to trigger the HCR reaction, detection of which indicates the presence of one or more targets in the sample. In certain exemplary embodiments, the masking construct comprises a hairpin having an RNA loop. When the activated TnpB protein cleaves RNA loops, an initiator can be released to trigger the HCR reaction.
In certain exemplary embodiments, the masking construct can inhibit the production of a gene product. The gene product may be encoded by a reporter construct added to the sample. The masking construct may be an interfering RNA, such as a short hairpin RNA (shRNA) or a small interfering RNA (siRNA), that is involved in an RNA interference pathway. The masking construct may also comprise a microrna (miRNA). When present, the masking construct inhibits expression of the gene product. The gene products may be fluorescent proteins or other RNA transcripts or proteins that can be detected by labeled probes, aptamers, or antibodies if no masking construct is present. After effector protein activation, the masking construct is cleaved or otherwise silenced, allowing expression and detection of the gene product as a positive detectable signal. In a preferred embodiment, the masking construct comprises two or more detectable signals, such as fluorescent signals that can be read on different channels of a fluorometer.
In particular embodiments, the masking construct comprises a silencing RNA that inhibits the production of a gene product encoded by the reporter construct, wherein the gene product produces a detectable positive signal upon expression.
In certain exemplary embodiments, the masking construct may isolate one or more reagents required to generate a detectable positive signal, such that release of the one or more reagents from the masking construct results in the generation of a detectable positive signal. The one or more reagents may be combined to produce a colorimetric signal, a chemiluminescent signal, a fluorescent signal, or any other detectable signal, and may comprise any reagent known to be suitable for such purposes. In certain exemplary embodiments, one or more reagents are sequestered by an RNA aptamer that binds to the one or more reagents. When the effector protein is activated and the RNA or DNA aptamer is degraded after detection of the target molecule, one or more reagents are released.
In certain exemplary embodiments, the masking constructs can be immobilized on a solid substrate in separate discrete volumes (further defined below) and separate the single agents. For example, the reagent may be a bead comprising a dye. When sequestered by the immobilized reagent, the individual beads are too dispersed to produce a detectable signal, but are capable of producing a detectable signal upon release from the masking construct, for example by aggregation or a simple increase in solution concentration. In certain exemplary embodiments, the immobilized masking agent is an RNA or DNA-based aptamer that can be cleaved by an activated effector protein upon detection of the target molecule.
In certain other exemplary embodiments, the masking construct binds to the immobilized reagent in solution, thereby blocking the ability of the reagent to bind to the free, individually labeled binding partner in solution. Thus, after the washing step is applied to the sample, the labeled binding partner may be washed away from the sample in the absence of the target molecule. However, if the effector protein is activated, the masking construct is cleaved to a degree sufficient to interfere with the ability of the masking construct to bind the reagent, thereby allowing the labeled binding partner to bind to the immobilized reagent. Thus, the labeled binding partner remains present after the washing step, indicating the presence of the target molecule in the sample. In certain aspects, the masking construct that binds the immobilization agent is a DNA or RNA aptamer. The immobilization reagent may be a protein and the labeled binding partner may be a labeled antibody. Alternatively, the immobilization reagent may be streptavidin and the labeled binding partner may be labeled biotin. The label on the binding partner used in the above embodiments may be any detectable label known in the art. Furthermore, other known binding partners may be used in accordance with the overall designs described herein.
In certain exemplary embodiments, the masking construct may comprise a ribozyme. Ribozymes are RNA molecules that have catalytic properties. Natural and engineered ribozymes comprise or consist of RNAs that can be targeted by effector proteins disclosed herein. The ribozyme may be selected or engineered to catalyze a reaction that produces a negative detectable signal or that prevents the production of a positive control signal. When the ribozyme is inactivated by the activated effector protein, a negative control signal is generated or a reaction that prevents the generation of a positive detectable signal is removed, thereby allowing the generation of a positive detectable signal. In one exemplary embodiment, the ribozyme may catalyze a colorimetric reaction, thereby causing the solution to assume a first color. When the ribozyme is deactivated, the solution changes to a second color, which is a detectable positive signal. Examples of how ribozymes can be used to catalyze colorimetric reactions are described in Zhao et al, "Signal amplification of glucosamine-6-phosphate based on ribozyme glmS," Biosens bioelectron.2014;16:337-42, and provides examples of how such systems may be modified to function in the context of the embodiments disclosed herein. Alternatively, when present, the ribozyme may produce cleavage products such as RNA transcripts. Thus, detecting a positive detectable signal may comprise detecting uncleaved RNA transcripts that are only produced in the absence of ribozymes.
In one embodiment, the masking construct may be a ribozyme that produces a negative detectable signal, and wherein a positive detectable signal is produced when the ribozyme is inactivated.
In certain exemplary embodiments, the one or more reagents are proteins, such as enzymes, that are capable of promoting the generation of a detectable signal (such as a colorimetric, chemiluminescent, or fluorescent signal) that is inhibited or sequestered such that the protein is unable to generate a detectable signal by binding of one or more DNA or RNA aptamers to the protein. Upon activation of the effector proteins disclosed herein, DNA or RNA aptamers are cleaved or degraded to the point that they no longer inhibit the ability of the protein to produce a detectable signal. In certain exemplary embodiments, the aptamer is a thrombin inhibitor aptamer. In certain exemplary embodiments, the thrombin inhibitor aptamer has the sequence GGGAACAAAGCUGAAGUACUUACCC (SEQ ID NO:64,308). When this aptamer is cleaved, thrombin will become active and cleave peptide colorimetric or fluorogenic substrates. In certain exemplary embodiments, the colorimetric substrate is p-nitroaniline (pNA) covalently linked to a peptide substrate of thrombin. After cleavage by thrombin, pNA is released and turns yellow and is readily visible to the naked eye. In certain exemplary embodiments, the fluorogenic substrate is 7-amino-4-methylcoumarin, a blue fluorophore that can be detected using a fluorescence detector. Inhibitory aptamers can also be used with horseradish peroxidase (HRP), beta-galactosidase, or Calf Alkaline Phosphatase (CAP), and conform to the general principles described above.
In one embodiment, rnase or dnase activity is detected colorimetrically via cleavage of an enzyme inhibiting aptamer. One potential mode of converting dnase or rnase activity into colorimetric signals is to couple cleavage of DNA or RNA aptamers with reactivation of enzymes capable of producing colorimetric output. In the absence of RNA or DNA cleavage, the intact aptamer will bind to the enzyme target and inhibit its activity. The advantage of this readout system is that the enzyme provides an additional amplification step: once released from the aptamer via a parachuting activity (e.g., tnpB parachuting activity), the colorimetric enzyme will continue to produce a colorimetric product, resulting in signal multiplication.
In one embodiment, existing aptamers that inhibit the enzyme by colorimetric readout are used. There are several aptamer/enzyme pairs with colorimetric readout, such as thrombin, protein C, neutrophil elastase and subtilisin. These proteases have pNA-based colorimetric substrates and are commercially available. In one embodiment, novel aptamers that target common colorimetric enzymes are used. Common and robust enzymes, such as β -galactosidase, horseradish peroxidase, or calf intestinal alkaline phosphatase, can be targeted by engineered aptamers designed by selection strategies such as SELEX. Such strategies allow for rapid selection of aptamers with nanomolar binding efficiency and can be used to develop additional enzyme/aptamer pairs for colorimetric readout.
In one embodiment, the masking construct may be a DNA or RNA aptamer and/or may comprise a DNA or RNA tether inhibitor.
In one embodiment, the masking construct may comprise a DNA or RNA oligonucleotide linked to a detectable ligand and a masking component.
In one embodiment, rnase or dnase activity is detected colorimetrically via cleavage of an RNA tether inhibitor. Many common colorimetric enzymes have competitive, reversible inhibitors: for example, beta-galactosidase can be inhibited by galactose. Many of these inhibitors are weak, but their effect can be increased by increasing the local concentration. By correlating the local concentration of the inhibitor with the dnase rnase activity, the colorimetric enzyme and inhibitor pair can be engineered into dnase and rnase sensors. Colorimetric dnase or rnase sensors based on small molecule inhibitors involve three components: colorimetric enzymes, inhibitors, and covalent linkages to both the inhibitor and the enzyme, thereby tethering the inhibitor to the enzyme's bridging RNA or DNA. In the uncleaved configuration, the enzyme is inhibited by an increase in the local concentration of small molecules; when DNA or RNA is cleaved (e.g., by TnpB bypass cleavage), the inhibitor will be released and the colorimetric enzyme will be activated.
In one embodiment, the aptamer or DNA or RNA tether inhibitor may sequester an enzyme, wherein the enzyme upon release from the aptamer or DNA or RNA tether inhibitor by acting on a substrate produces a detectable signal. In one embodiment, the aptamer may be an inhibitor receptor that inhibits the enzyme and prevents catalysis of the enzyme by the substrate to produce a detectable signal. In one embodiment, the DNA or RNA tether inhibitor may inhibit the enzyme and prevent catalysis of the enzyme by the substrate to produce a detectable signal.
In one embodiment, the rnase activity is detected colorimetrically via the formation and/or activation of G-quadruplexes. The G quadruplex in DNA can complex with heme (iron (III) -protoporphyrin IX) to form a deoxyribozyme (DNAzyme) with peroxidase activity. When a peroxidase substrate (e.g., ABTS (2, 2' -diazabis [ 3-ethylbenzothiazoline-6-sulfonic acid ] -diammonium salt)) is provided, the G-tetramer-heme complex causes oxidation of the substrate in the presence of hydrogen peroxide, which then forms a green color in solution. Exemplary G-quadruplex forming DNA sequences are: GGGTAGGGCGGGTTGGGA (SEQ ID NO:64,309). By hybridizing additional DNA or RNA sequences (referred to herein as "staples") to this DNA aptamer, the formation of G-quadruplex structure will be limited. After bypass activation, the staple will be cut, allowing the formation of G quadruplexes and binding of heme. This strategy is particularly attractive because color formation is enzymatic, which means that there is an additional amplification effect in addition to bypass activation.
In one embodiment, the masking construct may comprise an RNA oligonucleotide designed to bind to a G-quadruplex formation sequence, wherein the G-quadruplex structure is formed from the G-quadruplex formation sequence after cleavage of the masking construct, and wherein the G-quadruplex structure produces a detectable positive signal.
In certain exemplary embodiments, the masking constructs can be immobilized on a solid substrate in separate discrete volumes (further defined below) and separate the single agents. For example, the reagent may be a bead comprising a dye. When sequestered by the immobilized reagent, the individual beads are too dispersed to produce a detectable signal, but are capable of producing a detectable signal upon release from the masking construct, for example by aggregation or a simple increase in solution concentration. In certain exemplary embodiments, the immobilized masking agent is a DNA or RNA-based aptamer that can be cleaved by an activated effector protein upon detection of the target molecule.
In one exemplary embodiment, the masking construct comprises a detection agent that changes color depending on whether the detection agent is aggregated or dispersed in the solution. For example, certain nanoparticles (such as colloidal gold) undergo a visible purple-to-red color change when moving from an aggregate to a dispersed particle. Thus, in certain exemplary embodiments, such detection agents may remain aggregated by one or more bridging molecules. At least a portion of the bridging molecule comprises RNA or DNA. Upon activation of the effector proteins disclosed herein, the RNA or DNA portion of the bridging molecule is cleaved, allowing the detection agent to disperse and resulting in a corresponding color change. In certain exemplary embodiments, the detection agent is a colloidal metal. The colloidal metal material may comprise water insoluble metal particles or metal compounds dispersed in a liquid, hydrosol or metal sol. The colloidal metal may be selected from the metals of groups IA, IB, IIB and IIIB of the periodic table of elements, and transition metals, particularly those of group VIII. Preferred metals include gold, silver, aluminum, ruthenium, zinc, iron, nickel, and calcium. Other suitable metals also include metals in all of the various oxidation states: lithium, sodium, magnesium, potassium, scandium, titanium, vanadium, chromium, manganese, cobalt, copper, gallium, strontium, niobium, molybdenum, palladium, indium, tin, tungsten, rhenium, platinum, and gadolinium. The metal is preferably provided in ionic form, derived from suitable metal compounds, such as al3+, ru3+, zn2+, fe3+, ni2+ and ca2+ ions.
The foregoing color change was observed when the RNA or DNA bridge was cleaved by the activated TnpB polypeptide. In certain exemplary embodiments, the particles are colloidal metals. In certain other exemplary embodiments, the colloidal metal is colloidal gold. In certain exemplary embodiments, the colloidal nanoparticle is a 15nm gold nanoparticle (AuNP). Due to the unique surface properties of colloidal gold nanoparticles, when fully dispersed in solution, maximum absorbance was observed at 520nm, and the naked eye appeared red. After AuNP aggregation, they exhibit a red shift in maximum absorbance and darken in color, eventually precipitating out of solution as dark purple aggregates. In certain exemplary embodiments, the nanoparticle is modified to include DNA linkers extending from the surface of the nanoparticle. The individual particles are linked together by single-stranded RNA (ssRNA) or single-stranded DNA (ssDNA) bridges, each end of which hybridizes to at least a portion of a DNA linker. Thus, the nanoparticles will form a network of linked particles and aggregate, thereby assuming a dark precipitate. Upon activation of the TnpB polypeptides disclosed herein, ssRNA or ssDNA bridges will be cleaved, releasing AU NPS from the linked network and producing a visible red color. Exemplary DNA linkers and bridging sequences are listed below. Thiol linkers at the ends of the DNA linkers can be used to conjugate with the surface of the AuNPS. Other forms of conjugation may be used. In certain exemplary embodiments, two AuNP populations may be generated, one DNA linker for each population. This will help to promote proper binding of ssRNA bridges in the correct orientation. In certain exemplary embodiments, the first DNA linker is conjugated through the 3 'end and the second DNA linker is conjugated through the 5' end.
In certain other exemplary embodiments, the masking construct may comprise an RNA or DNA oligonucleotide to which is attached a detectable label and a masking agent for the detectable label. Examples of such detectable label/masking agent pairs are fluorophores and quenchers of fluorophores. Quenching of a fluorophore may occur due to the formation of a non-fluorescent complex between the fluorophore and another fluorophore or non-fluorescent molecule. This mechanism is referred to as ground state complex formation, static quenching or contact quenching. Thus, the RNA or DNA oligonucleotides can be designed such that the fluorophore and quencher are close enough for contact quenching to occur. Fluorophores and cognate quenchers thereof are known in the art and may be selected for this purpose by one of ordinary skill in the art. In the context of the present invention, the particular fluorophore/quencher pair is not critical, except that the choice of fluorophore/quencher pair ensures masking of the fluorophore. Upon activation of the effector proteins disclosed herein, the RNA or DNA oligonucleotides are cleaved, thereby cleaving the proximity between the fluorophore and the quencher required to maintain the contact quenching effect. Thus, detection of fluorophores can be used to determine the presence of a target molecule in a sample.
In certain other exemplary embodiments, the masking construct may comprise one or more RNA oligonucleotides that have one or more metal nanoparticles, such as gold nanoparticles, attached thereto. In one embodiment, the masking construct comprises a plurality of metal nanoparticles crosslinked by a plurality of RNA or DNA oligonucleotides forming a closed loop. In one embodiment, the masking construct comprises three gold nanoparticles crosslinked by three RNA or DNA oligonucleotides forming a closed loop. In one embodiment, cleavage of the RNA or DNA oligonucleotide by the TnpB protein results in a detectable signal from the metal nanoparticle.
In certain other exemplary embodiments, the masking construct may comprise one or more RNA or DNA oligonucleotides having one or more quantum dots attached thereto. In one embodiment, cleavage of the RNA or DNA oligonucleotide by the TnpB protein results in the generation of a detectable signal by a quantum dot.
In one exemplary embodiment, the masking construct may comprise quantum dots. Quantum dots can have multiple linking molecules attached to the surface. At least a portion of the linker molecule comprises RNA or DNA. The linker molecule is attached to the quantum dot at one end and to one or more quenchers along the length of the linker or at the end such that the quenchers remain sufficiently close for quenching of the quantum dot to occur. The linker may be branched. As mentioned above, the quantum dot/quencher pair is not critical, except that the choice of quantum dot/quencher pair ensures masking of the fluorophore. Quantum dots and their cognate quenchers are known in the art and can be selected for this purpose by one of ordinary skill in the art. Upon activation of the effector proteins disclosed herein, the RNA or DNA portion of the linker molecule is cleaved, thereby eliminating the proximity between the quantum dot and one or more quenchers required to maintain the quenching effect. In certain exemplary embodiments, the quantum dots are streptavidin conjugated. RNA or DNA via biotin linkers and recruit quenching molecules with the sequence/5 Biosg/UCUCGUACGUUC/3 IAbRQSp/(SEQ ID NO:64,310) or/5 Biosg/UCUCGUACGUUCUCUCGUACGUUC/3 IAbRQSp/(SEQ ID NO:64,311), wherein/5 Biosg/is biotin-tagged and/3 lAbRQSp/is an lover Black quencher (Iowa Black FQ). Once cleaved, the quantum dots will fluoresce significantly through the activated effectors disclosed herein.
In particular embodiments, the detectable ligand may be a fluorophore and the masking component may be a quencher molecule.
In a similar manner, fluorescence energy transfer (FRET) can be used to generate a detectable positive signal. FRET is a non-radiative process by which photons from a high energy excited fluorophore (i.e., a "donor fluorophore") raise the energy state of an electron in another molecule (i.e., an "acceptor") to a higher vibrational level that excites a singlet state. The donor fluorophore returns to the ground state without emitting the fluorescent character of the fluorophore. The acceptor may be another fluorophore or a non-fluorescent molecule. If the acceptor is a fluorophore, the transferred energy is emitted as a fluorescent characteristic of the fluorophore. If the acceptor is a non-fluorescent molecule, the absorbed energy is lost in the form of heat. Thus, in the context of embodiments disclosed herein, a fluorophore/quencher pair is replaced by a donor fluorophore/acceptor pair attached to an oligonucleotide molecule. When intact, the masking construct produces a first signal (negative detectable signal), as detected by fluorescence or heat emitted from the receptor. Upon activation of the effector proteins disclosed herein, the RNA oligonucleotide is cleaved and FRET is disrupted, such that fluorescence of the donor fluorophore (a positive detectable signal) is now detected.
In certain exemplary embodiments, masking the construct includes using an intercalating dye that alters its absorbance in response to cleavage of long RNA or DNA into short nucleotides. There are several such dyes. For example, pyronine (pyronine) -Y will complex with RNA and form a complex with absorbance at 572 nm. Cleavage of RNA results in absorbance loss and color change. Methylene blue can be used in a similar manner, with absorbance at 688nm varying upon RNA cleavage. Thus, in certain exemplary embodiments, the masking construct comprises an RNA and an intercalating dye complex that alters absorbance after cleavage of the RNA by an effector protein disclosed herein.
In certain exemplary embodiments, the masking construct may comprise an initiator of the HCR reaction. See, e.g., dirks and pierce.pnas101,15275-15728 (2004). HCR reactions utilize the potential energy of two hairpin types. When a single stranded initiator having a portion complementary to a corresponding region on one of the hairpins is released into a previously stable mixture, it opens a type of hairpin. This in turn exposes a single stranded region that opens another type of hairpin. This in turn exposes the same single stranded region as the original initiator. The resulting chain reaction may result in the formation of a nicked duplex that grows until the hairpin supply is exhausted. Detection of the resulting product may be performed on a gel or by colorimetry. Exemplary colorimetric detection methods include, for example, those disclosed in Lu et al, "Ultra-sensitive colorimetric assay system based on the hybridization chain reaction-triggered enzyme cascade amplification ACS Appl Mater Interfaces,2017,9 (1): 167-175, wang et al," An enzyme-free colorimetric assay using hybridization chain reaction amplification and split aptamers "analysis 2015,150,7657-7662, and Song et al," Non-covalent fluorescent labeling of hairpin DNA probe coupled with hybridization chain reaction for sensitive DNA detection. "Applied Spectroscopy,70 (4): 686-694 (2016).
In certain exemplary embodiments, the masking construct inhibits the generation of a detectable positive signal until cleaved or modified by the activated TnpB protein. In one embodiment, the masking construct can inhibit the generation of a detectable positive signal by masking the detectable positive signal or alternatively generating a detectable negative signal.
Amplification reagent
In certain exemplary embodiments, target RNA and/or DNA can be amplified prior to activating a CRISPR effector protein. Any suitable RNA or DNA amplification technique may be used. In certain exemplary embodiments, the RNA or DNA amplification is isothermal amplification. In certain exemplary embodiments, isothermal amplification may be Nucleic Acid Sequence Based Amplification (NASBA), recombinase Polymerase Amplification (RPA), loop-mediated isothermal amplification (LAMP), strand Displacement Amplification (SDA), helicase-dependent amplification (HDA), or Nicking Enzyme Amplification Reaction (NEAR). In certain exemplary embodiments, non-isothermal amplification methods may be used, including, but not limited to, PCR, multiple Displacement Amplification (MDA), rolling Circle Amplification (RCA), ligase Chain Reaction (LCR), or branched amplification methods (RAM).
In certain exemplary embodiments, the RNA or DNA amplification is NASBA, which is initiated from reverse transcription of the target RNA by a sequence-specific reverse primer to produce an RNA/DNA duplex. RNase H is then used to degrade the RNA template, allowing a forward primer containing a promoter, such as the T7 promoter, to bind and initiate extension of the complementary strand, thereby producing a double stranded DNA product. RNA polymerase promoter-mediated transcription of the DNA template then creates copies of the target RNA sequence. Importantly, each new target RNA can be detected by the guide RNA, thereby further increasing the sensitivity of the assay. Binding of the guide RNA to the target RNA then results in activation of the CRISPR effector protein and the method proceeds as described above. Another advantage of NASBA reactions is the ability to be performed under moderate isothermal conditions, e.g., at about 41 ℃, making them suitable for deployment in systems and devices for early direct detection at sites remote from the clinical laboratory.
In certain other exemplary embodiments, a Recombinase Polymerase Amplification (RPA) reaction can be used to amplify the target nucleic acid. The RPA reaction employs a recombinase that is able to pair sequence-specific primers with homologous sequences in duplex DNA. If target DNA is present, DNA amplification is initiated and no other sample manipulation, such as thermal cycling or chemical melting, is required. The entire RPA amplification system is stable as a dry formulation and can be safely transported without refrigeration. The RPA reaction may also be carried out at isothermal temperatures, with an optimum reaction temperature of 37-42 ℃. Sequence specific primers are designed to amplify a sequence comprising the target nucleic acid sequence to be detected. In certain exemplary embodiments, an RNA polymerase promoter, such as a T7 promoter, is added to one of the primers. This produces an amplified double stranded DNA product comprising the target sequence and the RNA polymerase promoter. After or during the RPA reaction, an RNA polymerase is added that will produce RNA from the double stranded DNA template. The amplified target RNA can then in turn be detected by the CRISPR effect system. In this way, the embodiments disclosed herein can be used to detect target DNA. The RPA reaction can also be used to amplify target RNA. The target RNA is first converted to cDNA using reverse transcriptase, and then second strand DNA synthesis is performed, at which point the RPA reaction proceeds as described above.
In embodiments of the invention, nicking enzyme-based amplification may be included. The nicking enzyme may be a CRISPR protein. Thus, introducing nicks into dsDNA can be programmable and sequence specific. FIG. 115 depicts an embodiment of the invention starting with two guidelines designed to target opposite strands of a dsDNA target. According to the invention, the nicking enzyme may be Cpf1, C2C1, cas9 or any ortholog or CRISPR protein that cleaves or is engineered to cleave a single strand of a DNA duplex. The nicked strand may then be extended by a polymerase. In embodiments, the position of the nicks is selected such that the polymerase extends the strand toward the central portion of the target duplex DNA between the nick sites. In certain embodiments, the primer is included in a reaction capable of hybridizing to an extended strand, followed by further polymerase extension of the primer to regenerate the two dsDNA fragments: a first dsDNA comprising a first strand Cpf1 guidance site or both a first strand and a second strand Cpf1 guidance site, and a second dsDNA comprising a second strand Cpf1 guidance site or both a first strand and a second strand Cprf guidance site. These fragments continue to be nicked and extended in a cycling reaction that exponentially amplifies the target region between the nicking sites.
Amplification may be isothermal and temperature may be selected. In one embodiment, the amplification is performed rapidly at 37 degrees celsius. In other embodiments, the temperature of isothermal amplification may be selected by selecting polymerases (e.g., bsu, bst, phi29, klenow fragments, etc.) that can be operated at different temperatures.
Thus, nicking isothermal amplification techniques use a nicking enzyme with fixed sequence preference (e.g., in a nicking enzyme amplification reaction or NEAR), which requires denaturation of the original dsDNA target to allow annealing and extension of the primer that adds the nicking substrate to the target end, using CRISPR nicking enzyme (where the nicking site can be programmed via guide RNA) means that no denaturation step is required, enabling the entire reaction to be truly isothermal. This also simplifies the reaction, since these primers that add nick substrate are different from those used later in the reaction, meaning that NEAR requires two sets of primers (i.e., 4 primers) whereas Cpf1 nick amplification requires only one set of primers (i.e., two primers). This makes the nick Cpf1 amplification simpler and easier to handle, without the need for complex instrumentation to perform denaturation and then cooling to isothermal temperatures.
Thus, in certain exemplary embodiments, the systems disclosed herein can include amplification reagents. Described herein are different components or reagents that can be used for nucleic acid amplification. For example, an amplification reagent as described herein may include a buffer, such as Tris buffer. Tris buffers may be used at any concentration suitable for the desired application or use, including for example, but not limited to, concentrations of 1mM, 2mM, 3mM, 4mM, 5mM, 6mM, 7mM, 8mM, 9mM, 10mM, 11mM, 12mM, 13mM, 14mM, 15mM, 25mM, 50mM, 75mM, 1M, etc. One skilled in the art will be able to determine the appropriate concentration of buffer such as Tris for use with the present invention.
Other components of biological or chemical reactions may include cell lysis components to disrupt or lyse cells to analyze materials therein. The cell lysis component may include, but is not limited to, detergents, salts as described above, such as NaCl, KCl, ammonium sulfate [ (NH 4) 2SO4] or others. Detergents suitable for use in the present invention may include Triton X-100, sodium Dodecyl Sulfate (SDS), CHAPS (3- [ (3-cholestamidopropyl) dimethylammonium ] -1-propanesulfonate), ethyltrimethylammonium bromide, nonylphenoxy polyethoxyethanol (NP-40). The concentration of the detergent may depend on the particular application and may be specific to the reaction in some cases. The amplification reaction may include dNTPs and nucleic acid primers used at any concentration suitable for the present invention, such as concentrations including, but not limited to, 100nM, 150nM, 200nM, 250nM, 300nM, 350nM, 400nM, 450nM, 500nM, 550nM, 600nM, 650nM, 700nM, 750nM, 800nM, 850nM, 900nM, 950nM, 1mM, 2mM, 3mM, 4mM, 5mM, 6mM, 7mM, 8mM, 9mM, 10mM, 20mM, 30mM, 40mM, 50mM, 60mM, 70mM, 80mM, 90mM, 100mM, 150mM, 200mM, 250mM, 300mM, 350mM, 400mM, 450mM, 500mM, etc. Likewise, the polymerases useful according to the present invention can be any specific or universal polymerase known in the art and useful in the present invention, including Taq polymerase, Q5 polymerase, and the like.
In some embodiments, amplification reagents as described herein may be suitable for hot start amplification. In some embodiments, hot-start amplification may be beneficial to reduce or eliminate dimerization of adapter molecules or oligonucleotides, or otherwise prevent unwanted amplification products or artifacts and obtain optimal amplification of the desired product. Many of the components described herein for amplification may also be used for hot start amplification. In some embodiments, reagents or components suitable for use with hot start amplification may be used in place of one or more of the composition components. For example, a polymerase or other reagent that exhibits the desired activity at a particular temperature or other reaction conditions may be used. In some embodiments, reagents designed or optimized for hot-start amplification may be used, e.g., the polymerase may be activated after transposition or after reaching a specific temperature. Such polymerases may be antibody-based or aptamer-based. The polymerases as described herein are known in the art. Examples of such reagents may include, but are not limited to, hot start polymerase, hot start dntps, and photocaged dntps. Such agents are known and available in the art. One skilled in the art will be able to determine the optimal temperature for each reagent.
Amplification of nucleic acids may be performed using a particular thermal cycling machine or apparatus, and may be performed in a single reaction or in batches, such that any desired number of reactions may be performed simultaneously. In some embodiments, amplification may be performed using a microfluidic or robotic device, or amplification may be performed using a manual change in temperature to achieve a desired amplification. In some embodiments, optimization may be performed to obtain optimal reaction conditions for a particular application or material. Those skilled in the art will understand and be able to optimize the reaction conditions to obtain sufficient amplification.
In certain embodiments, detection of DNA with the methods or systems of the invention entails transcription of (amplified) DNA into RNA prior to detection.
It is apparent that the detection methods of the present invention may involve various combined nucleic acid amplification and detection procedures. The nucleic acid to be detected may be any naturally occurring or synthetic nucleic acid, including but not limited to DNA and RNA, which may be amplified by any suitable method to provide an intermediate that can be detected. Detection of the intermediate may be performed by any suitable method, including but not limited to binding and activation of a CRISPR protein that produces a detectable signal moiety by direct or parachuting activity.
LAMP-based isothermal amplification
In certain exemplary embodiments, the LAMP amplification reagents may comprise primers for SARS-COV 2. LAMP reagents may also include colorimetric and/or fluorescent detection reagents such as hydroxynaphthol blue (see, e.g., goto, M.et al, colorimetric detection of loop-mediated isothermal amplification reaction by using hydroxy naphthol blue. Biotechniques,2009.46 (3): pages 167-72), leuco triphenylmethane dyes (see, e.g., miyamoto, S.et al, method for colorimetric detection of double-stranded nucleic acid using leuco triphenylmethane dye. Animal Biochem,2015.473: pages 28-33), and pH-sensitive dyes (see, e.g., tanner, N.A., Y.Zhang, and T.C. Evans, jr., visual detection of isothermal nucleic acid amplification using pH-sensitive dyes Biotechniques,2015.58 (2): pages 59-68); and fluorescence detection (see, e.g., yu et al, clinical Chemistry, hvaa102, doi:10.1093/clinchem/hvaa102 2020, 5-month 12-day), including the use of quenching probes (see, e.g., shirato et al, J Virol methods.2018, 8; 258:41-48.Doi: 10.1016/j.jviromet.2018.05.006). A review of LAMP methods (including OSD-LAMP) for sequence-specific detection is described in Becherer et al, al methods,2020,12,717-746, doi:10.1039/C9AY02246E, incorporated herein by reference.
In embodiments, the LAMP amplification reagents may comprise an Oligonucleotide Strand Displacement (OSD) probe. As used herein, an oligonucleotide strand displacement probe is also referred to herein as an oligonucleotide strand displacement probe or a one-step strand displacement probe. The general concept of using OSD exchanges is depicted in FIG. 1 of Bhadra et al, high-surety isothermal amplification and detection of SARS-CoV-2,including with crude enzymes,doi:10.1101/2020.04.13.039941. OSD probes rely on the binding enthalpy between the target binding probe and the amplicon of the LAMP reaction, resulting in a strand-exchange reaction, resulting in an easily readable change in fluorescent signal. Thus, the results of the LAMP reaction can be read visually or optically by a fluorescent OSD probe.
In one aspect, the OSD probe comprises a sequence specific for a target molecule. OSD probes may comprise pre-hybridized nucleic acid sequences, wherein the target sequence is 5, 6, 7, 8, 9, 10, 11, 12 or 13 nucleotides longer than the chain length with which it hybridizes, allowing sequence-specific interactions with complementary targets, OSD strand-exchange and producing a change in fluorescent signal.
In one aspect, the OSD probe is provided at the following concentrations: about 50nM to 200nM, about 75nM to 150nM, less than or equal to 200nM, 190nM, 180nM, 170nM, 160nM, 150nM, 140nM, 130nM, 120nM, 110nM, 100nM, 90nM, 80nM, 75nM, 65nM or 50nM. The probe may be designed to be complementary to the loop region between the F1c and F2 primer binding sites of the LAMP primer, which may be referred to as the long-footing region. The length of the complementary portion may be between about 9 and 14 nucleotides, more preferably 11-12 nucleotides. In one aspect, the longer strand of the OSD is labeled with a fluorescent molecule at the 5 'or 3' end of the strand. In one aspect, the marker is disposed on the end opposite the designed complementary target region (long footprint region). The short chain is prepared with a quencher at one end of the probe and can be designed to contain a region complementary to a portion of the long chain. OSD probes may be provided as part of a LAMP reagent as described herein, which may include their use on any device, cartridge, or in any composition as provided herein, including in some cases as a lyophilized reagent.
Extraction solution
In certain aspects, embodiments disclosed herein relate to compositions and kits that combine the hands-free cleavage and amplification of target nucleic acids into a single reaction volume. In certain exemplary embodiments, a hands-free lysing reagent may be used to extract nucleic acids from cells and/or viral particles. In contrast to prior art solutions, hands-free removal of the lysis solution eliminates the need for isolation of nucleic acids prior to further amplification. The hands-free lysis reagent may be mixed with an amplification reagent, such as a standard RT-PCR amplification reaction.
In one embodiment, the hands-free lysis solution and isothermal amplification reagents may be lyophilized in a single reaction volume for reconstitution by addition of the sample to be assayed. In certain other embodiments, the hands-free lysis solution and isothermal amplification reagents may be lyophilized and stored on a cartridge or lateral flow strip, as discussed in further detail below.
In certain exemplary embodiments, the single cleavage reaction compositions and kits may further comprise one or more TnpB proteins having parachuting activity and detection constructs. Pairing with one or more TnpB proteins may increase the sensitivity or specificity of the assay. In certain exemplary embodiments, the one or more TnpB proteins may be thermostable TnpB proteins. Exemplary TnpB proteins are disclosed in more detail below.
In certain exemplary embodiments, single cleavage amplification reaction compositions and kits may comprise optimized primers and/or one or more additives. In one aspect, the design optimizes the primers used in the amplification. In particular aspects, isothermal amplification is used alone. In another aspect, isothermal amplification is used with the TnpB system. In either approach, design considerations may follow a rational design to optimize the reaction. In one example, different additives with specific primer, target, tnpB protein, temperature, and other additive concentrations in the reaction can be identified. Optimization can be performed to reduce the number of steps and buffer exchanges that must occur in the reaction, simplify the reaction, and reduce the risk of contamination in the transfer step. In one aspect, it is contemplated that inhibitors, such as proteinase K, may be added so that buffer exchange may be reduced. Similarly, optimizing salt levels and the type of salt used may further facilitate and optimize one-pot detection (one-pot detection) as disclosed herein. In one aspect, when such an amplification method is used with the bead concentration in the lysis step, potassium chloride may be used instead of sodium chloride.
In one embodiment, the compositions and kits may further comprise nucleic acid binding beads. The beads may be used to capture, concentrate or otherwise enrich a particular material. The beads may be magnetic and may be provided to capture nucleic acid material. In another aspect, the beads are silica beads. Beads can be used in the extraction steps of the methods disclosed herein. The beads may optionally be used with the methods described herein, including one-pot methods, which allow for concentration of viral nucleic acids from a large volume of sample (such as saliva or a swab sample) to allow for a single one-pot reaction method. The concentration of the desired target molecule can be increased by about 10-fold, 50-fold, 100-fold, 200-fold, 500-fold, 800-fold, 1000-fold, 1500-fold, 2000-fold, 2500-fold, 3000-fold or more.
Magnetic beads in PEG and salt solutions are preferred in one aspect and in embodiments bind to viral RNA and/or DNA, allowing for simultaneous concentration and cleavage. Silica beads may be used in another aspect. It is contemplated to use capture moieties such as oligonucleotide functionalized beads. The beads may be used with extraction reagents, allowing incubation with sample and lysis/extraction buffer, concentrating the target molecules on the beads. When used with the cartridge device described in detail elsewhere herein, the magnet may be activated and the beads collected, optionally washed with extraction buffer and one or more washes performed. Advantageously, the beads can be used in a one-pot process and system without the need for additional washing of the beads, allowing for a more efficient process without increasing the risk of contamination in a multi-step process. The beads may be utilized with isothermal amplification as detailed herein, and the beads may flow into the amplification chamber of the cartridge or be maintained in a pot for the amplification step. After heating, the nucleic acid may be released from the beads.
Diagnostic device
The systems described herein may be included on a diagnostic device. A variety of substrates and configurations may be used. The device may be capable of defining a plurality of separate discrete volumes within the device. As used herein, "discrete volume alone" refers to a discrete space, such as a container, receptacle, or other defined volume or space that may be defined by features that prevent and/or inhibit migration of target molecules, for example, a volume or space defined by physical features (such as walls, e.g., pore walls, tube walls, or droplet surfaces), which may be impermeable or semi-permeable, or as defined by other means that may contain a sample within a defined space, such as chemical, diffusion rate limited, electromagnetic, or light illumination, or any combination thereof. Individual discrete volumes can be identified by molecular tags such as nucleic acid barcodes. By "diffusion rate limited" (e.g., diffusion limited volume) is meant a space or volume that is effectively defined by diffusion constraints (as is the case with two parallel laminar flows, where diffusion will limit migration of target molecules from one flow to the other) but only space into which certain molecules or reactions can enter. By "chemically" defined volume or space is meant a space where only certain target molecules may be present due to their chemical or molecular properties (such as size), wherein for example a gel bead may exclude certain substances from entering the bead, such as by the surface charge of the bead, the size of the matrix, or other bead physical properties that may allow selection of substances that may enter the interior of the bead, but not others. By "electromagnetically" defined volume or space is meant a space in which the electromagnetic properties (such as charge or magnetism) of the target molecule or its support can be used to define certain regions in space (such as capture of magnetic particles within a magnetic field or directly on a magnet). By "optically" defined volume is meant any region of space that can be defined by illuminating it with visible, ultraviolet, infrared or other wavelengths of light such that only target molecules within the defined space or volume can be labeled. One advantage of using a wall-free or semi-permeable discrete volume is that some reagents (such as buffers, chemical activators, or other agents) can pass through the discrete volume, while other materials such as target molecules can remain in the discrete volume or space. Typically, the discrete volume will comprise a fluid medium (e.g., an aqueous solution, oil, buffer, and/or medium capable of supporting cell growth) suitable for labeling the target molecule with the indexable nucleic acid identifier under conditions that allow for labeling. Exemplary discrete volumes or spaces that can be used in the disclosed methods include droplets (e.g., microfluidic droplets and/or emulsion droplets), hydrogel beads or other polymer structures (e.g., polyethylene glycol diacrylate beads or agarose beads), tissue slides (e.g., fixed formalin paraffin embedded tissue slides having specific areas, volumes or spaces defined by chemical, optical, or physical means), microscope slides whose areas are defined by depositing reagents in an ordered array or random pattern, tubes (such as centrifuge tubes, microcentrifuge tubes, test tubes, cuvettes, conical tubes, etc.), bottles (such as glass bottles, plastic bottles, ceramic bottles, conical bottles, scintillation bottles, etc.), wells (such as wells in a plate), plates, pipettes, or pipette heads, etc. In certain embodiments, the compartments are water droplets in a water-in-oil emulsion. In particular embodiments, any application, method, or system described herein that requires precise or uniform volumes may employ the use of an acoustic liquid dispenser (acoustic liquid dispenser).
In some embodiments, the separate discrete volumes may be droplets.
In certain exemplary embodiments, the device comprises a flexible material substrate on which a plurality of points may be defined. Flexible substrate materials suitable for diagnostic and biosensing are known in the art. The flexible substrate material may be made from plant-derived fibers such as cellulose fibers, or may be made from flexible polymers (such as flexible polyester films and other polymer types). Within each defined point, the reagents of the system described herein are applied to the respective point. Each spot may contain the same reagent, except for a different guide RNA or set of guide RNAs, or, where applicable, different detection aptamers to screen multiple targets at once. Thus, the systems and devices herein may be capable of screening samples from multiple sources (e.g., multiple clinical samples from different individuals) for the presence of the same target, or a limited number of targets, or screening aliquots of a single sample (or multiple samples from the same source) for the presence of multiple different targets in the sample. In certain exemplary embodiments, elements of the systems described herein are freeze-dried onto a paper or cloth substrate. Exemplary flexible material-based substrates that can be used in certain exemplary devices are disclosed in Pardee et al cell 2016,165 (5): 1255-66 and Pardee et al cell 2014,159 (4): 950-54. Suitable flexible material-based substrates for use with biological fluids, including blood, are disclosed in international patent application publication No. WO/2013/071301 to shekkopnyas et al entitled "Paper based diagnostic test", U.S. patent application publication No. 2011/011517 to Siegel et al entitled "Paper-based microfluidic systems", and Shafiee et al "Paper and Flexible Substrates as Materials for Biosensing Platforms to Detect Multiple Biotargets" Scientific Reports 5:8719 (2015). Other flexible base materials, including those suitable for use in wearable diagnostic devices, are disclosed in Wang et al, "Flexible Substrate-Based Devices for Point-of-Care Diagnostics" Cell 34 (11): 909-21 (2016). Other flexible base materials may include nitrocellulose, polycarbonate, methyl ethyl cellulose, polyvinylidene fluoride (PVDF), polystyrene, or glass (see, e.g., US 20120238008). In certain embodiments, the discrete volumes are separated by a hydrophobic surface, such as, but not limited to, a wax, a photoresist, or a solid ink.
In some embodiments, a dosimeter or badge may be provided that acts as a sensor or indicator so that the wearer is informed of exposure to certain microorganisms or other agents. For example, the systems described herein may be used to detect a particular pathogen. Likewise, the aptamer-based embodiments disclosed above may be used to detect polypeptides as well as other agents, such as chemical agents, to which a particular aptamer may bind. Such devices may be used to monitor soldiers or other military personnel, as well as clinicians, researchers, hospital staff, etc. to provide information related to exposure to potentially dangerous agents as quickly as possible, such as for biological or chemical warfare agent detection. In other embodiments, such surveillance badges may be used to prevent exposure of immunocompromised patients, burn patients, patients receiving chemotherapy, children or elderly individuals to dangerous microorganisms or pathogens.
In particular embodiments, each individual discrete volume further comprises one or more detection aptamers comprising a masked RNA polymerase promoter binding site or a masked primer binding site. Thus, each individual discrete volume may also contain nucleic acid amplification reagents.
In particular embodiments, the target molecule may be a target DNA and the separate discrete volumes further comprise primers that bind to the target DNA and comprise an RNA polymerase promoter.
Sample sources that can be analyzed using the systems and devices described herein include biological or environmental samples of a subject. The environmental sample may include a surface or a fluid. Biological samples may include, but are not limited to, saliva, blood, plasma, serum, stool, urine, sputum, mucus, lymph, synovial fluid, spinal fluid, cerebrospinal fluid, swabs from skin or mucous membranes, or combinations thereof. In an exemplary embodiment, the environmental sample is taken from a solid surface, such as a surface used to prepare food or other sensitive compositions and materials.
In other exemplary embodiments, elements of the systems described herein may be placed on a disposable substrate, such as a swab or cloth for wiping a surface or sample fluid. For example, the system may be used to test for the presence of pathogens on a food product by wiping the surface of the food product (such as fruit or vegetables). Similarly, the disposable substrate may be used to wipe other surfaces to detect certain microorganisms or agents, such as for security screening. The disposable substrate may also have application in forensic science, where the CRISPR system is designed for detection, e.g., to identify DNA SNPs that can be used to identify suspects, or to identify certain tissue or cell markers to determine the type of biological substance present in a sample. Likewise, the disposable substrate may be used to collect samples from a patient, such as saliva samples in the oral cavity, or skin swabs. In other embodiments, a sample or swab may be collected from the meat product to detect the presence of contaminants on or within the meat product.
Food, clinical, industrial and other environmental settings require near real-time microbiological diagnosis (see, e.g., lu TK, bowers J and Koeris MS., trends Biotechnol.2013, month 6; 31 (6): 325-7). In certain embodiments, the invention is used to rapidly detect a food-borne pathogen using guide RNAs specific for pathogens (e.g., campylobacter jejuni, clostridium perfringens, certain species of salmonella, escherichia coli, bacillus cereus, listeria monocytogenes (Listeria monocytogenes), certain species of shigella, staphylococcus aureus, staphylococcal enteritis (Staphylococcal enteritis), streptococcus, vibrio cholerae, vibrio parahaemolyticus, vibrio vulnificus, yersinia enterocolitica and yersinia pseudotuberculosis (Yersinia pseudotuberculosis), certain species of brucella, corynebacterium ulcerans (Corynebacterium ulcerans), cox belli (Coxiella burnetii), or shigella dysenteriae).
In certain embodiments, the device is or includes a flow test strip. For example, lateral flow test strips allow detection by color. The reporter molecule is modified to have a first molecule attached to the 5 'end (such as, for example, FITC) and a second molecule attached to the 3' end (such as, for example, biotin) (or vice versa). The lateral flow test strip is designed to have two capture lines, with an anti-first molecule (e.g., anti-FITC) antibody hybridized at the first line and an anti-second molecule (e.g., anti-biotin) antibody hybridized at the second downstream line. When the reaction flows along the strip, the uncleaved reporter will bind to the anti-first molecule antibody at the first capture line, while the cleaved reporter will release the second molecule and allow the second molecule to bind at the second capture line. A second molecule sandwich antibody, e.g., an antibody conjugated to a nanoparticle (such as a gold nanoparticle), will bind any second molecule at the first or second line and produce a strong readout/signal (e.g., color). As more reporter molecules are cleaved, more signal will accumulate on the second capture line and less signal will appear on the first line. In certain aspects, the invention relates to the use of a follow-up test strip as described herein for detecting a nucleic acid or polypeptide. In certain aspects, the invention relates to a method of detecting a nucleic acid or polypeptide with a flow test strip as defined herein, e.g. a (lateral) flow test or a (lateral) flow immunochromatographic assay.
Embodiments disclosed herein relate to lateral flow test devices comprising a TnpB system. The device may include a lateral flow substrate for detecting the TnpB cleavage reaction. Substrates suitable for use in lateral flow assays are known in the art. These may include, but are not necessarily limited to, films or pads made of cellulose and/or glass fibers, polyesters, nitrocellulose, or absorbent pads (J Saudi Chem Soc 19 (6): 689-705; 2015). The TnpB system (i.e., one or more TnpB systems and corresponding reporter constructs) is added to the lateral flow substrate at a defined reagent portion of the lateral flow substrate (typically at one end of the lateral flow substrate). The reporter construct used in the context of the present invention comprises a first molecule and a second molecule linked by a DNA linker. The lateral flow substrate further includes a sample portion. The sample portion may be identical, continuous or adjacent to the reagent portion. The lateral flow test strip also includes a first capture line, typically a horizontal line through the device, although other configurations are possible. The first capture area is adjacent to the lateral flow substrate and is located on the same end of the lateral flow substrate as the sample loading portion. A first binding agent that specifically binds to a first molecule of the reporter construct is immobilized or otherwise immobilized to the first capture region. The second capture area is located toward an end of the lateral flow substrate opposite the first binding area. The second binding agent is immobilized or otherwise immobilized to the second capture region. The second binding agent specifically binds to the second molecule of the reporter construct, or the second binding agent may bind to a detectable ligand. For example, the detectable ligand may be a particle that can be visually detected when it aggregates, such as a colloidal particle. The particles may be modified with antibodies that specifically bind to the second molecule on the reporter construct. If the reporter construct is not cleaved, it will facilitate accumulation of the detectable ligand at the first binding region. If the reporter construct is cleaved, the detectable ligand is released to flow to the second binding region. In such embodiments, the second binding agent is an agent capable of binding the detectable ligand specifically or non-specifically to an antibody on the detectable ligand. Examples of binding agents suitable for use in such embodiments include, but are not limited to, protein a and protein G.
The lateral support substrate may be located within a housing (see, e.g., "Rapid Lateral Flow Test Strips" Merck Millipore 2013). The housing may comprise at least one opening for loading the sample and a second single opening or separate openings allowing reading of the detectable signal generated at the first capture area and the second capture area.
The TnpB system may be lyophilized to a lateral flow substrate and packaged as a ready-to-use device, or the TnpB system may be added to the reagent portion of the lateral flow substrate at the time of use of the device. The sample to be screened is loaded into the sample loading portion of the lateral flow substrate. The sample must be a liquid sample or a sample dissolved in a suitable solvent (typically aqueous). The liquid sample reconstitutes the TnpB reagent so that the TnpB reaction can occur. The liquid sample begins to flow from the sample portion of the substrate to the first capture area and the second capture area. The complete reporter construct binds at the first capture region by binding between the first binding agent and the first molecule. Likewise, the detection agent will begin to collect at the first binding region by binding to the second molecule on the complete reporter construct. If the target molecule is present in the sample, the TnpB bypass effect is activated. When the activated TnpB comes into contact with the bound reporter construct, the reporter construct is cleaved, releasing the second molecule to flow further along the lateral flow substrate to the second binding region. The released second molecule is then captured at the second capture region by binding to the second binding agent, wherein additional detection agent may also accumulate by binding to the second molecule. Thus, if no target molecule is present in the sample, a detectable signal will appear at the first capture zone, whereas if a target molecule is present in the sample, a detectable signal will appear at the location of the second capture zone.
Specific binding integration molecules include any member of the binding pairs useful in the present invention. Such binding pairs are known to those of skill in the art and include, but are not limited to, antibody-antigen pairs, enzyme-substrate pairs, receptor-ligand pairs, and streptavidin-biotin. In addition to such known binding pairs, new binding pairs may be specifically designed. One feature of a binding pair is the binding between the two members of the binding pair.
If TnpB has DNA cleavage activity, the oligonucleotide linker with molecules at either end may comprise DNA. Oligonucleotide linkers may be single-stranded or double-stranded, and in certain embodiments, they may contain RNA and DNA regions. The oligonucleotide linkers may be of different lengths, such as 5-10 nucleotides, 10-20 nucleotides, 20-50 nucleotides or more.
In some embodiments, the polypeptide identifier element includes an affinity tag, such as a Hemagglutinin (HA) tag, myc tag, FLAG tag, V5 tag, chitin Binding Protein (CBP) tag, maltose Binding Protein (MBP) tag, GST tag, polyhis tag, and fluorescent protein (e.g., green Fluorescent Protein (GFP), yellow Fluorescent Protein (YFP), cyan Fluorescent Protein (CFP), dsRed, mCherry, kaede, kindling, and derivatives thereof, FLAG tag, myc tag, AU1 tag, T7 tag, olas tag, glu-Glu tag, VSV tag, or combinations thereof.
In certain exemplary embodiments, a lateral flow device comprises a lateral flow substrate comprising a first end for applying a sample. The first region is loaded with a detectable ligand, such as those disclosed herein, e.g., gold nanoparticles. The gold nanoparticles may be modified with a primary antibody, such as an anti-FITC antibody. The first region further comprises a detection construct. In one exemplary embodiment, disclosed herein is a DNA detection construct and a TnpB system. In one exemplary embodiment, and for further illustration purposes, the DNA construct may comprise a FAM molecule on a first end of the detection construct and biotin on a second end of the detection construct. The first test strip is positioned upstream of the flow of solution at the first end of the lateral flow substrate. The test strip may comprise a biotin ligand. Thus, when the DNA detection construct is present in its initial state, i.e., in the absence of target, the FAM molecule on the first end will bind to the anti-FITC antibody on the gold nanoparticle and biotin on the second end of the DNA construct will bind to the biotin ligand, allowing the detectable ligand to accumulate on the first test, producing a detectable signal. The generation of a detectable signal at the first band indicates the absence of the target ligand. In the presence of the target, a TnpB complex is formed and TnpB is activated, resulting in cleavage of the DNA detection construct. In the absence of the complete DNA detection construct, colloidal gold would flow through the second test strip. The lateral flow device may include a second strip upstream of the first strip. The second band may comprise molecules capable of binding to antibody-labeled colloidal gold molecules, such as anti-rabbit antibodies that can bind to rabbit anti-FTIC antibodies on colloidal gold. Thus, in the presence of one or more targets, the detectable ligand will accumulate at the second band, which indicates the presence of one or more targets in the sample.
In certain exemplary embodiments, the device is a microfluidic device that produces and/or combines different droplets (i.e., separate discrete volumes). For example, a first set of droplets may be formed that contain the sample to be screened, and a second set of droplets may be formed that contain elements of the systems described herein. The first set of droplets and the second set of droplets are then combined, and the combined set of droplets is then subjected to a diagnostic method as described herein. The microfluidic devices disclosed herein may be silicone-based chips and may be fabricated using a variety of techniques including, but not limited to, thermoforming, elastomer molding, injection molding, LIGA, soft lithography, silicon fabrication, and related thin film processing techniques. Suitable materials for fabricating the microfluidic device include, but are not limited to, cyclic Olefin Copolymer (COC), polycarbonate, poly (dimethylsiloxane) (PDMS), and poly (methyl acrylate) (PMMA). In one embodiment, soft lithography in PDMS may be used to fabricate microfluidic devices. For example, photolithography may be used to fabricate a mold that defines the locations of flow channels, valves, and filters within a substrate. The base material is poured into a mold and left to stand to create a stamp. The stamp is then sealed to a solid support such as, but not limited to, glass. Passivating agents may be required because some polymers such as PDMS absorb some proteins and may inhibit the hydrophobicity of certain biological processes (Schoffner et al Nucleic Acids Research,1996, 24:375-379). Suitable passivating agents are known in the art and include, but are not limited to, silanes, parylenes, n-dodecyl-b-D-tomato glycoside (matoside) (DDM), pluronic, tween-20, other similar surfactants, polyethylene glycol (PEG), albumin, collagen, and other similar proteins and peptides.
In certain exemplary embodiments, the system and/or apparatus may be adapted to convert to flow cytometry readout or allow all sensitive and quantitative measurements of millions of cells in a single experiment, and to retrofit existing flow-based methods, such as PrimeFlow assays. In certain exemplary embodiments, the cells may be cast as droplets containing unpolymerized gel monomer, which may then be cast as single cell droplets suitable for analysis by flow cytometry. The detection construct comprising the fluorescent detectable label may be poured into a droplet comprising unpolymerized gel monomer. The gel monomer polymerizes to form beads within the droplets. Since gel polymerization is performed by free radical formation, the fluorescent reporter is covalently bound to the gel. The detection construct may be further modified to include a linker, such as an amine. The quencher may be added after gel formation and bound to the reporter construct via a linker. Thus, the quencher does not bind to the gel and can diffuse freely when the reporter is cleaved by TnpB. Amplification of the signal in the droplet can be achieved by coupling the detection construct to a hybrid chain reaction (HCR initiator) amplification. The DNA/RNA hybrid hairpin may be incorporated into a gel, which may comprise a hairpin loop with an rnase sensitive domain. By protecting the strand displacement footholds within the hairpin loop with the rnase sensitive domain, the HCR initiator can be selectively deprotected after cleavage of the hairpin loop by the TnpB system. After deprotection of the HCR initiator via the foothold mediated strand displacement, fluorescent HCR monomers can be washed into the gel to effect signal amplification with initiator deprotection.
Examples of microfluidic devices that may be used in the context of the present invention are described in Hour et al, "Direct Detection and drug-resistance profiling of bacteremias using inertial microfluidics" Lap chip.15 (10): 2297-2307 (2016).
In the systems described herein, the system may be further incorporated into a wearable medical device that evaluates a biological sample of a subject outside of a clinical setting, such as a biological fluid, and reports the results of the assay remotely to a central server accessible to a medical care professional. The device may include the ability to self-sample Blood, such as the devices disclosed in U.S. patent application publication No. 2015/0342509 to Pemers et al entitled "Needle-free Blood Draw," and U.S. patent application publication No. 2015/0065821 to Andrew Conrad entitled "Nanoparticle Phoresis.
In some embodiments, the discrete volumes alone are microwells.
In certain exemplary embodiments, the device can include individual wells, such as microplate wells. The size of the microplate wells may be the size of standard 6, 24, 96, 384, 1536, 3456 or 9600 sized wells. In certain exemplary embodiments, elements of the systems described herein may be freeze-dried and applied to the surface of the well prior to dispensing and use.
The devices disclosed herein may also include inlet and outlet ports or openings, which in turn may be connected to valves, tubes, channels, compartments, and syringes and/or pumps for introducing fluids into and extracting fluids from the devices. The device may be connected to a fluid flow actuator that allows directional movement of fluid within the microfluidic device. Exemplary actuators include, but are not limited to, syringe pumps, mechanically actuated recirculation pumps, electroosmotic pumps, bulbs, bellows, diaphragms, or air bubbles intended to force movement of a fluid. In certain exemplary embodiments, the device is connected to a controller having programmable valves that work together to move fluid through the device. In certain exemplary embodiments, the device is connected to a controller, discussed in further detail below. The device may be connected to the flow actuator, controller and sample loading device by tubing that terminates in a metal pin for insertion into an inlet port on the device.
As shown herein, the elements of the system are stable upon lyophilization, thus embodiments are also contemplated that do not require a support device, i.e., the system can be applied to any surface or fluid that will support the reactions disclosed herein, and allow for detection of a positive detectable signal from the surface or solution. In addition to freeze drying, the system can also be stored and utilized stably in particulate form. Polymers useful in forming suitable particulate forms are known in the art.
In some embodiments, individual discrete volumes are defined on a solid substrate. In some embodiments, the individual discrete volumes are points defined on the substrate. In some embodiments, the substrate may be a flexible material substrate, for example, including but not limited to a paper substrate, a fabric substrate, or a flexible polymer-based substrate. In particular embodiments, the flexible material substrate is a paper substrate or a flexible polymer-based substrate.
In certain embodiments, tnpB binds to each discrete volume in the device. Each discrete volume may contain a different ωrna specific for a different target molecule. In certain embodiments, the sample is exposed to a solid substrate comprising more than one discrete volume, each discrete volume comprising omega RNA specific for the target molecule. Without being bound by theory, each omega RNA will capture its target molecule from the sample, and the sample need not be separated into separate assays. Thus, valuable samples can be preserved. The effector protein may be a fusion protein comprising an affinity tag. Affinity tags are well known in the art (e.g., HA tag, myc tag, flag tag, his tag, biotin). Effector proteins may be linked to biotin molecules and discrete volumes may contain streptavidin. In other embodiments, the CRISPR effector protein is bound by an antibody specific for the effector protein. Methods of binding CRISPR enzymes have been previously described (see, e.g., US20140356867 A1).
The devices disclosed herein may also include elements of point of care (POC) devices known in the art for analyzing samples by other methods. See, e.g., st John and Price, "Existing and Emerging Technologies for Point-of-Care Testing" (Clin Biochem Rev.2014, 8 months; 35 (3): 155-167).
The present invention may be used with a wireless lab-on-a-chip (LOC) diagnostic sensor system (see, e.g., us patent No. 9,470,699"Diagnostic radio frequency identification sensors and applications thereof"). In certain implementations, the invention is performed in a LOC controlled by a wireless device (e.g., mobile phone, personal Digital Assistant (PDA), tablet computer) and the results reported to the device.
Radio Frequency Identification (RFID) tag systems include RFID tags that transmit data for receipt by an RFID reader (also referred to as an interrogator). In a typical RFID system, a single object (e.g., a store commodity) is equipped with a relatively small tag that contains a transponder. The transponder has a memory chip with a unique electronic product code. The RFID reader transmits a signal activating a transponder within the tag using a communication protocol. Thus, the RFID reader is able to read data and write data to the tag. In addition, the RFID tag reader processes the data according to the RFID tag system application. Currently, there are RFID tags of both passive and active types. Passive RFID tags do not contain an internal power source, but are powered by a radio frequency signal received from an RFID reader. Alternatively, the active-type RFID tag contains an internal power source that allows the active-type RFID tag to have a larger transmission range and storage capacity. The use of passive tags with active tags depends on the particular application.
Lab-on-a-chip technology is described in detail in the scientific literature, and a tag consists of a plurality of microfluidic channels, input wells or chemical wells. The reaction in the wells may be measured using Radio Frequency Identification (RFID) tag technology because the conductive leads of the RFID electronic chip may be directly connected to each test well. The antenna may be printed or mounted in another layer of the electronic chip or directly on the back of the device. In addition, the leads, the antenna, and the electronic chip may be embedded in the LOC chip, thereby preventing the electrode or the electronic device from being shorted. Since LOC allows complex sample separation and analysis, this technique allows LOC testing to be performed independently of complex or expensive readers. Instead, a simple wireless device, such as a mobile phone or PDA, may be used. In one embodiment, the wireless device also controls the separation and control of microfluidic channels to enable more complex LOC analysis. In one embodiment, an LED and other electronic measurement or sensing devices are included in the LOC-RFID chip. Without being bound by theory, this technique is disposable and allows complex tests to be performed outside of the laboratory that require separation and mixing.
In a preferred embodiment, the LOC may be a microfluidic device. The LOC may be a passive type chip, wherein the chip is powered and controlled by wireless means. In certain embodiments, the LOC includes a microfluidic channel for containing reagents and a channel for introducing a sample. In certain embodiments, a signal from the wireless device transmits power to the LOC and activates the mixing of the sample with the assay reagent. In particular, in the case of the present invention, the system may include a masking agent, a CRISPR effect protein, and a guide RNA specific for the target molecule. When LOC is activated, the microfluidic device can mix the sample and assay reagents. After mixing, the sensor detects the signal and transmits the result to the wireless device. In certain embodiments, the unmasking agent is a conductive RNA molecule. The conductive RNA molecules may be attached to a conductive material. The conductive molecules may be conductive nanoparticles, conductive proteins, metal particles attached to proteins or latex or other conductive beads. In certain embodiments, if DNA or RNA is used, the conductive molecule may be directly attached to the matching DNA or RNA strand. The release of the conductive molecule may be detected by a sensor. The assay may be a one-step process.
Since the conductivity of the surface area can be accurately measured, a one-time wireless RFID radiometry can yield quantitative results. Furthermore, the test area may be very small, allowing more tests to be performed in a given area, thereby saving costs. In certain embodiments, multiple target molecules are detected using separate sensors each associated with a different CRISPR effector protein and guide RNAs immobilized to the sensors. Without being bound by theory, the wireless device may distinguish between activation of different sensors.
In addition to the conductive methods described herein, other methods that rely on RFID or bluetooth as a basic low cost communication and power platform for disposable RFID assays may also be used. For example, optical means may be used to assess the presence and level of a given target molecule. In certain embodiments, the optical sensor detects exposure of the fluorescent masking agent.
In certain embodiments, the device of the present invention may comprise a hand-held portable device for diagnostic reading of assays (see, e.g., vashist et al, commercial Smartphone-Based Devices and Smart Applications for Personalized Healthcare Monitoring and Management, diagnostics 2014,4 (3), 104-128; mReader from Mobile Assay; and Holomic rapid diagnostic test reader).
As noted herein, certain embodiments allow detection via colorimetric changes, which have certain attendant benefits when used in POC situations and/or in resource-starved environments where access to more complex detection devices to read signals may be limited. However, the portable embodiments disclosed herein may also be coupled with a handheld spectrophotometer capable of detecting signals outside the visible range. Examples of hand-held spectrophotometer devices that may be used in combination with the present invention are described in Das et al, "Ultra-portable, wireless smartphone spectrophotometer for rapid, non-destructive testing of fruit priority," Nature Scientific reports.2016,6:32504, DOI:10.1038/srep32504. Finally, in certain embodiments utilizing quantum dot-based masking constructs, the use of hand-held UV light or other suitable means may be successfully used to detect the signal due to the near-complete quantum yield provided by the quantum dots.
Method for detecting nucleic acid
The low cost and adaptability of the detection platform makes it suitable for a variety of applications including (i) general RNA/DNA quantification, (ii) rapid, multiplex RNA/DNA and protein expression detection, and (iii) sensitive detection of target nucleic acids, peptides and proteins in clinical and environmental samples. In addition, the systems disclosed herein may be suitable for detecting transcripts in biological environments such as cells. Given the high specificity of CRISPR effectors described herein, allele-specific expression of transcripts or disease-associated mutations in living cells can be tracked.
In some embodiments, the method comprises detecting a target nucleic acid in a sample, comprising partitioning the sample or sample set into one or more separate discrete volumes comprising a TnpB system as described herein. The sample or group of samples can then be incubated under conditions sufficient to allow binding of the one or more omega RNAs to the one or more target molecules, and the TnpB protein can be activated via binding of the one or more omega RNAs to the one or more target molecules, wherein activating the CRISPR effector protein results in modification of the detection construct such that a detectable positive signal is generated. One or more detectable positive signals may then be detected, the detection indicating the presence of one or more target molecules in the sample.
In some embodiments, the methods of the invention comprise detecting a polypeptide in a sample, comprising partitioning the sample or set of samples into a set of separate discrete volumes comprising a peptide detection aptamer as described herein and TnpB. The sample or group of samples may then be incubated under conditions sufficient to allow the peptide detection aptamer to bind to one or more target molecules, wherein binding of the aptamer to the corresponding target molecule exposes an RNA polymerase binding site or primer binding site, resulting in the generation of trigger RNA (trigger RNA). The TnpB can then be activated via binding of one or more omega RNAs to the trigger RNA, wherein activation of the TnpB protein results in modification of the detection construct such that a detectable positive signal is generated. A detectable positive signal may then be detected, detection of the detectable positive signal indicating the presence of one or more target molecules in the sample.
In certain exemplary embodiments, a single guide sequence specific for a single target is placed in a separate volume. Each volume may then receive a different sample or an aliquot of the same sample. In certain exemplary embodiments, multiple ωrnas, each of which separates a target, can be placed in a single well, such that multiple targets can be screened in different wells. To detect multiple ωrnas in a single volume, in certain exemplary embodiments, multiple TnpB proteins with different specificities may be used.
In embodiments, different TnpB orthologs with different sequence specificities may be used. Cleavage motifs can be used to exploit the sequence specificity of different orthologs. The detection construct may comprise a cleavage motif that is preferentially cleaved by a given TnpB ortholog. The cleavage motif sequence may be a specific nucleotide base, a repeating nucleotide base in a homopolymer, or a heteropolymer of bases. The cleavage motif may be a dinucleotide sequence, a trinucleotide sequence or a more complex motif comprising 4, 5, 6, 7, 8, 9 or 10 nucleotide motifs. For example, one ortholog may preferentially cleave A, while the other ortholog preferentially cleaves C, G, U/T. Thus, detection constructs may be generated that comprise or consist entirely of a substantial portion of a single nucleotide, each construct having a different fluorophore that can be detected at a different wavelength. In this way, up to four different targets can be screened in a single, discrete volume. In certain other exemplary embodiments, different orthologs with different nucleotide editing preferences, such as a combination of TnpB with Cas13 or Cas12, may be used.
In addition to single base editing preferences, additional detection constructs can be designed based on other motif cleavage preferences of the TnpB, cas12, and Cas13 orthologs. For example, cas13 or Cas12 orthologs may preferentially cleave di-, tri-, or more complex motifs comprising 4, 5, 6, 7, 8, 9, or 10 nucleotide motifs. For example, lwaca 13a shows strong preference for the hexanucleotide motif sequence, while cca 13b shows strong preference for other hexanucleotide motifs. Thus, the upper limit of multiplex assays using embodiments disclosed herein is limited primarily by the distinguishable detectable labels and the number of detection channels required to detect them. In certain exemplary embodiments, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 27, 28, 29, or 30 different targets are detected. Exemplary methods for identifying such motifs are further disclosed in the working examples below.
In particular embodiments, the target molecule may be a target DNA and the method may further comprise binding the target DNA to a primer comprising an RNA polymerase site, as described herein.
In particular embodiments, one or more omega RNAs can be designed to detect single nucleotide polymorphisms in a target RNA or DNA, or splice variants of an RNA transcript.
The sample used with the present invention may be a biological or environmental sample, such as a food sample (fresh fruit or vegetables, meat), a beverage sample, a paper surface, a fabric surface, a metal surface, a wood surface, a plastic surface, a soil sample, a fresh water sample, a wastewater sample, a brine sample, an atmospheric or other gas exposure sample, or a combination thereof. For example, household/commercial/industrial surfaces made of any material including, but not limited to, metal, wood, plastic, rubber, etc. may be wiped and tested for contaminants. Soil samples may be tested for the presence of pathogenic bacteria or parasites or other microorganisms for environmental purposes and/or for human, animal or plant disease testing. The cleanliness and safety and/or drinkability of a water sample, such as a fresh water sample, a wastewater sample or a brine sample, can be evaluated to detect the presence of cryptosporidium parvum (Cryptosporidium parvum), giardia lamblia, or other microbial contaminants, for example. In further embodiments, the biological sample may be obtained from sources including, but not limited to: tissue samples, saliva, blood, plasma, serum, stool, urine, sputum, mucus, lymph, synovial fluid, cerebrospinal fluid, ascites, pleural effusion, seroma, pus, bile, aqueous humor or vitreous humor, leakage fluid, exudates, or swabs of skin or mucosal surfaces. In some particular embodiments, the environmental or biological sample may be a crude sample and/or one or more target molecules may not be purified or amplified from the sample prior to application of the method. Identification of microorganisms may be useful and/or desirable for any number of applications, and thus any type of sample from any source deemed suitable by those skilled in the art may be used in accordance with the present invention.
In some embodiments, one or more omega RNAs can be designed to bind cell free nucleic acids. In some embodiments, one or more omega RNAs can be designed to detect single nucleotide polymorphisms in a target RNA or DNA, or splice variants of an RNA transcript. In some embodiments, one or more guide RNAs are designed to bind to one or more target molecules that diagnose a disease state, as described herein.
In some embodiments, the disease state may be an infection, an organ disease, a blood disease, an immune system disease, a cancer, a brain and nervous system disease, an endocrine disease, a pregnancy or labor related disease, a genetic disease, or an environmentally acquired disease.
In certain exemplary embodiments, the systems, devices, and methods disclosed herein relate to detecting the presence of one or more microbial agents in a sample (such as a biological sample obtained from a subject). In certain exemplary embodiments, the microorganism may be a bacterium, fungus, yeast, protozoan, parasite, or virus. Thus, the methods disclosed herein may be applicable to or used in combination with other methods requiring rapid identification of microbial species, monitoring of microbial proteins (antigens), antibodies, the presence of antibody genes, detection of certain phenotypes (e.g., bacterial resistance), monitoring of disease progression and/or outbreaks, and antibiotic screening. Because of the rapid and sensitive diagnostic capabilities of the embodiments disclosed herein, detection of microbial species types, low to single nucleotide differences, and ability to be deployed as POC devices, the embodiments disclosed herein can be used as a guide to therapeutic regimens, such as selection of appropriate antibiotics or antiviral drugs. Embodiments disclosed herein may also be used to screen environmental samples (air, water, surfaces, food, etc.) for the presence of microbial contamination.
A method of identifying a microbial species (such as a bacterial, viral, fungal, yeast or parasitic species, etc.) is disclosed. Particular embodiments disclosed herein describe methods and systems that will identify and differentiate microbial species within a single sample or within multiple samples, allowing for the identification of many different microorganisms. The methods of the invention allow detection of pathogens by detecting the presence of a target nucleic acid sequence in a biological or environmental sample and distinguishing between two or more species of one or more organisms in the sample, such as bacteria, viruses, yeasts, protozoa and fungi, or combinations thereof. A positive signal obtained from the sample indicates the presence of the microorganism. By using more than one effector protein, multiple microorganisms can be identified simultaneously using the methods and systems of the present invention, wherein each effector protein targets a particular microbial target sequence. In this way, a multi-level analysis can be performed on a particular subject, wherein any number of microorganisms can be detected at a time. In some embodiments, simultaneous detection of multiple microorganisms may be performed using a set of probes that identify one or more species of microorganisms.
Multiple analysis of the sample allows for large-scale detection of the sample, thereby reducing the time and cost of the analysis. However, multiplex assays are often limited by the availability of biological samples. However, according to the present invention, an alternative to multiplex analysis can be performed such that multiple effector proteins can be added to a single sample, and each masking construct can be combined with a separate quencher dye. In this case, positive signals can be obtained from each quencher dye separately for multiple assays in a single sample.
Disclosed herein are methods for distinguishing between two or more species of one or more organisms in a sample. The method is also suitable for detecting one or more species of one or more organisms in the sample.
In some embodiments, the methods provide for detection of a disease state characterized by the presence or absence of an antibiotic or drug resistance or susceptibility gene or transcript or polypeptide, preferably in a pathogen or cell.
Device for detecting and measuring
In one embodiment, the detection assay may be provided on a cartridge or chip. In one aspect, the cartridge may include one or more ampoules and one or more wells communicatively coupled to allow reagents and samples to be transferred, exchanged or moved through the compartment of the cartridge with or without the use of beads and to facilitate a detection assay utilizing a system/device for facilitating a detection assay on the cartridge.
Box (B)
A cartridge according to the present invention, also referred to herein as a chip, includes an assembly of a series of ampoules and compartments communicatively coupled to one or more other assemblies on the cartridge. The coupling is typically in fluid communication, for example via a channel. The cartridge may comprise a membrane sealing one or more of the compartments and/or ampoule. In one aspect, the membrane allows for storage of reagents, buffers, and other solid or fluid components that cover and seal the cartridge. The membrane may be configured to be pierced, or otherwise released from one or more components of the sealed or covered cassette by a means for releasing the reagent.
As described above, certain embodiments enable the use of nucleic acid binding beads to concentrate target nucleic acids, but do not require elution of the isolated nucleic acids. Thus, in certain exemplary embodiments, the cartridge may further comprise an activatable magnet, such as an electromagnet. The means for activating the magnet may be located on the device or the means for supplying the magnet or the magnet on the activation cassette may be provided by a second device such as those disclosed in further detail below.
Ampoule (ampoule)
Ampoules, also known as blisters, allow for the storage and release of reagents throughout the cartridge. The ampoule may include liquid or solid reagents, for example, a lysing reagent in one ampoule and a reactive reagent in another ampoule. Reagents may be as described elsewhere herein, and may be suitable for use in a cartridge. The ampoule may be sealed by a membrane that allows the contents of the ampoule to be ruptured, pierced, or otherwise released. See, e.g., becker, H.&Microfluidics-enabled diagnostic systems: marks, changes, and samples, in Microchip Diagnostics: methods and Protocols (Taly, V. Et al) (Springer, new York, 2017); czurratis et al, doi 10.1088/0960-1317/25/4/045002. Considerations for ampoules may include, for example, those discussed in Smith, S.et al, blister pouches for effective reagent storage on microfluidic chips for blood cell count. Microfluid Nanofluid 20,163 (2016), DOI 10.1007/s 10404-016-1830-2. In one aspect, the seal is a frangible seal formed from a composite layer film assembled to the cartridge body. Although referred to herein as an ampoule, the ampoule may include a cavity on a chip that includes a sealing membrane that is opened by a release member.
Compartments
The compartments on the chip may be positioned and sized to be in fluid communication with the ampoule and/or other compartments on the chip via channels or other communication means.
Means for reading the results of the assay
Means for reading the assay results may be provided in the system. The means for reading the assay result will depend in part on the type of detectable signal generated by the assay. In certain embodiments, the assay produces a detectable fluorescence or color reading. In these cases, the means for reading the assay result will be an optical means, for example a single-channel or multi-channel optical means, such as a fluorometer, colorimeter or other spectroscopic sensor.
Combinations of means for reading the assay results may be utilized and may include readings such as turbidity, temperature, magnetic, radio or electrical properties and/or optical properties (including scattering, polarization effects, etc.).
The system may also include a user interface for programming the device and/or reading the assay results. The user interface may include an LED screen. The system may be further configured with USB ports that allow docking of four or more devices.
In one aspect, the system includes means for activating a magnet disposed within or on the cassette.
Lateral flow device
In one embodiment, the detection assay may be provided on a lateral flow device, an exemplary lateral flow device being described, for example, in international publication WO 2019/071051, which is incorporated herein by reference. The lateral flow device may be adapted to detect one or more coronaviruses and/or other viruses in combination with coronaviruses. The lateral flow device may comprise a flexible substrate, such as a paper substrate or a flexible polymer-based substrate, which may include lyophilized reagents for detection assays and visual readings of assay results. See WO 2019/071051, [0145] - [0151] and example 2, which are expressly incorporated herein by reference. In one aspect, the lyophilization reagents may include preferred excipients that aid in the rate of reaction, specificity, or other variables, such as trehalose, histidine, and/or glycine. In one embodiment, coronavirus assays may be utilized with isothermal amplification reagents, allowing amplification to be performed without the use of complex instrumentation that may not be available in situ. Thus, the assay may be suitable for field diagnostics, including the use of visual readings on lateral flow devices, rapid, sensitive detection, and may be deployed for early and direct detection. Colorimetric detection may be utilized and may be particularly suitable for field-deployable applications, as described in international application PCT/US2019/015726, published as WO2019/148206. In particular, colorimetric detection may be as described in figure 102, figure 105, figures 107 to 111 and [00306] - [00324] in WO2019/148206, which is incorporated herein by reference, and may be used with a TnpB system.
In one embodiment, the present invention provides a lateral flow device comprising a substrate comprising a first end and a second end. The first end may include a sample loading portion; a first region comprising a detectable ligand, two or more TnpB systems, two or more detection constructs, and one or more first capture regions, each first capture region comprising a first binding agent. The substrate may further comprise two or more second capture areas between the first and second ends, each second capture area comprising a different binding agent. Each of the two or more TnpB systems may comprise a TnpB protein and one or more nucleic acid component molecules, each nucleic acid component molecule sequence configured to bind to one or more target molecules.
The device may include a lateral flow substrate for detecting a reaction between the TnpB polypeptide and the target molecule that triggers a bypass, non-specific cleavage of the detection construct. Substrates suitable for lateral flow assays are known in the art. These may include, but are not necessarily limited to, films or pads or absorbent pads made of cellulose and/or glass fibers, polyesters, nitrocellulose (J Saudi Chem Soc 19 (6): 689-705; 2015) and other embodiments described further herein. The detection system (i.e., one or more TnpB systems and corresponding reporter constructs) is added to the lateral flow substrate at a defined reagent portion of the lateral flow substrate (typically at one end of the lateral flow substrate). A reporter construct as used in the context of the present invention may comprise a first molecule and a second molecule linked by an RNA or DNA linker. The lateral flow substrate further includes a sample portion. The sample portion may be identical, continuous or adjacent to the reagent portion. In one aspect, the lateral flow substrate may be contained within another device. In one aspect, the lateral flow substrate may be used to visually read out the detectable signal in a one-pot reaction, for example, where the extraction, amplification, and detection steps are performed in separate discrete volumes.
Lateral flow substrate
In certain exemplary embodiments, the lateral flow device comprises a lateral flow substrate upon which detection may be performed. Substrates suitable for lateral flow assays are known in the art. These may include, but are not necessarily limited to, films or pads or absorbent pads made of cellulose and/or glass fibers, polyesters, nitrocellulose (J Saudi Chem Soc 19 (6): 689-705; 2015).
The lateral support substrate includes a first end and a second end, and one or more capture areas each containing a binding agent. The first end may include a sample loading portion; a first region comprising a detectable ligand, two or more TnpB systems, two or more detection constructs, and one or more first capture regions, each first capture region comprising a first binding agent. The substrate may further comprise two or more second capture areas between the first and second ends, each second capture area comprising a different binding agent. Each of the two or more TnpB systems may comprise a TnpB protein and one or more nucleic acid component molecules, each nucleic acid component configured to bind to one or more target molecules. The lateral flow substrate may be configured to detect a reaction in which non-specific cleavage by-cleavage is triggered when a target molecule is bound and cleaved by the TnpB protein in the reaction.
The lateral support substrate may be located within a housing (see, e.g., "Rapid Lateral Flow Test Strips" Merck Millipore 2013). The housing may comprise at least one opening for loading the sample and a second single opening or separate openings allowing reading of the detectable signal generated at the first capture area and the second capture area.
Embodiments disclosed herein may be prepared in a freeze-dried form to facilitate dispensing and point of care (POC) applications. Such embodiments are useful in a variety of situations in human health, including, for example, viral detection, bacterial strain typing, sensitive genotyping, and detection of free DNA of cells associated with a disease. Thus, the lateral substrates comprising one or more of the system elements (including the detectable ligand, the TnpB system, the detection construct, and the binding agent) can be freeze-dried into lateral flow substrates and packaged as a ready-to-use device. Alternatively, all or part of the elements of the system may be added to the reagent portion of the lateral flow substrate at the time the device is used.
First and second ends of the substrate
The base of the lateral flow device includes a first end and a second end. The TnpB system (i.e., one or more TnpB systems and corresponding reporter constructs) is added to the lateral flow substrate at a defined reagent portion of the lateral flow substrate (typically at a first end of the lateral flow substrate). A reporter construct as used in the context of the present invention comprises a first molecule and a second molecule linked by an RNA or DNA linker. The lateral flow substrate further includes a sample portion. The sample portion may be identical, continuous or adjacent to the reagent portion.
In certain exemplary embodiments, the first end comprises a first region. The first region includes a detectable ligand, two or more TnpB systems, two or more detection constructs, and one or more first capture regions, each first capture region comprising a first binding agent.
Capture area
The lateral flow substrate may include one or more capture areas. In embodiments, the first end of the lateral flow substrate comprises one or more first capture areas, with two or more second capture areas between the first area of the first end of the substrate and the second end of the substrate. The capture area may be provided as a capture line, typically a horizontal line through the device, although other configurations are possible. The first capture area is adjacent to the lateral flow substrate and is located on the same end of the lateral flow substrate as the sample loading portion.
Binding agent
Specific binding integration molecules include any member of the binding pairs useful in the present invention. Such binding pairs are known to those of skill in the art and include, but are not limited to, antibody-antigen pairs, enzyme-substrate pairs, receptor-ligand pairs, and streptavidin-biotin. In addition to such known binding pairs, new binding pairs may be specifically designed. One feature of a binding pair is the binding between the two members of the binding pair.
A first binding agent that specifically binds to a first molecule of the reporter construct is immobilized or otherwise immobilized to the first capture region. The second capture area is located toward an end of the lateral flow substrate opposite the first capture area. The second binding agent is immobilized or otherwise immobilized to the second capture region. The second binding agent specifically binds to the second molecule of the reporter construct, or the second binding agent may bind to a detectable ligand. For example, the detectable ligand may be a particle, such as a colloidal particle, that can be visually detected and that produces a detectable positive signal when it aggregates. The particles may be modified with antibodies that specifically bind to the second molecule on the reporter construct. If the reporter construct is not cleaved, it will facilitate accumulation of the detectable ligand at the first binding region. If the reporter construct is cleaved, the detectable ligand is released to flow to the second binding region. In such embodiments, the second binding region comprises a second binding agent capable of specifically or non-specifically binding the detectable ligand to an antibody to the detectable ligand. The binding agent may be, for example, an antibody that recognizes a particular affinity tag. Such binding agents may also contain, for example, a detectable label, such as an isotopic label and/or a nucleic acid barcode. A barcode is a short nucleotide sequence (e.g., DNA, RNA, or a combination thereof) that serves as an identifier. The nucleic acid barcode may have a length of 4-100 nucleotides and may be single-stranded or double-stranded. Methods for identifying cells using barcodes are known in the art. Thus, the nucleic acid component molecules of the TnpB system described herein can be used to detect barcodes.
Detectable ligands
The first region is loaded with a detectable ligand, such as those disclosed herein, e.g., gold nanoparticles. The detectable ligand may be a particle that can be visually detected when it aggregates, such as a colloidal particle. The particles may be modified with antibodies that specifically bind to the second molecule on the reporter construct. If the reporter construct is not cleaved, it will facilitate accumulation of the detectable ligand at the first binding region. If the reporter construct is cleaved, the detectable ligand is released to flow to the second binding region. In such embodiments, the second binding agent is an agent capable of binding the detectable ligand specifically or non-specifically to an antibody on the detectable ligand. Examples of binding agents suitable for use in such embodiments include, but are not limited to, protein a and protein G. In some examples, the detectable ligand is a gold nanoparticle, which may be modified with a primary antibody, such as an anti-FITC antibody.
Lateral flow test constructs
The first region further comprises a detection construct. In one exemplary embodiment, disclosed herein are RNA detection constructs and TnpB systems (TnpB proteins and one or more nucleic acid component molecules configured to bind to one or more target sequences). In one exemplary embodiment, and for further illustration purposes, the RNA construct may comprise a FAM molecule on a first end of the detection construct and biotin on a second end of the detection construct. The first test strip is positioned upstream of the flow of solution at the first end of the lateral flow substrate. The test strip may comprise a biotin ligand. Thus, when the RNA detection construct is present in its initial state, i.e. in the absence of target, the FAM molecule on the first end will bind to the anti-FITC antibody on the gold nanoparticle and biotin on the second end of the RNA construct will bind to the biotin ligand, allowing the detectable ligand to accumulate on the first test, producing a detectable signal. The generation of a detectable signal at the first band indicates the absence of the target ligand. In the presence of the target, a TnpB complex is formed and the TnpB protein is activated, resulting in cleavage of the detection construct. In the absence of the complete RNA detection construct, colloidal gold will flow through the second test strip. The lateral flow device may include a second strip upstream of the first strip. The second band may comprise molecules capable of binding to antibody-labeled colloidal gold molecules, such as anti-rabbit antibodies, e.g., rabbit anti-FITC antibodies on colloidal gold. Thus, in the presence of one or more targets, the detectable ligand will accumulate at the second band, which indicates the presence of one or more targets in the sample.
In one embodiment, the first end of the lateral flow device comprises two detection constructs, and each of the two detection constructs comprises an RNA or DNA oligonucleotide comprising a first molecule on the first end and a second molecule on the second end. The first molecule and the second molecule may be linked by an RNA or DNA linker.
In one embodiment, the first molecule on the first end of the first detection construct may be FAM and the second molecule on the second end of the first detection construct may be biotin, or vice versa. In one embodiment, the first molecule on the first end of the second detection construct may be FAM and the second molecule on the second end of the second detection construct may be Digoxin (DIG), or vice versa.
In one embodiment, the first end may comprise three detection constructs, wherein each of the three detection constructs comprises an RNA or DNA oligonucleotide comprising a first molecule on the first end and a second molecule on the second end. In particular embodiments, the first and second molecules on the detection construct comprise type 665 and Alexa 488, respectively; type 665 and FAM, and type 665 and Digoxin (DIG).
In one embodiment, the first end of the lateral flow device comprises two or more TnpB systems, also referred to as TnpB systems. In one embodiment, such a TnpB system may include a TnpB protein and one or more nucleic acid component molecules configured to bind to one or more target sequences.
Sample of
When using a detection system with a lateral flow substrate, the sample to be screened is loaded to the sample loading portion of the lateral flow substrate. The sample must be a liquid sample or a sample dissolved in a suitable solvent (typically aqueous). The liquid sample reconstitutes the detection reagent so that a detection reaction can occur. The liquid sample begins to flow from the sample portion of the substrate to the first capture area and the second capture area.
The sample used with the present invention may be a biological or environmental sample, such as a surface sample, a fluid sample or a food sample (fresh fruits or vegetables, meats). The food sample may include a beverage sample, a paper surface, a fabric surface, a metal surface, a wood surface, a plastic surface, a soil sample, a fresh water sample, a wastewater sample, a brine sample, a sample exposed to the atmosphere or other gases, or a combination thereof. For example, household/commercial/industrial surfaces made of any material including, but not limited to, metal, wood, plastic, rubber, etc. may be wiped and tested for contaminants. Soil samples may be tested for the presence of pathogenic bacteria or parasites or other microorganisms for environmental purposes and/or for human, animal or plant disease testing. The cleanliness and safety and/or drinkability of a water sample (such as a fresh water sample, a wastewater sample or a brine sample) can be evaluated to detect the presence of, for example, cryptosporidium parvum, giardia lamblia, or other microbial contaminants. In further embodiments, the biological sample may be obtained from sources including, but not limited to: tissue samples, saliva, blood, plasma, serum, stool, urine, sputum, mucus, lymph, synovial fluid, spinal fluid, cerebrospinal fluid, ascites, pleural effusion, seroma, pus, bile, aqueous or vitreous humor, leakage fluid, exudates, or swabs of skin or mucosal surfaces. In some particular embodiments, the environmental or biological sample may be a crude sample and/or one or more target molecules may not be purified or amplified from the sample prior to application of the method. Identification of microorganisms may be useful and/or desirable for any number of applications, and thus any type of sample from any source deemed suitable by those skilled in the art may be used in accordance with the present invention.
In certain embodiments, the methods and systems can be used for direct detection from patient samples. In one aspect, the methods and systems may also allow direct detection from patient samples using visual readings to further facilitate field deployability. In one aspect, the field-deployable version may include, for example, a lateral flow device and system and/or colorimetric detection as described herein. The methods and systems can be used to differentiate between various viral species and strains and identify clinically relevant mutations, which are important for viral outbreaks such as coronavirus outbreaks (2019-nCoV). In one aspect, the sample is from a nasopharyngeal swab or a saliva sample. See, e.g., wyleie et al, "Saliva is more sensitive for SARS-CoV-2detection in COVID-19patients than nasopharyngeal swabs," DOI:10.1101/2020.04.16.20067835.
Method for detecting and/or quantifying target nucleic acid
In one embodiment, the invention provides a method for detecting a target nucleic acid in a sample. Such methods may include contacting a sample with a first end of a lateral flow device as described herein. The first end of the lateral flow device may include a sample loading portion, wherein the sample flows from the sample loading portion of the substrate to the first capture zone and the second capture zone and generates a detectable signal.
The positive detectable signal may be any signal that can be detected using optical, fluorescent, chemiluminescent, electrochemical, or other detection methods known in the art, as described elsewhere herein.
In one embodiment, the lateral flow device may be capable of detecting two different target nucleic acid sequences. In one embodiment, such detection of two different target nucleic acid sequences can occur simultaneously.
In one embodiment, the absence of a target nucleic acid sequence in the sample triggers a detectable fluorescent signal at each capture region. In such cases, the absence of any target nucleic acid sequence in the sample may cause a detectable signal to appear at the first and second capture regions.
In one embodiment, a lateral flow device as described herein is capable of detecting three different target nucleic acid sequences. In particular embodiments, when the target nucleic acid sequence is not present in the sample, a fluorescent signal may be generated at each of the three capture regions. In such exemplary embodiments, when the sample contains one or more target nucleic acid sequences, there may be no fluorescent signal at the capture region corresponding to the target nucleic acid sequences.
The sample to be screened is loaded into the sample loading portion of the lateral flow substrate. The sample must be a liquid sample or a sample dissolved in a suitable solvent (typically aqueous). The liquid sample reconstitutes the system reagent so that a detection reaction can occur. The complete reporter construct binds at the first capture region by binding between the first binding agent and the first molecule. Likewise, the detection agent will begin to collect at the first binding region by binding to the second molecule on the complete reporter construct. If the target molecule is present in the sample, the TnpB protein cleavage effect is activated. When the activated TnpB protein comes into contact with the bound reporter construct, the reporter construct is cleaved, releasing the second molecule to flow further along the lateral flow substrate to the second binding region. The released second molecule is then captured at the second capture region by binding to the second binding agent, wherein additional detection agent may also accumulate by binding to the second molecule. Thus, if no target molecule is present in the sample, a detectable signal will appear at the first capture zone, whereas if a target molecule is present in the sample, a detectable signal will appear at the location of the second capture zone.
In one embodiment, the invention provides a method for quantifying a target nucleic acid in a sample comprising partitioning the sample or sample set into one or more separate discrete volumes comprising two or more TnpB systems as described herein. The method may comprise amplifying one or more target molecules in the sample or group of samples using HDA, as described herein. The method may further comprise incubating the sample or group of samples under conditions sufficient to allow binding of the nucleic acid component molecules to one or more target molecules. The method may further comprise activating the TnpB protein via binding of the nucleic acid component molecule to one or more target molecules. Activation of the TnpB protein can result in modification of the detection construct such that a detectable positive signal is produced. The method may further comprise detecting one or more detectable positive signals, wherein detection indicates the presence of one or more target molecules in the sample. The method may further comprise comparing the intensity of the one or more signals to a control to quantify nucleic acids in the sample. The steps of amplifying, incubating, activating and detecting may all be performed in the same separate discrete volumes.
A "discrete volume alone" is a discrete volume or space, such as a container, receptacle, or other defined volume or space that may be defined by features that prevent and/or inhibit migration of nucleic acids and reagents necessary to perform the methods disclosed herein, e.g., a volume or space defined by physical features (such as walls, e.g., walls of a pore, or surfaces of a droplet), which may be impermeable or semi-permeable, or as defined by other means, such as chemical, diffusion rate limited, electromagnetic, or light illumination, or any combination thereof. By "diffusion rate limited" (e.g., diffusion limited volume) is meant a space or volume that is effectively defined by diffusion constraints (as is the case with two parallel laminar flows, where diffusion will limit migration of target molecules from one flow to the other) but only space into which certain molecules or reactions can enter. By "chemically" defined volume or space is meant a space where only certain target molecules may be present due to their chemical or molecular properties (such as size), wherein for example a gel bead may exclude certain substances from entering the bead, such as by the surface charge of the bead, the size of the matrix, or other bead physical properties that may allow selection of substances that may enter the interior of the bead, but not others. By "electromagnetically" defined volume or space is meant a space in which the electromagnetic properties (such as charge or magnetism) of the target molecule or its support can be used to define certain regions in space (such as capture of magnetic particles within a magnetic field or directly on a magnet). By "optically" defined volume is meant any region of space that can be defined by illuminating it with visible, ultraviolet, infrared or other wavelengths of light such that only target molecules within the defined space or volume can be labeled. One advantage of using non-wall or semi-permeable is that some reagents (such as buffers, chemical activators or other agents) can pass through discrete volumes in the applicant, while other materials such as target molecules can remain in discrete volumes or spaces. Typically, the discrete volume will comprise a fluid medium (e.g., an aqueous solution, oil, buffer, and/or medium capable of supporting cell growth) suitable for labeling the target molecule with the indexable nucleic acid identifier under conditions that allow for labeling. Exemplary discrete volumes or spaces that can be used in the disclosed methods include droplets (e.g., microfluidic droplets and/or emulsion droplets), hydrogel beads or other polymer structures (e.g., polyethylene glycol diacrylate beads or agarose beads), tissue slides (e.g., fixed formalin paraffin embedded tissue slides having specific areas, volumes or spaces defined by chemical, optical, or physical means), microscope slides whose areas are defined by depositing reagents in an ordered array or random pattern, tubes (such as centrifuge tubes, microcentrifuge tubes, test tubes, cuvettes, conical tubes, etc.), bottles (such as glass bottles, plastic bottles, ceramic bottles, conical bottles, scintillation bottles, etc.), wells (such as wells in a plate), plates, pipettes, or pipette heads, etc. In certain exemplary embodiments, the individual discrete volumes are wells of a microplate. In certain exemplary embodiments, the microplate is a 96-well, 384-well, or 1536-well microplate.
Sample incubation in an amplification step or extraction step as described herein may be performed using a heat source known in the art. Advantageously, the heat source may be a readily commercially available heat source that does not require complex instrumentation. Exemplary heating systems may include heating blocks, incubators, and/or water baths, the temperature of which is maintained by commercially available low-vacuum (sous-vis) cookers. In this way, sample diagnostics can be performed without the need for expensive and proprietary equipment that is primarily found in diagnostic laboratories and hospital environments.
In certain exemplary embodiments, paper-based microfluidics may be used for transfer of samples or reagents. For example, a test strip printed with a wax barrier at a defined distance from the end of a paper dipstick may be used to define the volume of reagent or sample to be transferred. For example, a wax barrier may be printed on a paper dipstick to define a microliter volume such that when the dipstick is transferred into a volume of reagent or sample, only microliter of the reagent or sample is absorbed onto the dipstick. The dipstick may be placed in a second reagent mixture in which the reagent or sample will diffuse into the reaction mixture. Such an assembly allows for preparing and using the assay without special equipment such as pipettes.
Optical means may be used to assess the presence and level of a given target molecule. In one embodiment, the optical sensor detects exposure of the fluorescent masking agent. In one embodiment, the device of the present invention may comprise a hand-held portable device for diagnostic reading of assays (see, e.g., vashist et al Commercial Smartphone-Based Devices and Smart Applications for Personalized Healthcare Monitoring and Management, diagnostics 2014,4 (3), 104-128; mReader from Mobile Assay; and Holomic rapid diagnostic test reader).
As noted herein, certain embodiments allow detection via colorimetric changes, which have certain attendant benefits when used in POC situations and/or in resource-starved environments where access to more complex detection devices to read signals may be limited. However, the portable embodiments disclosed herein may also be coupled with a handheld spectrophotometer capable of detecting signals outside the visible range. Examples of hand-held spectrophotometer devices that may be used in combination with the present invention are described in Das et al, "Ultra-portable, wireless smartphone spectrophotometer for rapid, non-destructive testing of fruit priority," Nature Scientific reports.2016,6:32504, DOI:10.1038/srep32504. Finally, in one embodiment utilizing quantum dot-based masking constructs, the use of hand-held UV light or other suitable means may be successfully used to detect the signal due to the near-complete quantum yield provided by the quantum dots.
Amplification of target molecules
The step of amplifying the one or more target molecules may comprise an amplification system known in the art. In one embodiment, the amplification is isothermal. In certain exemplary embodiments, the target RNA and/or DNA may be amplified prior to activating the TnpB protein. Any suitable RNA or DNA amplification technique may be used. In certain embodiments, the amplification step may take less than about 1 hour, 50 minutes, 40 minutes, 30 minutes, 25 minutes, 20 minutes, or 15 minutes, depending on the sample, starting concentration, and nature of the amplification used.
In one embodiment, the amplification of the target molecule and the detection of the target molecule may be performed in a single reaction, e.g., a "one-pot" method. General guidelines for using a one-pot process may be as follows: gootenberg et al, science 2018, 4, 27, 360 (6387) 439-444 (typically using Cas13, cas12a and Csm6, multiple targets were detected in a single reaction, and DNA extraction was performed exclusively in the sample, and used as input for direct detection in FIG. S33); and Ding et al, "All-in-One Dual CRISPR-Cas12a (AIOD-CRISPR) Assay: A Case for Rapid, ultrasensitive and Visual Detection of Novel Coronavirus SARS-CoV-2and HIV Virus," doi:10.1101/2020.03.19.998724,biorxiv preprint (One-pot method for target-specific nucleic acid detection using a pair of crRNA and Dual CRISPR-Cas12a detection).
In certain exemplary embodiments, the RNA or DNA amplification is isothermal amplification. In certain exemplary embodiments, isothermal amplification may be Nucleic Acid Sequence Based Amplification (NASBA), recombinase Polymerase Amplification (RPA), loop-mediated isothermal amplification (LAMP), strand Displacement Amplification (SDA), helicase-dependent amplification (HDA), or Nicking Enzyme Amplification Reaction (NEAR). In certain exemplary embodiments, non-isothermal amplification methods may be used, including, but not limited to, PCR, multiple Displacement Amplification (MDA), rolling Circle Amplification (RCA), ligase Chain Reaction (LCR), or branched amplification methods (RAM).
Amplification of the target molecule may be optimized by methods as described in detail herein. In one aspect, the design optimizes the primers used in the amplification. In a particular aspect, isothermal amplification is used with the TnpB system. In either approach, design considerations may follow a rational design to optimize the reaction. Optimization of the methods as disclosed herein may include first screening primers to identify one or more sets of primers that function well for a particular target, tnpB protein and/or response. Once the primers are screened, magnesium concentration titration can be performed to determine the optimal magnesium concentration for higher signal-to-noise readings. In one example, different additives with specific primer, target, tnpB protein, temperature, and other additive concentrations in the reaction can be identified. Optimization can be performed to reduce the number of steps and buffer exchanges that must occur in the reaction, simplify the reaction, and reduce the risk of contamination in the transfer step. Similarly, optimizing salt levels and the type of salt used may further facilitate and optimize the one-pot assays disclosed herein.
Loop-mediated isothermal amplification
In certain exemplary embodiments, loop-mediated isothermal amplification (LAMP) reactions can be used to target nucleic acids, which reactions encompass both LAMP and RT-LAMP reactions. LAMP can be performed using a four primer system to perform isothermal nucleic acid amplification in combination with a polymerase. Notomi et al, nucleic Acids Res. 2000,28,12, nagamine et al, molecular and Cellular Probes (2002) 16,223-229, doi:10.1006/mcpr.2002.0415. When LAMP was performed using the 4-primer system, two in-loop primers (denoted FIP and BIP) and two outer primers F3 and B3 were provided. The inner primers each contain two different sequences, one for priming in the first stage of amplification and the other for self priming in the subsequent amplified state. The two outer primers initiate strand displacement of the nucleic acid strand starting from the FIP and BIP primers, thereby generating loop formation and strand displacement nucleic acid synthesis using the provided polymerase. LAMP can be performed with two to six primers, ranging from only two loop primers to at least 2 additional primers LF and LB and two outer primers and two inner primers. The LAMP technique advantageously has high specificity and can operate at a variety of pH and temperatures. In a preferred aspect, LAMP is an isothermal reaction between about 45 ℃ to 75 ℃, 55 ℃ to 70 ℃, or 60 ℃ to 65 ℃. Colorimetric LAMP (Y. Zhang et al, doi: 10.1101/2020.92.26.20028373), RT-LAMP (Lamb et al, doi:10.1101/2020.02.19.20025155; and Yang et al, doi: 10.1101/2020.03.02.20030130) have been developed for detection of COVID-19 and are incorporated herein by reference in their entirety.
In one embodiment, the LAMP reagent may include Bst 2.0+rtx or Bst 3.0 from New England Biolabs. In one embodiment, the LAMP reagent may comprise colorimetric or fluorescent detection. Detection of LAMP products can be accomplished using colorimetric tools such as hydroxynaphthol blue (see, e.g., goto, M.et al, colorimetric detection of loop-mediated isothermal amplification reaction by using hydroxy naphthol blue. Biotechniques,2009.46 (3): pages 167-72), leuco triphenylmethane dyes (see, e.g., miyamoto, S.et al, method for colorimetric detection of double-stranded nucleic acid using leuco triphenylmethane dye. Animal Biochem,2015.473: pages 28-33), and pH-sensitive dyes (see, e.g., canner, N.A., Y.Zhang and T.C.Evans, jr., visual detection of isothermal nucleic acid amplification using pH-sensitive dyes Biohniques, 2015.58 (2): pages 59-68); and fluorescence detection (see, e.g., yu et al, clinical Chemistry, hvaa102, doi:10.1093/clinchem/hvaa1022020, 5-month 12-day), including the use of quenching probes (see, e.g., shirato et al, J Virol methods.2018, 8; 258:41-48.Doi: 10.1016/j.jviromet.2018.05.006).
In one aspect, the primer set of the LAMP is designed to amplify one or more target sequences, thereby generating an amplicon comprising the one or more target sequences. Optionally, the primer may comprise a barcode that may be designed as described elsewhere herein. The polymerase and optionally reverse transcriptase (in the case of RT-LAMP) are used and incubated to a temperature sufficient for LAMP amplification, for example 50℃to 72℃and more preferably 55℃to 65 ℃. Preferably, the enzyme utilized in the LAMP reaction is thermostable. LAMP primer sites have been designed, see, e.g., park et al, "Development of Reverse Transcription Loop-Mediated Isothermal Amplification Assays Targeting SARS-CoV-2" J.of mol.Diag. (2020). Optionally, a control template is further provided with the sample, which control template may be different from the target sequence but share a primer binding site. In an exemplary embodiment, visual readout of the test results may be accomplished using a commercially available lateral flow substrate (e.g., a commercially available paper substrate).
NASBA
In certain exemplary embodiments, the RNA or DNA amplification is NASBA, which is initiated from reverse transcription of the target RNA by a sequence-specific reverse primer to produce an RNA/DNA duplex. RNase H is then used to degrade the RNA template, allowing a forward primer containing a promoter, such as the T7 promoter, to bind and initiate extension of the complementary strand, thereby producing a double stranded DNA product. RNA polymerase promoter-mediated transcription of the DNA template then creates copies of the target RNA sequence. Importantly, each new target RNA can be detected by the nucleic acid component molecules, thereby further increasing the sensitivity of the assay. Binding of the nucleic acid component molecule to the target RNA then results in activation of the TnpB protein and the method proceeds as described above. Another advantage of NASBA reactions is the ability to be performed under moderate isothermal conditions, e.g., at about 41 ℃, making them suitable for deployment in systems and devices for early direct detection at sites remote from the clinical laboratory.
RPA
In certain other exemplary embodiments, a Recombinase Polymerase Amplification (RPA) reaction can be used to amplify the target nucleic acid. The RPA reaction employs a recombinase that is able to pair sequence-specific primers with homologous sequences in duplex DNA. If target DNA is present, DNA amplification is initiated and no other sample manipulation, such as thermal cycling or chemical melting, is required. The entire RPA amplification system is stable as a dry formulation and can be safely transported without refrigeration. The RPA reaction may also be carried out at isothermal temperatures, with an optimum reaction temperature of 37-42 ℃. Sequence specific primers are designed to amplify a sequence comprising the target nucleic acid sequence to be detected. In certain exemplary embodiments, an RNA polymerase promoter, such as a T7 promoter, is added to one of the primers. This produces an amplified double stranded DNA product comprising the target sequence and the RNA polymerase promoter. After or during the RPA reaction, an RNA polymerase is added that will produce RNA from the double stranded DNA template. The amplified target RNA can then in turn be detected by the TnpB system. In this way, the embodiments disclosed herein can be used to detect target DNA. The RPA reaction can also be used to amplify target RNA. The target RNA is first converted to cDNA using reverse transcriptase, and then second strand DNA synthesis is performed, at which point the RPA reaction proceeds as described above.
Transposase-based amplification
Embodiments disclosed herein provide systems and methods for isothermal amplification of a target nucleic acid sequence by contacting an oligonucleotide containing the target nucleic acid sequence with a transposon complex. The oligonucleotide may be a single-or double-stranded RNA, DNA or RNA/DNA hybrid oligonucleotide. The transposon complex comprises a transposase and a transposon sequence comprising one or more RNA polymerase promoters. Transposases facilitate insertion of one or more RNA polymerase promoters into an oligonucleotide. The RNA polymerase promoter can then transcribe the target nucleic acid sequence from the inserted one or more RNA polymerase promoters. One advantage of this system is that there is no need to heat or melt the double stranded DNA template, as the RNA polymerase requires a double stranded template. Such isothermal amplification is rapid and simple, requiring no complex and expensive denaturing and cooling equipment. In certain exemplary embodiments, the RNA polymerase promoter is a native or modified T7 RNA promoter.
As used herein, the term "transposon" refers to a nucleic acid segment recognized by a transposase or integrase, and is an essential component of a functional nucleic acid-protein complex (i.e., a transposome) that is capable of transposition. The term "transposase" as used herein refers to an enzyme that is a component of a functional nucleic acid-protein complex capable of transposition and mediates transposition. The term "transposase" also refers to integrases from retrotransposons or retroviral sources. Transposon complexes are formed between a transposase and a double stranded DNA fragment containing specific binding sequences for the enzyme, referred to as "transposon ends". The sequence of the transposon binding site may be modified with other bases at certain positions without affecting the ability of the transposon complex to form a stable structure that can be efficiently transposed into the target DNA.
In embodiments provided herein, a transposon complex may comprise a transposase and a transposon sequence comprising one or more RNA polymerase promoters. The term "promoter" refers to a region of DNA that is involved in binding RNA polymerase to initiate transcription. In particular embodiments, the RNA polymerase promoter may be a T7 RNA polymerase promoter. The T7 RNA promoter may be inserted into a double stranded polynucleotide using a transposase. In one embodiment, the T7 RNA polymerase promoter insertion into the oligonucleotide may be random.
The transposition frequency of most transposons is very low, which uses complex mechanisms to limit activity. For example, tn5 transposase utilizes suboptimal DNA binding sequences, and the C-terminal end of the transposase interferes with DNA binding. Reznikoff and its colleagues carefully characterize the mechanism involved in Tn5 transposition. Tn5 transposes by a mechanism of shearing and pasting. The transposon has two pairs of 19bp elements that are utilized by the transposase: an external element (OE) and an Internal Element (IE). One transposase monomer is bound to each of the two elements utilized. When a monomer binds to each end of a transposon, both monomers dimerize, thereby forming a synapse. Vectors with a donor backbone of at least 200bp but less than 1000bp are most transposable in bacteria. Transposon cleavage occurs by trans catalysis and only occurs when the monomer bound to each DNA end is in the synaptic complex. Tn5 is transposed using a relaxed target site selection and can therefore be inserted into target DNA with little target sequence specificity.
Natural downregulation of Tn5 transposition can be overcome by selection of highly active transposases and by optimization of transposase binding elements [ York et al 1998]. Chimeric elements (ME) made from three base modifications of wild-type OE resulted in a 50-fold increase in transposition events in bacterial and cell-free systems. The combined effect of optimized ME and high activity mutant transposase was estimated to result in a 100-fold increase in transposable activity. Goryshin et al show that preformed Tn5 transposition complexes can be functionally introduced into bacteria or yeast by electroporation [ Goryshin et al 2000]. Linearization of the DNA pinpoints the inverted repeats at both ends of the transposon, allowing Goryshin and colleagues to bypass the cleavage step of the transposition, thereby enhancing transposition efficiency.
In one embodiment, a transposase may be used to tag an oligonucleotide sequence that includes a target sequence. The term "tagging" refers to steps in transposase accessible chromatin assays using the described sequencing (ATAC-seq). (see, buenrosro, J.D., giresi, P.G., zaba, L.C., chang, H.Y., greenleaf, W.J., transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nature methods 2013;10 (12): 1213-1218). In particular, high activity Tn5 transposases loaded with adaptors for high throughput DNA sequencing in vitro can fragment and tag the genome simultaneously using sequencing adaptors. In one embodiment, the adapter is compatible with the methods described herein.
In one embodiment, the transposase may be a Tn5 transposase. In one embodiment, the transposase may be a variant of a Tn5 transposase or an engineered transposase. The transposase may be engineered using any method known in the art. The engineered transposase may be optimized to function at a temperature in the range of 30 ℃ to 45 ℃, 35 ℃ to 40 ℃, or at any temperature in between. The engineered transposase can be optimized to release from the oligonucleotide at a faster rate than the wild type transposase.
In one embodiment, the transposase may be a Tn5 transposase, mu transposase, or Tn7 transposase. In vitro transposition efficiency may vary depending on the transposon system used. Generally, tn5 and Mu transposases achieve higher levels of transposase efficiency. In one embodiment, the insertion may be random. In one embodiment, the insertion may occur in a GC-rich region of the target sequence.
In one embodiment, the transposon sequence may comprise two 19 base pair chimeric end (ME) Tn5 transposase recognition sequences. Tn5 transposases will typically transpose any DNA sequence contained between such short 19 base pair ME Tn5 transposase recognition sequences.
In one embodiment, the use of a transposase allows for the separation of double stranded polynucleotides in the absence of heating or melting. The method may be adapted according to the method described in PCT/US2019/039195, which is incorporated herein by reference.
Nicking enzyme dependent amplification
In embodiments of the invention, nicking enzyme-based amplification may be included. The nicking enzyme may be a TnpB protein. Thus, the introduction of nicks into double stranded DNA can be programmable and sequence specific. In embodiments of the invention, two guides can be designed to target opposite strands of a dsDNA target. According to the invention, the nicking enzyme may be TnpB or a nicking enzyme that may use any CRISPR protein (such as Cpf1, C2C1, cas 9) or any ortholog or CRISPR protein that cleaves or is engineered to cleave a single strand of a DNA duplex. In particular embodiments, tnpB is used for nicking enzyme dependent amplification. The nicked strand may then be extended by a polymerase. In one embodiment, the position of the nicks is selected such that the polymerase extends the strand toward the central portion of the target duplex DNA between the nick sites. In one embodiment, the primer is included in a reaction capable of hybridizing to an extended strand, followed by further polymerase extension of the primer to regenerate the two dsDNA fragments: a first dsDNA comprising a first strand TnpB guide site or both a first strand and a second strand TnpB guide site, and a second dsDNA comprising a second strand TnpB guide site or both a first strand and a second strand TnpB guide site. These fragments continue to be nicked and extended in a cycling reaction that exponentially amplifies the target region between the nicking sites. Alternatively, CRISPR-Cas proteins may be used instead of TnpB for nickase-based amplification, and such methods are known in the art.
Amplification may be isothermal and temperature may be selected. In one embodiment, the amplification is performed rapidly at 37 degrees celsius. In other embodiments, the temperature of isothermal amplification may be selected by selecting polymerases (e.g., bsu, bst, phi29, klenow fragments, etc.) that can be operated at different temperatures.
Thus, nicking isothermal amplification techniques use a nicking enzyme with fixed sequence preference (e.g., in a nicking enzyme amplification reaction or NEAR), which requires denaturation of the original dsDNA target to allow annealing and extension of the primer that adds the nicking substrate to the end of the target, using a reprogrammable nicking enzyme (where the nicking site can be programmed via an RNA molecule) means that no denaturation step is required, enabling the entire reaction to be truly isothermal. This also simplifies the reaction, since these primers that add nick substrate are different from those used later in the reaction, meaning that NEAR requires two sets of primers (i.e., 4 primers) whereas Cpf1 nick amplification requires only one set of primers (i.e., two primers). This makes the nick Cpf1 amplification simpler and easier to handle, without the need for complex instrumentation to perform denaturation and then cooling to isothermal temperatures.
In one aspect, isothermal amplification reagents may be utilized with the thermostable TnpB protein. The combination of thermostable proteins and isothermal amplification reagents can be used to further improve the reaction time for detection and diagnosis.
Thus, in certain exemplary embodiments, the systems disclosed herein can include amplification reagents. Described herein are different components or reagents that can be used for nucleic acid amplification. For example, an amplification reagent as described herein may include a buffer, such as Tris buffer. Tris buffers may be used at any concentration suitable for the desired application or use, including for example, but not limited to, concentrations of 1mM, 2mM, 3mM, 4mM, 5mM, 6mM, 7mM, 8mM, 9mM, 10mM, 11mM, 12mM, 13mM, 14mM, 15mM, 25mM, 50mM, 75mM, 1M, etc. One skilled in the art will be able to determine the appropriate concentration of buffer such as Tris for use with the present invention.
Salts such as magnesium chloride (MgCl 2), potassium chloride (KCl) or sodium chloride (NaCl) may be included in amplification reactions such as PCR in order to improve the amplification of the nucleic acid fragments. Although salt concentration will depend on the particular reaction and application, in one embodiment, a particular size of nucleic acid fragment may produce optimal results at a particular salt concentration. Larger products may require varying salt concentrations, typically lower salts, in order to produce the desired results, while amplification of smaller products may produce better results at higher salt concentrations. Those skilled in the art will appreciate that the presence and/or concentration of salts and changes in salt concentration can alter the stringency of biological or chemical reactions, and thus any salt that provides suitable conditions for the reactions of the invention and described herein can be used. In certain preferred embodiments, when polynucleotide extraction beads such as magnetic beads are utilized, plant QuickExtract solution can be combined with KCl buffer for use in an optimized detection method according to the present disclosure.
Other components of biological or chemical reactions may include cell lysis components to disrupt or lyse cells to analyze materials therein. The cell lysis component may include, but is not limited to, detergents, salts as described above, such as NaCl, KCl, ammonium sulfate [ (NH 4) 2SO4] or others. Detergents suitable for use in the present invention may include Triton X-100, sodium Dodecyl Sulfate (SDS), CHAPS (3- [ (3-cholestamidopropyl) dimethylammonium ] -1-propanesulfonate), ethyltrimethylammonium bromide, nonylphenoxy polyethoxyethanol (NP-40). The concentration of the detergent may depend on the particular application and may be specific to the reaction in some cases. The amplification reaction may include dNTPs and nucleic acid primers used at any concentration suitable for the present invention, such as concentrations including, but not limited to, 100nM, 150nM, 200nM, 250nM, 300nM, 350nM, 400nM, 450nM, 500nM, 550nM, 600nM, 650nM, 700nM, 750nM, 800nM, 850nM, 900nM, 950nM, 1mM, 2mM, 3mM, 4mM, 5mM, 6mM, 7mM, 8mM, 9mM, 10mM, 20mM, 30mM, 40mM, 50mM, 60mM, 70mM, 80mM, 90mM, 100mM, 150mM, 200mM, 250mM, 300mM, 350mM, 400mM, 450mM, 500mM, etc. Likewise, the polymerases useful according to the present invention can be any specific or universal polymerase known in the art and useful in the present invention, including Taq polymerase, Q5 polymerase, and the like.
In one embodiment, the amplification reagents as described herein may be suitable for hot start amplification. In one embodiment, hot-start amplification may be useful to reduce or eliminate dimerization of adapter molecules or oligonucleotides, or otherwise prevent unwanted amplification products or artifacts and obtain optimal amplification of the desired product. Many of the components described herein for amplification may also be used for hot start amplification. In one embodiment, reagents or components suitable for use with hot start amplification may be used in place of one or more of the composition components. For example, a polymerase or other reagent that exhibits the desired activity at a particular temperature or other reaction conditions may be used. In one embodiment, reagents designed or optimized for hot-start amplification may be used, e.g., the polymerase may be activated after transposition or after reaching a specific temperature. Such polymerases may be antibody-based or aptamer-based. The polymerases as described herein are known in the art. Examples of such reagents may include, but are not limited to, hot start polymerase, hot start dntps, and photocaged dntps. Such agents are known and available in the art. One skilled in the art will be able to determine the optimal temperature for each reagent.
Amplification of nucleic acids may be performed using a particular thermal cycling machine or apparatus, and may be performed in a single reaction or in batches, such that any desired number of reactions may be performed simultaneously. In one embodiment, amplification may be performed using a microfluidic or robotic device, or amplification may be performed using a manual change in temperature to achieve the desired amplification. In one embodiment, optimization may be performed to obtain optimal reaction conditions for a particular application or material. Those skilled in the art will understand and be able to optimize the reaction conditions to obtain sufficient amplification.
In one embodiment, detection of DNA with the methods or systems of the invention entails transcription of (amplified) DNA into RNA prior to detection.
It is apparent that the detection methods of the present invention may involve various combined nucleic acid amplification and detection procedures. The nucleic acid to be detected may be any naturally occurring or synthetic nucleic acid, including but not limited to DNA and RNA, which may be amplified by any suitable method to provide an intermediate that can be detected. Detection of the intermediate may be performed by any suitable method, including, but not limited to, binding and activation of the TnpB protein, which produces a detectable signal moiety by direct or parachuting activity.
Helicase dependent amplification
In helicase-dependent amplification, a helicase is used to unwind double stranded nucleic acids to produce templates for primer hybridization and subsequent primer extension. This procedure utilizes two oligonucleotide primers, each hybridized to the 3' end of the sense strand containing the target sequence or the antisense strand containing the reverse complement target sequence. The HDA reaction is a general method of helicase-dependent nucleic acid amplification.
When this method is combined with a TnpB detection system, the target nucleic acid can be amplified by opening the R loop of the target nucleic acid using the first and second TnpB complexes. Thus, the first strand and the second strand of the target nucleic acid can be cleaved using helicase, allowing the primer and polymerase to bind and extend the DNA under isothermal conditions.
The term "helicase" herein refers to any enzyme capable of enzymatically untangling double stranded nucleic acids. For example, helicases are enzymes that are present in all organisms and in all processes involving nucleic acids, such as replication, recombination, repair, transcription, translation, and RNA splicing. (Kornberg and Baker, DNA Replication, W.H. Freeman and Company (2 nd edition (1992)), in particular chapter 11). Any helicase that translocates in the 5 'to 3' direction or the opposite 3 'to 5' direction along DNA or RNA can be used in this embodiment of the invention. This includes recombinant forms of helicases or naturally occurring enzymes obtained from prokaryotes, viruses, archaebacteria and eukaryotes, and analogues or derivatives having specific activities. Examples of naturally occurring DNA helicases described in Kornberg and Baker, chapter 11 of their work DNA Replication, W.H. Freeman and Company (2 nd edition (1992)), include Escherichia coli helicases I, II, III and IV, rep, dnaB, priA, pcrA, T, gp41 helicase, T4 Dda helicase, T7 Gp4 helicase, SV40 large T antigen, yeast RAD. Additional helicases that may be used in the HDA include RecQ helicase (Harmon and Kowalczykowski, J.biol. Chem.276:232-243 (2001)), thermostable UvrD helicase from Thermoanaerobacter soakage (T.tengcongensis) (disclosed in example XII of the present invention) and Thermus thermophilus (T.thermophilus) (Collins and McCarthy, extremophilies. 7:35-41. (2003)), thermostable DnaB helicase from Thermus aquaticus (T.aquaticus) (Kaplan and Steitz, J.biol. Chem.274:6889-6897 (1999)), and MCM helicase from archaebacterium and eukaryotes (Grainge et al, nucleic Acids Res.31:4888-4898 (2003)).
The traditional definition of helicase is an enzyme that catalyzes a reaction that separates/decompresses/unwinds the helical structure of a nucleic acid duplex (DNA, RNA, or hybrid) into single stranded components using Nucleoside Triphosphate (NTP) hydrolysis as an energy source, such as ATP. However, it should be noted that not all helicases meet this definition. More generally defined are those that are motile proteins that move along (typically in a certain direction, 3 'to 5' or 5 to 3 or both) single-or double-stranded nucleic acids, i.e., translocases, that may or may not untwist the encountered double-stranded nucleic acids. Furthermore, some helicases bind and "melt" duplex nucleic acid structures only, without significant translocase activity.
Helicases are present in all organisms and play a role in the overall aspect of nucleic acid metabolism. Helicases are classified based on amino acid sequence, directionality, oligomeric state, and nucleic acid type and structural preference. The most common classification methods were developed based on the presence of certain amino acid sequences (called motifs). According to this classification, helicases are divided into 6 superfamilies: SF1, SF2, SF3, SF4, SF5, and SF6.SF1 and SF2 helicases do not form a loop around the nucleic acid, whereas SF3 through SF6 form a loop. Superfamily classification is not dependent on classical taxonomies.
The DNA helicase is responsible for catalyzing the unwinding of double stranded DNA (dsDNA) molecules into their corresponding single stranded nucleic acid (ssDNA) forms. Although structural and biochemical studies have demonstrated how various helicases can translocate directionally on ssDNA, one ATP per nucleotide is consumed, the mechanism of nucleic acid untangling and how untangling activity is regulated remains unclear and controversial (T.M.Lohman, E.J.Tomko, C.G.Wu, "Non-hexameric DNA helicases and translocases: mechanisms and regulation," Nat Rev Mol Cell Biol 9:391-401 (2008)). Since helicases potentially can unwind all nucleic acids encountered, understanding how their unwinding activity is regulated may lead to the use of helicase function for biotechnology applications.
The term "HDA" refers to helicase-dependent amplification, an in vitro method of amplifying nucleic acids by unwinding double stranded nucleic acids using a helicase preparation to create templates for primer hybridization and subsequent primer extension. This procedure utilizes two oligonucleotide primers, each hybridized to the 3' end of the sense strand containing the target sequence or the antisense strand containing the reverse complement target sequence. The HDA reaction is a general method of helicase-dependent nucleic acid amplification.
The present invention includes the use of any suitable helicase known in the art. These include, but are not necessarily limited to, uvrD helicase, CRISPR-Cas3 helicase, escherichia coli helicase I, escherichia coli helicase II, escherichia coli helicase III, escherichia coli helicase IV, rep helicase, dnaB helicase, pria helicase, pcrA helicase, T4 Gp41 helicase, T4 Dda helicase, SV40 large T antigen, yeast RAD helicase, recD helicase, recQ helicase, thermostable Thermus flavus UvrD helicase, thermostable Thermus dnaB helicase, dda helicase, papillomavirus E1 helicase, archaea MCM helicase, eukaryotic MCM helicase, and T7 Gp4 helicase.
In a particularly preferred embodiment, the helicase comprises a hypermutation. In certain embodiments, although Escherichia coli mutations have been described, the mutations are generated by sequence alignment (e.g., D409A/D410A of TteUvrd) and result in thermophilic enzymes that operate at lower temperatures, such as 37 ℃, which is advantageous for the amplification methods and systems described herein. In one embodiment, the hypermutation is an aspartic acid to alanine mutation, the position of which is based on sequence alignment. In one embodiment, the hyper mutant helicase is selected from the group consisting of WP_003870487.1 Thermoanaerobacter etholyticus 403/404, WP_049660019.1 Bacillus species FJAT-27231 407/408, WP_034654680.1 Bacillus megaterium 415/416, WP_095390358.1 Bacillus simplex 407/408, and WP_055343022.1 Clostridium soxhlet 402/403.
Incubation
Detection and/or extraction methods using the systems disclosed herein may include incubating the sample or set of samples under conditions sufficient to allow binding of the nucleic acid component molecules to one or more target molecules. Extraction may include incubating the sample under conditions sufficient to allow release of viral RNA present in the sample, which may include incubating at 22 ℃ to 60 ℃ for 30 to 70 minutes or at 90 ℃ to 100 ℃ for about 10 minutes.
In certain exemplary embodiments, the incubation time for amplification and detection in the present invention may be reduced. The assay may be performed during the period of time required for the enzymatic reaction to occur. The biochemical reaction can be performed within 5 minutes (e.g., 5 minute ligation) by one skilled in the art. The incubation may be performed at one or more temperatures for a time range between about 10 minutes and 90 minutes, preferably less than 90 minutes, 75 minutes, 60 minutes, 45 minutes, 30 minutes, 25 minutes, 20 minutes, 15 minutes or 10 minutes, depending on the sample, reagents and components of the system. In one embodiment, incubation for amplification is performed at one or more temperatures between about 20 ℃ and 80 ℃, in one embodiment at about 37 ℃. In one embodiment, incubation for amplification is performed at one or more temperatures between about 55 ℃ and 65 ℃, between about 59 ℃ and 61 ℃, in one embodiment at about 60 ℃.
Activation of
In certain exemplary embodiments, activation of the TnpB protein occurs through binding of the TnpB complex to one or more target molecules via a nucleic acid component molecule, wherein activation of the TnpB protein results in modification of the detection construct such that a detectable signal is generated.
Detecting a signal
Detection may include visual observation of a positive signal relative to a control. Detection may include signal loss or signal presence at one or more capture areas, such as colorimetric or fluorescent detection. In certain exemplary embodiments, further modifications may be introduced that further amplify the detectable positive signal. For example, activated TnpB protein parachuting activation can be used to generate a second target or additional nucleic acid component molecular sequences or both. In one exemplary embodiment, the reaction solution will contain the second target incorporated at a high concentration. The second target may be different from the first target (i.e., the target for which the assay is designed to detect) and may be common in all reaction volumes in some cases. The second nucleic acid component molecular sequence for the second target may be protected, for example, by a secondary structural feature (such as a hairpin with an RNA loop) and is unable to bind the second target or the TnpB protein. The protecting group is cleaved by the activated TnpB r protein (i.e., after activation by formation of a complex with the first target in solution) and forms a complex with the free TnpB protein in solution and is activated by the inclusions in the second target. In certain other exemplary embodiments, similar concepts are used with the free nucleic acid component molecular sequences of the second target and the protected second target. Cleavage of the protecting group from the second target will allow for the formation of additional TnpB proteins, nucleic acid component sequences, second target sequences. In yet another exemplary embodiment, activation of the TnpB protein by the first target can be used to cleave a protected or circularized primer, which is then released to perform an isothermal amplification reaction on a template of the second nucleic acid component sequence, the second target, or both, such as those disclosed herein. Subsequent transcription of this amplified template will result in additional second nucleic acid component molecular sequences and/or second target sequences, followed by additional TnpB protein bypass activation.
Quantization
In a particular method, the intensity of one or more signals is compared to a control to quantify nucleic acids in a sample. The term "control" refers to any reference standard suitable for providing a comparison with the expression product in the test sample. In one embodiment, the control comprises obtaining a "control sample" from which the expression product level is detected and compared to the expression product level from the test sample. Such control samples may include any suitable sample, including but not limited to a sample of a control patient whose results are known (which may be a stored sample or previous sample measurements); normal tissue, fluid, or cells isolated from a subject (such as a normal patient or a patient having a condition of interest).
The signal intensity is "significantly" higher or lower than the normal intensity if the amount of signal greater or less than the normal or control level, respectively, is greater than the standard error for the determination of the assessment, and preferably at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 150%, 200%, 300%, 350%, 400%, 500%, 600%, 700%, 800%, 900%, 1000% or more. Alternatively, the signal may be considered "significantly" above or below the normal and/or control signal if the amount is at least about 2%, preferably at least about 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 100%, 105%, 110%, 115%, 120%, 125%, 130%, 135%, 140%, 145%, 150%, 155%, 160%, 165%, 170%, 175%, 180%, 185%, 190%, 195%, 2-fold, 3-fold, 4-fold, 5-fold or more, or any range therebetween (such as 5% -100%), respectively. Such significant modulation values can be applied to any of the metrics described herein, such as altered expression levels, altered activity, altered biomarker inhibition, altered binding of the test agent, and the like.
In one embodiment, the detectable positive signal may be a loss of fluorescent signal or colorimetric signal relative to a control, as described herein. In one embodiment, a detectable positive signal may be detected on a lateral flow device, as described herein.
Application of detection method
Systems and methods for detecting and diagnosing microorganisms, including bacterial, fungal and viral microorganisms, can be devised. In one aspect, the system may include multiplex detection of multiple variants of a viral infection, including coronaviruses, different viruses that may be related coronaviruses or respiratory viruses, or combinations thereof. In embodiments, assays can be performed on a variety of viruses and viral infections (including acute respiratory infections) using the disclosure detailed herein. The system may comprise two or more TnpB systems as described elsewhere herein to effect multiplexing to detect a variety of respiratory tract infections or viral infections, including coronaviruses. Coronaviruses are a family of plus-sense single-stranded RNA viruses that infect a wide variety of animals and humans. SARS-CoV is a type of coronavirus infection and MERS-CoV detection of one or more coronaviruses is contemplated, including 2019-nCoV detected. The sequences of 2019-nCoV are available from GISAID accession numbers EPI_ISL_402124 and EPI_ISL_402127-402130 and are described in DOI 10.1101/2020.01.22.914952. Further deposits of SARS-CoV-2 deposited in the GISAID platform include EP_ISL_402119-402121 and EP_ISL 402123-402124; see also GenBank accession number MN908947.3.
Target molecule detection may include two or more detection systems utilizing the TnpB protein. The TnpB protein may preferably be thermostable, allowing the use of different TnpB proteins with different sequence specificities, operational temperatures or cleavage preferences by multiple designs.
Multiple embodiments can be designed to track one or more variants of coronaviruses or one or more variants of coronaviruses, including SARS-CoV-2 in combination with other viruses such as human respiratory syncytial virus, middle East Respiratory Syndrome (MERS) coronavirus, severe acute respiratory syndrome-related (SARS) coronavirus, and influenza. In embodiments, the assay may be performed in multiplex to detect multiple variants of coronaviruses, different viruses that may be related coronaviruses or respiratory viruses, or combinations thereof. In one aspect, each assay may be performed in a separate discrete volume. A "discrete volume alone" is a discrete volume or space, such as a container, receptacle, or other defined volume or space that may be defined by features that prevent and/or inhibit migration of nucleic acids and reagents necessary to perform the methods disclosed herein, e.g., a volume or space defined by physical features (such as walls, e.g., walls of a pore, or surfaces of a droplet), which may be impermeable or semi-permeable, or as defined by other means, such as chemical, diffusion rate limited, electromagnetic, or light illumination, or any combination thereof. By "diffusion rate limited" (e.g., diffusion limited volume) is meant a space or volume that is effectively defined by diffusion constraints (as is the case with two parallel laminar flows, where diffusion will limit migration of target molecules from one flow to the other) but only space into which certain molecules or reactions can enter. By "chemically" defined volume or space is meant a space where only certain target molecules may be present due to their chemical or molecular properties (such as size), wherein for example a gel bead may exclude certain substances from entering the bead, such as by the surface charge of the bead, the size of the matrix, or other bead physical properties that may allow selection of substances that may enter the interior of the bead, but not others. By "electromagnetically" defined volume or space is meant a space in which the electromagnetic properties (such as charge or magnetism) of the target molecule or its support can be used to define certain regions in space (such as capture of magnetic particles within a magnetic field or directly on a magnet). By "optically" defined volume is meant any region of space that can be defined by illuminating it with visible, ultraviolet, infrared or other wavelengths of light such that only target molecules within the defined space or volume can be labeled. One advantage of using non-wall or semi-permeable is that some reagents (such as buffers, chemical activators or other agents) can pass through discrete volumes in the applicant, while other materials such as target molecules can remain in discrete volumes or spaces. Typically, the discrete volume will comprise a fluid medium (e.g., an aqueous solution, oil, buffer, and/or medium capable of supporting cell growth) suitable for labeling the target molecule with the indexable nucleic acid identifier under conditions that allow for labeling. Exemplary discrete volumes or spaces that can be used in the disclosed methods include droplets (e.g., microfluidic droplets and/or emulsion droplets), hydrogel beads or other polymer structures (e.g., polyethylene glycol diacrylate beads or agarose beads), tissue slides (e.g., fixed formalin paraffin embedded tissue slides having specific areas, volumes or spaces defined by chemical, optical, or physical means), microscope slides whose areas are defined by depositing reagents in an ordered array or random pattern, tubes (such as centrifuge tubes, microcentrifuge tubes, test tubes, cuvettes, conical tubes, etc.), bottles (such as glass bottles, plastic bottles, ceramic bottles, conical bottles, scintillation bottles, etc.), wells (such as wells in a plate), plates, pipettes, or pipette heads, etc. In certain exemplary embodiments, the individual discrete volumes are wells of a microplate. In certain exemplary embodiments, the microplate is a 96-well, 384-well, or 1536-well microplate.
In certain exemplary embodiments, the systems, devices, and methods disclosed herein relate to detecting the presence of one or more microbial agents in a sample (such as a biological sample obtained from a subject). In certain exemplary embodiments, the microorganism may be a bacterium, fungus, yeast, protozoan, parasite, or virus. Thus, the methods disclosed herein may be applicable to or used in combination with other methods requiring rapid identification of microbial species, monitoring of microbial proteins (antigens), antibodies, the presence of antibody genes, detection of certain phenotypes (e.g., bacterial resistance), monitoring of disease progression and/or outbreaks, and antibiotic screening. Because of the rapid and sensitive diagnostic capabilities of the embodiments disclosed herein, detection of microbial species types, low to single nucleotide differences, and ability to be deployed as POC devices, the embodiments disclosed herein can be used as a guide to therapeutic regimens, such as selection of appropriate antibiotics or antiviral drugs. Embodiments disclosed herein may also be used to screen environmental samples (air, water, surfaces, food, etc.) for the presence of microbial contamination.
A method of identifying a microbial species (such as a bacterial, viral, fungal, yeast or parasitic species, etc.) is disclosed. Particular embodiments disclosed herein describe methods and systems that will identify and differentiate microbial species within a single sample or within multiple samples, allowing for the identification of many different microorganisms. The methods of the invention allow detection of pathogens by detecting the presence of a target nucleic acid sequence in a biological or environmental sample and distinguishing between two or more species of one or more organisms in the sample, such as bacteria, viruses, yeasts, protozoa and fungi, or combinations thereof. A positive signal obtained from the sample indicates the presence of the microorganism. By using more than one effector protein, multiple microorganisms can be identified simultaneously using the methods and systems of the present invention, wherein each effector protein targets a particular microbial target sequence. In this way, a multi-level analysis can be performed on a particular subject, such as a subject suffering from an unknown respiratory infection, suffering from a symptom of coronavirus, or an individual at risk of or having been exposed to coronavirus, in which any number of microorganisms can be detected at once. In one embodiment, simultaneous detection of multiple microorganisms may be performed using a set of probes that identify one or more species of microorganisms.
Microorganism detection
In one embodiment, a method for detecting a microorganism in a sample is provided, comprising dispensing a sample or a set of samples into one or more separate discrete volumes comprising a TNPB system as described herein; incubating the sample or group of samples under conditions sufficient to allow binding of the one or more nucleic acid component molecules to the one or more microorganism-specific targets; activating the TnpB protein via binding of one or more nucleic acid component molecules to one or more target molecules, wherein activating the TnpB protein results in modification of the RNA-based masking construct such that a detectable positive signal is generated; and detecting a detectable positive signal, wherein detection of the detectable positive signal indicates the presence of one or more target molecules in the sample. The one or more target molecules may be mRNA, gDNA (coding or non-coding), trRNA or rRNA, comprising target nucleotide sequences that can be used to distinguish two or more microbial species/strains from each other. The nucleic acid component molecules can be designed to detect a target sequence. Embodiments disclosed herein may also utilize certain steps to improve hybridization between nucleic acid component molecules and target RNA sequences. Methods for enhancing ribonucleic acid hybridization are disclosed in WO 2015/085194 entitled "Enhanced Methods of Ribonucleic Acid Hybridization", which is incorporated herein by reference. The microorganism specific target may be RNA or DNA or a protein. The DNA methods may also include the use of DNA primers incorporating an RNA polymerase promoter as described herein. If the target is a protein, the method will utilize an aptamer and step specific for the detection of the protein described herein.
Detection of single nucleotide variants
In one embodiment, one or more identified target sequences can be detected using a nucleic acid component molecule specific for and binding to a target sequence described herein. The systems and methods of the present invention can even distinguish single nucleotide polymorphisms present between different microbial species, and thus the use of multiple nucleic acid component molecules according to the present invention can further expand or improve the number of target sequences that can be used to distinguish species. For example, in one embodiment, one or more nucleic acid component molecules can distinguish between a species, genus, family, order, class, phylum, kingdom, or phenotype of a microorganism, or a combination thereof.
rRNA sequence-based detection
In certain exemplary embodiments, the devices, systems, and methods disclosed herein can be used to distinguish between multiple microorganism species in a sample. In certain exemplary embodiments, the identification can be based on ribosomal RNA sequences, including 16S, 23S, and 5S subunits. Methods for identifying related rRNA sequences are disclosed in U.S. patent application publication No. 2017/0029872. In certain exemplary embodiments, a set of nucleic acid component molecules can be designed to distinguish each species by a unique variable region for each species or strain. The nucleic acid component molecules can also be designed to target RNA genes that differentiate microorganisms at the genus, family, order, class, phylum, kingdom level, or a combination thereof. In certain exemplary embodiments using amplification, a set of amplification primers can be designed to flank the constant region of the ribosomal RNA sequence, and the nucleic acid component molecules are designed to distinguish each species by a variable internal region. In certain exemplary embodiments, the primer and nucleic acid component molecules can be designed as conserved and variable regions in the 16S subunit, respectively. Other genes or genomic regions that uniquely vary between species or subsets of species, such as the RecA gene family, RNA polymerase β subunits, may also be used. Other suitable markers for phylogenetic development and methods for their identification are discussed, for example, in Wu et al, arXiv:1307.8690[ q-bio.GN ].
In certain exemplary embodiments, the method or diagnosis is designed to screen microorganisms at multiple phylogenetic and/or phenotypic levels simultaneously. For example, a method or diagnosis may include using a plurality of TnpB systems with different nucleic acid component molecules. The first set of nucleic acid component molecules can distinguish, for example, between mycobacteria, gram positive bacteria, and gram negative bacteria. These general categories may be even further subdivided. For example, the nucleic acid component may be designed and used in a method or diagnosis to distinguish between intestinal and parenteral bacteria within gram-negative bacteria. The second set of nucleic acid component molecules can be designed to distinguish microorganisms at the genus or species level. Thus, a matrix can be generated that identifies all mycobacteria, gram positive bacteria, gram negative bacteria (further divided into intestinal and parenteral), and each genus of bacterial species falling into one of these categories is identified in a given sample. The foregoing is for exemplary purposes only. Other ways for classifying other microorganism types are also contemplated and will follow the general structure described above.
Drug resistance screening
In certain exemplary embodiments, the devices, systems, and methods disclosed herein can be used to screen for microbial genes of interest, such as antibiotic and/or antiviral resistance genes. The nucleic acid component molecules can be designed to distinguish between known genes of interest. The embodiments disclosed herein for detecting such genes can then be used to screen samples, including clinical samples. The ability to screen for drug resistance at POC would be of great benefit for the selection of appropriate therapeutic regimens. In certain exemplary embodiments, the antibiotic resistance gene is a carbapenemase, including KPC, NDM1, CTX-M15, OXA-48. Other antibiotic resistance genes are known and can be found, for example, in the comprehensive antibiotic resistance database (Jia et al, "CARD 2017:expansion and model-centric curation of the Comprehensive Antibiotic Resistance database." Nucleic Acids Research,45, D566-573).
Ribavirin (Ribavirin) is a potent antiviral drug against a variety of RNA viruses. Several clinically important viruses have evolved ribavirin resistance, including foot-and-mouth disease virus doi 10.1128/JVI.03594-13; poliovirus (Pfeifer and Kirkegaard. PNAS,100 (12): 7289-7294, 2003); and hepatitis C virus (Pfeiffer and Kirkegaard, J. Virol.79 (4): 2346-2355, 2005). Many other persistent RNA viruses, such as hepatitis and HIV, have evolved resistance to existing antiviral drugs: hepatitis B virus (lamivudine), tenofovir, entecavir (entecavir) doi 10/1002/hep22900; hepatitis C virus (telaprevir), BILN2061, ITMN-191, SCh, boceprevir (boceprevir), AG-021541, ACH-806) doi 10.1002/hep.22549; and HIV (many drug resistance mutations) hivb. The embodiments disclosed herein can be used to detect such variants, and the like.
In addition to drug resistance, there are many clinically relevant mutations that can be detected with the embodiments disclosed herein, such as persistent and acute infections in LCMV (doi: 10.1073/pnas.1019304108), and increases in ebola virus infectivity (Diehl et al cell 2016,167 (4): 1088-1098).
As described elsewhere herein, closely related microbial species (e.g., having only a single nucleotide difference in a given target sequence) can be distinguished by introducing a synthetic mismatch in the nucleic acid component molecules.
Monitoring microbial outbreaks
In one embodiment, the TnpB system or methods of use thereof as described herein can be used to determine the evolution of pathogen outbreaks. The method may comprise detecting one or more target sequences from a plurality of samples from one or more subjects, wherein the target sequences are sequences from a microorganism causing an outbreak. Such methods may further include determining a pattern of pathogen transmission, or a mechanism involving an outbreak of disease caused by the pathogen.
The pattern of pathogen transmission may include continued new transmission from the natural repository of pathogens or subject-to-subject transmission (e.g., interpersonal transmission) after a single transmission from the natural repository, or a mixture of both. In one embodiment, the pathogen transmission may be bacterial or viral transmission, in which case the target sequence is preferably a microbial genome or fragment thereof. In one embodiment, the pattern of pathogen transmission is an early pattern of pathogen transmission, i.e., at the beginning of a pathogen outbreak. Determining pathogen transmission patterns at the beginning of a pathogen outbreak may increase the likelihood of preventing the pathogen outbreak as early as possible, thereby reducing the likelihood of local and international spread.
Determining a pattern of pathogen transmission may include detecting pathogen sequences according to the methods described herein. Determining the pattern of pathogen transmission may also include detecting a common intra-host variation in pathogen sequences between subjects and determining whether the common intra-host variation exhibits a temporal pattern. The observed intra-and inter-host variation patterns provide important insights into transmission and epidemiology (Gire et al, 2014).
Detection of intra-host variation between subjects that shows temporal patterns in common indicates a transmitted link between subjects (particularly between humans) because it can be enhanced by infection of subjects from multiple sources (repeat infection), sample contamination repeat mutation (with or without balanced selection), or co-transmission of slightly different viruses caused by early mutations in the transmitted chain (Park, et al, cell 161 (7): 1516-1526, 2015). Detection of a common intra-host variation between subjects may include detection of an intra-host variation at a common Single Nucleotide Polymorphism (SNP) location. Positive detection of in-host variation at a common (SNP) location suggests that repeated infection and contamination are the primary explanation for in-host variation. Repeated infection and contamination can be distinguished based on the frequency of SNPs where inter-host variation occurs (Park et al 2015). Otherwise, repeated infections and contamination may be excluded. In the latter case, detection of intra-subject variation in a common host may also include assessing the frequency of synonymous variation and non-synonymous variation and comparing the frequencies of synonymous variation and non-synonymous variation to each other. Non-synonymous mutations are mutations that alter amino acids of a protein, potentially resulting in biological changes in microorganisms that undergo natural selection. Synonymous substitutions do not alter the amino acid sequence. The same frequency of synonymous and non-synonymous variants indicates that the intra-host variant is neutral evolving. If the frequencies of synonymous and non-synonymous variants are different, then the intra-host variant is likely to be maintained by balanced selection. If the frequency of synonymous and non-synonymous variants is low, then repeated mutations are indicated. If the frequency of synonymous and non-synonymous variants is high, co-propagation is indicated (Park et al, 2015).
Like ebola virus, lassa virus (LASV) can cause hemorrhagic fever with high mortality. Genome catalogues of nearly 200 LASV sequences were generated by Andersen et al from clinical and rodent reservoir samples (Andersen et al, cell volume 162, 4 th, pages 738-750, 2015, 8, 13). Andersen et al showed that EVD epidemic in 2013-2015 was driven by interpersonal transmission, whereas LASV infection was mainly caused by reservoir-to-human infection. Andersen et al elucidate the transmission of LASV in western africa and show that this migration is accompanied by changes in LASV genome abundance, mortality, codon adaptation and translation efficiency. The method may further comprise phylogenetically comparing the first pathogen sequence to the second pathogen sequence and determining whether there is a phylogenetic link between the first pathogen sequence and the second pathogen sequence. The second pathogen sequence may be an earlier reference sequence. If a phylogenetic relationship exists, the method can further comprise rooting phylogenetic development of the first pathogen sequence to the second pathogen sequence. Thus, a lineage of the first pathogen sequence can be constructed. (Park et al 2015).
The method may further comprise determining whether the mutation is detrimental or adaptive. Deleterious mutations indicate transmission of impaired viruses and dead-end infections and are therefore typically present only in individual subjects. Mutations specific to an individual subject are those that occur on the outer branches of the phylogenetic tree, while the inner branch mutations are mutations that occur in multiple samples (i.e., in multiple subjects). The higher non-synonymous substitution rate is a feature of the outer branches of the phylogenetic tree (Park et al, 2015).
In the inner branches of the phylogenetic tree, there is a choice of more opportunities to filter out harmful mutants. By definition, internal branches have generated multiple offspring lineages and are therefore unlikely to contain mutations with fitness costs. Thus, a lower non-synonymous substitution rate indicates the presence of an internal branch (Park et al, 2015).
Synonymous mutations, which may have less impact on fitness, occur at a more comparable frequency on the inner and outer branches (Park et al 2015).
By analyzing sequenced target sequences such as viral genomes, it is possible to find mechanisms responsible for the severity of epidemic, such as during 2014 ebola outbreaks. For example, gire et al performed phylogenetic comparisons of the genome of the 2014 outbreak with all 20 genomes of the early outbreak, indicating that the 2014 western virus may not spread from there in the past decade. The use of differences from other ebola virus genomes to plant root system development is problematic (6, 13). However, rooting of trees for the oldest bursts reveals a strong correlation between sampling date and root tip distance, with a substitution rate of 8 x 10-4 per site per year (13). This suggests that the lineages of the last three bursts separated from a common ancestor at nearly the same time around 2004, supporting the hypothesis that: each outbreak represents an independent zoonotic event from the same genetically diverse virus population in its natural repository. They also found that the 2014 EBOV outbreak could be caused by a single propagation of the natural repository, then an interpersonal propagation occurred during the outbreak. Their results also indicate that the epidemic of cerafreon may originate from the introduction of two genetically different viruses from guinea at about the same time (Gire et al, 2014).
It is also possible to determine how the lassa virus propagates from its point of origin, in particular by interpersonal propagation, and even back to the history of such propagation 400 years ago (Andersen et al, cell 162 (4): 738-50, 2015).
In terms of the work required during an EBOV burst in 2013-2015 and the difficulties encountered by medical personnel on the scene of the burst, and more generally, the method of the present invention allows sequencing to be performed using fewer selected probes, so that sequencing can be accelerated, thereby shortening the time required from sample collection to result acquisition. Furthermore, the kits and systems may be designed to be used in the field so that diagnosis of a patient may be easily performed without the need to send or transport the sample to a country or other part of the world.
In any of the above methods, the target sequence or fragment thereof may be sequenced using any of the above sequencing methods. Furthermore, sequencing the target sequence or fragment thereof may be near real-time sequencing. The target sequence or fragment thereof may be sequenced according to the methods described previously (experimental procedures: matranga et al, 2014; and Gire et al, 2014). Sequencing a target sequence or fragment thereof may include parallel sequencing of multiple target sequences. Sequencing the target sequence or fragment thereof may include Illumina sequencing.
Analyzing the target sequence or fragment thereof hybridized to one or more selected probes may be an identification analysis, wherein hybridization of the selected probes to the target sequence or fragment thereof indicates the presence of the target sequence within the sample.
Currently, the primary diagnosis is based on the symptoms of the patient. However, the various diseases may have the same symptoms, so that diagnosis depends largely on statistical data. For example, malaria can trigger symptoms like influenza: headache, fever, tremble, arthralgia, vomiting, hemolytic anemia, jaundice, hemoglobin in urine, retinal damage and convulsion. These symptoms are also common in sepsis, gastroenteritis, and viral diseases. In the latter case, ebola hemorrhagic fever has the following symptoms: fever, sore throat, muscle aches, headache, vomiting, diarrhea, rash, decreased liver and kidney function, internal and external bleeding.
When the patient is sent to a medical facility (e.g. in tropical africa), the basic diagnostic result will be malaria, as it is statistically the most likely disease in the region of africa. Thus, the patient receives malaria treatment, but the patient may not actually be infected with the disease, and the patient is eventually not treated correctly. Lack of proper treatment can be life threatening, especially when the patient is suffering from a rapidly developing disease. It may be too late when medical personnel become aware that treatment of the patient is ineffective and make the correct diagnosis and administer the appropriate treatment to the patient.
The method of the present invention provides a solution to this situation. In fact, since the number of molecules of the nucleic acid component can be significantly reduced, this makes it possible to provide selected probes divided into groups on a single chip, each group being specific for one disease, so that a plurality of diseases, such as viral infections, can be diagnosed simultaneously. Thanks to the invention, more than 3 diseases, preferably more than 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 diseases, preferably the most frequent diseases in a population of a given geographical area, can be diagnosed simultaneously on a single chip. Since each set of selected probes is specific for one of the diagnosed diseases, a more accurate diagnosis can be performed, thereby reducing the risk of administering a false treatment to the patient.
In other cases, a disease such as a viral infection may occur without any symptoms, or may have caused symptoms but disappeared before the patient is sent to medical personnel. In such cases, the diagnosis is complicated either by the patient not seeking any medical assistance or by the absence of symptoms on the day of the visit.
The invention can also be used with other methods for diagnosing diseases, identifying pathogens, and optimizing therapy based on nucleic acid (e.g., mRNA in crude, unpurified samples) detection.
The method of the present invention also provides a powerful tool for addressing this situation. In fact, since multiple sets of selected nucleic acid component molecules (each set specific for one of the most common diseases occurring in a population of a given area) are contained in a single diagnosis, medical personnel need only contact biological samples collected from patients with the chip. Reading the chip reveals the disease of the patient's infection.
In some cases, the patient is brought to the medical personnel for diagnosis of the particular symptoms. The method of the invention makes it possible to identify not only which disease causes these symptoms, but also to determine at the same time whether the patient is suffering from another disease he is unaware of.
This information may be critical when looking for burst mechanisms. In fact, patient groups carrying the same virus also exhibit temporal patterns, indicating that there is a subject-to-subject transmission link.
Exemplary microorganisms
Embodiments disclosed herein can be used to detect a variety of different microorganisms. The term microorganism as used herein includes bacteria, fungi, protozoa, parasites and viruses.
Bacteria and method for producing same
An example list of types of microorganisms that can be detected using embodiments disclosed herein is provided below. In certain exemplary embodiments, the microorganism is a bacterium. Examples of bacteria that can be detected according to the disclosed methods include, but are not limited to, any one or more (or any combination thereof) of the following: acinetobacter baumannii, actinomycetes species, actinomycetes (Actinomycetes), actinomycetes species (such as Actinomycetes chlamydomonas (Actinomyces israelii) and Actinomycetes naeslundii (Actinomyces naeslundii)), aeromonases species (such as aeromonas hydrophila, aeromonas vannamei and aeromonas guinea), phagostimulant slurry (Anaplasma phagocytophilum), edge slurry-free (Anaplasma marginale), alcaligenes xylosoxidans (Alcaligenes xylosoxidans), acinetobacter baumannii, actinomycetes companion (Actinobacillus actinomycetemcomitans), bacillus species (such as bacillus anthracis, bacillus cereus, bacillus subtilis, bacillus thuringiensis and bacillus stearothermophilus), and bacteroides such as bacteroides fragilis, bartonella such as bartonella baculosa (Bartonella bacilliformis) and bartonella hantazii (Bartonella henselae), bifidobacterium such as bordetella pertussis (Bordetella pertussis), bordetella parapertussis (Bordetella parapertussis) and bordetella bronchiseptica (Bordetella bronchiseptica), borrelia such as Borrelia regressive and Borrelia burgdorferi, brucella such as Brucella abortus (Brucella abortus), brucella canis (Brucella anis), brucella capri (Brucella melintensis) and Brucella suis (Brucella suis), and the like, burkholderia species (such as Burkholderia meliodes and Burkholderia cepacia), campylobacter species (such as Campylobacter jejuni, campylobacter coli, campylobacter erythropolis and Campylobacter foetidus), carbon dioxide philia species (Capnocytophaga sp.), human heart bacillus (Cardiobacterium hominis), chlamydia trachomatis (Chlamydia trachomatis), chlamydia pneumoniae (Chlamydophila pneumoniae), chlamydia psittaci (Chlamydophila psittaci), citrobacter species (Citrobacter sp.) Chlamydia, corynebacterium species (such as Corynebacterium diphtherium (Corynebacterium diphtheriae), corynebacterium jejuni and Corynebacterium), clostridium species (such as Clostridium perfringens ()), clostridium difficile, clostridium botulinum and Clostridium tetani), clostridium Ai Kenjun (Eikenella corrodens), enterobacter species (such as Enterobacter gas (Enterobacter aerogenes), enterobacter agglomerans (2), enterobacter cloacae and Escherichia coli, including Escherichia coli (opportunistic Escherichia coli) of opportunity such as enterotoxigenic Escherichia coli, enteroinvasive Escherichia coli, enteropathogenic Escherichia coli, enterohemorrhagic Escherichia coli, enteroaggregating Escherichia coli and urethropathogenic Escherichia coli), enterococcus such as enterococcus faecalis and enterococcus faecium, ehrlichia sp such as QIAGEN Ehrlichia (Ehrlichia chafeensia) and Ehrlichia canis, epidermophyton floccosum (Epidermophyton floccosum), the genus Erythrocco, eubacterium (Francisella tularensis), francisella tularensis (Francisella tularensis), fusobacterium nucleatum, gardnerella vaginalis (Gardnerella vaginalis), bicoccus measles (Gemella morbillorum), haemophilus (such as Haemophilus influenzae, leptospira duciflorae (Haemophilus ducreyi), haemophilus aegypti (Haemophilus aegyptius), haemophilus parainfluenza (Haemophilus parainfluenzae), haemophilus haemolyticus and Haemophilus parahaemolyticus (Haemophilus parahaemolyticus)), helicobacter (such as helicobacter pylori, helicobacter homosamalbe and Fennali (Helicobacter fennelliae)), jin Shijin (Kingella gii), klebsiella (such as Klebsiella pneumoniae, klebsiella granulomatosis (Klebsiella granulomatis) and Klebsiella oxytoca), lactobacillus (such as Leidella monocytogenes, leidella (Leptospira interrogans), leidella pneumophila (Legionella pneumophila), leidella (such as Leidella, streptococcus haemolyticus (Mannheimia hemolytica), mycobacterium equi (such as Mycobacterium sp, mycobacterium crispatus (3929), mycobacterium sp, mycobacterium tuberculosis (such as Mycobacterium sp), mycobacterium tuberculosis (such as Mycobacterium sp.4, mycobacterium sp, mycobacterium sp.3, mycobacterium sp.4, mycobacterium tuberculosis (such as Mycobacterium sp), mycoplasmal species (mycoplasmal sp.) (such as mycoplasma pneumoniae (Mycoplasma pneumoniae), mycoplasma hominis (Mycoplasma hominis) and mycoplasma genitalium (Mycoplasma genitalium)), nocardia species (such as nocardia stella, church nocardia sanguinea and nocardia brasiliensis (Nocardia brasiliensis)), neisseria species (Neisseria sp.) (such as Neisseria gonorrhoeae (Neisseria gonorrhoeae) and Neisseria meningitidis), pasteurella spinosa, furribbon sporo (Pityrosporum orbiculare) (furfur malassezia (Malassezia furfur)), shigella sp, prasuvorax species, porphyrinomonas species (poryomonas sp.), black producing prasuvorax (Prevotella melaninogenica), proteus species (such as Proteus vulgaris and Provinia mirabilis), providencia species (such as Provenia gonorrhoeae, torula, such as Provensis, torula rhodospori (Providencia rettgeri) and Neisseria meningitidis), rhodosporum (Pityrosporum orbiculare) (furfur mala, shigella pseudomona, rhodospori (26) and rhodospori (38), rhodospori (P.sp.) (P., salmonella (such as Salmonella enterica, salmonella typhi (Salmonella typhi), salmonella paratyphi (Salmonella paratyphi), salmonella enteritidis (Salmonella enteritidis), salmonella choleraesuis (Salmonella cholerasuis) and Salmonella typhimurium (Salmonella typhimurium)), serratia (such as serratia marcescens and serratia liquefaciens), shigella (Shigella sp.) (such as Shigella dysenteriae, shigella flexneri, shigella pallidum and Shigella sonnei), staphylococci (such as staphylococcus aureus, staphylococcus epidermidis, staphylococcus haemolyticus, staphylococcus saprophyticus), streptococcus (such as streptococcus pneumoniae (e.g., chloramphenicol resistant serotype 4 streptococcus pneumoniae, spectinomycin (spifinomycin) resistant serotype 6B streptococcus pneumoniae) streptomycin (streptomycin) -resistant serotype 9V streptococcus pneumoniae, erythromycin (erythromycin) -resistant serotype 14 streptococcus pneumoniae, olprine (optochnin) -resistant serotype 14 streptococcus pneumoniae, rifampicin (rifampicin) -resistant serotype 18C streptococcus pneumoniae, tetracycline-resistant serotype 19F streptococcus pneumoniae, penicillin (penicillin) -resistant serotype 19F streptococcus pneumoniae and trimethoprim-resistant serotype 23F streptococcus pneumoniae, chloramphenicol-resistant serotype 4 streptococcus pneumoniae, spectinomycin-resistant serotype 6B streptococcus pneumoniae, streptomycin-resistant serotype 9V streptococcus pneumoniae, olprine-resistant serotype 14 streptococcus pneumoniae, rifampicin-resistant serotype 18C streptococcus pneumoniae, penicillin resistant serotype 19F streptococcus pneumoniae or trimethoprim resistant serotype 23F streptococcus pneumoniae)), streptococcus agalactiae (Streptococcus agalactiae), streptococcus mutans (Streptococcus mutans), streptococcus pyogenes (Streptococcus pyogenes), group a streptococcus (Group A streptococci), streptococcus pyogenes, group B streptococcus (Group B streptococci), streptococcus agalactiae, group C streptococcus (Group C streptococci), streptococcus angina, streptococcus equisimilis (Streptococcus equismilis), group D streptococcus (Group D streptococci), streptococcus bovis (Streptococcus bovis), group F streptococcus (Group F streptococci), and streptococcus group G streptococcus of streptococcus angina (Streptococcus anginosus Group G streptococci)), molluscum (Spirillum minus), candida (Streptobacillus moniliformi), streptococcus Treponema sp (such as Treponema pallidum (Treponema carateum), treponema pallidum (Treponema petenue), treponema pallidum (Treponema pallidum) and Treponema pallidum (Treponema endemicum)), trichophyton rubrum (Trichophyton rubrum), trichophyton mentagrophytes (T.mendagarthes), whipple's disease-producing (Tropheryma whippelii), ureaplasma urealyticum (Ureaplasma urealyticum), weronella sp (Vellonella sp)), vibrio sp (such as Vibrio cholerae, vibrio parahaemolyticus, vibrio vulnificus, vibrio alginolyticus, vibrio mimicus, vibrio cholerae, vibrio hebrew, vibrio macerans, vibrio sea (Vibrio damsela) and Vibrio freundii (Vibrio furnsiii)), yersinia species (Yersinia sp.), such as Yersinia enterocolitica, yersinia pestis (Yersinia pestis) and Yersinia pseudotuberculosis (Xanthomonas maltophilia), and the like.
Fungi
In certain exemplary embodiments, the microorganism is a fungus or fungal species. Examples of fungi that can be detected according to the disclosed methods include, but are not limited to, any one or more (or any combination thereof) of the following: aspergillus, blastomyces (Blastomyces), candida (candidasis), coccidioides (coccoid hypomycis), cryptococcus neoformans (Cryptococcus neoformans), cryptococcus gartersii (Cryptococcus gatti), histoplasmosis such as histoplasmosis capsulata (Histoplasma capsulatum), pneumosporosis such as pneumosporosis yenii (Pneumocystis jirovecii), viticola (Stachybotrys) such as sciurella papyrifera (Stachybotrys chartarum), mucor Mucroymcosis, sporotrichosis (sporhrix), ocular fungus infection ringworm, umbilical vermilion (Exserohilum), and Cladosporium (Cladosporium).
In certain exemplary embodiments, the fungus is a yeast. Examples of yeasts that can be detected according to the disclosed methods include, but are not limited to, one or more of the following (or any combination thereof): aspergillus species such as aspergillus fumigatus (Aspergillus fumigatus), aspergillus flavus (Aspergillus flavus) and aspergillus clavatus (Aspergillus clavatus), cryptococcus species such as Cryptococcus neoformans, cryptococcus glaucocalyxa, cryptococcus laurentii (Cryptococcus laurentii) and Cryptococcus shallowii (Cryptococcus albidus), geotrichum (Geotrichum) species, saccharomyces species, hansenula (Hansenula) species, candida species such as candida albicans, kluyveromyces species, debaryomyces species, pichia species, or combinations thereof. In certain exemplary embodiments, the fungus is a mold. Exemplary mold species include, but are not limited to, penicillium species, cladosporium species, myceliophthora (Byssochlamys) species, or combinations thereof.
Protozoa (protozoa)
In certain exemplary embodiments, the microorganism is a protozoan. Examples of protozoa that may be detected according to the disclosed methods and apparatus include, but are not limited to, any one or more (or any combination thereof) of the following: euglena animal kingdom (Euglenozoa), iso She Zugang (hetorolobosa), trichomonas (dipyrida), amoeba kingdom (Amoebozoa), budding genus (Blastocystia) and Apicomplexa (apicoplexa). Exemplary euglena animal kingdoms include, but are not limited to, trypanosoma cruzi (Trypanosoma cruzi) (Chagas disease), trypanosoma brucei (t. Brucei gambiense), trypanosoma brucei (t. Brucei rhodesiense), leishmania brasiliensis (Leishmania braziliensis), leishmania infantis (l. Infantum), leishmania mexicana (l. Mexicana), leishmania maxima (l. Major), leishmania tropicalis (l. Tropica) and leishmania donovani (l. Donovania). Exemplary class i She Zu includes, but is not limited to, dirigible (Naegleria fowleri). Exemplary orders of the genus bichromatales include, but are not limited to, giardia intestinalis (Giardia intestinalis) (giardia lamblia, giardia duodenalis). Exemplary amoeba kingdom include, but are not limited to, acanthamoeba californica (Acanthamoeba castellanii), baboon balameba (Balamuthia madrillaris), endoamoeba histolytica (Entamoeba histolytica). Exemplary genus Agents include, but are not limited to, human Agents (Blastocystic hominis). Exemplary apicomplexa include, but are not limited to, babesia (Babesia microciti), cryptosporidium parvum, circumsporozoites catarrhalis, plasmodium falciparum, plasmodium vivax, plasmodium ovale (p.ovale), plasmodium malariae, and toxoplasma gondii.
Parasites
In certain exemplary embodiments, the microorganism is a parasite. Examples of parasites that can be detected according to the disclosed methods include, but are not limited to, one or more (or any combination thereof) of the following: flat tail filarial species and plasmodium species.
Virus (virus)
In certain example embodiments, the systems, devices, and methods disclosed herein relate to detecting a virus in a sample. Embodiments disclosed herein can be used to detect viral infection (e.g., of a subject or plant), or to determine strains, including strains that differ in single nucleotide polymorphisms. The virus may be a DNA virus, RNA virus or retrovirus. Non-limiting examples of viruses that may be used in the present invention include, but are not limited to, ebola, measles, SARS, chikungunya, hepatitis, marburg virus, yellow fever, MERS, dengue fever, rasagile, influenza, rhabdovirus, or HIV. The hepatitis virus may include hepatitis a, b or c. Influenza viruses may include, for example, influenza a or influenza b. HIV may include HIV 1 or HIV 2. In certain exemplary embodiments, the viral sequence may be human respiratory syncytial virus, sudan ebola virus (Sudan ebola virus), present Dibula Jiao Bingdu (Bundibugyo virus), large forest ebola virus (Tai Forest ebola virus), redston ebola virus (Reston ebola virus), azomota (Achimota), aegea flaviviridae (Aedes flaviviruses), ai Gu Kate virus (Aguacate virus), acarban virus (Akabane virus), adfaldoxye arenavirus (Alethinophid reptarenavirus), apasylus arenavirus (Allpahuayo mammarenavirus), A Ma Pali arenavirus (Amapari mmarenavirus), andisvirus (Andesvirus), apaoyi virus (Apoi), aravan virus (Aravan virus), aroav virus (Aroav), aroav Mo Wa virus (Aruwot virus), large paramyxovirus (Atlantic salmonparamyoxivirus), australian island virus (Azodiac virus), barbary virus (BK) or Bazafivela virus (Bear Canonmammarenavirus), bazafivela virus (BK), bazafivela virus (Bear Canonmammarenavirus), bazafivela virus (Brua), bazafivela virus (Brua virus (Bear Canonmammarenavirus), bazafia virus (Brua virus), bazafivela virus (Bruses) (Bruson virus), bazafia virus (Bruses) or a virus (Bruses) Betapapulomavirus 1-6 (Betapapulomavirus 1-6), ban Jie virus (Bhanja virus), bokrolo bat rabies virus (Bokeloh bat lyssavirus), bourn disease virus (Bourdon virus), niu Bing hepatitis virus (Bovine hepacivirus), bovine parainfluenza virus 3, bovine respiratory syncytial virus, ba Zong Bingdu (Brapiran virus), benyamawere virus (Bunyamye virus), california encephalitis virus (California encephalitis virus), candirus (Candriu virus), canine distemper virus (Caninedistemper virus), canine pneumovirus (Canaine pneumovirus), pine bay virus (Cedar virus), cell fusion factor virus (Cell fusing agent virus), whale measles virus (Cetacean morbillivirus), and combinations thereof the virus types include Chandiprara virus (Chandiprara virus), chaoyang virus (Chaoyang virus), cha Palei mammalian arenavirus (Chaparemammarenavirus), chikungunya virus (Colobus monkey papillomavirus), colorado tick fever virus (Colorado tick fever virus), vaccinia virus, cremilia-Congo hemorrhagic fever virus, culex yellow virus (Culex flaviviruses), kun Pi Kesi mammalian arenavirus (Cupixi mammarenavirus), dengue virus, dulbrava-Belgide virus (Dobrva-Belgrade virus), dong harbor virus (Donggang virus), du Bei virus (Dugbe virus), duchenhan virus (Duvehage virus), eastern equine encephalitis virus (Eastern equineencephalitis virus), enrobe bat virus (Entebbe bat virus), enterovirus A-D, european bat rabies virus 1-2, eyach virus, feline measles virus (Feline morbillivirus), spearhead snake paramyxovirus (Fer-de-Lance paramyxovirus), fischer's river virus (Fitzroy River virus), flaviviridae virus, flekecord mammal-arenavirus (Flexal mammarenavirus), GB virus C, gairo virus, gomeyeri (Gemycerularvirus), goose paramyxovirus SF02, island virus (Great Island virus), guanartolal mammal-like arenavirus (Guanarito mammarenavirus), hantavirus Z10, hartled virus (Heartand virus), hendela virus hepatitis A/B/C/E, hepatitis B virus (Hepatitis delta virus), human bocavirus, human coronavirus, human endogenous retrovirus K, human enterovirus, human genital-related circular DNA virus-1, human herpesvirus 1-8, human immunodeficiency virus 1/2, human mammalian adenovirus A-G (Human mastadenovirus A-G), human papillomavirus, human parainfluenza virus 1-4, human paraenteric virus (Human paraechovirus), human picornavirus (Human picornavirus), human Sitting horse virus (Human smaacovirus), ai Kema rabies virus (Ikoma lysavirus), ilheisis virus, influenza A-C, the virus types may include, for example, the type of Is Pi Buru arenavirus (Ippy mammarenavirus), the type of Irkut virus, J-virus, JC polyomavirus, japanese encephalitis virus, huning mammal arenavirus (Junin mammarenavirus), the type of KI polyomavirus, cadi Pi Luo virus (Kadipiro virus), the type of Kami river virus (Kamiti River virus), the type of Kadouu virus (Kedou virus), the type of ku Gu De virus (Khujand virus), the type of Kekobera virus (Kokobera virus), the type of Koknoop forest disease virus (Kyasanur forest disease virus), the type of Lagos bat virus, the type of Langat virus (Langaat virus), the type of La Sha Buru arenavirus (Lassa mammarenavirus), the type of Latin mammal arenavirus (Latino mammarenavirus) Luo Pade mountain virus (Leopards Hill virus), liao ning virus, yongan river virus (Ljungan virus), ralovovirus (Llovir virus), jumping virus (Louping ill virus), lu Yao mammalian arenavirus (Lujo mammarenavirus), runa mammalian arenavirus (Luna mammarenavirus), lunkavirus (Lunk virus), lymphocytic choriomeningitis mammalian arenavirus (Lymphocytic choriomeningitis mammarenavirus), european rabies virus (Lyssavirus Ozernoe), MSSi2\225 virus, ma Qiubo mammalian arenavirus (Machupo mammarenavirus), mammalian astrovirus 1 (Mamastrov irus 1), manzania virus (Manzanella virus), ma Puai Lavirus (Mapuera virus), marburg virus, ma Yaluo virus (Mayaro virus), measles virus, mei Nagao virus (Menangle virus), morsikadi virus (Mercoadoo virus), mexicodeoxycytoma virus (Merkel cell polyomavirus), middle east respiratory syndrome coronavirus, mo Bala mammal arenavirus (Mobala mammarenavirus), modoc virus (Modoc virus), mo Jiang virus (Moijang virus), mo Keluo virus (Mokolo virus), monkey pox virus, monte-Carnis-arma bat leukosis virus (Montana myotis leukoenchalitis virus), mo Peiya Laxavirus rearrangement 29 (Mopeia lassa virus reassortant), mo Peiya mammal arenavirus (Mopeia mammarenavirus), morogoro virus (Morogoro virus) Mossman virus, mumps virus, murine pneumovirus, murray Valley encephalitis virus (Murray Valley encephalitis virus), nariva virus, newcastle disease virus, nipa virus, norwalk virus, norway mouse hepatitis C virus (Norway rat hepacivirus), entaya virus, orniton-Niang virus (O' nyong-nyong virus), ortho Li Weiluo S mammal arenavirus (Oliveros mammarenavirus), murray hemorrhagic fever virus (Omsk hemorrhagic fever virus), orophe virus, parainfluenza virus 5, barana mammal arenavirus (Parana mammarenavirus), para Ma Dahe virus (Parramatta River virus), peste-des-petits-rum virus, bi Chide mammal arenavirus (Pichande mammarenavirus), picornaviridae Tao Buru arenavirus (Pirital mammarenavirus), fish hepatitis E virus A (Piscihepevirus A), porcine parainfluenza virus 1, porcine mumps virus (porcine rubulavirus), powassan virus, primate T-lymphotropic virus 1-2, primate erythroid parvovirus 1 (Primate erythroparvovirus), poantrum virus (Punta Torula virus), puumala virus (Puumala virus), guangdong flat virus (Quantum Binh virus), rabies virus, lewy dan virus (Razdenvirus) reptiles Boerna virus (Reptile bornavirus 1), rhinovirus A-B, rifut mountain valley fever virus (Rift Valley fever virus), rinderpest virus, rebauv virus (Rio Bravo virus), rodent ringlet virus (Rodent Torque Teno virus), rodent hepatitis C virus (Rodent hepacivirus), ross river virus, rotavirus A-I, royal Farm virus (Royal Farm virus), rubella virus, sabia mammal arenavirus (Sabia mammarenavirus), selemm virus (Salem virus), najalous sand fly fever virus (Sandfly fever Naples virus), sicily sand fly fever virus (Sandfly fever Sicilian virus), sapone virus (saporo virus), sathre virus (Sathreri virus), seal ring virus (Seal anellovirus), semliki forest virus (Semliki Forest virus), sendai virus (Sendai virus), hancheng virus (Seoul virus), sepick virus (Sepik virus), severe acute respiratory syndrome associated coronavirus, severe fever with thrombocytopenia syndrome virus, salmenda virus (Shamonda virus), simoney bat virus (Shimoni bat virus), suny virus (Shuni virus), simmondsia virus (Simmons virus), simian thin ring virus (Simiantorque teno virus), simian virus 40-41, xin Nuobai virus (Sin Nombre virus), sindbis virus, small finger ring virus (Small anellovirus) Sosuga virus (Kleba virus), spanish goat encephalitis virus (Spanish goat encephalitis virus), stokes Wen Ni virus (Spondweni virus), st.Louis encephalitis virus (St.Louis encephalitis virus), send Xia En virus (SUNSHINE virus), TTV-like minute virus (TTV-like mini virus), takara-type mammal arenavirus (Tacaribe mammarenavirus), taiyi virus (Taila virus), tama bata virus (Tamana bat virus), tami Emi mammal arenavirus (Tamiami mammarenavirus), tembusu virus (Tembusu virus), to Gao Tu virus (Thooto virus), soto Para virus (Thottapalayam virus), tick borne encephalitis virus (Tick-borne encephalitisvirus), cynancan virus (Tioman virus), takara-type mammal arenavirus (Tamana bat virus), the viral sequences may be selected from the group consisting of togaviridae, canine ringvirus (Torque teno canis virus), night monkey ringvirus (Torque teno douroucouli virus), cat ringvirus (Torque tenofelis virus), medium ringvirus (Torque teno midi virus), porcine ringvirus (Torque teno susvirus), marmoset ringvirus (Torque teno tamarin virus), ringvirus (Torque teno virus), sea lion ringvirus (Torque teno zalophus virus), toyokov virus (Tuhoko virus), tula virus (Tula virus), tree shrew paramyxovirus, urso virus (us virus), you Kuni m virus (uukunimi virus), vaccinia virus, smallpox virus, venezuelan equine encephalitis virus (Venezuelan equine encephalitis virus), indian vesicular stomatitis virus (Vesicular stomatitis Indiana virus), WU polyoma virus, wessel brav (Wesselsbron virus), west caucasian hepiali virus (West Caucasian bat virus), west nile virus, west equine virus (Western equine encephalitis virus), white virus (Whitewater Arroyo mammarenavirus), yellow fever virus, yokom virus (nula virus), or zakie virus (Z-virus) (325995), zakie virus (Z-virus (3295), and zakie virus (Z-virus). Examples of detectable RNA viruses include one or more (or any combination thereof) of the following: coronaviridae, picornaviridae, calicividae, flaviviridae, togaviridae, vitroneviridae (Bornaviridae), filoviridae, paramyxoviridae, alveolar viridae (Pneumoviridae), rhabdoviridae, arenaviridae, bunyaviridae, orthomyxoviridae, or butyl type viruses. In certain exemplary embodiments, the virus is a coronavirus, SARS, polio virus, rhinovirus, hepatitis a, norwalk virus, yellow fever virus, west nile virus, hepatitis c virus, dengue virus, zika virus, rubella virus, ross river virus, sindbis virus, chikungunya virus, borna disease virus, ebola virus, marburg virus, measles virus, mumps virus, nipah virus, hendra virus, newcastle disease virus, human respiratory syncytial virus, rabies virus, lassa virus, hantavirus, crimia-congo hemorrhagic fever virus, influenza, or hepatitis delta virus.
In certain exemplary embodiments, the virus may be a plant virus selected from the group consisting of: tobacco Mosaic Virus (TMV), tomato Spotted Wilt Virus (TSWV), cucumber Mosaic Virus (CMV), potato Y virus (PVY), RT virus cauliflower mosaic virus (CaMV), plum Pox Virus (PPV), brome Mosaic Virus (BMV), potato Virus X (PVX), citrus Tristeza Virus (CTV), barley Yellow Dwarf Virus (BYDV), potato Leaf Roller Virus (PLRV), tomato plexiglas virus (TBSV), rice east grid Lu Qiuzhuang virus (rice tungro spherical virus) (RTSV), rice Huang Banbo virus (RYMV), rice white leaf virus (RHbV), maize Lei Yaduo non-nano virus (maize rayado fino virus) (MRFV) Maize Dwarf Mosaic Virus (MDMV), sugarcane mosaic virus (SCMV), sweet potato pinnate virus (SPFMV), sweet potato gravure yellow filamentous virus (sweet potato sunken vein closterovirus) (SPSVV), grape leaf virus (GFLV), grape Virus A (GVA), grape Virus B (GVB), grape spot virus (GFkV), grape leaf-related virus-1, grape leaf-related virus-2 and grape leaf-related virus-3 (GLRaV-1, GLRaV-2 and GLRaV-3), arabian mosaic virus (ArMV) or sand grape stem pox-related virus (RSPaV). In a preferred embodiment, the target RNA molecule is part of the pathogen or transcribed from a DNA molecule of the pathogen. For example, the target sequence may be contained in the genome of an RNA virus. It is further preferred that if said pathogen infects or has infested said plant, the TnpB protein hydrolyzes said target RNA molecule of said pathogen in said plant. It is therefore preferred that the TnpB system (or a required portion thereof for its completion) is capable of cleaving a target RNA molecule from a plant pathogen when the TnpB system is applied therapeutically (i.e., after an infection has occurred) or prophylactically (i.e., before an infection has occurred).
In certain exemplary embodiments, the virus may be a retrovirus. Exemplary retroviruses that can be detected using embodiments disclosed herein include one or more or any combination of the following viruses: alpha, beta, gamma, delta, lentivirus, foamy (spiravirus), or transposable (Metaviridae), pseudoviridae (Pseudoviridae) and Retroviridae (Retroviridae) including HIV, hepadnaviridae (including hepatitis b virus) and cauliflower mosaic (including cauliflower mosaic virus).
In certain exemplary embodiments, the virus is a DNA virus. Exemplary DNA viruses that can be detected using embodiments disclosed herein include one or more (or any combination thereof) from the following viral families: myoviridae (Myoviridae), ponoviridae (Podoviridae), long-tail viridae (Siphoviridae), heteroherpesviridae (Allohespoviridae) including human herpesvirus and varicella zoster virus, maroroviridae (Malocoviridae), daceae (Lipoviridae), poncireidae (Rudioviridae), adenoviridae (Adenoviridae), bottle-like viridae (Ampulaviridae), vesicular viridae (Ascovviridae), african swine fever virus (Asfaviridae) including African swine fever virus, baculoviridae (Baciloviridae), cicadae (Cicadae), rhabdoviridae (Clavididae), cover phage (Corpotyvirus), fuseviridae (Fusevidae), fusevidae (Fusevidae), and other characteristics of the family (Fusevidae) globoviridae (globoviridae), trichoviridae (guttavidae), mastadenitiviridae (Hytrosaviridae), iridoviridae (Iridoviridae), maseilliviridae (maseillevidae), pseudoviridae (mimivir), nudioviridae (nudioridae), lineaviridae (Nimaviridae), panduraviridae (panoravidae), papillomaviridae (papiloviridae), alga DNA viridae (phydanaviridae), orthoviridae (plasavaviridae), polynaviridae (polynaviridae), polyomaviridae (polyomaviruses), polyomaviridae (polyomavidae) (including simian virus 40, JC virus, BK virus), poxviridae (Poxviridae) (including vaccinia and smallpox), globviridae (sphaviridae), compound layer (ctividae), turriviridae (Turrividae), dinodnavirus (Dinodnavirus), salterprovirus (Salterprovirus), ruizidovirus (Rhizidovirus), etc. In some embodiments, a method of diagnosing a species-specific bacterial infection in a subject suspected of having a bacterial infection is described as obtaining a sample comprising bacterial ribosomal ribonucleic acid from the subject; contacting a sample with one or more of the probes; and detecting hybridization between the bacterial ribosomal ribonucleic acid sequence present in the sample and the probe, wherein detection of hybridization indicates that the subject is infected with: escherichia coli, klebsiella pneumoniae, pseudomonas aeruginosa, staphylococcus aureus, acinetobacter baumannii, candida albicans, enterobacter cloacae, enterococcus faecalis, enterococcus faecium, proteus mirabilis, staphylococcus agalactiae (Staphylococcus agalactiae) or staphylococcus maltophilia (Staphylococcus maltophilia), or a combination thereof.
Novel coronavirus
The presently disclosed systems and methods of the invention are designed to detect coronaviruses, in one aspect, the target sequence is 2019-nCoV, also referred to herein as SARS-CoV-2, which causes COVID-19. Coronaviruses are a family of plus-sense single-stranded RNA viruses that infect a wide variety of animals and humans. SARS-CoV and MERS-CoV are both one type of coronavirus infection. It is contemplated that one or more coronaviruses will be detected, including SARS-CoV-2 detected. The sequence of sARS-CoV-2 is available from GISAID accession numbers EPI_ISL_402124 and EPI_ISL_402127-402130 and is described in DOI 10.1101/2020.01.22.914952. Other deposits of SARS-CoV2 are all deposited in the GISAID platform, including EP_ISL_402119-402121 and EP_ISL 402123-402124; see also GenBank accession number MN908947.3. In one aspect, non-redundant alignments can be generated using known SARS and SARS-associated coronaviruses or other viruses from one or more hosts. The relevant virus may be present in, for example, bats.
In one embodiment, the system is designed to comprise at least one high activity nucleic acid component polynucleotide designed according to the methods disclosed herein. In a preferred embodiment, the nucleic acid component polynucleotide binds to at least one target sequence that is a unique coronavirus genomic sequence, thereby identifying the presence of coronavirus to the exclusion of other viruses. The systems and methods may be designed to detect a variety of respiratory tract infections or viral infections, including coronaviruses.
In one aspect, at least one nucleic acid component polynucleotide binds to a coronavirus sequence encoding a polypeptide having an immunostimulatory effect on the host immune system. Immunostimulatory polypeptides have the ability to enhance, stimulate or increase an immune system response, typically by inducing activation or activity of components of the immune system (e.g., immune cells). In embodiments, the immunostimulatory polypeptide results in an immune-mediated disease in the host. In one aspect, the host is a mammal, such as a human, bat, or pangolin, that may be infected with a coronavirus. Cyranoski, d.d pangolins spread the China coronavirus to peopleNature, month 2, 7, 2020. In one embodiment, the nucleic acid component polynucleotides can be designed to detect SARS-CoV-2 or variants thereof in meats, living animals, and humans, such that testing can be performed, for example, in the marketplace and other public places where sources of contamination may occur.
The gene target may comprise an ORF1ab, an N protein, an RNA-dependent RNA polymerase (RdRP), an E protein, an ORF1b-nsp14, a spike glycoprotein (S) or a full crown target. Molecular assays have been developed and can be used as a starting point for developing nucleic acid component molecules for use in the methods and systems described herein. See "Diagnostic Detection of 2019-nCoV by real-time RT-PCR" Charile, berlin Germany (17. 1.2020)' Detection of 2019novel coronavirus (2019-nCoV) in suspected human cases by RT-PCR-Hong Kong University (23. 1.2020); PCR and sequencing protocol for 2019-nCoV-Department of Medical Sciences, ministry of Public Health, thailand (update of 28 days 1 month 2020); PCR and sequencing protocols for 2019-nCoV-National Institute of Infectious Diseases Japan (24 days of 1 month 2020); US CDC panel primer and probes-U.S. CDC, USAV-U.S. CDC, USA (month 1, 28 of 2020); china CDC Primers and probes for detection 2019-nCoV (24 days 1 month 2020), which is incorporated herein by reference in its entirety. In addition, the nucleic acid component molecular design may take advantage of differences or similarities to SARS-CoV. Researchers have recently identified similarities and differences between 2019-nCoV and SARS-CoV. "Coronavirus Genome Annotation Reveals Amino Acid Differences with Other SARS Viruses," genome web, month 2 and 10 of 2020. For example, nucleic acid component molecules based on the 8a protein (present in SARS-CoV but not in SARS-CoV-2) can be used to distinguish viruses. Similarly, the 8b and 3b proteins have different lengths in SARS-CoV and sARS-CoV-2, and can be used to design nucleic acid constituent molecules to detect non-overlapping proteins of nucleotides encoded in both viruses. Wu et al Genome Composition and Divergence of the Novel Coronavirus (2019-nCoV) Originating in China, cell Host & Microbe (2020), DOI:10.1016/j.chom.2020.02.001, which is incorporated herein by reference, includes all supplementary information, in particular table S1. Mutations can also be detected using nucleic acid components and/or primers specifically designed to detect changes in, for example, SARS-CoV-2 virus. In embodiments, the nucleic acid component or primer can be designed to detect the D614G mutation in SARS-CoV-2 spike protein. See, korber et al, cell 182,812-827 (2020); doi 10.1016/j.cell.2020.06.043.Other mutations in the spike protein can be designed utilizing the COVID-19viral genome analysis pipeline available at cov.lanl.gov. Further resources for primers and nucleic acid components designed for detection of coronaviruses or coronavirus mutations can be found in Starr et al, "Deep Mutational Scanning of SARS-CoV-2Receptor Binding Domain Reveals Constraints in Folding and ACE2 Binding," Cell,182,1-16 (2020); doi 10.1016/j.cell.2020.08.012.
The detection systems and methods can be used to identify single nucleotide variations, rRNA sequence-based detection, drug resistance screening, monitoring microbial outbreaks, genetic perturbations, and environmental sample screening, as described in [0183] - [0327] of PCT/US2018/054472, 22, 2018, 10.
In certain exemplary embodiments, the systems, devices, and methods disclosed herein can be used for biomarker detection. For example, the systems, devices, and methods disclosed herein can be used for SNP detection and/or genotyping. The systems, devices, and methods disclosed herein can also be used to detect any disease state or condition characterized by abnormal gene expression. Abnormal gene expression includes abnormalities in the expressed gene, the expression location and the expression level. A variety of transcripts or protein markers associated with cardiovascular, immune, and cancer diseases may be detected. In certain exemplary embodiments, embodiments disclosed herein can be used for cell free DNA detection of diseases involving lysis. In certain exemplary embodiments, the embodiments can be used for faster and more portable detection for prenatal testing of cell free DNA. The embodiments disclosed herein can be used to screen different SNP sets associated with different coronaviruses, evolved SARS-CoV2 and other related respiratory viral infections, and the like. As described elsewhere herein, closely related gene phenotypes/alleles or biomarkers (e.g., having only a single nucleotide difference in a given target sequence) can be distinguished by introducing synthetic mismatches in the nucleic acid component molecules.
In one aspect, the invention relates to a method for detecting a target nucleic acid in a sample, comprising: dispensing a sample or a set of samples into one or more separate discrete volumes comprising a TnpB system according to the invention as described herein; incubating the sample or group of samples under conditions sufficient to allow binding of the one or more nucleic acid component molecules to the one or more target molecules; activating the TnpB protein via binding of one or more nucleic acid component molecules to one or more target molecules, wherein activating the TnpB protein results in modification of the RNA-based masking construct such that a detectable positive signal is generated; and detecting a detectable positive signal, wherein detection of the detectable positive signal indicates the presence of one or more target molecules in the sample.
The sensitivity of the assays described herein are well suited for detecting target nucleic acids in a variety of biological sample types, including sample types in which the target nucleic acid is diluted or sample materials are limited. The method for field deployable and rapid diagnostic assays may be optimized for the type of sample material utilized and may be adjusted according to methods known in the art for other assays. See Myhrvold et al, 2018. Biomarker screens can be performed on a variety of sample types, including, but not limited to, saliva, urine, blood, stool, sputum, and cerebrospinal fluid. Embodiments disclosed herein may also be used to detect up-and/or down-regulation of genes. For example, the sample may be serially diluted such that only the overexpressed genes remain above the detection limit threshold of the assay.
In one embodiment, the invention provides the steps of obtaining a biological fluid sample (e.g., urine, plasma or serum, sputum, cerebrospinal fluid) and extracting DNA or RNA. The mutant nucleotide sequence to be detected may be part of a larger molecule or may initially exist as a discrete molecule.
In embodiments, DNA is isolated from the plasma/serum of a cancer patient. For comparison, a DNA sample isolated from tumor tissue and a second sample may be isolated from non-tumor tissue (e.g., lymphocytes) of the same patient (control). The non-tumor tissue may be of the same type as the tumor tissue or from a different organ source. In one embodiment, a blood sample is collected and plasma is immediately separated from blood cells by centrifugation. Serum may be filtered and stored frozen until DNA/RNA extraction.
In one aspect, sample preparation may include methods as disclosed herein to circumvent other RNA extraction methods, and may be used with standard amplification techniques such as RT-PCR and the TnpB detection methods disclosed herein. In one aspect, the method may comprise a one-step extraction-free RNA preparation method that may be used with a sample to be tested for coronavirus, which in one aspect may be an RT-qPCR test method, a lateral flow test method, or other TnpB test methods disclosed herein. Advantageously, the RNA extraction method can be used directly with other test protocols. In one aspect, the method includes using a nasopharyngeal swab, nasal saline lavage, or other nasal sample (e.g., a front nasal swab) with a Quick ExtractTM DNA extraction solution (QE 09050), lucigen or QuickExtract plant DNA extraction solution, lucigen. In one aspect, the sample is diluted with a 2:1, 1:1, or 1:2 sample to DNA extraction solution. The sample is incubated at about 90℃to about 98℃and preferably about 95 ℃. In another aspect, the incubating is performed at between about 20 ℃ and about 90 ℃, about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, or 90 ℃. The incubation period may be about 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 minutes, preferably about 4 to 6 minutes, or about 5 minutes. Incubation time and temperature may vary depending on sample size and mass, and if lower temperatures are used, incubation time may increase. Current CDC Real-Time RT-PCR diagnostic panels are described in fda.gov/media/134922/downlink, "CDC 2019-Novel Coronavierus (2019-nCoV) Real-Time RT-PCR Diagnostic Panel". In one embodiment, the DNA extraction solution may remain with the sample after incubation and be used in a subsequent step of the detection method. In one aspect, the detection method is an RT-qPCR reaction and the concentration of the extraction solution is maintained below 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3% of the reaction mixture, wherein the reaction mixture comprises the detection reagent, the sample, and the extraction solution.
In one embodiment, beads are used with particular embodiments of the present invention and may be included in the extraction solution. The beads may be used to capture, concentrate or otherwise enrich a particular material. The beads may be magnetic and may be provided to capture nucleic acid material. In another aspect, the beads are silica beads. Beads can be used in the extraction steps of the methods disclosed herein. The beads may optionally be used with the methods described herein, including one-pot methods, which allow for concentration of viral nucleic acids from a large volume of sample (such as saliva or a swab sample) to allow for a single one-pot reaction method. The concentration of the desired target molecule can be increased by about 10-fold, 50-fold, 100-fold, 200-fold, 500-fold, 800-fold, 1000-fold, 1500-fold, 2000-fold, 2500-fold, 3000-fold or more.
Magnetic beads in PEG and salt solutions are preferred in one aspect and in embodiments bind to viral RNA and/or DNA, allowing for simultaneous concentration and cleavage. Silica beads may be used in another aspect. It is contemplated to use capture moieties such as oligonucleotide functionalized beads. The beads may be used with extraction reagents, allowing incubation with sample and lysis/extraction buffer, concentrating the target molecules on the beads. The extraction may be performed at 22 ℃ to 60 ℃ as described elsewhere herein, followed by isothermal amplification and/or TnpB detection under conditions as described elsewhere herein. When used with the cartridge device described in detail elsewhere herein, the magnet may be activated and the beads collected, optionally washed with extraction buffer and one or more washes performed. Advantageously, the beads can be used in a one-pot process and system without the need for additional washing of the beads, allowing for a more efficient process without increasing the risk of contamination in a multi-step process. The beads may be utilized with isothermal amplification as detailed herein, and the beads may flow into the amplification chamber of the cartridge or be maintained in a pot for the amplification step. After heating, the nucleic acid may be released from the beads.
In certain exemplary embodiments, the target nucleic acid is detected directly from a crude or untreated sample (such as blood, serum, saliva, cerebrospinal fluid, sputum, or urine). In certain embodiments, the target nucleic acid is cell free DNA.
Kit for detecting a substance in a sample
In one aspect, the invention provides kits containing any one or more of the elements disclosed in the methods and compositions described above. In one aspect, the invention provides a kit comprising one or more of the components described herein. In one embodiment, the kit comprises the compositions herein and instructions for using the kit. In one embodiment, the kit comprises a carrier system and instructions for using the kit. In one embodiment, the kit comprises a delivery system and instructions for using the kit. In one embodiment, the kit comprises a carrier system and instructions for using the kit. The elements may be provided individually or in combination and may be provided in any suitable container, such as a vial, bottle or tube. The kit may comprise a nucleic acid component and optionally an unbound protective chain as described herein. The kit may include a nucleic acid component having a protective chain that is at least partially bound to a reprogrammable spacer portion of the nucleic acid component sequence (i.e., the nucleic acid component). Thus, the kit may comprise a nucleic acid component in the form of a partially double stranded nucleotide sequence as described herein. In one embodiment, the kit includes instructions in one or more languages, such as instructions in more than one language. The instructions may be specific for the applications and methods described herein.
In one embodiment, the kit comprises one or more reagents for use in a method of using one or more elements described herein. The reagents may be provided in any suitable container. For example, the kit may provide one or more reaction or storage buffers. The reagents may be provided in a form that is useful for a particular assay, or in a form that requires the addition of one or more other components prior to use (e.g., in concentrate or lyophilized form). The buffer may be any buffer including, but not limited to, sodium carbonate buffer, sodium bicarbonate buffer, borate buffer, tris buffer, MOPS buffer, HEPES buffer, and combinations thereof. In one embodiment, the buffer is alkaline. In one embodiment, the buffer has a pH of about 7 to about 10. In one embodiment, the kit comprises one or more oligonucleotides corresponding to the nucleic acid component scaffold, a reprogrammable sequence for insertion into a vector for operably linking the nucleic acid component sequence and the regulatory element. In one embodiment, the kit comprises a homologous recombination template polynucleotide. In one embodiment, the kit comprises one or more vectors and/or one or more polynucleotides described herein. Kits may advantageously allow all elements of the system of the invention to be provided.
Examples
Example 1-
Applicants examined TnpB sharing an Inverted Terminal Repeat (ITR) sequence with IscB ITR to infer the true RNA of TnpB. Specifically, tnpB in the species Propionibacterium racemosum shares ITRS with the species Propionibacterium racemosum IscB. FIG. 2 provides an exemplary TnpB alignment that includes similarity to TnpB in F.racemosus. Importantly, there is a highly conserved region that is beyond what one considers to be the typical end of proteins (FIG. 2). This conserved region marks the possible presence of ITRs and expressed RNAs. Supporting this region labeled RNA is much less conserved at the 3 'end of the TnpB locus (FIG. 3), indicating that the 3' end is not expressed as RNA. The similarity of ITRs between TnpB and IscB may include the 5'ITR of TnpB (fig. 4), which is homologous to the 5' ITR of IscB (fig. 5).
The 5' itrs of TnpB and IscB are similar at the sequence level and based on similarity to the IscB locus, the direction of guidance of TnpB relative to the locus will be identical. In IscB, the guide is located 5' of the locus, creating RNA expressed as guide+ (hRNA conserved region). This suggests that the region immediately upstream of the 5' ITR of TnpB may also serve as a guide. Based on the position of the coding sequence end immediately 3' to the conserved RNA region, this suggests that the conserved RNA region can be expressed by CDS as a radial flow transcript, which can easily flow through ITRs and include part of the surrounding region instead of part of the locus (guide region). This suggests that the RNA species of these systems are (TnpB RNA conserved) + guides, similar to dr+ spacer configurations employed by many V-type CRISPRs.
Using the existing RNA seq data from F.racemosus, the proposed RNA did express evidence of a 15bp guide, with the remainder of the RNA remaining 50-60bp (FIG. 6). This indicates that the expression is 3'- >5', meaning that the RNA reflects the bottom strand. Although IscB is an RNA-guided endonuclease, similarly having a guide at the same position relative to the inverted terminal repeat (immediately upstream of the 5' itr), in the TnpB-based transposon, the direction of the effector protein relative to the DNA transposon is opposite to that of IscB. The orientation of the IscB locus is 5' ITR/hRNA (overlap), iscB (5 ' strand), and finally 3' ITR. For TnpB sharing ITR with IscB, it is 5'ITR/RNA (RNA on the 3' strand), tnpB (3 'strand), and finally 3' ITR. This work together with the RuvC domain of TnpB supported that the TnpB system acts as an RNA-guided DNA endonuclease, where the guide is RNA expressed immediately upstream of the 5' end. Annotated sequences from exemplary TnpB loci of c.racemosus, including 5'itr and 3' itr (fig. 7).
TABLE 5.
Applicants identified two functional TnpB orthologs: actinoplanes strain DSM 43150 and a bacterial isolate B11 of the class epsilon alteromonas. Applicants studied the TAM requirements of these functional TnpB orthologs using the experimental method in fig. 8. Based on the experiments, the TAM of actinoplanes spatulosa strain DSM 43150 and the epsilon bacterial isolate B11 were identified in fig. 9, wherein TCAG was identified as the TAM of actinoplanes spatulosa strain DSM 43150 and TCAT was identified as the TAM of epsilon bacterial isolate B11.
Example 2 plasmid cleavage assay
The TnpB protein and ωrna were mixed in an in vitro transcription/translation reaction with two plasmids, one containing the correct homologous target and TAM sequence (exemplary TCAT in fig. 10), while the other had incorrect TAM or no target at all. If cleavage occurs, the adaptors can be ligated and then used for amplification and next generation sequencing. If the plasmid is not cleaved, the adaptors will not be ligated and the product will not be amplified. Identification of the sequence indicated that cleavage occurred. The applicant has observed (unpublished results) that adaptor ligation appears to occur only in the presence of any appreciable amount of +tam/+ target plasmid, thus concluding that cleavage occurs only in those substrates.
Example 3-TnpB Rd1 summary of ortholog 5
RNA sequencing (RNA-Seq) was performed on selected 10 TnpB-Rd1 orthologs that appeared to be related to IscB non-coding RNA (ncRNA) with sequence similarity. The results of the RNA-seq confirm that TnpB is related to ncRNA. It was observed that a 173 nucleotide (nt) scaffold could be truncated to 102nt and retain TnpB activity (fig. 11A). Analysis was performed using the ortholog TnpBRd1_5_fn30_tam, enriching all captured TAMs using the 30bp guide of the Fn spacer (fig. 11B), depleting TAMs 15% using the same 30bp guide of Fn (fig. 11C), and enriching TAMs using the 30bp guide of the 644PSP1 spacer (first 10% for captured TAMs; fig. 11D). The results indicate that TnpB activity is reprogrammable and has guidance specificity, i.e. TAM enrichment of non-targeted guides is not observed, tnpB activity is inconsistent with nicking, since NEB nicking enzymes do not result in enrichment of nicking sites, and guidance lengths of 15-30nt are acceptable.
Example 4-TnpB Rd1 validation of ortholog 1 and 4.
The Rd1 ortholog 1 and 4 were verified using TXTL cleavage assay using the plasmid substrate specifically cleaved by the TAM of the adaptors ligation reads in actinoplanes TnpB-1 and actinoplanes macerans TnpB (FIG. 12A). For each condition, two separate plasmids of the same concentration were used to directly compare the different substrates (fig. 12A). The adaptor ligation site was determined using next generation sequencing in the amplified product and showed agreement between TAM screening and validation (results based on ortholog 4). The adaptor ligation positions (fig. 12B, upper bar) of the 8N TAM library plasmid using non-target (NT) and target (T) strands show that the adaptor ligation sites are slightly different relative to each strand. When a single TAM plasmid is used with non-target (NT) strands (fig. 12B, bottom bar graph), the sites of adaptor ligation are similar to non-target (NT) strands when a TAM library is used.
Example 5-identification of TnpB orthologs with 5' TAM.
Ortholog selection was based on experimental characterization of ten Tnp orthologs from different bacteria containing ncrnas with sequence similarity to IscB non-coding RNAs (ncrnas). Orthologs were derived from actinomycetes cellulolytic strain, maduralensis strain, DSM 45823, maduralensis strain, DSM 44197, two tnpbs from actinomycetes cellulolytic strain, DSM 45823, maduralensis strain, DSM 44197, actinoplanes spatulosa strain, DSM 43150, alicyclobacillus megaterium strain, DSM 17980, halophila Zhang Liping strain, DSM 102030, actinomyces albophilus strain, DSM 45015, epsilon bacteria, c.racemosus, xi Shi slightly thermus strain, DSM 9946, and QNF01000004 extract (reverse).
To determine TAMs in ortholog, two methods were used (fig. 13A-13B). The first method involves expressing TnpB from an operably linked T7 promoter, a reprogrammable pTarget (e.g., fn), and T7 expressing omega RNA scaffolds and guides for re-targeting together. The complete pTarget was sequenced to determine which TAMs allowed cleavage (fig. 13A). The second method involves in vitro transcription and translation, TAM specific pTarget cleavage, followed by adaptor ligation, cleavage specific amplification, and finally sequencing of the enriched TAM. Only 5' tams were allowed to be identified in those cleaved sequences (fig. 13A). Seven identified TAMs of the ten orthologs are shown and each has the sequence TTCAN (fig. 13B).
Example 6-validation of TAMs for two TnpB orthologs.
Adapter ligation reads in the presence or absence of the TnpB protein (+protein) were used to verify two TnpB orthologs from actinomycetes strain split She Youdong DSM 43150 (TnpB-2) and actinomycetes cellulolytic strain DSM 45823 (FIG. 14). The results showed that cleavage of the target plasmid in both orthologs was dependent on the TnpB protein and that both target and TAM were required to be present (fig. 14).
Example 7-TnpB-2DNA cleavage.
Applicants determined the TnpB-2 cleavage sites of Actinoplanes schizophyllum on both the target and non-target strands (FIG. 15A). Similar to some Cas12 proteins, tnpB cleavage was observed at multiple positions within and outside the guide annealing site (fig. 15B).
Example 8-characterization of ncRNA associated with actinoplanes meracilis TnpB.
The actinoplanes TnpB-2ORF was expressed in escherichia coli, including a downstream region of about 200 bp. The downstream region is shown to contain 173nt scaffolds (reRNAs) and RNA guide sequences (FIG. 16A). The engineered scaffold sequences resulted in a series of truncations ranging from 201nt to 102nt, maintaining TAM-specific enrichment in the presence of the guide and dsDNA (fig. 16B).
EXAMPLE 9 characterization of ncRNA associated with Propionibacterium racemosum TnpB
The region immediately downstream of the c.racemosus TnpB appears to encode ncRNA and guide sequences (fig. 17).
Example 10-IS200/IS605 elements encoding a plurality of RNA-directed nucleases
Applicants have attempted to determine if IS200/605 transposons generally contain RNA directed effectors. The IS200/IS605 transposon more commonly encodes TnpB, a unique family of proteins that like IsrB contain RuvC domains as the sole endonuclease domains and are considered ancestral of Cas12s (type V CRISPR effector) (fig. 18A). The diversity of the TnpB family is an order of magnitude richer than the IscB family; HMMER searches identified more than 100 ten thousand tnpB loci in the publicly available prokaryote genome. Sequence conservation analysis revealed that conserved non-coding regions were immediately downstream of the CDS of many tnpbs, indicating the presence of related ncrnas that could serve as RNA guides (fig. 20). Previous work has also identified ncrnas in archaebacteria and bacteria that overlap the 3' end of the tnpB gene, but the function of these ncrnas has not been characterized. The small RNA-seq of c.racemus demonstrated that the natural expression of ncRNA overlaps the 3' end of the relevant tnpB ORF (fig. 18B), which is classified as a unique class of omega RNAs. The reverse complement of the 3 'end of the TnpB ωRNA of the Mycobacterium racemosum was almost identical to the 5' end of the ωRNA in some of the Mycobacterium racemosum IscB, and the region corresponds to the predicted transposon end in each locus (FIG. 18C). Analysis of the non-redundant locus containing the TnpB gene of c.racemosus TnpB showed a sudden decrease in conservation at the 3' end of the locus (fig. 20), which corresponds to the IS200/605 transposon end. Comparison with the small RNA-seq locus reveals expression beyond a decrease in conservation, indicating the possible presence of a guide sequence in the transcript (fig. 18D). To investigate this possibility, applicant recombinantly expressed and purified one of these tnpbs (from actinoplanes spatzeri) in the presence of predicted omega RNAs overlapping ORFs. The small RNA-seq of co-purified RNA showed that predicted omega RNAs containing putative guide regions in actinoplanes sp interact with TnpB proteins (fig. 21A). In vitro plasmid cleavage assays were performed on multiple TnpB proteins in this cluster using reprogrammed guides, demonstrating RNA-directed cleavage using 5' tam (fig. 18E). Cleavage specific adaptor ligation and sequencing of the targeting plasmid containing TAM further confirmed the reprogrammable RNA-directed dsDNA endonuclease activity of TnpB (fig. 18F). TAM screening was performed on other TnpB loci, including Actinoplanes schizophyllum TnpB-2, apocyclic acid bacillus megaterium TnpB and the epsilon class Proteus bacterial isolate B11_G4TnpB TAM (FIG. 21B). Plasmid competition assay positive controls were performed using SpCas 9. SpCas9 only cleaves the plasmid containing TAM and target, as indicated by the presence of cleavage-specific adapter ligation products. Statistical significance was assessed by comparing the number of adapter-ligated reads of the first plasmid listed under each condition using a two-tailed T-test, normalized to the average of adapter-ligated reads of the second plasmid listed in the + protein and-protein condition (fig. 21C).
Examples of RNA guided systems are shown in fig. 19, including specific guided loading systems, such as OMEGA (OMEGA) and CRISPR systems, and non-specific guided loading systems, such as Argonaute/RNAi systems.
Materials and methods
Configuration file planning
The initial profile of IscB was planned on the NR database using PSI-BLAST of NCBI, going through 8 iterations from the starting seed sequence to up to 20000 target sequences. The strong filtering parameters (expected threshold 1e-5 and PSI-BLAST threshold 1 e-6) were chosen to reduce the accumulation of unrelated proteins, such as restriction enzymes or homing endonucleases, that also contain HNH domains. All proteins smaller than 260aa were discarded. The remaining proteins were then aligned using a MAFFT FFT-NS-1, and part of the proteins and proteins with poor coverage compared to the HNH domain were discarded. The filtered set was then clustered using mmseq s2 with 70% sequence identity and 70% minimum coverage. The mmseq 2 representation of each cluster was then aligned using a MAFFT-einsi alignment. The resulting alignment is further divided into multiple domains (NTD, ruvC-I, ruvC-II, HNH and RuvC-III) to create different HHAlign profiles for the respective regions. For the HMMER profile, ruvC-I, BH and RuvC-II are combined into a single profile to reduce false positives.
Because of the wide diversity of TnpA, PSI-BLAST searches resulted in over 20000 homologous sequences. Since TnpA consists of a single contiguous catalytic domain, HHblits was used instead to identify more non-redundant homologs, using an E-value cutoff of 1E-3, a minimum hit probability of 80% and 8 iterations in the uniref30_2020_06 database. The resulting proteins were aligned using a MAFFT-einsi alignment and part of the protein was removed.
Identification of IscB, isrB and IshA
All prokaryotic genomes were downloaded from NCBI, NCBI WGS and well-licensed JGI project. The ORFs on all contigs predicted a minimum size of 80aa and allowed for the alternative start codons GTG and TTG. ORFs sharing the same termination position and chain (+/-) as the existing protein annotations are discarded to support the existing annotations. All ORFs were then searched using HMMER and 6 IscB profiles (minimum number of bits 18). Any ORF/protein that hits any one of the 6 domains is considered a protein of interest (POI) and is reserved for further analysis. To reduce redundancy, all proteins were then clustered with 90% sequence identity and 85% coverage using mmseq s 2. Proteins whose start or stop sites are within 200bp of the edge of the contig are considered partial. Within each 90% of the clusters, proteins less than the 80 th percentile in length are discarded. Among the remaining proteins, proteins containing X (ambiguous) amino acids are discarded unless their removal results in a set of empty sequences. Of the remaining proteins, the representative sequence is selected as the sequence with the greatest minimum distance from the protein start or stop site to phylogenetic analysis based on the IscB-IsrB-Cas9RuvC/BH and RuvC/BH/HNH domains.
Cas9 identified from the IscB domain-based search showed a single-line branch from the RuvC-based tree, but not all Cas9 were identified from this search. To expand the Cas9 space included in the analysis, the Koonin laboratory was used, the Cas9 profile of TIGRFAM and the profile made by the MAFFT alignment of the Cas9 protein of CRISPRDisco to identify other Cas9 proteins not found in the initial IscB domain-based search using HMMER with a minimum bit score of 25 and a protein length of 500aa. Cas9 proteins from the IscB domain search are combined with Cas9 proteins from the Cas9 search and the duplicates are deleted. The Cas9ORF start site was then modified using GLIMMER (8). Then, the superset of Cas9 proteins clustered with 90% sequence identity, 85% coverage and reduced redundancy in the same manner as the IscB search. The superset of non-redundant Cas9 proteins were then reclustered with 50% sequence identity and 60% coverage. The selection of 50% sequence identity reflects the difference in the size of the conserved regions between IscB and Cas9, with about 100aa evolving slower in the critical regions (RuvC, BH, and HNH). The clustering criteria for 65% minimal sequence identity of 400aa protein is functionally similar to 65% -100/400+100/1000=50% sequence identity of larger 1000aa Cas9 with the same conserved domain spanning about 100 aa. The Cas9 cluster was then combined with the IscB cluster for phylogenetic analysis.
The IscB, isrB, and Cas9 obtained from the above filtering criteria were aligned using the MAFFT-x2 (two iterations) and BLOSUM62 scores (default if not mentioned). Because of the different domain architectures and sizes of these different protein families, the RuvC-I and BH regions of Cas9 are not aligned with those of IscB and IsrB. RuvC-I and BH are manually grouped and realigned using MAFFT-x 2. If the HH align score of the IscB-RuvC-I or Cas9-RuvC-I profile is less than 21, sequences with poor coverage in the aligned RuvC-I domain region are removed. Because of their heterozygous nature, small proteins with Cas 9-like and IscB-like sequences and HHAlign hits containing RuvC-I and BH typically do not have the correct alignment with RuvC-I and BH relative to all other proteins. For such proteins, all amino acids between the N-terminus and RuvC-II are grouped as RuvC-I and BH for alignment. The RuvC-I and BH regions were then realigned using MAFFT-linsi. IsrB alignment of HNH domain has been moved to RuvC-II column group. The RuvC-II, ruvC-III, HNH and NTD domains were then aligned in order using MAFFT-linsi. Excess regions between BH domains and RuvC-II were then aligned using MAFFT-einsi and BLOSUM30 to identify REC-like insertions. Sequences that are not aligned or poorly aligned to either of the RuvC or BH domains are removed. The resulting alignment was then used for phylogenetic analysis.
All domains unusual for all 3 types of proteins, namely NTD, REC1, REC2, PI domain and IscB/IsrB C-terminal domain, were removed from the alignment, leaving a trimmed alignment containing only highly conserved portions of RuvC-I, BH, ruvC-II and RuvC-III domains, creating RuvC/BH alignment containing IscB, isrB and Cas 9. Another alignment of only the RuvC-I, BH, ruvC-II, HNH and RuvC-III domains containing only IscB and Cas9 was created, called the RuvC/BH/HNH alignment. For both alignments, clusters with death-representative sequences (sequences with mutations at key catalytic sites) were removed. Specifically, the filtered positions are RuvC-I conservative D, ruvC-II conservative E, HNH conservative H (where applicable) and RuvC-III conservative D and H. The symmetry test performed in IQ Tree2 was used to identify potential phylogenetic violations for comparison (figure s#7fu2). The RuvC/BH/HNH comparison shows a stationarity hypothesis that severely violates the three main hypotheses (reversibility, stationarity, homogeneity) used in typical phylogenetic analysis. Since the alignment contains too many taxonomies, the different speed model cannot be used in IQ Tree2, we use a subtractive method to identify the source of the stationarity violation. Preliminary analysis revealed that the main single-lineage branch of II-B Cas9 always splits from the rest of Cas9 with branch length > 1, suggesting that its accurate placement along the tree may be difficult. We removed the major monoclone branch of II-B Cas9 from the RuvC/BH/HNH alignment, which significantly reduced the stationarity violation determined by the edge symmetry test p-value (figure s#7fu 2). We also created another alignment consisting of IscB and Cas9 only from the early stage of Cas9 evolution, which completely eliminated any stationarity violations. For each alignment, an alternative model selection is performed using a model lookup tool implemented in IQ Tree 2. The best model is selected using the red pool information amount criterion (AICc) corrected for small sample sizes. In most cases, the AICc best model is different from the belleville information content criterion (Bayesian Information Criterion, BIC) or AIC (standard red-cell information content criterion), and some analysis is run on both sets of models; however, AICc is generally more preferred because of the smaller sample size correction. Then, for each alignment, phylogenetic trees were constructed using various methods (IQ Tree2, raxml, mrbayes) for cross-comparison. Although FastTree2 is used for rapid visualization of phylogenetic information, likelihood scores obtained using this method are much worse than those obtained using IQ Tree2, RAxML or MrBayes. Thus, fastTree2 is not used for comprehensive cross-comparisons.
For the text phylogenetic map, a mixed tree approach is used in order to concentrate Cas 9-related information while maintaining accuracy of phylogenetic development. For this approach, in addition to the complete IscB and IsrB groups, subsamples of Cas9 clusters were also selected from the alignment. The resulting sub-alignment was then used for phylogenetic inference using IQ Tree 2 with the same parameters. Since there is Cas 9-related information in the child alignment that may be biased, the position of the Cas9 lineage is deduced from the tree deduced from the original alignment. This is accomplished by splitting the Cas9 branch from the child alignment tree and replacing the original Cas9 branch on the original tree with a smaller split branch. The branch selected for grafting is the branch between Cas9_849 and all other Cas9, as this region shares the same topology between the two trees. The order of Cas9 subtype evolution was checked to ensure consistency between each tree to ensure compatibility after extensive downsampling of Cas9 proteins.
IscB/IsrBω RNA discovery, management and analysis
Using only the results of the IscB domain search, all representatives from clusters of at least 2 HHAlign hits with at least 3 proteins and either RuvC-I, ruvC-II or RuvC-III and a number of bits of at least 17 were collected. All the upstream (from-300 bp to +200bp of the start codon) and downstream (from-200 bp to +300bp of the stop codon) regions of all IscB and IsrB proteins were aligned individually using MAFFT-einsi. Upstream alignment demonstrated a large conserved region outside the typical CDS boundary of IscB/IsrB. Downstream alignments did not contain any large conserved regions and were discarded for further analysis. Each sequence in the conserved upstream region was folded using ViennaRNA RNAFold (9). Sequences in an alignment are divided into separate groups based on conservation of key distinct regions in the alignment. The main group is labeled G1a, covering a large number of iscbωrnas. The covariance folded RNA structure was extrapolated using R-scape to correct for phylogenetic correlation. R-scape parameter E value threshold 1E-2 and gap threshold 0.75 for all profiles. CMbuild from the inpersal was used to optimize RNA alignment and construct Covariance Models (CM). The resulting RNA structure was visualized using R2R. Other alignment groups (G1 b, G1 c.) of IscB/IsrB omega RNAs were iteratively created based on conserved upstream regions (relative to ORFs) from IscB/IsrB branches in RuvC trees that did not have strong correlation with existing omega RNA groups. The models of these groups were constructed in the same way, except that the shared secondary structure identified by ViennaRNA was used instead of the R-scape structure when the sample size was too small to obtain an accurate covariance fold structure. No dummy junction is explicitly encoded in the covariance model of Infinite. The structure of the hybrid CRISPR/ωrna associated with the CRISPR-associated IscB is inferred in the same way, using a common secondary structure instead of the covariance fold structure due to the smaller sample size.
Cluster annotation
All 10kb genomic frameworks (10 kb region around the protein of interest (POI)) were collected. CRISPR sequences were identified using CRT. All ORFs of TnpA within 10kb of POI were predicted using HMMER. All omega RNA profiles were predicted on genomic DNA using the covariance-based nucleic acid hmm search of Inform-1.1, cmsearch. RNA profile hits with scores below 35 are discarded. For each genome frame, all RNA profile hits are grouped into overlapping groups, each ωrna hit in the group overlapping on the genome with at least one other ωrna hit in the group. Each overlapping group is then assigned to the omega RNA hit with the highest bit score in the group. The result information of the different system development trees was visualized using GraphLAN.
Identification of the directed coding mechanism of IscB/IsrBω RNA
After complete classification and management of all major omega RNA types according to IscB and IsrB, all examples of IscB and IsrB in the genome of c. All examples of IscB/IsrBω RNA in the genome were also searched using CMsearch and G1a-G1i RNA covariance models. The occurrence of multiple nearly identical IscB associated with nearly identical omega RNAs was identified by BLASTn and classified as transposon amplification. The occurrence of omega RNA within 500bp of the same strand without detectable IscB or IsrB was classified as independent trans-acting omega RNA. In some cases, ωrnas and corresponding IscB/IsrB are separated by inserting an unrelated transposon between them. In such cases, the ωrnas are not considered as trans-acting ωrnas.
All covariance models were searched against our prokaryotic genome database. Examples with multiple omega RNAs within 300bp on the same strand were kept for further analysis and categorized as omega RNA arrays.
Identification of eukaryotic IscB orthologs
All eukaryotic genomes were downloaded from NCBI. To capture all possible IscB, the existing gene model was discarded in this assay. All DNA sequences were translated into 6-frame amino acid translations and split into ORFs by splitting across the stop codon (x). The HMMER profile generated by the IscB profile management step is then used to search for the IscB domain of each ORF. The ORFs that hit both the IscB HNH and RuvC domains were retained for further analysis. The CMsearch and IscB-linked omega RNA covariance model created in this study was then used to search for IscB-linked omega RNAs in the region surrounding the ORF.
Discovery of IshB
A power set of all possible domain combinations of PLMP, ruvC-I, BH, ruvC-II, HNH and RuvC-III domains was generated. For each domain combination, the number of clusters from the IscB domain search that hit each domain and have a minimum number of bits of 21 is calculated. Combinations of domains within the combination that exhibit a high level of protein homology are retained for further analysis. The combination of domains truncated as N-terminal or C-terminal to IscB, cas9 or IsrB is discarded. From the remaining combinations, plmp+hnh showed higher cluster counts relative to other domain combinations (such as RuvC-ii+plmp only), indicating that the combination corresponds to a true protein family. These proteins were subsequently named IshB due to the presence of HNH domains, which also contained the PLMP domains present in both IscB and IsrB.
Tnpb management and omega RNA analysis
Examples of IscB-linked omega RNAs from c.racemi were searched in c.racemi genome using BLASTn. Hits near IscB or IsrB are discarded. Multiple partial hits were found near the TnpB, always located downstream of the TnpB gene. Exploration of these hits revealed that multiple tnpbs shared transposon ends with IscB. As was done for IscB, upstream and downstream locus conservation analyses were performed on the relevant TnpB loci. Since TnpB is highly diverse, only TnpB, identifiable via high similarity in mmseqs2 searches, is included.
Sequencing of Small RNAs
Heterologous expression in Escherichia coli
Stbl3 chemocompetent E.coli was transformed with a plasmid containing the locus of interest. Single colonies were used to inoculate 5mL of overnight culture. After overnight growth, the cultures were centrifuged and resuspended in 750uL of TRI reagent (Zymo) and incubated for 5min at room temperature. 0.5mm zirconia/silica beads (BioSpec product) were added and the culture vortexed for approximately 1 minute to mechanically lyse the cells. 200uL of chloroform (Sigma Aldrich) was then added, the culture was gently turned over for mixing, and incubated for 3min at room temperature, followed by 15min at 12000Xg at 4 ℃. The aqueous phase was used as input for RNA extraction using the Direct-zol RNA miniprep plus kit (Zymo). The extracted RNA was treated with 10 units of DNase I (N EB) at 37℃for 30min to remove residual DNA and was again purified using the RNA Clean & C ontrater-25 kit (Zymo). The ribosomal RNA was removed using a bacterial RiboMinus transcriptome isolation kit (Thermo Fisher Sc ientific) using a half-volume reaction according to the manufacturer's protocol. The purified samples were then treated with 20 units of T4 polynucleotide kinase (NEB) at 37℃for 6h and re-purified using the RNA Clean & Concentrator-25 (Zymo) kit. The purified RNA was treated with 20 units of 5' RNA polyphosphatase (Lucigen) at 37℃for 30min and re-purified using the RNA Clean & concentrate r-5 kit (Zymo). Purified RNA was used as input for NEBNext Small RNA Library Prep for Illumina (NEB) and the extension time in final PC R was 60s, 16 cycles, according to the manufacturer's protocol. Amplified libraries were gel extracted, quantified by qPCR using the KAPA library quantification kit of Illumina (Roche) on a StepOne Plus machine (Applied Biosystems/Thermo Fisher Scientific), sequenced on Illumina NextSeq, read 1 for 42 cycles, read 2 for 46 cycles, index 1 for 6 cycles. The adaptors were trimmed using cutadapter and mapped to loci of interest using Bowtie2 (http:// Bowtie-bio.sourceforge.net/Bowtie2/index. Shtm.l). Fill readings were obtained and visualized using custom python scripts for fill readings greater than 200bp in length.
Ribonucleoprotein
RNP was purified as described. 100uL of concentrated RNP was used as input. The above scheme is modified as follows: RNA extraction was performed using 300uL TRI reagent (Zymo) and 60uL chloroform (Sigma Aldrich).
Protobacterium racemosum
Freeze-dried E.racemosus SOSP1-21 DSM 44963 was obtained from DSMZ (DSMZ. De/collection/analysis/culture/DSM-44963), resuspended in Streptomyces GYM medium (4 g glucose, 10g malt extract, 4g yeast extract in 1L water) and grown in a shaking incubator at 28 ℃. After 76 days, the cultures were centrifuged and the protocol described above was modified as follows: mechanical lysis was performed using zirconia beads and vortexed for approximately 30min. Ribosomal RNA was removed using the nebinex rRNA depletion kit (bacteria) (NEB) according to the manufacturer's protocol, and rRNA depleted samples were purified using Agencourt RNAClean XP beads (Beckman Coulter) prior to T4 PNK treatment. T4 PNK treatment was performed for 1.5h and purification was performed using RNA cleaning & concentration-5 kit (Zymo). The final PCR in the preparation of the small RNA library included 15 cycles.
Cloning of TAM library
Target sequences with 8N degenerate flanking sequences were synthesized by IDT and amplified by PCR using nebnet high fidelity 2X master mix (NEB). The backbone plasmid was digested with restriction enzymes (pACYC: ecoRV; pUC19: eco88I and HindIII, thermo Fisher Scientific) and treated with FastAP alkaline phosphatase (Thermo Fisher Scientific). Amplified library fragments were inserted into backbone plasmids at 50℃for 1 hour by Gibson assembly using a 2X Gibson assembly master mix (NEB) at an insert to vector molar ratio of 8:1. The Gibson assembly reaction was then isopropanol precipitated by adding an equal volume of isopropanol (Sigma Aldrich), a final concentration of 50mM NaCl and 1uL GlycoBlue nucleic acid co-precipitant (Thermo Fisher Scientific). After incubation for 15min at room temperature, the solution was centrifuged at maximum speed for 15min at 4 ℃, then the supernatant was aspirated, and the precipitated DNA was resuspended in 12uL TE and incubated for 10 min at 50 ℃ to dissolve. 2uL was then transformed into durable electrocompetent E.coli (Lucigen) by electroporation according to the manufacturer's instructions, recovered by shaking for 1h at 37℃and then inoculated onto 5 22.7 cm. Times.22.7 cm bioassay plates with appropriate antibiotic resistance. After 12-16 hours of growth at 37℃the cells were scraped from the plate and subjected to Midi-or Maxi-preparation using a NucleoBond Midi-or Maxi-prep kit (Macherey Nagel).
Escherichia coli TAM screening
100ng of each of the plasmid containing the locus of interest and the target 8N degenerate flanking library plasmid was transformed into 30uL of durable electrically competent E.coli (Lucigen) by electroporation, 3 replicates per locus of interest, and 3 replicates of the empty control, according to the manufacturer's protocol. After recovery by shaking at 37℃for 1 hour, the cells were inoculated on 1 bioassay plate of 22.7 cm. Times.22.7 cm with the appropriate antibiotic resistance and grown at 37℃for 12-16 hours. The cells were scraped from the plate and mixed thoroughly, and 2mL of scraped cells were used as input for miniprep (Qiagen). 100ng of the miniprep plasmid was input into PCR to amplify TAM-containing regions (supplementary Table XXX) using NEBNExt high-fidelity 2X PCR master mix (NEB) for 12 cycles of PCR, with an annealing temperature of 63℃followed by a second 18 cycle of PCR to further add Illumina adaptors and barcodes. Amplified libraries were gel extracted, quantified by the Qubit dsDNA HS assay (Thermo Fisher Scientific), and single ended sequenced on Illumina NextSeq, with read 1 for 75 cycles, index 1 for 8 cycles, and index 2 for 8 cycles. TAMs are extracted and weblog describing the depleted TAMs is visualized using custom Python scripts.
In vitro cleavage assay
Double stranded DNA (dsDNA) substrates were generated by PCR amplification of pUC19 plasmid containing the target site and TAM sequence. Cy3 and Cy5 conjugated DNA oligonucleotides (IDT) were used as primers to generate labeled dsDNA substrates. Single stranded DNA (ssDNA) substrates were ordered as Cy5.5 conjugated oligonucleotides (IDT). All omega RNAs used in biochemical assays were transcribed in vitro from DNA templates purchased from Twist Biosciences using the HiScribe T7 rapid high yield RNA synthesis kit (New England Biolabs). Target cleavage assay performed using awaiiscb contains 20mM HEPES pH 7.5, 50mM NaCl and 5mM MgCl 2 10nM DNA substrate, 1. Mu.M protein and 4. Mu.M. Omega. RNA in the final 1 Xreaction buffer. The assay was allowed to run at 37 ℃ for 1 hour, then briefly transferred to 50 ℃ for 5min, and immediately placed on ice to help relax the RNA structure prior to RNA digestion. The reaction was then treated with rnase a (Qiagen) and protease K (New England Biolabs) and purified using a PCR purification kit (Qiagen). DNA was resolved by gel electrophoresis on Novex 10% TBE (dsDNA substrate), 6% TBE-urea (dsDNA substrate) and 15% TBE-urea (ssDNA substrate) polyacrylamide gel (Thermo Fisher Scientific). Target cleavage assays performed using CRISPR-associated IscB RNPs were performed similarly, except that protein and omega RNA were replaced with 450nM RNP and the reaction was incubated for 1.5 hours at 37 ℃. The cleaned reaction was then run on a 4% agarose E-gel (Thermo Fisher Scientific).
For kinetic analysis of awaiiscb activity, cleavage reactions were quenched with 11mM EDTA at each time point prior to purging. For screening metals, mgCl is removed from the reaction buffer 2 2mM EDTA and 7mM of the indicated metals were added simultaneously. The paracut cleavage assay was performed using 10nM unlabeled ds/ssDNA substrate and 10nM Cy5.5-labeled paracut ssDNA substrate, and the reaction was allowed to proceed for 3 hours at 37 ℃. Cleavage of the labeled non-targeted ssDNA was then assessed on a 15% tbe-urea polyacrylamide gel.
Single-stranded RNA (ssRNA) substrates were transcribed in vitro and labeled at their 3' end with pCp-Cy5 (Jena Bioscience). For 3' end labeling, 50pmol of ssRNA was reacted with 50mM Tris pH 7.8, 10mM MgCl 2 100pmol of pCp-Cy5 and 50U of T4 RNA ligase 1 (New England Biolabs) in 10mM DTT, 2mM ATP and 10% DMSO were incubated at 4℃for 40 hours. The labelling reaction was quenched with 20mM EDTA and purified using RNA cleaning and concentration kit (Zymo). The ssRNA cleavage assay was performed in a similar manner to the DNA cleavage assay, quenched with 19mM EDTA at the end of the reaction, treated with proteinase K, and visualized on a 6% TBE-urea polyacrylamide gel.
Cell free transcription/translation TAM screening
The IscB protein sequence was human codon optimized using GenScript codon optimization tools, and IscB genes, tnpB genes, and omega RNA scaffolds with endogenous codon optimization were synthesized from Twist Biosciences. Transcription/translation templates are generated from custom synthetic products by PCR. Cell free transcription/translation reactions were performed using the PURExpress in vitro protein synthesis kit (NEB) according to the manufacturer's half-volume reaction protocol, using 75ng of template for the protein of interest, 125ng of template for the corresponding omega RNA with a guide to the TAM library, and 25ng of TAM library plasmid. The reaction was incubated at 37℃for 4 hours, then quenched by placing on 4℃or ice and adding 10ug each of RNase A (Qiagen) and 8 units of proteinase K (NEB), then incubated at 37℃for 5min. DNA was extracted by PCR purification and the adaptors were ligated using the nebnet adaptor (NEB) of Illumina according to the manufacturer's protocol using the NEBNext Ultra IIDNA library preparation kit (NEB) of Illumina. After adaptor ligation, 12 cycles of PCR were performed using a primer specific for TAM library backbone and a primer specific for nebnex adaptor using nebnex high fidelity 2X PCR master mix (NEB) at an annealing temperature of 63 ℃ to specifically amplify the cleavage products, followed by a second round of 18 cycles of PCR to further add Illumina i5 adaptors. Amplified libraries were gel extracted, quantified by qPCR using KAPA library quantification kit of Illumina (Roche) on a StepOne Plus machine (Applied Biosystems/Thermo Fisher Scientific), and single ended sequencing on Illumina MiSeq, read 1 for 80 cycles, index 1 for 8 cycles, and index 2 for 8 cycles. TAMs were extracted and the enrichment score for each TAM was calculated by: all TAMs occurring more than once were filtered and normalized to the TAM frequency in the input pool, which underwent the same in vitro transcription/translation and quenching reactions. Generating a position weight matrix based on the enrichment score, and visualizing weblog based on the position weight matrix using a custom Python script.
Expression and purification of Krasicb-1 RNP Complex
Purification of KraaIscB-1 complexed with the ncRNA of its native locus was performed in a manner similar to the CRISPR-related IscB-ncRNA RNP complex, with the following modifications: (1) The KraIscB-1CDS was unchanged and human codons were optimized; (2) Co-expression with ncRNA in BL21 (DE 3) cells (New England Biolabs) in the presence of 100. Mu.g/ml ampicillin and 25. Mu.g/ml kanamycin; (3) The bdSENP1 protease was not used because the KraIscB-1 protein was not attached to the N-terminal tag, but only carries a double-stranded coccus tag at its C-terminus. After limiting the boundaries of co-purified RNA by small RNA sequencing, the predicted omega RNA sequence was cloned downstream of the T7 promoter in the pCOLADuet-1 vector for inducible expression, and then KraIscB-1-omega RNA complexes were prepared following the same procedure.
Cell free transcription/translation cleavage assay
Omega RNA templates were amplified from custom-made synthetic products as described and transcribed in vitro using HiScribe T7 rapid high yield RNA synthesis kit (NEB) and 150ng DNA template in 2uL and 2uL T7 RNA polymerase mix (NEB) and 6.67mM final concentration of each NTP in 30uL of the reactant, and purified using RNA clean-up & concentrate-25 kit (Zymo). The protein sequence is amplified from custom synthesized products or locus plasmid templates. To generate the target, a short oligonucleotide containing the target and a TAM sequence with appropriate overhangs were synthesized by Genewiz and cloned into the corresponding backbone plasmid by Golden Gate (Golden Gate) or restriction ligation cloning. Labeled primers for generating labeled linear targets were synthesized by IDT and amplified by PCR from target plasmids using Q5 hot start high fidelity 2x master mix (NEB) according to the manufacturer's protocol.
Cell free transcription/translation reactions were performed using the PURExpress in vitro protein synthesis kit (NEB) according to the manufacturer's half-volume reaction protocol, using 75ng of template for the protein of interest and a final concentration of 1 μm in vitro transcribed omega RNA. The reaction was incubated at 37℃for 4 hours and then placed on ice to quench in vitro transcription/translation. Then 50-100ng of target substrate was added and the reaction was incubated for an additional 1 hour at the specified temperature. The reaction was then quenched by the addition of 10ug each of rnase a (Qiagen) and 8 units proteinase K (NEB) followed by incubation at 37 ℃ for 5min. DNA was extracted by PCR purification and electrophoresis was performed on 10% or 6% NovexTBE-urea gel or 10% NovexTBE gel (Thermo Fisher Scientific) as prescribed by the manufacturer's protocol. The gel was stained with 1X SYBR gold (Thermo Fisher Scientific) for a prescribed period of 10-15min and imaged on a Chemieoc imager (BioRad) with optimal exposure settings.
Mammalian cell culture and transfection
Mammalian cell culture experiments were performed in the HEK293FT line (american type culture collection (ATCC)) grown in Dulbecco modified Eagle medium containing high glucose, sodium pyruvate and GlutaMAX (Thermo Fisher) supplemented with 1x penicillin-streptomycin (Thermo Fisher), 10mM HEPES (Thermo Fisher) and 10% fetal bovine serum (VWR Seradigm). The confluency of all cells was maintained below 80%.
All transfections were performed using Lipofectamine 2000 (Thermo Fisher). Cells were seeded 16-20 hours prior to transfection to ensure 90% confluency at transfection. Cells were seeded at 20,000 cells/well for 96 well plates and 100,000 cells/well for 24 well plates. For each well on the plate, the transfection plasmid was combined with OptiMEM I low serum medium (Thermo Fisher) to a total of 25. Mu.L. Separately, 23 μl of OptiMEM was combined with 2 μl Lipofectamine 2000. The plasmid and Lipofectamine solutions were then combined and pipetted onto the cells.
Mammalian lysate cleavage assay
The human codon optimized IscB gene was cloned into the CMV expression backbone by Gibson assembly using a 2X Gibson assembly master mix (NEB) to produce the pCMV-SV40NLS-IscB protein-nucleoplasmin NLS-3xHA construct. As described, 500ng of each protein expression plasmid was transfected into individual wells of a 24-well plate. After about 48 hours, cells were washed with 500 μl Dulbecco phosphate buffered saline (Sigma Aldrich). mu.L of ice-cold lysis buffer (20 mM HEPES 7.5, 100mM KCl, 5mM MgCl2, 0.1% Triton-X100, 5% glycerol, 1mM DTT, 1X whole protease inhibitor cocktail) was added, and the cells were scraped from the plate, transferred to clean tubes and incubated on ice for 15min. The cells were then sonicated in a cold water bath at an amplitude of 30 for 4 cycles of 10s each. Lysates were then clarified by centrifugation at maximum speed for 20min and supernatants were collected and used fresh in the assay or snap frozen in liquid nitrogen for later use. Labeled targets and in vitro transcribed RNAs are produced as described in the "cell free transcription/translation cleavage assay" above.
Omega RNA templates were amplified from custom-made synthetic products as described and transcribed in vitro using HiScribe T7 rapid high yield RNA synthesis kit (NEB) and 150ng DNA template in 2uL and 2uL T7 RNA polymerase mix (NEB) and 6.67mM final concentration of each NTP in 30uL of the reactant, and purified using RNA clean-up & concentrate-25 kit (Zymo). To perform the cleavage assay, 10 μl of cell lysate was incubated with 1 μg of in vitro transcribed ωrna or sgRNA or negative control as no RNA and 100ng target substrate in 1x NEBuffer 3.1 (NEB). The reaction was incubated at 37℃for 1 hour, then quenched by the addition of 10ug each of RNase A (Qiagen) and 8 units of proteinase K (NEB), and then incubated at 37℃for 5min. DNA was extracted by PCR purification and run on a 4% agarose E-gel EX (Thermo Fisher Scientific) according to the manufacturer's instructions and visualized on a ChemiDoc imager (BioRad).
Sequencing of cleavage products
In vitro cleavage assays were performed as described. The purified reactions were subjected to the GLOE-seq library preparation protocol as described (15) using 2.5uM adaptors as input to the proximal adaptor annealing step. Final amplification with addition of Illumina adaptors and barcodes was performed using nebnet high fidelity 2x PCR master mix (NEB) at an annealing temperature of 63 ℃ for 15s for 12 cycles. The library was double-ended sequenced using Illumina MiSeq, with reading 1 for 150 cycles, reading 2 for 150 cycles, index 1 for 8 cycles, and index 2 for 8 cycles. Double-ended reads were mapped to target substrates using BWA and 3' ends were extracted and mapped using custom Python script.
Enzymatic footprint (Enzymatic footprinting assay)
The dsDNA substrate (191 bp) was generated by PCR amplification from a plasmid containing the target site and TAM sequence. 10pmol of dWaiscB and 40pmol of omega RNA were incubated in reaction buffer (20mM HEPES pH 7.5, 50mM NaCl, 10mM MgCl2 and 5% glycerol) for 30min at 37 ℃. Then, 0.1pmol of DNA substrate was added and the reaction was allowed to proceed for another 30min at 37 ℃. Next, 500U of exonuclease III (New England Biolabs) was added, the assay was incubated at 37 ℃ for an additional 10min, and quenched with 20mM EDTA. As a negative control, another reaction was run in parallel, where ωrna was excluded and the volume was replaced with water. After quenching, both reactions were briefly transferred to 50 ℃ for 5min, then immediately placed on ice, treated with rnase a (Qiagen) and protease K (New England Biolabs), and purified using a PCR purification kit (Qiagen). The purified reactions were subjected to the GLOE-seq library preparation protocol using 2.5uM adaptors as input to the proximal and distal adaptor annealing steps. The library was amplified as described in sequencing of "cleavage products" in the materials and methods section, and double-ended sequencing of the library was performed using Illumina MiSeq, with read 1 for 100 cycles, read 2 for 100 cycles, index 1 for 8 cycles, and index 2 for 8 cycles. Double-ended reads were mapped to target substrates using BWA and 3' ends were extracted and mapped using custom Python script.
Mammalian genome editing
Omega RNA scaffold scaffolds were cloned into pUC 19-based human U6 expression scaffolds by Gibson assembly. For the 12 guide libraries pooled, the primers that added each of the 12 guides in a given pool were mixed in equimolar ratios, and the omega RNA scaffold backbone was subjected to full plasmid amplification using the Phusion rapid high fidelity 2X master mix (Thermo Fisher Scientific), using the guide primer annealed to the U6 promoter and the second primer annealed to the omega RNA scaffold origin. The PCR products were gel extracted and eluted in 30uL, then blunt ended by adding 5 units T4 PNK (NEB), 200 units T4 DNA ligase (NEB) and finally 1X T4 DNA ligase buffer (NEB) to circularize, and incubated for 1.5h at room temperature, then transformed in Stbl3 chemocompetent E.coli (NEB). For a single guide construct, genewiz synthesizes an oligonucleotide with the appropriate overhang, anneals and phosphorylates using T4 PNK (NEB), and clones it into the omega RNA backbone by restriction ligation cloning. The protein expression constructs were cloned as described in the mammalian lysate cleavage assay above.
For a single guide sequence, 250ng of the guide/omega RNA expression plasmid and 125ng of the protein expression plasmid were transfected into each of the 4 wells of a 96-well plate for each of the guide conditions described. After 60-72 hours, genomic DNA was harvested by washing the cells once in 1xDPBS (Sigma Aldrich) and adding 50uL QuickExtract DNA extraction solution (Lucigen). Cells were scraped from the plate, suspended in QuickExtract, and circulated at 65 ℃ for 15min, at 68 ℃ for 15min, and then at 95 ℃ for 10min to lyse the cells. 2.5uL of lysed cells was used as input for each PCR reaction.
For library amplification, the target genomic region was amplified and 12 cycles of PCR were performed using nebinex high fidelity 2X PCR master mix (NEB) at an annealing temperature of 63 ℃ for 15s, followed by a second round of 18 cycles of PCR to add Illumina adaptors and barcodes. The library was gel extracted and single ended sequencing was performed on Illumina MiSeq, with reading 1 for 300 cycles, index 1 for 8 cycles, index 2 for 8 cycles. The insertion/deletion (indel) frequency was analyzed using CRISPResso 2. In view of the low frequency of indel events of IscB, indels with at least 2 reads or indels with more than 1 base insertion or deletion were only counted into the reported indel frequency in order to eliminate noise and sequencing errors in PCR. For single guide/ωrna experiments, to assess statistical significance, a 2-tailed T-test was performed using non-targeted guide/ωrna conditions as negative control (see supplementary table ukreP).
Expression and purification of TnpB RNP Complex
To purify the TnpB protein complexed with putative omega RNA of its natural locus, the N-terminal His14-MBP tagged TnpB CDS and the corresponding downstream locus (up to 80bp beyond the predicted guide adaptor end) were cloned as a single piece downstream of the T7 promoter in pET45b (+) vector. TnpB RNP was expressed and purified in a similar manner to CRISPR-related IscB RNP, but with the following modifications: (1) Lysis buffer and buffers A and B were supplemented with 40mM imidazole and 5mM beta-mercaptoethanol while eliminating MgCl in all buffers 2 And DTT; (2) RNP was purified on Ni-sepharose 6 fast flow beads (GE Healthcare) instead of Streptococcus-Tactin resin; (3) The elution buffer was supplemented with 300mM imidazole and 5mM beta-mercaptoethanol while eliminating MgCl 2 DTT and desthiobiotin; (4) The His14-MBP solubility tag remains attached to the RNP to provide stability.
In vitro transcription/translation plasmid competition assay
The use of 62.5ng protein template and omega RNA at a final concentration of 7uM in 5uL established an in vitro assay to verify TAM-specific TnpB activity as described in the "cell free transcription/translation cleavage assay". SpCas9 was also measured as a positive control (panel S#KnIchC). The reaction was incubated at 37℃for 3 hours, followed by the addition of the target plasmid substrate. Each reaction received 15ng of each of the 2 plasmids containing TAM sequences flanking the homologous targets, containing only homologous targets or no targets. The reaction was incubated at 37 ℃ for an additional 1 hour, then quenched and subjected to adaptor ligation and sequencing as described in "cell free transcription/translation TAM screening". Custom python scripts were used to quantify cleavage products corresponding to each plasmid substrate.
***
Various modifications and alterations of this invention will become apparent to those skilled in the art without departing from the scope and spirit of this invention. While the invention has been described in connection with specific embodiments, it will be understood that further modifications are possible and that the claimed invention should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the invention which are obvious to those skilled in the art are intended to be within the scope of the invention. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains and as may be applied to the essential features hereinbefore set forth.
Claims (70)
1. A non-naturally occurring, engineered composition comprising a) a TnpB polypeptide comprising a Ruv-C nuclease domain, optionally comprising Ruv-CI, ruv-CII and Ruv-CIII subdomains, and b) an omega RNA molecule comprising a scaffold and a reprogrammable spacer sequence, said nucleic acid component molecule capable of forming a complex with said TnpB polypeptide and directing said TnpB polypeptide to a target polynucleotide.
2. The composition of claim 1, wherein the TnpB polypeptide comprises from about 200 to about 500 amino acids.
3. The composition of claim 1, wherein the reprogrammable spacer sequence comprises a spacer of 10 nucleotides to 30 nucleotides in length.
4. The composition of claim 1, wherein the omega RNA component molecule comprises a scaffold of about 80 to 200 nucleotides in length.
5. The composition of any one of the preceding claims, wherein the TnpB complex binds to a Target Adjacent Motif (TAM) 5' of the target polynucleotide.
6. The composition of claim 5, wherein the TAM sequence comprises TCA.
7. The composition of claim 5, wherein the TAM sequence comprises TTCAN.
8. The composition of any one of the preceding claims, wherein the target polynucleotide is DNA.
9. The composition of any one of the preceding claims, further comprising a homologous recombination donor template comprising a donor sequence for insertion into a target polynucleotide.
10. The composition of any one of the preceding claims, further comprising a functional domain associated with the TnpB protein.
11. The composition of claim 10, wherein the functional domain is a transposase, an integrase, a nucleobase deaminase, a reverse transcriptase, a recombinase, an integrase, a topoisomerase, a retrotransposon, a phosphatase, a polymerase, a ligase, a helicase, a methylase, a demethylase, a translational activator, a translational repressor, a transcriptional activator, a transcriptional repressor, a transcriptional release factor, a chromatin modifier, a histone modifier, a nuclease.
12. A vector system comprising one or more vectors encoding the TnpB polypeptide and omega RNA components of any of the preceding claims.
13. An engineered cell comprising the composition of any one of claims 1 to 12.
14. A method of modifying a target polynucleotide sequence in a cell comprising introducing into the cell the composition of any one of claims 1 to 12.
15. The method of claim 14, wherein the modification comprises cleavage of a DNA polynucleotide.
16. The method of claim 15, wherein the cleavage results in a 5' overhang.
17. The method of claim 16, wherein the cleavage occurs distally of a target adjacent motif.
18. The method of claim 17, wherein the cleavage occurs at the site of a spacer annealing site or 3' of the target sequence.
19. The method of claim 14, wherein the polypeptide and/or omega RNA component is provided via one or more polynucleotides encoding the polypeptide and/or one or more omega RNA component, and wherein the one or more polynucleotides are operably configured to express the TnpB polypeptide and/or the omega RNA component molecule.
20. The method of any one of the preceding claims, wherein the one or more mutations comprise substitutions, deletions, and insertions.
21. An engineered, non-naturally occurring composition comprising:
a TnpB polypeptide, wherein the TnpB polypeptide is catalytically inactive,
b. A nucleotide deaminase associated with or otherwise capable of forming a complex with the TnpB protein, and
c. omega RNA component molecules capable of forming complexes with the TnpB protein and directing site-specific binding at a target sequence.
22. The composition of claim 21, wherein the TnpB is selected from table 1A, 1B, 1C or fig. 1.
23. The composition of claim 21, wherein the nucleotide deaminase is an adenosine deaminase or a cytidine deaminase.
24. One or more polynucleotides encoding one or more components of the composition of any one of claims 21 or 22.
25. One or more vectors encoding one or more polynucleotides of claim 24.
26. A cell or progeny thereof genetically engineered to express one or more components of the composition of any one of claims 24 or 26.
27. A method of editing a nucleic acid in a target polynucleotide comprising delivering the composition of claim 21 or 22, the one or more polynucleotides of claim 24, or the one or more vectors of claim 25 to a cell or population of cells comprising the target polynucleotide.
28. The method of claim 27, wherein the target polynucleotide is a target sequence within genomic DNA.
29. The method of claim 27 or 28, wherein the target polynucleotide is edited at one or more bases to introduce a g→a or c→t mutation.
30. An isolated cell or progeny thereof comprising one or more base edits made using the method of any one of claims 28 to 29.
31. An engineered, non-naturally occurring composition comprising:
a. a TnpB polypeptide which catalyzes the death,
b. a reverse transcriptase associated with or otherwise capable of forming a complex with the TnpB polypeptide, an
c. An omega RNA component molecule capable of forming a complex with the TnpB protein and directing site-specific binding of the complex to a target sequence of a target polynucleotide, the directing molecule further comprising a donor template encoding a donor sequence for insertion into the target polynucleotide.
32. One or more polynucleotides encoding one or more components of the composition of claim 31.
33. One or more vectors encoding one or more polynucleotides of claim 32.
34. A method of modifying a target polynucleotide comprising:
delivering the composition of claim 31, the one or more polynucleotides of claim 32, or the one or more vectors of claim 33 to a cell or population of cells comprising the target polynucleotide, wherein the complex directs the reverse transcriptase to the target sequence, and the reverse transcriptase facilitates insertion of a donor sequence encoded by the donor template from the omega RNA component molecule into the target polynucleotide.
35. The method of claim 34, wherein insertion of the donor sequence:
a. introducing one or more base edits;
b. correcting or introducing a premature stop codon;
c. disruption of splice sites;
d. insertion or restoration of splice sites;
e. inserting a gene or gene fragment at one or both alleles of the target polynucleotide; or (b)
f. A combination thereof.
36. An isolated cell or progeny thereof comprising a modification made using the method of claim 34 or 35.
37. An engineered, non-naturally occurring composition comprising:
TnpB polypeptide,
b. a non-LTR retrotransposon protein associated with or otherwise capable of forming a complex with the TnpB polypeptide, and
c. An omega RNA component molecule capable of forming a complex with said TnpB protein and directing site-specific binding of said complex to a target sequence of a target polynucleotide, said omega RNA molecule further comprising a donor template encoding a donor sequence for insertion into said target polynucleotide and being located between two binding elements capable of forming a complex with said non-LTR retrotransposon protein.
38. The composition of claim 37, wherein the TnpB protein is fused to the N-terminus of the non-LTR retrotransposon protein.
39. The composition of claim 37 or 38, wherein the TnpB protein is engineered to have nickase activity.
40. The composition of claim 39, wherein the omega RNA component molecule directs the fusion protein to a target sequence 5' of the targeted insertion site, and wherein the TnpB protein produces a strand break at the targeted insertion site.
41. The composition of claim 39, wherein the omega RNA component molecule directs the fusion protein to a target sequence 3' of the targeted insertion site, and wherein the TnpB protein produces a strand break at the targeted insertion site.
42. The composition of claim 39, wherein the donor polynucleotide further comprises a polymerase processing element to facilitate 3' end processing of the donor polynucleotide sequence.
43. The composition of claim 39, wherein the donor polynucleotide further comprises a region homologous to a target sequence on the 5 'end of the donor construct, the 3' end of the donor construct, or both.
44. The composition of claim 43, wherein the homology region is 8 to 25 base pairs.
45. One or more polynucleotides encoding one or more components of the composition of any one of claims 39 to 44.
46. One or more vectors comprising one or more polynucleotides of claim 45.
47. A method of modifying a target polynucleotide comprising:
delivering the composition of any one of claims 39 to 44, the one or more polynucleotides of claim 45, or the one or more vectors of claim 46 to a cell or cell line comprising the target polynucleotide, wherein the complex directs the non-LTR retrotransposon protein to the target sequence, and the non-LTR retrotransposon protein facilitates insertion of a donor polynucleotide sequence from the donor construct into the target polynucleotide.
48. The method of claim 47, wherein the insertion of the donor sequence:
a. Introducing one or more base edits;
b. correcting or introducing a premature stop codon;
c. disruption of splice sites;
d. insertion or restoration of splice sites;
e. inserting a gene or gene fragment at one or both alleles of the target polynucleotide; or (b)
f. A combination thereof.
49. An isolated cell or progeny thereof comprising a modification made using the method of claim 47 or 48.
50. An engineered, non-naturally occurring composition comprising:
TnpB polypeptide,
b. an integrase protein associated with or otherwise capable of forming a complex with the TnpB polypeptide and optionally a reverse transcriptase, and
c. an omega RNA component molecule capable of forming a complex with the TnpB protein and directing site-specific binding of the complex to a target sequence of a target polynucleotide, the directing molecule further comprising a donor template encoding a donor sequence for insertion into the target polynucleotide and being located between two binding elements capable of forming a complex with the integrase protein.
51. The composition of claim 50, wherein the TnpB protein is fused to the integrase protein and optionally to the reverse transcriptase.
52. The composition of claim 50 or 51, wherein the TnpB protein is engineered to have nickase activity.
53. The composition of claim 52, wherein the omega RNA component molecule directs the fusion protein to a target sequence, and wherein the TnpB protein creates a nick at the targeted insertion site.
54. The composition of claim 52, wherein the donor polynucleotide further comprises a region homologous to a target sequence on the 5 'end of the donor construct, the 3' end of the donor construct, or both.
55. One or more polynucleotides encoding one or more components of the composition of any one of claims 52 to 54.
56. One or more vectors comprising one or more polynucleotides of claim 55.
57. A method of modifying a target polynucleotide comprising:
delivering the composition of any one of claims 50 to 54, the one or more polynucleotides of claim 55, or the one or more vectors of claim 56 to a cell or population of cells comprising the target polynucleotide, wherein the complex directs the integrase protein to the target sequence and the integrase protein facilitates insertion of a donor polynucleotide sequence from the donor construct into the target polynucleotide.
58. The method of claim 57, wherein insertion of the donor sequence:
a. introducing one or more base edits;
b. correcting or introducing a premature stop codon;
c. disruption of splice sites;
d. insertion or restoration of splice sites;
e. inserting a gene or gene fragment at one or both alleles of the target polynucleotide; or (b)
f. A combination thereof.
59. An isolated cell or progeny thereof comprising a modification made using the method of claim 57 or 58.
60. A composition for detecting the presence of a target polynucleotide in a sample comprising:
one or more TnpB proteins with parachuting activity;
at least one omega RNA component comprising a sequence capable of binding to a target polynucleotide and designed to form a complex with the one or more TnpB proteins;
a detection construct comprising a polynucleotide component, wherein the TnpB exhibits a parachuting nuclease activity and cleaves the polynucleotide component of the detection construct upon activation by the target sequence; and
optionally, isothermal amplification reagents.
61. The composition of claim 60, wherein the TnpB is selected from Table 1A, table 1B, table 1C, or FIG. 1, or comprises one or more catalytic residues corresponding to 195D, 277E, or 361D of the sequence alignment of FIG. 1.
62. The composition of claim 60, wherein the one or more TnpB proteins are selected from Table 1A, table 1B, table 1C, or FIG. 1, and are active, i.e., have nuclease activity, over a temperature range of 45℃to 60 ℃.
63. The composition of claim 60, wherein the isothermal amplification reagent is a loop-mediated isothermal amplification (LAMP) reagent.
64. The composition of claim 63, wherein the LAMP reagent comprises LAMP primers.
65. The composition of any one of claims 60 to 64, further comprising one or more additives to increase reaction specificity or kinetics.
66. The composition of any one of claims 60 to 65, further comprising a polynucleotide binding bead.
67. A method for detecting a polynucleotide in a sample, the method comprising:
contacting one or more target sequences with a TnpB, at least one omega RNA component capable of forming a complex with the TnpB and directing sequence-specific binding to one or more target polynucleotides, and a detection construct, wherein the TnpB exhibits a paracentesis activity and cleaves the detection construct upon activation by the one or more target sequences; and
Detecting a signal from cleavage of the detection construct, thereby detecting the one or more target polynucleotides.
68. The method of claim 67, further comprising amplifying the target polynucleotide using isothermal amplification prior to the contacting step.
69. The method of claim 68, wherein detecting amplified target polynucleotides by binding of the target polynucleotides to the TnpB complex occurs at a temperature in the range of 45℃to 60 ℃.
70. The method of claim 67, wherein the target polynucleotide is detected in one hour or less.
Applications Claiming Priority (6)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US63/141,371 | 2021-01-25 | ||
| US63/195,610 | 2021-06-01 | ||
| US63/210,860 | 2021-06-15 | ||
| US202163282352P | 2021-11-23 | 2021-11-23 | |
| US63/282,352 | 2021-11-23 | ||
| PCT/US2022/013710 WO2022159892A1 (en) | 2021-01-25 | 2022-01-25 | Reprogrammable tnpb polypeptides and use thereof |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN117616126A true CN117616126A (en) | 2024-02-27 |
Family
ID=89958397
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202280024175.9A Pending CN117616126A (en) | 2021-01-25 | 2022-01-25 | Reprogrammable TNPB polypeptides and their uses |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN117616126A (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN117947054A (en) * | 2024-03-25 | 2024-04-30 | 上海羽冠生物技术有限公司 | Method for editing pertussis Bao Te bacteria seamless genes |
| CN120310956A (en) * | 2025-06-11 | 2025-07-15 | 中国科学院东北地理与农业生态研究所 | A DNAzyme-based gene chip biosensor and its application |
-
2022
- 2022-01-25 CN CN202280024175.9A patent/CN117616126A/en active Pending
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN117947054A (en) * | 2024-03-25 | 2024-04-30 | 上海羽冠生物技术有限公司 | Method for editing pertussis Bao Te bacteria seamless genes |
| CN120310956A (en) * | 2025-06-11 | 2025-07-15 | 中国科学院东北地理与农业生态研究所 | A DNAzyme-based gene chip biosensor and its application |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP4281567A1 (en) | Reprogrammable tnpb polypeptides and use thereof | |
| WO2022087494A1 (en) | Reprogrammable iscb nucleases and uses thereof | |
| CN115175996A (en) | Novel type VI CRISPR enzymes and systems | |
| WO2022173830A1 (en) | Nuclease-guided non-ltr retrotransposons and uses thereof | |
| WO2021102042A1 (en) | Retrotransposons and use thereof | |
| WO2020236967A1 (en) | Random crispr-cas deletion mutant | |
| WO2023114872A2 (en) | Reprogrammable fanzor polynucleotides and uses thereof | |
| WO2021097118A1 (en) | Small type ii cas proteins and methods of use thereof | |
| EP4437094A1 (en) | Reprogrammable iscb nucleases and uses thereof | |
| WO2022147321A1 (en) | Type i-b crispr-associated transposase systems | |
| AU2020373064A1 (en) | Type I-B CRISPR-associated transposase systems | |
| WO2023230483A2 (en) | Engineered chimeric iscb polypeptides and uses thereof | |
| WO2021041922A1 (en) | Crispr-associated mu transposase systems | |
| WO2022150651A1 (en) | Dna nuclease guided transposase compositions and methods of use thereof | |
| WO2023170535A2 (en) | Novel nucleic acid-guided nucleases and use thereof | |
| CN116583599A (en) | Reprogrammable IscB nucleases and uses thereof | |
| WO2022087451A1 (en) | Nucleic acid-guided nucleases and use thereof | |
| CN117616126A (en) | Reprogrammable TNPB polypeptides and their uses | |
| WO2022076830A1 (en) | Type i crispr-associated transposase systems | |
| WO2024081728A2 (en) | Reprogrammable tnpb polypeptides with maze domains and uses thereof | |
| WO2024238835A2 (en) | Novel crispr enzymes and systems | |
| EP4436592A1 (en) | Reprogrammable isrb nucleases and uses thereof | |
| WO2024015920A1 (en) | Hybrid crispr-cas systems and methods of use thereof | |
| WO2024081711A2 (en) | Reprogramable tnpb polypeptides and use thereof | |
| WO2024197008A2 (en) | Nuclease-guided non-ltr retrotransposons and uses thereof |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |