WO2025199099A1 - Ligand dependent post-translational control of engineered rna polymerase (rnap) mutants from bacteriophage t7 - Google Patents
Ligand dependent post-translational control of engineered rna polymerase (rnap) mutants from bacteriophage t7Info
- Publication number
- WO2025199099A1 WO2025199099A1 PCT/US2025/020365 US2025020365W WO2025199099A1 WO 2025199099 A1 WO2025199099 A1 WO 2025199099A1 US 2025020365 W US2025020365 W US 2025020365W WO 2025199099 A1 WO2025199099 A1 WO 2025199099A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- indole
- rnap
- engineered
- cell
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/70—Vectors or expression systems specially adapted for E. coli
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1093—General methods of preparing gene libraries, not provided for in other subgroups
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/12—Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
- C12N9/1241—Nucleotidyltransferases (2.7.7)
- C12N9/1247—DNA-directed RNA polymerase (2.7.7.6)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6844—Nucleic acid amplification reactions
Definitions
- the present application is directed to engineered proteins, and specifically engineered RNA polymerases from bacteriophages subject to small-molecule post-translation control.
- T7 RNA polymerase is a monomeric bacteriophage-encoded DNA directed RNA polymerase that catalyzes the formation of RNA in the 5’ to 3’ direction.
- T7 RNA polymerase recognizes a specific promoter sequence (i.e., the T7 promoter).
- the conformation of the N-terminal domain changes between the initiation and elongation phases of the functioning enzyme.
- Many biotechnological process relay on high-fidelity transcription of RNA transcripts from DNA templates.
- T7 RNAP has found widespread utility throughout microbiology and biological engineering workflows. This is in part due to its simplicity as a single gene which encodes a large 883 AA protein that has predictable control as it transcribes from a well characterized promoter sequence that is orthogonal to native promoter sequences throughout the tree of life. Cloning and expression of the gene encoding T7 RNA polymerase are well known in the art (See e.g., U.S. Pat. No. 4,952,496). Due to its promoter specificity and high RNA polymerase activity, T7 has been used for various applications. It is also useful for the high-level expression of recombinant genes in E. coli.
- T7 is also used in various nucleic acid amplification methods, including those used in diagnostic methods. As stability and thermostability are often important considerations in the development of components of diagnostic methods (See U.S. Pat. Nos. 9,193,959, 8,551,752, and 7,507,567, etc.).
- in vitro transcription uses bacteriophage DNA-dependent ribonucleic acid (RNA) polymerases to synthesize template-directed mRNA transcripts.
- RNA DNA-dependent ribonucleic acid
- Problems in the IVT reaction can result in complete failure (e.g., no transcript generated) or in transcripts that are the incorrect size (e.g., shorter or longer than expected).
- Specific problems associated with IVT reactions include, for example, abortive (truncated) transcripts, run-on transcripts, polyA tail variants/3’ heterogeneity, mutated transcripts, and/or double-stranded contaminants produced during the reactions.
- T7 RNAP has further been engineered to be thermally stable, recognize different promoters in an orthogonal manner, to function as a split enzyme, and control various genetic systems. This enables rapid development of T7 RNAP for non-native functions, such as de novo allostery. As a result, T7 RNAP is important to the downstream transcriptional control of a number of in vitro and in vivo processes. However, control of T7 RNAP itself is limited to traditional genetic-based expression control elements, such as promotors and inducers which cannot be applied to the downstream control of the enzyme. As such, there exist s long-felt need for a post-translational control system for T7 RNAP for both in vitro, and in vivo applications of the same.
- the present application relates to an engineered T7 RNA polymerases (RNAP) enzymes from bacteriophage that are post-translationally controlled by one or more to heterocyclic ligands, such as indole and indole-containing compounds.
- RNAP RNA polymerases
- a preferred mutant includes a T7 RNAP with a mutation at position 727, and more preferably a T7 RNAP with a tryptophan to glycine, alanine, or valine position 727 of the amino acid sequence.
- the application relates to an engineered RNAP comprising a polypeptide sequence having at least between 80% to 99%, or more sequence identity to the reference sequence of SEQ ID NO’s. 1 or 3, or a functional fragment thereof, having one or more mutations that causes the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
- the engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and even more preferably a substitution mutation W727G, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can further include a substitution set at a position selected from 430, 433, 633, 727, 849, and 880, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can include a substitution set at a position selected from S430, N433, S633, F849, and F880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can include a substitution selected from S430P, N433T, S633P, W727G, F849I, and F880Y, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can further include a substitution set at a position selected from 378, 430, 433, 633, 727, 737, 781, 785, 786, 845, 849, and 880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can substitution selected from K378, S430, N433, S633, W727, Q737, N781, S785, Q786, F845, F849, and F880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can include a substitution selected from K378R, S430P, N433T, S633P, W727G, Q737L, N781S, S785A, Q786H, F845L, F849I, and F880Y, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. All of the above mutations causing the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
- the engineered RNA polymerase can further include a substitution set at a position selected from 378, 430, 433, 633, 727, 737, 778, 785, 786, 849, 880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can substitution selected from K378, S430, N433, S633, W727, Q737, 1778, S785, Q786, F849, F880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can include a substitution selected from K378R, S430P, N433T, S633P, W727G, Q737C, I778L, S785A, Q786H, F849I, and F880Y, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and even more preferably a substitution mutation W727G, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can further include a substitution set at a position selected from 737, 781, 785, 786, 845, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can include a substitution set at a position selected from Q737, N781, S785, Q786, F845, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can include a substitution selected from Q737L, N781S, S785A, Q786H, F845L, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can further include a substitution set at a position selected from 737, 778, 781, 782, 785, 786, 845, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can substitution selected from Q737, 1778, N781, F782, S785, Q786, F845, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can include a substitution selected from V725X, Q737L, Q737C, Q737F, Q737V, Q737I, Q737H, I778L, I778V, N781S, N781H, F782Y S785A, Q786H, F845Y, F845L, F845C, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. All of the above mutations causing the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
- the engineered RNA polymerase comprises a substitution at
- the engineered RNA polymerase can further include a substitution set at a position selected from 737, 781, 785, 786, 845, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can include a substitution set at a position selected from Q737, N781, S785, Q786, F845, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1..
- the engineered RNA polymerase can include a substitution selected from Q737L, N781S, S785A, Q786H, F845L, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can further include a substitution set at a position selected from 737, 778, 781, 782, 785, 786, 845, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can substitution selected from Q737, 1778, N781, F782, S785, Q786, F845, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can include a substitution selected from V725X, Q737L, Q737C, Q737F, Q737V, Q737I, Q737H, I778L, I778V, N781S, N781H, F782Y S785A, Q786H, F845Y, F845L, F845C, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. All of the above mutations causing the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indolesimilar compound, as compared to an RNA polymerase lacking said mutation.
- the engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and even more preferably a substitution mutation wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can further include a substitution set at a position selected from 737, 781, 785, 786, 845, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can include a substitution set at a position selected from Q737, N781, S785, Q786, F845, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1..
- the engineered RNA polymerase can include a substitution selected from Q737L, N781S, S785A, Q786H, F845L, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can further include a substitution set at a position selected from 737, 778, 781, 782, 785, 786, 845, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can substitution selected from Q737, 1778, N781, F782, S785, Q786, F845, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can include a substitution selected from V725X, Q737L, Q737C, Q737F, Q737V, Q737I, Q737H, I778L, I778V, N781S, N781H, F782Y S785A, Q786H, F845Y, F845L, F845C, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. All of the above mutations causing the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
- the engineered RNA polymerase comprises further comprises a MIG substitution. In another preferred aspect, the engineered RNA polymerase comprises an N-terminal degron sequence. Additional aspects of the disclosure will become apparent from the figures, claims, and specification provided below.
- Figures 1A-H Development of a ligand-activatable RNA polymerase responsive to indole.
- A Chemical recovery of function approach for T7 RNAP. Mutating a buried tryptophan in T7 RNAP disrupts the ability of the RNAP to transcribe DNA to RNA. This activity is recovered in the presence of indole.
- B In vitro transcriptional assays using Peppers aptamer in the absence or presence of 1 mM indole for the indicated variants.
- TS W727G is also known as LARPV1.
- C Schematic of the bacterial- 1 -hybrid selection system developed for identification of LARPs. The selection enables positive and negative selection.
- the Pareto front of high score in the presence of indole and low score for the solvent contains variants (large green circle for LARP-I, black circles for other tested variants) predicted to have improved or maintained indole responsive growth with significantly less constitutive activity compared to the starting construct (LARPVl, orange larger circle).
- the sequence profile above the plot contains the LARP-I and LARPVl specific mutations relative to the wild-type T7 RNAP.
- FIGS. 2A-H LARP-I allows indole control of gene expression exogenously, endogenously, and intercellularly.
- A Cartoon of predicted results of experiment with external addition of indole. Exogenous indole added to LARP-I containing E. coli and the tryptophanase gene knocked out enables ligand dependent gene expression measured by expression of a fluorescent reporter.
- B GFP RFU normalized by cell density as a function of supplemented indole concentration. Positive control represents an expression strain using T7 RNAP R632S , and negative control expresses the catalytic knockout T7 RNAP R632S ' 1639 . Error bars represent 1 s.e.m., n > 3.
- FIG. 1 Cartoon of predicted results using an E. coli strain capable of producing indole.
- E. coli naturally expresses tryptophanase, and produces indole through tryptophan metabolism which accumulates.
- D GFP RFU normalized by cell density for indicated strains and in the presence and absence of 125 zM indole. Positive control (T7 RNAP R632S ) and negative control (T7 RNAP R632S /Y639A ) are the same as in panel B.
- Endogenous activation of LARP-I is similar to exogenous activation at 125 zzM indole addition in direct comparison of strains with and without the tnaA knockout.
- Error bars represent 1 s.d., n > 3.
- E Predicted effects of the expression of a single gene lysin (Sgl KU1 ) under a T7 promoter with co-expression of LARP-I. Indole activates LARP-I, leading to cell lysis.
- FIGS 3A-C LARP-I controls T7 bacteriophage viability and propagation in trans.
- A Cartoon of phage experiment with the phage RNAP supplied in trans. Bacteriophage T7 Agpl (AT7 RNAP) requires active T7 RNAP to propagate. Bacteria containing LARP-I do not propagate phage in the absence of indole, but yield robust phage infection in the presence of indole.
- B Phage plaque formation, and quantification as plaque forming units (PFU), for different T7 phage, T7 RNAPs, and indole concentrations.
- PFU plaque forming units
- FIGS 4A-G The versatility and orthogonality of LARPs are demonstrated for different bacteria, different promoter sequence specificities, and for different controlling ligands.
- A Cartoon of experimental construction of P. pntida LARP constructs.
- C Hybrid LARPs with engineered polymerase specificity loops.
- E PSERM scores of library variants from selection on indole-5 carboxyaldehyde (I-5-CHO) vs indole. Larger symbols represent indicated variants. Sequence differences between variants are shown.
- F Specific growth rate on media lacking histidine and supplemented with lmM3-AT. Indole and I-5-CHO are included at 50 //M. Statistics and /2-values: **: /?- value ⁇ 0.01, ****: value ⁇ 0.0001.
- G E.
- coli USOAtnaA with an sfGFP reporter plasmid driven by pT7 and different T7 RNAPs positive control: T7 RNAP R632S ; negative control: T7 RNAP R632S Y639A ).
- FIGS 5A-B Overview of in vitro transcriptional activity assay.
- a transcriptional assay is assembled using a dsDNA sequence encoding an aptamer (i.e. Peppers or Spinach) under the pT7 canonical promoter sequence TAATACGACTCACTATA (SEQ ID NO. 17) in solution with reaction buffer, rNTPs, the non-fluorescent aptamer ligand (i.e. HBC530 or DFHBI).
- the reaction is initiated with the addition of T7 RNA Polymerase, at various ligand concentrations.
- the RNA polymerase transcribes the mRNA aptamer sequence, which binds to and constrains a chromophore, resulting in fluorescence.
- FIGS 6A-C Initial glycine scan of 15 rationally selected tryptophan residues, expression and purification results.
- A Soluble Fraction and
- B Eluate from mutants of T7 RNAP harboring the given mutations.
- WT N-terminal His6 T7 RNAP R378K.
- C Fold-change in activity of initially purified variants in the presence of 2 mM indole relative to the solvent control (ethanol) revealed no variants with positive indole modulation.
- Figures 7A-B Computationally predicted stability of glycine scan mutants.
- FIGS 8A-C Identification of initial LARPs using a thermally stable in vitro glycine scan.
- FIGS 9A-C Activity of LARPvl (TS-W727G).
- A SDS-PAGE Protein Gel of purified wild-type (WT), thermally stable (TS), TS-W727G (LARPvl), and LARP-I at 20 pg, 5 pg, and 2 pg.
- B Dose-response curve for TSW287G and TS. Activity is reported normalized to 1 for each variant in the absence of indole.
- FIGS 10A-B Generation of E. coll USO pyrF-/hisB- AtnaA by CRISPR-Cas9.
- a CRISPR-Cas9 dsDNA break and homology directed recombination protocol utilizing temperature-sensitive plasmids was used to generate a knock-out of the gene encoding the tryptophan degrading enzyme tryptophanase (tnaA).
- tnaA tryptophan degrading enzyme tryptophanase
- a gRNA targeting the tnaA coding sequence and a linear fragment of dsDNA with homology arms surrounding the tnaA coding sequence were used to remove a large section of the E. coli genome in the selection strain USO pyrF-/hisB-.
- FIGS 12A-L PSERM Scoring A-C) Position Specific Amino Acid Enrichment Ratio Score Heat Map, (D-F) PSERM score distribution for library variants, (G-I) relevant PSERM scores plotted against each other for (A) Solvent control, (B) Indole, and (C) Indole-5- carboxyaldehyde Selection Results.
- PSERM scores for each variant result from the additive score of constituent mutations pictured for Solvent (A), Indole (B) and Indole-5-carboyxaldehyde (C). WT residues are flagged with an asterisk (*).
- the top 2% of library variants in the ligand selections E, F) have a positive PSERM score.
- Figure 15 Indole-dependent SglKUl expression and lysis in LARP-I containing A. coli.
- FIGS 16A-C Development of LARP-I expressing P. putida strains.
- Figures 18A-B Fluorescence anisotropy of purified TS and LARP-I at 1 nM DNA LARP- I and TS control, with 500 uM indole or solvent control, titrated against 1 nM of fluorescently labeled pT7 double stranded DNA incubated at 25°C (A) and 37°C (B). Values are the means of three technical replicates fit to a 1 : 1 binding isotherm. Oligonucleotides from IDT: 6FAM fluorescein fluorophore conjugated to the coding (forward) strand, pT7: -17:-1, TAATACGACTCACTATA (SEQ ID NO. 17), annealed to the reverse complement sequence to form double stranded DNA.
- FIGS 19A-C Circular Dichroism Spectroscopy of LARP-I.
- A Full Spectra of LARP- I with and without 500 pM indole shows no difference relative to the TS background, traces are manually offset around (+ for TS, - for LARP-I without indole) for comparison.
- LARP-I shows a decreased Tm,app around 39°C as measured by the maximum of the derivative of the change in ellipticity (inset).
- LARP-I shows two distinct peaks in the derivative of the ellipticity in the absence of indole (29°C, 39°C), which is less pronounced in the presence of 500 /zM indole, suggesting differential stabilization of other secondary structure elements.
- Figure 20 Sequences and properties of tested LARP constructs.
- Growth rate data indicates that the constructs are inducible under plate conditions and were tested in planktonic culture shown in Fig 13. Blank means whether the variant grew on selective solid media in the absence of ligand (in all cases +: yes, - : no).
- Negative selection indicates whether the variant grew on counter selective solid media in the presence of 1 mM 5-FOA. Positive selection indicates whether the variant grew under positive selection conditions on selective media in the presence of indoles. The tested variants all had Positive: +; Negative: +; Blank: - (gray shading).
- the present application relates to mutated RNA polymerases from bacteriophage whose activity is responsive to indole, or an indole derivative as shown below:
- Indole Indole-5- carboxyaldehyde The present application relates to mutated RNA polymerases from bacteriophage whose activity is responsive to indole, or an indole-similar compound as shown indoline, quinoline, and isoquinoline.
- T7 is a bacteriophage capable of infecting E. coli cells, and other bacteria species.
- the engineered T7 RNAP of the disclosure a mutation at position 727, and more preferably a T7 RNAP with a tryptophan to glycine, alanine, or valine position 727 of the amino acid sequence.
- the application relates to an engineered RNAP comprising a polypeptide sequence having at least between 80% to 99%, or more sequence identity to the reference sequence of SEQ ID NO’s. 1 or 3, or a functional fragment thereof, having one or more mutations that causes the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
- the engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and even more preferably a substitution mutation W727G, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can further include a substitution set at a position selected from 430, 433, 633, 727, 849, and 880, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can include a substitution set at a position selected from S430, N433, S633, F849, and F880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can include a substitution selected from S430P, N433T, S633P, W727G, F849I, and F880Y, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can further include a substitution set at a position selected from 378, 430, 433, 633, 727, 737, 781, 785, 786, 845, 849, and 880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can substitution selected from K378, S430, N433, S633, W727, Q737, N781, S785, Q786, F845, F849, and F880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can include a substitution selected from K378R, S430P, N433T, S633P, W727G, Q737L, N781S, S785A, Q786H, F845L, F849I, and F880Y, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. All of the above mutations causing the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
- the engineered RNA polymerase can further include a substitution set at a position selected from 378, 430, 433, 633, 727, 737, 778, 785, 786, 849, 880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can substitution selected from K378, S430, N433, S633, W727, Q737, 1778, S785, Q786, F849, F880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNA polymerase can include a substitution selected from K378R, S430P, N433T, S633P, W727G, Q737C, I778L, S785A, Q786H, F849I, and F880Y, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
- the engineered RNAP includes a polypeptide sequence having at least between 80% to 99%, or more sequence identity to the reference sequence of SEQ ID NO’s. 1 or 3, or a functional fragment thereof, wherein said engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, wherein as a result of said mutation the RNAP has a phenotype of being responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
- the substitution at position 727 includes a substitution mutation selected from W727G, W727A, or W727V, and/or any combinations thereof.
- the engineered RNAP includes substitution at position 727 selected from W727G, W727A, or ' 'l TV, and an additional substitution set at a position selected from 737, 781, 785, 786, 845, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, and any of the above mutations cause the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indolesimilar compound, as compared to an RNA polymerase lacking said mutation.
- the engineered RNAP includes a polypeptide sequence having at least between 80% to 99%, or more sequence identity to the reference sequence of SEQ ID NO’s. 1 or 3, or a functional fragment thereof, wherein said engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, wherein as a result of said mutation the RNAP has a phenotype of being responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
- the substitution at position 727 includes a substitution mutation selected from W727G, W727A, or W727V, and/or any combinations thereof.
- the engineered RNAP includes substitution at position 727 selected from W727G, W727A, or W727V, and an additional substitution at a position selected from Q737, N781, S785, Q786, F845, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, and any of the above mutations cause the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indolesimilar compound, as compared to an RNA polymerase lacking said mutation.
- the engineered RNAP includes a polypeptide sequence having at least between 80% to 99%, or more sequence identity to the reference sequence of SEQ ID NO’s. 1 or 3, or a functional fragment thereof, wherein said engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, wherein as a result of said mutation the RNAP has a phenotype of being responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
- the substitution at position 727 includes a substitution mutation selected from W727G, W727 A, or H IN, and/or any combinations thereof.
- the engineered RNAP includes substitution at position 727 selected from W727G, W727A, or W727V, and an additional substitution at a position selected from Q737L, N781S, S785A, Q786H, F845L, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, and any of the above mutations cause the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
- the engineered RNAP includes a polypeptide sequence having at least between 80% to 99%, or more sequence identity to the reference sequence of SEQ ID NO’s. 1 or 3, or a functional fragment thereof, wherein said engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, wherein as a result of said mutation the RNAP has a phenotype of being responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
- the substitution at position 727 includes a substitution mutation selected from W727G, W727A, or W727V, and/or any combinations thereof.
- the engineered RNAP includes substitution at position 727 selected from W727G, W727A, or W727V, and an additional substitution set at a position selected from 737, 778, 781, 782, 785, 786, 845, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, and any of the above mutations cause the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
- the engineered RNAP includes a polypeptide sequence having at least between 80% to 99%, or more sequence identity to the reference sequence of SEQ ID NO’s. 1 or 3, or a functional fragment thereof, wherein said engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, wherein as a result of said mutation the RNAP has a phenotype of being responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
- the substitution at position 727 includes a substitution mutation selected from W727G, W727A, or W727V, and/or any combinations thereof.
- the engineered RNAP includes substitution at position 727 selected from W727G, W727A, or 'Wl'ITV, and an additional substitution set at a position selected from Q737, 1778, N781, F782, S785, Q786, F845, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, and any of the above mutations cause the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
- the engineered RNAP includes a polypeptide sequence having at least between 80% to 99%, or more sequence identity to the reference sequence of SEQ ID NO’s. 1 or 3, or a functional fragment thereof, wherein said engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, wherein as a result of said mutation the RNAP has a phenotype of being responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
- the substitution at position 727 includes a substitution mutation selected from W727G, W727A, or W727V, and/or any combinations thereof.
- the engineered RNAP includes substitution at position 727 selected from W727G, W727A, or W727V, and an additional substitution at a position selected from V725X, Q737L, Q737C, Q737F, Q737V, Q737I, Q737H, I778L, I778V, N781S, N781H, F782Y S785A, Q786H, F845Y, F845L, F845C, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, and any of the above mutations cause the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
- the engineered RNAP includes a polypeptide sequence having at least between 80% to 99%, or more sequence identity to the reference sequence of SEQ ID NO’s. 1 or 3, or a functional fragment thereof, wherein said engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, wherein as a result of said mutation the RNAP has a phenotype of being responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
- the substitution at position 727 includes a substitution mutation selected from W727G, W727A, or W727V, and/or any combinations thereof.
- the engineered RNAP includes substitution at position 727 selected from W727G, W727A, or W727V, and an additional mutations sets selected from the groups:
- any of the above mutations cause the activity of the engineered RNAP to be responsive to indole, or an indole derivative, such as indole-5-carboxyaldehyde (I-5CH0) or an indole-similar compound, such as indoline, quinoline, and isoquinoline, as compared to an RNA polymerase lacking said mutation.
- any of the above substitutions can include a conservative amino acid substitution.
- the disclosures provides for an engineered RNA polymerase (RNAP) polypeptide, or functional fragment thereof, selected from SEQ ID NO’s. 19-21, or a functional fragment thereof, wherein the engineered RNAP is responsive to indole or indole-5- carboxyaldehyde (I-5CH0).
- RNAP RNA polymerase
- the disclosures provides for a nucleotide sequence encoding an engineered RNA polymerase (RNAP) selected from SEQ ID NO’s. 19-21, or functional fragment thereof, wherein the engineered RNAP is responsive to indole or indole-5-carboxyaldehyde (I- 5CH0).
- RNAP engineered RNA polymerase
- the disclosures provides for an expression vector comprising the nucleic acid molecule according to SEQ ID NO’s. 19-21, or functional fragment thereof, operably linked to an expression control sequence.
- the disclosures provides for prokaryotic or eukaryotic cell transformed by the expression vector expression vector encoding a nucleic acid molecule according to SEQ ID NO’s. 19-21, or functional fragment thereof, and capable of expressing the engineered RNA polymerase.
- the cell is further modified to disrupt or knock out one or more genes directed to the production of indole or one of its derivatives or an indole-similar compound.
- the application describes a nucleic acid molecule encoding the RNAP, or a functional fragment thereof, having one or more mutations that causes the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation and it use in an in vitro or in vivo system, such as an engineered cell, in vitro transcription and translation systems, a diagnostic device, or a pharmaceutical composition.
- the posit-translational control and activation of the RNAP via indole, an indole derivative, or an indole-similar compound thereof provides additional control of transcription and downstream gene interactions, for example in complex in vivo or in vitro systems.
- the wild type activity of T7 RNAP can be toxic to cells such that the reduced activity of the RNAP described herein can ameliorate the overall toxcicity to cells while providing additional post-translational ligand control over its use.
- the nucleic acid molecule encoding the RNAP, or a functional fragment thereof can be operably linked to an expression control sequence forming an expression vector that can further be used to transform a cell and capable of expressing the engineered RNA polymerase.
- the cell which can preferably include a bacterium, and more preferably an E. coli bacterium, can be further modified to disrupt or knock out one or more genes directed to the production of indole, an indole derivative, or an indole-similar compound.
- the gene that has been knocked out, or disrupted includes tryptophanase (TnaA).
- Methods of disrupting the expression of a gene, or generating a gene knock-out are known in the art, and can include various homologous recombination methods, RNAi constructs, as well as endonuclease-based systems such as CRISPR or other such systems.
- the application describes a method of amplifying a target nucleic acid in a sample in an isothermal transcription based nucleic acid amplification reaction, comprising contacting the sample with a primer pair comprising a first promoter-oligonucleotide and a second oligonucleotide for amplification of the target nucleic acid; an effective amount of indole, an indole derivative, or an indole-similar compound, and the T7 RNA polymerase according to any of claims 1-29, under conditions whereby the isothermal transcription based nucleic acid amplification reaction can occur to amplify the target nucleic acid.
- the application describes an enzyme mixture for use in an isothermal transcription based nucleic acid amplification reaction comprising an engineered T7 RNA polymerase according to the description herein; an enzyme having reverse transcriptase activity and optional RNase H activity; and an effective amount of indole, an indole derivative, or an indole-similar compound.
- the application describes a method of inducing expression in a cell, comprising the steps transforming a cell with the expression vector as described herein encoding an engineered RNAP, wherein the cell is genetically modified such that it does not endogenously produce indole, an indole derivative, or an indole-similar compound, and introducing an effective amount of exogenous indole, an indole derivative, or an indole-similar compound, wherein the endogenous indole, an indole derivative, or an indole-similar compound activates the RNAP.
- the application describes a method of inducing expression in a cell, comprising the steps transforming a cell with the expression vector encoding an engineered RNAP as described herein, wherein the cell produces endogenous indole, an indole derivative, or an indole-similar compound that activates the RNAP.
- the application describes a method of inducing expression in a cell, comprising the steps: transforming a first cell with the expression vector encoding an engineered RNAP as described herein, wherein the cell is genetically modified such that it does not endogenously produce indole, an indole derivative, or an indole-similar compound, and introducing a second cell a that produces endogenous indole, an indole derivative, or an indolesimilar compound that activates the RNAP of the first cell.
- nucleic acids are written left to right in 5’ to 3’ orientation; amino acid sequences are written left to right in amino to carboxy orientation, respectively.
- indole describe a class of compounds containing a benzene ring fused to a five-membered nitrogen containing pyrrole ring.
- indole derivatives such as indole-5- carboxyaldehyde (I-5CHO), are well known in the art and would be recognized by those of ordinary skill.
- an indole derivative is capable of activating one or more of the LARPs described herein.
- indole-similar compounds encompass compounds such as indoline, quinoline, and isoquinoline that are structurally similar to indole, or that contain an indole group that is further be fused with an separate ringed structure, such as a quinoline or isoquinolinone.
- an indole-similar compound is capable of activating one or more of the LARPs described herein.
- T7 RNA polymerase refers to a monomeric T7 bacteriophage-encoded DNA directed RNA polymerase that catalyzes the formation of RNA in the 5’ to 3’ direction.
- polynucleotide and nucleic acid refer to two or more nucleosides that are covalently linked together.
- the polynucleotide may be wholly comprised of ribonucleotides (i.e., RNA), wholly comprised of 2’ deoxyribonucleotides (i.e., DNA), or comprised of mixtures of ribo- and 2’ deoxyribonucleotides.
- the polynucleotides may include one or more non-standard linkages.
- the polynucleotide may be single-stranded or double-stranded, or may include both single-stranded regions and double-stranded regions.
- a polynucleotide will typically be composed of the naturally occurring encoding nucleobases (i.e., adenine, guanine, uracil, thymine and cytosine), it may include one or more modified and/or synthetic nucleobases, such as, for example, inosine, xanthine, hypoxanthine, etc.
- such modified or synthetic nucleobases are nucleobases encoding amino acid sequences.
- a “protein,” “polypeptide,” and “peptide” are used interchangeably herein to denote a polymer of at least two amino acids covalently linked by an amide bond, regardless of length or post-translational modification (e.g., glycosylation or phosphorylation).
- amino acids are referred to herein by either their commonly known three- letter symbols or by the one-letter symbols recommended by IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single letter codes.
- alanine (Ala or A), arginine (Are or R), asparagine (Asn or N), aspartate (Asp or D), cysteine (Cys or C), glutamate (Glu or E), glutamine (Gin or Q), histidine (His or H), isoleucine (He or I), leucine (Leu or L), lysine (Lys or K), methionine (Met or M), phenylalanine (Phe or F), proline (Pro or P), serine (Ser or S), threonine (Thr or T), tryptophan (Trp or W), tyrosine (Tyr or Y), and valine (Vai or V).
- the amino acid may be in either the L- or D-configuration about a-carbon (Co).
- “Ala” designates alanine without specifying the configuration about the a-carbon
- “D-Ala” and “L-Ala” designate D-alanine and L-alanine, respectively.
- upper case letters designate amino acids in the L-configuration about the a-carbon
- lower case letters designate amino acids in the D-configuration about the a-carbon.
- A designates L-alanine and “a” designates D-alanine.
- a designates D-alanine.
- polypeptide sequences are presented as a string of one-letter or three-letter abbreviations (or mixtures thereof), the sequences are presented in the amino (N) to carboxy (C) direction in accordance with common convention.
- nucleosides used for the genetically encoding nucleosides are conventional and are as follows: adenosine (A); guanosine (G); cytidine (C); thymidine (T); and uridine (U).
- the abbreviated nucleosides may be either ribonucleosides or 2’- deoxyribonucleosides.
- the nucleosides may be specified as being either ribonucleosides or 2’- deoxyribonucleosides on an individual basis or on an aggregate basis.
- a polynucleotide or a polypeptide refers to a material or a material corresponding to the natural or native form of the material that has been modified in a manner that would not otherwise exist in nature or is identical thereto but produced or derived from synthetic materials and/or by manipulation using recombinant techniques.
- wild-type and “naturally-occurring” refer to the form found in nature.
- a wild-type polypeptide or polynucleotide sequence is a sequence present in an organism that can be isolated from a source in nature, and which has not been intentionally modified by human manipulation.
- a “coding sequence” or “sequence encoding” refers to that part of a nucleic acid (e.g., a gene) that encodes an amino acid sequence of a protein.
- percent (%) sequence identity is used herein to refer to comparisons among polynucleotides and polypeptides, and are determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide or polypeptide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence for optimal alignment of the two sequences.
- the percentage may be calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.
- the percentage may be calculated by determining the number of positions at which either the identical nucleic acid base or amino acid residue occurs in both sequences, or a nucleic acid base or amino acid residue is aligned with a gap to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.
- Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith and Waterman (Smith and Waterman, Adv. Appl.
- HSPs high scoring sequence pairs
- the word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always ⁇ 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative- scoring residue alignments; or the end of either sequence is reached.
- the BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment.
- the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (See, Henikoff and Henikoff, Proc. Natl. Acad. Sci. USA 89: 10915 [1989]).
- Exemplary determination of sequence alignment and % sequence identity can employ the BESTFIT or GAP programs in the GCG Wisconsin Software package (Accelrys, Madison Wis.), using default parameters provided.
- a “reference sequence” refers to a defined sequence used as a basis for a sequence comparison.
- a reference sequence may be a subset of a larger sequence, for example, a segment of a full-length gene or polypeptide sequence.
- a reference sequence is at least 20 nucleotide or amino acid residues in length, at least 25 residues in length, at least 50 residues in length, at least 100 residues in length or the full length of the nucleic acid or polypeptide.
- two polynucleotides or polypeptides may each (1) comprise a sequence (i.e., a portion of the complete sequence) that is similar between the two sequences, and (2) may further comprise a sequence that is divergent between the two sequences
- sequence comparisons between two (or more) polynucleotides or polypeptide are typically performed by comparing sequences of the two polynucleotides or polypeptides over a “comparison window” to identify and compare local regions of sequence similarity.
- a “reference sequence” can be based on a primary amino acid sequence, where the reference sequence is a sequence that can have one or more changes in the primary sequence.
- a “comparison window” refers to a conceptual segment of at least about 20 contiguous nucleotide positions or amino acids residues wherein a sequence may be compared to a reference sequence of at least 20 contiguous nucleotides or amino acids and wherein the portion of the sequence in the comparison window may comprise additions or deletions (i.e., gaps) of 20 percent or less as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences.
- the comparison window can be longer than 20 contiguous residues, and includes, optionally 30, 40, 50, 100, or longer windows.
- “Corresponding to”, “reference to” or “relative to” when used in the context of the numbering of a given amino acid or polynucleotide sequence refers to the numbering of the residues of a specified reference sequence when the given amino acid or polynucleotide sequence is compared to the reference sequence.
- the residue number or residue position of a given polymer is designated with respect to the reference sequence rather than by the actual numerical position of the residue within the given amino acid or polynucleotide sequence.
- a given amino acid sequence such as that of an engineered T7 RNA polymerase, can be aligned to a reference sequence by introducing gaps to optimize residue matches between the two sequences. In these cases, although the gaps are present, the numbering of the residue in the given amino acid or polynucleotide sequence is made with respect to the reference sequence to which it has been aligned.
- Xn The positions of amino acid differences generally are referred to herein as “Xn,” where n refers to the corresponding position in the reference sequence upon which the residue difference is based.
- a “residue difference at position X727 as compared to SEQ ID NO. 1” refers to a difference of the amino acid residue at the polypeptide position corresponding to position 727 of SEQ ID NO. 1.
- a “residue difference at position X727 as compared to SEQ ID NO. 1” an amino acid substitution of any residue other than tryptophan at the position of the polypeptide corresponding to position 727 of SEQ ID NO. 1.
- the specific amino acid residue difference at a position is indicated as “XnY” where “Xn” specified the corresponding position as described above, and “Y” is the single letter identifier of the amino acid found in the engineered polypeptide (i.e., the different residue than in the reference polypeptide).
- the present disclosure also provides specific amino acid differences denoted by the conventional notation “AnB”, where A is the single letter identifier of the residue in the reference sequence, “n” is the number of the residue position in the reference sequence, and B is the single letter identifier of the residue substitution in the sequence of the engineered polypeptide.
- a polypeptide of the present disclosure can include one or more amino acid residue differences relative to a reference sequence, which is indicated by a list of the specified positions where residue differences are present relative to the reference sequence.
- the various amino acid residues that can be used are separated by a “/” (e g., X10H/X10P or X10H/P) or a comma X10H, X10P or X10H, P.
- the enzyme variants comprise more than one substitution.
- substitutions are separated by a slash or a comma for ease in reading (e.g., C14A/K122A or C14A, K122A).
- the present application includes engineered polypeptide sequences comprising one or more amino acid differences that include either/or both conservative and non-conservative amino acid substitutions.
- a “conservative amino acid substitution” refers to a substitution of a residue with a different residue having a similar side chain, and thus typically involves substitution of the amino acid in the polypeptide with amino acids within the same or similar defined class of amino acids.
- an amino acid with an aliphatic side chain may be substituted with another aliphatic amino acid (e.g., alanine, valine, leucine, and isoleucine); an amino acid with hydroxyl side chain is substituted with another amino acid with a hydroxyl side chain (e.g., serine and threonine); an amino acids having aromatic side chains is substituted with another amino acid having an aromatic side chain (e g., phenylalanine, tyrosine, tryptophan, and histidine); an amino acid with a basic side chain is substituted with another amino acid with a basis side chain (e.g., lysine and arginine); an amino acid with an acidic side chain is substituted with another amino acid with an acidic side chain (e.g., aspartic acid or glutamic acid); and/or a hydrophobic or hydrophilic amino acid is replaced with another hydrophobic or hydrophilic amino acid, respectively.
- another aliphatic amino acid e.g
- non-conservative substitution refers to substitution of an amino acid in the polypeptide with an amino acid with significantly differing side chain properties. Non-conservative substitutions may use amino acids between, rather than within, the defined groups and affects (a) the structure of the peptide backbone in the area of the substitution (e.g., proline for glycine) (b) the charge or hydrophobicity, or (c) the bulk of the side chain.
- an exemplary non-conservative substitution can be an acidic amino acid substituted with a basic or aliphatic amino acid; an aromatic amino acid substituted with a small amino acid; and a hydrophilic amino acid substituted with a hydrophobic amino acid.
- deletion refers to modification to the polypeptide by removal of one or more amino acids from the reference polypeptide.
- Deletions can comprise removal of 1 or more amino acids, 2 or more amino acids, 5 or more amino acids, 10 or more amino acids, 15 or more amino acids, or 20 or more amino acids, up to 10% of the total number of amino acids, or up to 20% of the total number of amino acids making up the reference enzyme while retaining enzymatic activity and/or retaining the improved properties of an engineered enzyme.
- Deletions can be directed to the internal portions and/or terminal portions of the polypeptide.
- the deletion can comprise a continuous segment or can be discontinuous.
- isolated polypeptide refers to a polypeptide which is substantially separated from other contaminants that naturally accompany it (e.g., protein, lipids, and polynucleotides).
- the term embraces polypeptides which have been removed or purified from their naturally-occurring environment or expression system (e.g., host cell or in vitro synthesis).
- the recombinant T7 RNA polymerase polypeptides may be present within a cell, present in the cellular medium, or prepared in various forms, such as lysates or isolated preparations.
- the recombinant T7 RNA polymerase polypeptides can be an isolated polypeptide.
- an isolated polypeptide is a substantially pure.
- substantially pure polypeptide refers to a composition in which the polypeptide species is the predominant species present (i.e., on a molar or weight basis it is more abundant than any other individual macromolecular species in the composition), and is generally a substantially purified composition when the object species comprises at least about 50 percent of the macromolecular species present by mole or % weight.
- a substantially pure T7 RNA polymerase composition comprises about 60% or more, about 70% or more, about 80% or more, about 90% or more, about 95% or more, and about 98% or more of all macromolecular species by mole or % weight present in the composition.
- the object species is purified to essential homogeneity (i.e., contaminant species cannot be detected in the composition by conventional detection methods) wherein the composition consists essentially of a single macromolecular species. Solvent species, small molecules ( ⁇ 500 Daltons), and elemental ion species are not considered macromolecular species.
- the isolated recombinant T7 RNA polymerase polypeptides are substantially pure polypeptide compositions.
- hybridization stringency relates to hybridization conditions, such as washing conditions, in the hybridization of nucleic acids. Generally, hybridization reactions are performed under conditions of lower stringency, followed by washes of varying but higher stringency.
- hybridization refers to conditions that permit target- DNA to bind a complementary nucleic acid that has about 60% identity, preferably about 75% identity, about 85% identity to the target DNA, with greater than about 90% identity to target- polynucleotide.
- Exemplary moderately stringent conditions are conditions equivalent to hybridization in 50% formamide, 5* Denhart's solution, 5> ⁇ SSPE, 0.2% SDS at 42° C., followed by washing in 0.2*SSPE, 0.2% SDS, at 42° C.
- “High stringency hybridization” refers generally to conditions that are about 10° C. or less from the thermal melting temperature T m as determined under the solution condition for a defined polynucleotide sequence.
- a high stringency condition refers to conditions that permit hybridization of only those nucleic acid sequences that form stable hybrids in 0.018M NaCl at 65° C.
- High stringency conditions can be provided, for example, by hybridization in conditions equivalent to 50% formamide, 5*Denhardfs solution, 5*SSPE, 0.2% SDS at 42° C., followed by washing in O.lxSSPE, and 0.1% SDS at 65° C.
- Another high stringency condition is hybridizing in conditions equivalent to hybridizing in 5> ⁇ SSC containing 0.1% (w:v) SDS at 65° C. and washing in 0.1 *SSC containing 0.1% SDS at 65° C.
- Other high stringency hybridization conditions, as well as moderately stringent conditions, are described in the references cited above.
- codon optimized refers to changes in the codons of the polynucleotide encoding a protein to those preferentially used in a particular organism such that the encoded protein is more efficiently expressed in the organism of interest.
- the genetic code is degenerate in that most amino acids are represented by several codons, called “synonyms” or “synonymous” codons, it is well known that codon usage by particular organisms is nonrandom and biased towards particular codon triplets. This codon usage bias may be higher in reference to a given gene, genes of common function or ancestral origin, highly expressed proteins versus low copy number proteins, and the aggregate protein coding regions of an organism's genome.
- the polynucleotides encoding the T7 RNA polymerase enzymes may be codon optimized for optimal production from the host organism selected for expression.
- control sequence refers herein to include all components, which are necessary or advantageous for the expression of a polynucleotide and/or polypeptide of the present application.
- Each control sequence may be native or foreign to the nucleic acid sequence encoding the polypeptide.
- control sequences include, but are not limited to, a leader, polyadenylation sequence, propeptide sequence, promoter sequence, signal peptide sequence, initiation sequence and transcription terminator.
- the control sequences include a promoter, and transcriptional and translational stop signals.
- the control sequences may be provided with linkers for the purpose of introducing specific restriction sites facilitating ligation of the control sequences with the coding region of the nucleic acid sequence encoding a polypeptide.
- operably linked is defined herein as a configuration in which a control sequence is appropriately placed (i.e., in a functional relationship) at a position relative to a polynucleotide of interest such that the control sequence directs or regulates the expression of the polynucleotide and/or polypeptide of interest.
- promoter refers to a nucleic acid sequence that is recognized by a host cell for expression of a polynucleotide of interest, such as a coding sequence. The promoter sequence contains transcriptional control sequences, which mediate the expression of a polynucleotide of interest.
- the promoter may be any nucleic acid sequence which shows transcriptional activity in the host cell of choice including mutant, truncated, and hybrid promoters, and may be obtained from genes encoding extracellular or intracellular polypeptides either homologous or heterologous to the host cell.
- a “substrate” in the context of an enzymatic conversion reaction process refers to the compound or molecule acted on by the T7 RNA polymerase polypeptide.
- a “product” in the context of an enzymatic conversion process refers to the compound or molecule resulting from the action of the T7 RNA polymerase polypeptide on a substrate.
- culturing refers to the growing of a population of microbial cells under any suitable conditions (e.g., using a liquid, gel or solid medium).
- Recombinant polypeptides can be produced using any suitable methods known the art. Genes encoding the wild-type polypeptide of interest can be cloned in vectors, such as plasmids, and expressed in desired hosts, such as A. coll, S. cerevisiae, etc. Variants of recombinant polypeptides can be generated by various methods known in the art. Indeed, there is a wide variety of different mutagenesis techniques well known to those skilled in the art. In addition, mutagenesis kits are also available from many commercial molecular biology suppliers.
- variants After the variants are produced, they can be screened for any desired property (e.g., high or increased activity, or low or reduced activity, increased thermal activity, increased thermal stability, and/or acidic pH stability, etc.).
- desired property e.g., high or increased activity, or low or reduced activity, increased thermal activity, increased thermal stability, and/or acidic pH stability, etc.
- “recombinant T7 RNA polymerase polypeptides” also referred to herein as “engineered T7 RNA polymerase polypeptides,” “variant T7 RNA polymerase enzymes,” and “T7 RNA polymerase variants” find use.
- an “expression vector” is a DNA construct for introducing a DNA sequence into a cell.
- the vector is an expression vector that is operably linked to a suitable control sequence capable of effecting the expression in a suitable host of the polypeptide encoded in the DNA sequence.
- an “expression vector” has a promoter sequence operably linked to the DNA sequence (e.g., transgene) to drive expression in a host cell, and in some embodiments, also comprises a transcription terminator sequence.
- the term “expression” includes any step involved in the production of the polypeptide including, but not limited to, transcription, post-transcriptional modification, translation, and post-translational modification. In some embodiments, the term also encompasses secretion of the polypeptide from a cell.
- the term “produces” refers to the production of proteins and/or other compounds by cells. It is intended that the term encompass any step involved in the production of polypeptides including, but not limited to, transcription, post-transcriptional modification, translation, and post-translational modification. In some embodiments, the term also encompasses secretion of the polypeptide from a cell.
- an amino acid or nucleotide sequence e.g., a promoter sequence, signal peptide, terminator sequence, etc.
- a promoter sequence e.g., a promoter sequence, signal peptide, terminator sequence, etc.
- a heterologous sequence e.g., a promoter sequence, signal peptide, terminator sequence, etc.
- the terms “cell,” or “host cell” refer to suitable hosts for expression vectors comprising DNA provided herein (e.g., the polynucleotides encoding the T7 RNA polymerase variants).
- the host cells are prokaryotic or eukaryotic cells that have been transformed or transfected with vectors constructed using recombinant DNA techniques as known in the art.
- the prokaryotic cell comprises a bacterium, and preferably a bacterium capable of expressing a ligand activated RNA polymerases (LARPs) as described herein.
- LRPs ligand activated RNA polymerases
- fragment refers to a portion of a peptide or nucleotide sequence of a RNAP that still retains the activity of the whole.
- disclosure of a DNA sequence also include the corresponding RNA and amino acid sequence including all redundant codons and conservative amino acid substitutions
- disclosure of a RNA sequence also include the corresponding DNA and amino acid sequence including all redundant codons and conservative amino acid substitutions
- disclosure of amino acid sequence also include the corresponding RNA and DNA sequence including all redundant codons and conservative amino acid substitutions and vice versa.
- isolated and purified are used to refer to a molecule (e.g., an isolated nucleic acid, polypeptide, etc.) or other component that is removed from at least one other component with which it is naturally associated.
- purified does not require absolute purity, rather it is intended as a relative definition.
- composition and “formulation” encompass products comprising at least one engineered T7 RNA polymerase of the present disclosure, intended for any suitable use (e.g., research, diagnostics, etc.).
- transcription is used to refer to the process whereby a portion of a DNA template is copied into RNA by the action of an RNA polymerase enzyme.
- DNA template is used to refer to a double or single-stranded DNA molecule including a promoter sequence and a sequence coding for the RNA product of transcription.
- promoter is used to refer to a DNA sequence that is recognized by RNA polymerase as the start site of transcription.
- the promoter recruits RNA polymerase, and in the case of T7RNA polymerase, determines the start site of transcription.
- RNA polymerase is used to refer to a DNA-directed RNA polymerase, which copies a DNA template into an RNA polynucleotide, by incorporating nucleotide triphosphates stepwise into the growing RNA polymer
- RNA molecules that code for a protein. This protein is decoded through the action of translation.
- the term “responsive to” means refers to a property of the engineered T7 RNA polymerase, which can be represented by reconstitution to approximately wild-type specific activity (e.g., product produced/time/weight protein) or reconstitution to approximately wild-type in percent conversion of the substrate to the product (e.g., percent conversion of starting amount of substrate to product in a specified time period) when introduced to an effective amount of an inducer, such as indole, an indole derivative, or an indole-similar compound.
- an inducer such as indole, an indole derivative, or an indole-similar compound.
- the term “effective amount” refers to an amount sufficient to produce the desired result.
- One of general skill in the art may determine what the effective amount by using routine experimentation.
- “Pharmaceutical compositions” are compositions that include an amount (for example, a unit dosage) of one or more of the disclosed compounds together with one or more non-toxic pharmaceutically acceptable additives, including carriers, diluents, and/or adjuvants, and optionally other biologically active ingredients.
- Such pharmaceutical compositions can be prepared by standard pharmaceutical formulation techniques such as those disclosed in Remington's Pharmaceutical Sciences , Mack Publishing Co., Easton, Pa. (19th Edition).
- the pharmaceutical acceptable carrier may comprise any conventional pharmaceutical carrier or excipient. The choice of carrier and/or excipient will to a large extent depend on factors such as the particular mode of administration, the effect of the carrier or excipient on solubility and stability, and the nature of the dosage form.
- compositions of the disclosure can include a probiotic bacterium engineered to express one or more engineered RNAP enzymes described herein.
- pharmaceutically acceptable refers to compounds, materials, compositions, and/or dosage forms which are, within the scope of sound medical judgement, suitable for use in contact with the tissues of a subject (e.g., human) without excessive toxicity, irritation, allergic response, or other problem or complication, commensurate with a reasonable benefit/risk ratio.
- a subject e.g., human
- Each carrier, excipient, etc. must also be “acceptable” in the sense of being compatible with the other ingredients of the formulation.
- Suitable carriers, diluents, excipients, etc. can be found in standard pharmaceutical texts. See, for example, “Handbook of Pharmaceutical Additives”, 2nd Edition (eds. M. Ash and I. Ash), 2001 (Synapse Information Resources, Inc., Endicott, N.Y., USA), “Remington's Pharmaceutical Sciences”, 20th edition, pub. Lippincott, Williams & Wilkins, 2000; and “Handbook of Pharmaceutical Excipients”, 2nd edition, 1994.
- the formulations may conveniently be presented in unit dosage form and may be prepared by any methods well known in the art of pharmacy. Such methods include the step of bringing into association the active compound with the carrier which constitutes one or more accessory ingredients. In general, the formulations are prepared by uniformly and intimately bringing into association the active compound with liquid carriers or finely divided solid carriers or both, and then if necessary, shaping the product.
- Formulations may be in the form of liquids, solutions, suspensions, emulsions, elixirs, syrups, tablets, lozenges, granules, powders, capsules, cachets, pills, ampoules, suppositories, pessaries, ointments, gels, pastes, creams, sprays, mists, foams, lotions, oils, boluses, electuaries, or aerosols.
- T7 RNAP and derivatives have been studied extensively, are used to synthesize mRNA for medical applications, and are the dominant transcriptional control mechanism employed in recombinant protein expression in bacteria.
- unregulated in vivo expression of T7 RNAP is usually toxic owing to its high basal activity.
- chemically inducible split T7 RNAPs have been demonstrated with low basal activity and high inducibility, a full length single subunit ligand activated T7 RNAP with minimal basal activity could also solve the toxicity problem inherent in many synthetic biology circuits utilizing T7 promoters.
- Applicants describe the use rational design to engineer the full-length, single subunit T7 RNA polymerase to be controlled by physiologically relevant concentrations of indole.
- Applicants optimized their LARPs through directed evolution to yield LARP-I with minimal transcriptional activity in the absence of indole, and a 29-fold increase in activity with an EC50 of 344 //M.
- Applicants utilized LARP-I in several contexts to show that indole controls T7- dependent gene expression exogenously, endogenously, and intercellularly.
- Applicants also demonstrated indole-dependent bacteriophage viability and propagation in trans. Specificity of different indoles, T7 promoter specificities, and portability to different bacteria are shown.
- These novel LARPs represent a new chemically inducible platform immediately deployable for novel synthetic biology applications, including for modulation of synthetic cocultures.
- LARPs developed here represent a new chemically inducible system that functions in diverse bacteria with minimal modification. This system supplies an urgent need for the synthetic biology community of predictive gene expression control with an alternative controlling ligand. Further, LARPs can sense and respond to the native metabolite indole, which allows LARP receiver strains to be constructed for bottom-up manipulations of microbial communities. Applicants also demonstrated that LARPs can respond selectively to an indole derivative or indole similar compounds.
- LARPs may find applications in inducible or dynamic control of gene expression in bioreactors, for metabolite control of engineered phage therapies, or to perform user-defined operations in the gut microbiome or other mixed microbial communities producing indole, or indole derivatives, or indole similar compounds.
- LARPs can be complementary to existing approaches of engineering cellular control using transcription factors and/or riboswitches. While transcription factors are a form of pre- transcriptional control their output being repression or activation of transcription from a target DNA sequence-riboswitches are regulated at the level of mRNA by ligand binding to control continued transcription of nascent mRNA, or translation of the full length mRNA. Riboswitches often suffer from low dynamic range and leaky off-states, but these disadvantages can be addressed using cascading systems. The post-translational mechanism inherent in these LARPs may allow faster on- and off-switching than these alternatives, albeit with a much narrower range of potential ligands.
- Example 2 Design, Engineering, and Optimization of LARPs.
- potential LARPs are incubated with linear dsDNA containing a T7 promoter sequence driving expression of an RNA Spinach or Pepper aptamer.
- the amount of transcript is proportional to fluorescence, and the rate of fluorescence is determined in the presence and absence of 1 mM indole (Figure 5).
- highly inducible LARPs Applicants optimized a previously described dual positive and negative bacterial selection ( Figure 1C).
- a plasmid construct containing a T7 promoter upstream of HIS3 and URA3 genes is transformed into an E. coli strain (i) incapable of producing endogenous indole; and (ii) auxotrophic for histidine and uracil (E. coli USO EhisB, pyrF, ina A Figure 10).
- This auxotrophy complementation selection was used to select low basal activity LARPs from a library of approximately 11 million members.
- a combinatorial library containing mutations at nine positions within 6 A of position 727 was constructed by a cassette-based Golden Gate protocol using_synthetic DNA containing degenerate codons. Transformed cells were passed through a round of negative selection using 1 mM 5-FOA. Surviving members of this library were then selected for activity in the presence and absence of 50 «M indole. Deep sequencing of the selected and input libraries, followed by analysis using position specific enrichment ratio matrices (PSERM), revealed a Pareto front of candidate indole-specific LARPs ( Figures ID and 12).
- PSERM position specific enrichment ratio matrices
- PSERM is a method for analyzing combinatorial libraries after selection useful for cooptimization.
- PSERM scores for a variant are the summation of the individual enrichment ratios of each mutation.
- 11 LARPs passed plate-based assays and showed statistically significant (one-way ANOVA, Tukey’s posthoc test,/?-value ⁇ 0.03 for all samples) indole-specific growth rate increases in defined minimal media (Figure 13; Figure 20).
- LARP-I (K378R, S430P, N433T, S633P, W727G, Q737L, N781S, S785A, Q786H, F845L, F849I, F880Y) is marked by minimal growth rate in the absence of indole, and near-maximal growth rate with 50 //M indole under selection conditions ( Figure IE).
- Figure IE Applicants expressed and purified LARP-I ( Figure 9) to evaluate its binding properties in vitro.
- the basal activity of LARP-I is less than 0.1% of the activity of parent enzyme T7 RNAP TS, the EC50 for indole is 344 «M (95% c.i.
- LARP-I is activated by each of these to various extents, both the magnitude and EC50 (none are saturable up to 2 mM) are much lower than for indole ( Figure 14).
- LARP-I identified from our selection shows low basal activity and post-translational, indole-inducible transcription with a high dynamic range in vitro.
- Example 3 LARPs Control Gene Expression Exogenously, Endogenously, and in Cocultures.
- a plasmid encoding a sfGFP expression marker downstream of a T7 promoter sequence which was optimized for strong translation was used to transform tnaA E. coli (lacking tryptophanase) expressing LARP-I ( Figure 2A). Tryptophanase produces indole, pyruvate, and ammonia through the hydrolysis of tryptophan.
- external addition of indole is sufficient to drive high levels of gene expression.
- Receiver strains could release a biological payload in response to indole either through cell lysis or secretion.
- indole could be used to induce lysis
- Applicants screened a panel of previously described single gene lysins and identified several that lysed E. coli under induction with an arabinose-inducible promoter, including Sgl KU1 .
- Indole induction of Sgl KU1 by LARP-I expressing E. coli MG 1655 AtnaA resulted in cell lysis within 45 min ( Figure 2E and 15). Therefore, LARP-I can be controlled by physiologically relevant concentrations of an endogenous metabolite. Additionally, LARP-I enables indole-mediated delivery of intracellular cargo to the extracellular environment through cell lysis.
- Applicants constructed a sender strain (E. coli USO MnsB, pyrF, RIF ) that produces indole.
- Applicants also prepared a sender strain deficient in indole production (E. coli USO MiisB, UspyrF RlnaA, RF ).
- Applicants constructed (E. coli USO hisB, pyrF RlnaR) with the LARPI/sfGFP reporting system ( Figure 2G).
- LARP-I Fluorescence microscopy of the co-cultures shows that most of the individual LARP-I receiver cells are activated in a coculture with the indole-producing sender strain (representative data shown in Figure 2F), while few are activated by the indole-deficient sender strain.
- LARP-I can be part of novel receiver circuits that allows intercellular communication between indole sender strains and LARP-I containing receiver strains.
- LARPs are Portable for Different Bacteria, DNA Promoter Specificities, and Ligands.
- LARP-I shows robust activation of a GFP reporter at increasing indole concentrations, as measured by flow cytometry ( Figures 4B and 16C).
- LARP- I5CHO K378R, S430P, N433T, S633P, W727G, Q737C, I778L, S785A, Q786H, F849I, F880Y.
- E. coli USO EhisB, EpyrF, EtnaA expressing LARP-I5CHO grows minimally in the absence of I-5CHO and is selectively activated by I-5CHO over indole ( Figure 4F).
- LARP-I5CHO Expression of LARP-I5CHO in the same reporter strain shown in Figure 2A revealed I-5CH0 dependent sfGFP expression, with the overall activation on the same order of magnitude as constitutive expression of T7 RNAP R632S ( Figure 4G). Additionally, LARP- I5CH0 shows specificity for L5CHO over indole in phage infection assays ( Figure 17). Thus, LARPs can be engineered to be specific to other indoles.
- Example 6 LARP-I Recognizes Promoter DNA in the Absence of Indole.
- the LARPs demonstrated here have minimal basal activity and large indole-dependent increases in activity.
- the LARP-specific mutations centered around W727 are located beneath the interface with the allosteric inhibitor T7 lysozyme.
- T7 lysozyme traps T7 RNAP in the initiation complex where, while it is able to bind DNA, it only produces abortive transcripts. It is possible that a similar-but opposite in effect-allosteric mechanism occurs for the LARPs. While T7 lysozyme inhibits T7 RNAP (antagonism), the mutational inhibition of LARPs is recovered by an allosteric effector small molecule (agonism).
- LARP-I In the absence of indole, LARP-I also bound pT7 DNA, albeit with a reduced KD of approximately 30 nM at both 25 and 37°C ( Figure 18). In the presence of 500 z/M indole the affinity for pT7 DNA changed to approximately 17 nM at both 25 and 37°C ( Figure 18). Thus, LARP-I recognizes promoter DNA in the presence and absence of indole, and with minimal temperature differences. Applicants also used circular dichroism to evaluate the secondary structure and the apparent melting temperature of LARP-I with and without indole. There was no significant difference in the secondary structure of LARP-I relative to the TS background ( Figure 19A).
- Plasmid and plasmid library generation All sequences were ordered as synthetic dsDNA gene blocks (eBlocks/gBlocks, IDT) or were PCR amplified using gene-specific primers (IDT) and commercially available kits. Plasmids were constructed by restriction cloning, Golden Gate assembly 1, Gibson assembly 2 or DNA HiFi MasterMix using commercially available kits, by site-directed mutagenesis using commercially available kits (Agilent QuikChange II or NEB Q5®), or by combinatorial nicking mutagenesis 3,4. Most plasmids were sequence verified using Oxford Nanopore (Plasmidsaurus). All other plasmids had sequence verification of the inserted T7 RNAP using Sanger sequencing (Genewiz).
- RNAP and LARP constructs containing a/ 5 The base T7 RNAP plasmid pZB017-T7-His6-WT encoding T7 RNAP K378R with an N-terminal 6x-His Tag was constructed from pT7-911Q- His6-WT5, a gift from Seelig Lab at the University of Minnesota. The His tag and K378R mutation appears to have no impact on the activity of the T7 RNAP in benchmark comparisons with commercially purchased T7 RNAP (data not shown). RNAP and LARP constructs containing a/ 5 .
- T7 RNAP mutational library was previously described 1. Applicants used the high resolution structures available for T7 RNAP (PDB codes: 1CEZ, 1QLN, 1H38, 1MSW, 3E2E) 710 to identify mutations with (i.) any heavy atom within 6 A of the Cb of position 727; and (ii.) CaCb vector pointing towards position 727. The following sets of mutations were encoded in the library - 725 A GV; 727G; 735 AGV; 737CDFGHILNQRSVY; 778ILV; 781HKNQRS;
- tnaA gene Two NGG containing gRNA sequences in the middle of the tnaA gene (ACCATCACCAGTAACTCTGC (SEQ ID NO. 16); ctggctcaataacacgaatg (SEQ ID NO. 18)) were selected. Additionally, the DNA used for recombination (gBlock 21) was designed to remove 1032 bp, or more than 70% of the tnaA gene sequence. A genome PCR and Kovak’s reagentl4 indole test were conducted to verify successful knockout of the tnaA gene. Genome PCR was performed by picking a single colony into 1 mL of LB and growing to an ODeoo ⁇ 1.
- the culture was then pelleted at 17,000xG and washed three times by resuspending in 1 mL nuclease free water and re-centrifuging. After the final wash and resuspension, cells were sonicated for 10 minutes. A 20 uL PCR was performed using primers at 1 uM [407, 408; 407, 409; 407, 410], 6 uL of the sonicated cells, and NEB Q5 MasterMix. The products from the genome PCR were ran on a 2wt% agarose-TAE gel with SYBR Safe, visualized with a Saferimage plate reader, and compared with the products from WT cells.
- Kovak’s reagent was prepared (2 g p-dimethylamino-benzaldehyde, 10 mL 37% HC1, 30 mL Isoamyl alcohol) and mixed at 50 L in 2 mL of a saturated bacterial culture and observed for the formation of pink coloring. No pink color was observed in the selected knockout strains.
- Successful curing of temperature sensitive plasmids was verified by replica plating on LB-agar with and without chloramphenicol or ampicillin.
- Proteins were purified by Ni-NTA affinity chromatography essentially as described by Rio et al, using BPER and Turbonuclease for lysis. Protein yields ranged from 5-50 mg/L of culture volume. Proteins were stored at 4°C T7 storage buffer (50mM Tris-HCl pH 7.9, ImM EDTA, 5mM bME, lOOmM NaCl, 15% glycerol) until use.
- Transcriptional Assay Transcriptional activity was tested using an adaption of a previously described fluorescent microtiter plate assay 15. Reactions were assembled in a Black Costar Round-bottom 96-well plate. A reaction mixture (90 pL total) and the T7 RNAP (10 pL total) were mixed and incubated at 37 C and the increase in fluorescence measured.
- the 100 pL reaction mixture consisted of 20 pL of 5x Transcription Buffer (640 mM HEPES pH 7.5, 112 mM MgC12, 200 mM DTT, 6.4 mM Spermidine), 20 pL of 25 mM rNTP Mix, 10 pL 1 M KC1, 50 ng pT7-Spinach Template DNA (111 bp) or pT7-8Peppers (555 bp), 1 pL 10 mg/mL BSA, 1 pL 0.1 U/mL iPPase , 2 pL of 100 pM 3,5-difluor-4-hydroxybenzylidine imidizolanone (for Spinach aptamer, DFHBI, in ethanol) or 1 pL of 100 pM 4-Cyano-a-[[4-[(2-hydroxyethyl) methylamino] phenyl] methylene] benzeneacetonitrile (for Peppers Ap
- This assembled reaction mixture sans enzyme was added to a Black Costar Plate Well containing 10 pL of 20 pM Protein and mixed by pipetting.
- the fluorescence readout (Spinach: Ex/Em 469/501; Peppers Ex/Em 485/530) was performed every 60-72 sec for at least 60 minutes in a Biotek Synergy Hl plate reader set to 37°C.
- Transcriptional activity was determined as the slope of the increase in fluorescence (RFU/min).
- RFU/min was calculated by fitting a line to a moving window of 20 data points, and reporting the largest value. For low activity, a correction to the max rate is used where the slope is multiplied by the correlation coefficient of the fit, and the highest value is reported.
- the T7 RNAP screen was developed from a previous selection used for screening Zinc finger nuclease target sequences.
- An auxotrophic strain, US0 pyrF-/hisB-, containing the homologous HIS3 and URA3 deletions was complemented with a plasmid containing the HIS3/URA3 genes under a pT7 promoter. Additionally, the gene for T7 RNAP was transformed under a secondary plasmid.
- Initial tests of growth screening on solid media suggested the pT7/T7 RNAP system was toxic, as previously documented. To reduce the toxicity of T7 RNAP, different pT7 promoter strengths and T7 RNAP expression strengths were tested.
- M9-Complete M9 mineral salts with 6 g/L dextrose, 1.4 g/L Mix Amino Acids (-Trp, -Leu, -His, -Ura), 78 mg/L Trp, 22.4 mg Uracil, 380 mg Leucine, 0.5 g/L of Histidine and 0.2 g/L Yeast Extract) 1.8 (w/v)% agar plates with Tetracycline (10 /zg/mL), Chloramphenicol (30 /zg/mL), and Kanamycin (50 /zg/mL) and grown overnight at 37°C.
- M9-his M9 mineral salts with 6 g/L dextrose, 1.4 g/L Mix Amino Acids (-Trp, -Leu, -His, -Ura), 78 mg/L Trp, 22.4 mg Uracil, 380 mg Leucine) liquid culture and grown overnight. Overnight cultures were diluted to an ODeoo of 0.01-0.04 into 3.5 mL of M9-his with 1 mM of the HIS3 competitive inhibitor 3 -aminotriazole (3-AT). Growth was measured by ODeoo in Hungate tubes. The ultimate screen used in this work had a lacUV5 promoter driving expression of full length T7 RNAP with a GUG initiation codon and the UmuD N-degron tag.
- the negative selection was performed by plating approximately 5E9 cells (10 OD-mL) on agar containing M9-Complete media and 1 mM 5fluoroorotic acid (5-FOA), and incubated at 37°C overnight. After negative selection plates were scraped with M9-his, washed two times by pelleting (4000xG for 10 mins) and resuspending in M9-his, and either directly used for positive selection or prepared as glycerol stocks and stored at -80°C.
- M9-his plates were prepared with 1 mM 3-amino-l,2,4-triazole and either 50 /iM of indole or indole-5-carboxyaldehyde or ethanol as the solvent control. Cells passed through counter selection were plated on the selective media plates and grown for 20 hours at 37°C. The resulting colonies for each selection condition were scraped with M9-his media and prepared as glycerol stocks. DNA from resulting libraries was plasmid extracted and prepared for next generation sequencing essentially following method B of Kowalsky et al. The libraries were sequenced on an Illumina MiSeq using 2 x 250 paired end reads by Rush University genomics core.
- Exposure time was 2 ms for Brightfield, 100 ms for GFP channel (Ex: 488 nm; Em: 520 nm), and 300 ms for RFP (Ex: 594 nm; Em: 610 nm) channel. Images were taken in the first few seconds following laser exposure and magnification adjustment to reduce bleaching. Images were processed in Imaged 27 by combining the individual channels (BF, GFP, RFP), adjusting brightness and contrast for the entire channels to reduce over-exposure, (BF-gray minimum/maximum 2700/4000, RFP-red/GFP-cyan minimum/maximum 100/4000). Representative images are shown in Figure 2F.
- Phage Assays Phage were amplified in 50 mb cultures of LB and titered following Kibby et al., and stored at 4°C following filtration through a 0.22 gm filter. For gpl deficient phage strains, E. coli BL21*(DE3) was used to amplify the phage. Phage was diluted in the same LB media used for amplification. To perform phage plaque assays, MG1655 tnaA- was transformed with pZB513 (plasmid expressing WT T7 RNAP) and pZB578 (LARP-I). Single colonies were picked and grown overnight at 37°C in LB + 30 gg/mL chloramphenicol.
- LB overlay plates were prepared following Kibby et al. Briefly, 3 mb of melted 1.5 (w/v)% LB-agarwas added to 6 mL LB containing 30 pg/mL chloramphenicol, phage infection salts (10 mM MgC12, 10 mM CaC12 and 100 pM MnC12), and indole or ethanol as a vehicle control. To this mixture 100 pL of saturated overnight culture was added, immediately mixed, and plated on top of pre-prepared LB plates. The overlay agar was allowed to dry for at least 15 minutes and not more than 1 hour at room temperature.
- Phage infection was carried out in 96-well plates with 200 pL total volume per well by adding 4 pL of phage dilutions at a multiplicity of infection (PFU/cell, assuming 5E8 E.coli per OD-mL) of 0.5 or 5.
- Circular dichroism Circular dichroism was performed on an Applied Photophysics Chirascan VI 00. Purified LARPI was buffer exchanged into 20 mM NaPO4, pH 7.8 with 50 mM NaCl using 7 kDa Zeba 0.5 mL spin columns per manufacturer's protocol and diluted to a concentration in the range of 0.1-0.4 mg/mL. 170 /zL of sample was loaded into clean 0.5 mm cuvettes. Due to the presence of 50 mM NaCl, the usable spectra were limited to 195 nm. Spectral measurements were taken using 1 sec timepoints at 2 nm bandwidth and 0.5 nm step size for LARP-I in the presence and absence of 500 /zM indole.
- the ellipticity spectra of LARP-I from 195-260 nm was indistinguishable from TS at both 0 and 500 uM indole concentrations.
- the wavelength was maintained at either 208.5 nm or 222 nm and the ellipticity was measured for 24 secs between 0.5 C temperature steps as the sample was heated at l°C/min.
- the canonical 17-nt T7 promoter TAATACGACTCACTATA (SEQ ID NO. 17) spanning from -17 to -1 position of the initiation start site was purchased from IDT with 5’ 6FAM-fluorescein conjugated to the 5’ end of the forward promoter sequence.
- the fluorescent promoter was annealed with the unconjugated reverse complement sequence at 20 zM concentration in buffer containing 10 mM Tris, pH 7.8, 50 mM NaCl, and 1 mM EDTA with the complementary strand in 20% excess (24 /zM) by heating the mixture to 98°C for 1 minute and cooling to 4°C over 3 hours in a thermal cycler.
- Annealed primer was diluted to 2 nM in a buffer containing 33 mM HEPEs, pH 7.8, 50 mM NaCl, and 10 mM MgC12. Protein was exchanged into 33 mM HEPES, pH 7.8, 50 mMNaCl, and 10 mM MgC12 using 7 kDa Zeba 0.5 mL spin columns per manufacturer's protocol. Protein concentration was determined using A280 nanodrop as well as Bradford assay. Protein was serial diluted 22 times in 2x dilutions (1 : 1) or 1.5x dilutions (2: 1) in the same buffer.
- DNA final concentration 1 nM, final volume 20 /zL per well
- Fluorescence polarization was measured using a Tecan Spark plate reader with automated settings for FAM-fluorescein channels (excitation wavelength of 483 nm; excitation bandwidth of 20 nm; emission wavelength of 529 nm; emission bandwidth of 20 nm; automatic gain; mirror: automatic -dichroic 510; 30 flashes; integration time: 40 ⁇ s; settling time: 100 ms; z-position: 20000 /zm; G- factor calibrated from a buffer blank and reference sample of 1 nM fluorescent DNA in buffer).
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Biomedical Technology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Plant Pathology (AREA)
- Medicinal Chemistry (AREA)
- Bioinformatics & Computational Biology (AREA)
- Crystallography & Structural Chemistry (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
The present application relates to mutated RNA polymerases (RNAP) enzymes from bacteriophage T7 that are post-translationally controlled by one or more to heterocyclic ligands, such as indole, an indole derivative, or an indole-similar compound. A preferred mutant includes a T7 RNAP with a mutation at position 727, and more preferably a thermostable T7 RNAP with a tryptophan to glycine substitution at position 727 of the amino acid sequence.
Description
LIGAND DEPENDENT POST-TRANSLATIONAL CONTROL OF ENGINEERED RNA POLYMERASE (RNAP) MUTANTS FROM BACTERIOPHAGE T7
CROSS REFERENCE TO RELATED APPLICATION
This International PCT application claims the benefit of and priority to U.S. Provisional Application No. 63/566,882, filed March 18, 2024, the specification, claims and drawings of which are incorporated herein by reference in their entirety.
STATEMENT OF GOVERNMENT INTEREST
This invention was made with government support under grant number 2030221 and 2218330 awarded by National Science Foundation. The government has certain rights in the invention.
SEQUENCE LISTING
The instant application contains contents of the electronic sequence listing (90245.01161- Sequence-Listing.xml; Size: 54,044 bytes; and Date of Creation: March 17, 2025) is herein incorporated by reference in its entirety.
TECHNICAL FIELD
The present application is directed to engineered proteins, and specifically engineered RNA polymerases from bacteriophages subject to small-molecule post-translation control.
BACKGROUND
T7 RNA polymerase (RNAP) is a monomeric bacteriophage-encoded DNA directed RNA polymerase that catalyzes the formation of RNA in the 5’ to 3’ direction. In the process of transcription initiation, T7 RNA polymerase recognizes a specific promoter sequence (i.e., the T7 promoter). The conformation of the N-terminal domain changes between the initiation and elongation phases of the functioning enzyme. Many biotechnological process relay on high-fidelity transcription of RNA transcripts from DNA templates.
T7 RNAP has found widespread utility throughout microbiology and biological engineering workflows. This is in part due to its simplicity as a single gene which encodes a large 883 AA protein that has predictable control as it transcribes from a well characterized promoter sequence that is orthogonal to native promoter sequences throughout the tree of life. Cloning and expression of the gene encoding T7 RNA polymerase are well known in the art (See e.g., U.S. Pat.
No. 4,952,496). Due to its promoter specificity and high RNA polymerase activity, T7 has been used for various applications. It is also useful for the high-level expression of recombinant genes in E. coli. T7 is also used in various nucleic acid amplification methods, including those used in diagnostic methods. As stability and thermostability are often important considerations in the development of components of diagnostic methods (See U.S. Pat. Nos. 9,193,959, 8,551,752, and 7,507,567, etc.).
In one specific example, in vitro transcription (IVT) uses bacteriophage DNA-dependent ribonucleic acid (RNA) polymerases to synthesize template-directed mRNA transcripts. Problems in the IVT reaction can result in complete failure (e.g., no transcript generated) or in transcripts that are the incorrect size (e.g., shorter or longer than expected). Specific problems associated with IVT reactions include, for example, abortive (truncated) transcripts, run-on transcripts, polyA tail variants/3’ heterogeneity, mutated transcripts, and/or double-stranded contaminants produced during the reactions.
T7 RNAP has further been engineered to be thermally stable, recognize different promoters in an orthogonal manner, to function as a split enzyme, and control various genetic systems. This enables rapid development of T7 RNAP for non-native functions, such as de novo allostery. As a result, T7 RNAP is important to the downstream transcriptional control of a number of in vitro and in vivo processes. However, control of T7 RNAP itself is limited to traditional genetic-based expression control elements, such as promotors and inducers which cannot be applied to the downstream control of the enzyme. As such, there exist s long-felt need for a post-translational control system for T7 RNAP for both in vitro, and in vivo applications of the same.
SUMMARY OF THE DISCLOSURE
The present application relates to an engineered T7 RNA polymerases (RNAP) enzymes from bacteriophage that are post-translationally controlled by one or more to heterocyclic ligands, such as indole and indole-containing compounds. A preferred mutant includes a T7 RNAP with a mutation at position 727, and more preferably a T7 RNAP with a tryptophan to glycine, alanine, or valine position 727 of the amino acid sequence.
In one preferred aspect, the application relates to an engineered RNAP comprising a polypeptide sequence having at least between 80% to 99%, or more sequence identity to the reference sequence of SEQ ID NO’s. 1 or 3, or a functional fragment thereof, having one or more mutations that causes the activity of the engineered RNAP to be responsive to indole, an indole
derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
In a preferred aspect, the engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and even more preferably a substitution mutation W727G, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
In another preferred aspect, the engineered RNA polymerase can further include a substitution set at a position selected from 430, 433, 633, 727, 849, and 880, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can include a substitution set at a position selected from S430, N433, S633, F849, and F880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can include a substitution selected from S430P, N433T, S633P, W727G, F849I, and F880Y, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
In another preferred aspect, the engineered RNA polymerase can further include a substitution set at a position selected from 378, 430, 433, 633, 727, 737, 781, 785, 786, 845, 849, and 880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can substitution selected from K378, S430, N433, S633, W727, Q737, N781, S785, Q786, F845, F849, and F880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can include a substitution selected from K378R, S430P, N433T, S633P, W727G, Q737L, N781S, S785A, Q786H, F845L, F849I, and F880Y, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. All of the above mutations causing the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
In another preferred aspect, the engineered RNA polymerase can further include a substitution set at a position selected from 378, 430, 433, 633, 727, 737, 778, 785, 786, 849, 880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can substitution selected from K378, S430, N433, S633, W727, Q737, 1778, S785, Q786, F849, F880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can include a substitution selected from K378R, S430P, N433T, S633P, W727G, Q737C, I778L, S785A, Q786H, F849I, and F880Y, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. All of the above mutations causing the activity of the engineered RNAP to be responsive to an indole derivative, such as indole-5-carboxyaldehyde (I- 5CHO), or an indole-similar compound, such as indoline, quinoline, and isoquinoline, as compared to an RNA polymerase lacking said mutation.
In a preferred aspect, the engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and even more preferably a substitution mutation W727G, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In another preferred aspect, the engineered RNA polymerase can further include a substitution set at a position selected from 737, 781, 785, 786, 845, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can include a substitution set at a position selected from Q737, N781, S785, Q786, F845, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can include a substitution selected from Q737L, N781S, S785A, Q786H, F845L, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In another preferred aspect, the engineered RNA polymerase can further include a substitution set at a position selected from 737, 778, 781, 782, 785, 786, 845, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can substitution selected from Q737,
1778, N781, F782, S785, Q786, F845, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can include a substitution selected from V725X, Q737L, Q737C, Q737F, Q737V, Q737I, Q737H, I778L, I778V, N781S, N781H, F782Y S785A, Q786H, F845Y, F845L, F845C, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. All of the above mutations causing the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
In another preferred aspect, the engineered RNA polymerase comprises a substitution at In another preferred aspect, the engineered RNA polymerase can further include a substitution set at a position selected from 737, 781, 785, 786, 845, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can include a substitution set at a position selected from Q737, N781, S785, Q786, F845, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.. In still further embodiments, the engineered RNA polymerase can include a substitution selected from Q737L, N781S, S785A, Q786H, F845L, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In another preferred aspect, the engineered RNA polymerase can further include a substitution set at a position selected from 737, 778, 781, 782, 785, 786, 845, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can substitution selected from Q737, 1778, N781, F782, S785, Q786, F845, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can include a substitution selected from V725X, Q737L, Q737C, Q737F, Q737V, Q737I, Q737H, I778L, I778V, N781S, N781H, F782Y S785A, Q786H, F845Y, F845L, F845C, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. All of the above mutations causing
the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indolesimilar compound, as compared to an RNA polymerase lacking said mutation.
In another preferred aspect, the engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and even more preferably a substitution mutation wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In another preferred aspect, the engineered RNA polymerase can further include a substitution set at a position selected from 737, 781, 785, 786, 845, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can include a substitution set at a position selected from Q737, N781, S785, Q786, F845, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.. In still further embodiments, the engineered RNA polymerase can include a substitution selected from Q737L, N781S, S785A, Q786H, F845L, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In another preferred aspect, the engineered RNA polymerase can further include a substitution set at a position selected from 737, 778, 781, 782, 785, 786, 845, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can substitution selected from Q737, 1778, N781, F782, S785, Q786, F845, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can include a substitution selected from V725X, Q737L, Q737C, Q737F, Q737V, Q737I, Q737H, I778L, I778V, N781S, N781H, F782Y S785A, Q786H, F845Y, F845L, F845C, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. All of the above mutations causing the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
In another aspect, the engineered RNA polymerase comprises further comprises a MIG substitution. In another preferred aspect, the engineered RNA polymerase comprises an N-terminal degron sequence.
Additional aspects of the disclosure will become apparent from the figures, claims, and specification provided below.
BRIEF DESCRIPTION OF FIGURES
Figures 1A-H. Development of a ligand-activatable RNA polymerase responsive to indole. (A) Chemical recovery of function approach for T7 RNAP. Mutating a buried tryptophan in T7 RNAP disrupts the ability of the RNAP to transcribe DNA to RNA. This activity is recovered in the presence of indole. (B) In vitro transcriptional assays using Peppers aptamer in the absence or presence of 1 mM indole for the indicated variants. TSW727G is also known as LARPV1. (C) Schematic of the bacterial- 1 -hybrid selection system developed for identification of LARPs. The selection enables positive and negative selection. (D) PSERM scores for library variants for indole vs selection on a solvent control. The Pareto front of high score in the presence of indole and low score for the solvent contains variants (large green circle for LARP-I, black circles for other tested variants) predicted to have improved or maintained indole responsive growth with significantly less constitutive activity compared to the starting construct (LARPVl, orange larger circle). The sequence profile above the plot contains the LARP-I and LARPVl specific mutations relative to the wild-type T7 RNAP. (E) Specific growth rates of LARP constructs in the presence and absence of 50 /M indole under growth in selective minimal media without histidine supplementation and with lmM3-AT. (F,G) In vitro transcriptional assay using the Peppers aptamer with indicated indole concentrations for LARP-I, LARPVl, and TS. Panel F shows activity represented as RFU/min, and panel G shows a relative activity for each variant normalized to one in the absence of indole. (H) In vitro transcriptional activity time course. A delayed spike of 500 /M indole shows rapid activation of LARP-I transcription. Statistics and/?-values: **: /lvalue < 0.01, ?-value < 0.0001; ns not significant ( 2-value >0.05). RFU/min given as mean and standard deviation (B,F), Welch ANOVA using Brown-Forsyth and corrected for multiple comparisons, specific growth rate given as mean and 95% C.I. Welch’s unpaired two-tailed /-test and Tukey’s posthoc test. Relative activity (G) reported as mean and SEM error bars, the RFU in the delayed indole trace (H) is given as a range.
Figures 2A-H. LARP-I allows indole control of gene expression exogenously, endogenously, and intercellularly. (A) Cartoon of predicted results of experiment with external addition of indole. Exogenous indole added to LARP-I containing E. coli and the tryptophanase gene knocked out enables ligand dependent gene expression measured by expression of a
fluorescent reporter. (B) GFP RFU normalized by cell density as a function of supplemented indole concentration. Positive control represents an expression strain using T7 RNAPR632S, and negative control expresses the catalytic knockout T7 RNAPR632S '1639 . Error bars represent 1 s.e.m., n > 3. (C) Cartoon of predicted results using an E. coli strain capable of producing indole. E. coli naturally expresses tryptophanase, and produces indole through tryptophan metabolism which accumulates. (D) GFP RFU normalized by cell density for indicated strains and in the presence and absence of 125 zM indole. Positive control (T7 RNAPR632S) and negative control (T7 RNAPR632S /Y639A) are the same as in panel B. Endogenous activation of LARP-I is similar to exogenous activation at 125 zzM indole addition in direct comparison of strains with and without the tnaA knockout. Error bars represent 1 s.d., n > 3. (E) Predicted effects of the expression of a single gene lysin (SglKU1) under a T7 promoter with co-expression of LARP-I. Indole activates LARP-I, leading to cell lysis. (F) ODeoo vs time after induction of the culture with 500 /zM indole. Ethanol was used as a vehicle control for the 0 «M control. Indole-dependent cell lysis of E. coli expressing SglKU1 occurs within 45 min. Error bars for 0 /zM indole represent s.e.m., n = 3. The three biological replicates are plotted for the 500 zzM indole experiment. (G) Cartoon of expected activation of receiver strain by the sender strain. (H) Confocal microscopy of representative cocultures; panels to the right show RFU/ODeoo bulk population measurements for the cocultures shown by microscopy. Positive and negative T7 RNAP controls are the same as in panels B and D.
Figures 3A-C. LARP-I controls T7 bacteriophage viability and propagation in trans. (A) Cartoon of phage experiment with the phage RNAP supplied in trans. Bacteriophage T7 Agpl (AT7 RNAP) requires active T7 RNAP to propagate. Bacteria containing LARP-I do not propagate phage in the absence of indole, but yield robust phage infection in the presence of indole. (B) Phage plaque formation, and quantification as plaque forming units (PFU), for different T7 phage, T7 RNAPs, and indole concentrations. (C) ODeoo vs time after infection for indicated combinations of phage and indole for LARP-I expressing E. coli MG1655 AtnaA. Two different multiplicities of infection (MOI) are shown. Phage infection kinetics are indole-dependent in kill-curve assays approaching WT phage infection rates at 500 z/M indole. Error bars represent 1 s.d., (n = 3).
Figures 4A-G. The versatility and orthogonality of LARPs are demonstrated for different bacteria, different promoter sequence specificities, and for different controlling ligands. (A) Cartoon of experimental construction of P. pntida LARP constructs. (B) GFP mean fluorescence
intensity (MFI) as a function of indole for LARP-I, positive control (T7 RNAPR632S), and negative control (T7 RNAPR632S/Y639A). Error bars represent 1 s.e.m. of n = 3. (C) Hybrid LARPs with engineered polymerase specificity loops. Sequence variation at the specificity loops for LARP-I- pCGG, LARP-I-pCTGA, and LARP-IpN4 are shown relative to the original LARP-I which has activity on the canonical pT7 promoter sequence. (D) Selection of E. coli strains in the presence and absence of 100 z/M indole. Strains contain the indicated hybrid polymerases and promoters driving URA3 and HIS3 expression. Positive selection conditions are growth on media lacking histidine and supplemented with 1 mM 3-AT, while negative selection uses complete growth media containing 1 mM 5-FOA. (E) PSERM scores of library variants from selection on indole-5 carboxyaldehyde (I-5-CHO) vs indole. Larger symbols represent indicated variants. Sequence differences between variants are shown. (F) Specific growth rate on media lacking histidine and supplemented with lmM3-AT. Indole and I-5-CHO are included at 50 //M. Statistics and /2-values: **: /?- value < 0.01, ****: value < 0.0001. (G) E. coli USOAtnaA with an sfGFP reporter plasmid driven by pT7 and different T7 RNAPs (positive control: T7 RNAPR632S; negative control: T7 RNAPR632S Y639A). GFP RFU normalized to ODeoo as a function of I-5-CHO concentration for the indicated strains at 22 h post induction. Error bars represent 1 s.d. (n = 3).
Figures 5A-B. Overview of in vitro transcriptional activity assay. (A) A transcriptional assay is assembled using a dsDNA sequence encoding an aptamer (i.e. Peppers or Spinach) under the pT7 canonical promoter sequence TAATACGACTCACTATA (SEQ ID NO. 17) in solution with reaction buffer, rNTPs, the non-fluorescent aptamer ligand (i.e. HBC530 or DFHBI). The reaction is initiated with the addition of T7 RNA Polymerase, at various ligand concentrations. The RNA polymerase transcribes the mRNA aptamer sequence, which binds to and constrains a chromophore, resulting in fluorescence. (B) The assay is performed in 100 L reaction volume in 96 well plates and the output are read via a plate reader with the corresponding excitation wavelength and emissions spectra (HBC530: 485 nm/530 nm; Spinach 24-2: 469 nm/501 nm). Fluorescence over time is used to define activity [RFU/min],
Figures 6A-C. Initial glycine scan of 15 rationally selected tryptophan residues, expression and purification results. (A) Soluble Fraction and (B) Eluate from mutants of T7 RNAP harboring the given mutations. WT: N-terminal His6 T7 RNAP R378K. (C) Fold-change in activity of initially purified variants in the presence of 2 mM indole relative to the solvent control (ethanol) revealed no variants with positive indole modulation.
Figures 7A-B. Computationally predicted stability of glycine scan mutants. Two webbased servers (A) SDM2 and (B) PoPMuSiC2.0 were used to ascertain the stability of tryptophan to glycine mutants in the soluble (n=6) and insoluble (n=9) groups from expression results. Five PDB structures were used as inputs, and the p-value for each PDB comparison is listed. For both SDM2 and PoPMuSiC2.0, the insoluble group had a higher destabilizing value than did the soluble group, which was statistically significant (p<0.05) for seven of the ten tests. P values were computed by multiple unpaired t tests assuming a single pooled variance and no correction for multiple comparisons.
Figures 8A-C. Identification of initial LARPs using a thermally stable in vitro glycine scan. A. Expression of thermally stabilized T7 RNAP TS (S430P, N433T, S633P, F849I, F880Y) and TS W287G. B. Fold-change in activity of tryptophan to glycine mutants in the thermally stabilized background in the presence of 1 mM indole relative to the solvent control (ethanol) revealed two variants - W287G & W727G (LARPV1) - which had ~>2x increase in activity at 1 mM indole. Individual data points indicate biological replicates. C. Dose-response curve for TSW287G and TS. Activity is reported normalized to 1 for each variant in the absence of indole.
Figures 9A-C. Activity of LARPvl (TS-W727G). A. SDS-PAGE Protein Gel of purified wild-type (WT), thermally stable (TS), TS-W727G (LARPvl), and LARP-I at 20 pg, 5 pg, and 2 pg. B. Dose-response curve for TSW287G and TS. Activity is reported normalized to 1 for each variant in the absence of indole.
Figures 10A-B. Generation of E. coll USO pyrF-/hisB- AtnaA by CRISPR-Cas9. A CRISPR-Cas9 dsDNA break and homology directed recombination protocol utilizing temperature-sensitive plasmids was used to generate a knock-out of the gene encoding the tryptophan degrading enzyme tryptophanase (tnaA). A gRNA targeting the tnaA coding sequence and a linear fragment of dsDNA with homology arms surrounding the tnaA coding sequence were used to remove a large section of the E. coli genome in the selection strain USO pyrF-/hisB-. (A) Successful knockout of the tnaA gene was confirmed through colony PCR of the tnaA gene. Three pairs of primers amplifying the genomic region of tnaA subject to the -1032 bp genomic KO knockout were tested. Region A: 191 bp in WT, N/A in tnaA KO; Region B: 1452 bp in WT, 420 bp in tnaA KO; Region C: 1156 bp in WT, 134 bp in tnaA KO. (B) A phenotypic test using Kovac’s reagent on saturated cultures confirmed indole is not produced in the AtnaA strain. Kovac’ s reagent reacts with indole, producing a brilliant purple color.
Figure 11. Optimized T7 positive and negative selection assay affords indole-dependent growth for LARPvl. The left panel represents growth in planktonic culture of E. coli USO pyrF- /hisB- AtnaA expressing LARPvl and HIS3/URA3 driven by a strong T7 promoter. Open and closed symbols represent different biological replicates. + indole is a condition with 100 /zM indole. The right panel represents growth on solid agar selective media with and without the indicated concentration of indole.
Figures 12A-L PSERM Scoring (A-C) Position Specific Amino Acid Enrichment Ratio Score Heat Map, (D-F) PSERM score distribution for library variants, (G-I) relevant PSERM scores plotted against each other for (A) Solvent control, (B) Indole, and (C) Indole-5- carboxyaldehyde Selection Results. PSERM scores for each variant result from the additive score of constituent mutations pictured for Solvent (A), Indole (B) and Indole-5-carboyxaldehyde (C). WT residues are flagged with an asterisk (*).The top 2% of library variants in the ligand selections (E, F) have a positive PSERM score. Plotting the scores against each other reveals Pareto-fronts in the Solvent vs Ligand plots (G, H) and high correlation between Indole and Indole-5- carboxyaldehyde scores (I). Larger closed symbols represent hits tested and variants described in the main text.
Figure 13. Individual Growth Rates of LARP hits in selective media (histidine-deficient, ImM 3-AT) measured in planktonic growth assays in presence and absence of indole, p-values: ****: p-value<0.0001; ***: p-value<0.001; **: p-value<0.01; *: p-value<0.05. Amino acid sequences of tested LARPs are shown in Figure 20.
Figure 14. In vitro Ligand Specificity for LARP-I. Transcriptional Activity of LARP -I in ligand titrations of indole, indoline, quinoline, and quinoline (n=l). Structure of the indole similar heterocycle structures provided at the right.
Figure 15. Indole-dependent SglKUl expression and lysis in LARP-I containing A. coli. (A) Replicate 1 lysis curves (ODeoo vs time post-induction in hours) mean and standard error of the mean of biological replicates (n=3) for negative controls (LARP-I, no SglKUl; catalytically dead T7 RN PR632S;Y639A, SglKU1) and LARP-I, SglKUl with 0 uM indole (empty symbols) or 500 /zM indole (fdled symbols). (B) Replicate 2 lysis curves (ODeoo vs time post-induction in hours) mean and standard error of the mean of biological replicates (n=3) for negative controls (LARP-I, no SglKUl; catalytically dead T7 RNAPR632S;Y639A, SglKUl) and LARP-I, SglKUl. (C) Replicate
2 experimental group (LARP-I + SglKU1) with 0 pM indole (empty symbols, Mean and SEM, n=3) or 500 jtzM indole (filled symbols, individual traces).
Figures 16A-C. Development of LARP-I expressing P. putida strains. A. Overnight cultures of E. coll DH5a, P. putida mt-2, and the mt-2 derivatives KT2440 and AG4775 were grown overnight in LB medium before addition of Kovac’s reagent. The cherry red color of the Kovac’s reagent layer indicates the presence of indoles, as seen for the positive control strain E. coli DH5a. B. ODeoo vs. indole concentration for three P. putida strains expressing different T7 RNAP constructs. Cultures were grown aerobically in LB overnight, then back-diluted to an OD6OO=0.0 into indole-containing media at the concentration indicated. ODeoo was measured 4 hours after indole addition. Higher (500, 1000 pM) indole concentrations leads to impaired growth. Error bars represent 1 s.e.m. (n=3). C. Flow cytometry gating strategy and sample GFP expression histograms in the absence and presence of indole. Data shown is the P. putida LARP-I expression strain at 0 and 500 pM indole.
Figure 17. LARP-I5CHO Phage Infection. Phage kill-curves performed with T7 Phage (uninfected control, wt control and T7 Agpl) infecting MG1655 AtnaA containing LARP-I5CHO with increasing indole-5-carboxyaldehyde (left) or indole (right) concentrations at multiplicity of infection of 0.5 (top) and 5.0 (bottom). Error bars represent 1 SEM (n=3).
Figures 18A-B. Fluorescence anisotropy of purified TS and LARP-I at 1 nM DNA LARP- I and TS control, with 500 uM indole or solvent control, titrated against 1 nM of fluorescently labeled pT7 double stranded DNA incubated at 25°C (A) and 37°C (B). Values are the means of three technical replicates fit to a 1 : 1 binding isotherm. Oligonucleotides from IDT: 6FAM fluorescein fluorophore conjugated to the coding (forward) strand, pT7: -17:-1, TAATACGACTCACTATA (SEQ ID NO. 17), annealed to the reverse complement sequence to form double stranded DNA.
Figures 19A-C. Circular Dichroism Spectroscopy of LARP-I. (A) Full Spectra of LARP- I with and without 500 pM indole shows no difference relative to the TS background, traces are manually offset around (+ for TS, - for LARP-I without indole) for comparison. Circular dichroism temperature ramp of purified LARP-I in the presence or absence of 500 pM indole measuring ellipticity at (B) 222 nm and (C) 208.5 nm. LARP-I shows a decreased Tm,app around 39°C as measured by the maximum of the derivative of the change in ellipticity (inset). (C) At 208.5 nm LARP-I shows two distinct peaks in the derivative of the ellipticity in the absence of indole (29°C,
39°C), which is less pronounced in the presence of 500 /zM indole, suggesting differential stabilization of other secondary structure elements.
Figure 20. Sequences and properties of tested LARP constructs. Growth rate data indicates that the constructs are inducible under plate conditions and were tested in planktonic culture shown in Fig 13. Blank means whether the variant grew on selective solid media in the absence of ligand (in all cases +: yes, - : no). Negative selection indicates whether the variant grew on counter selective solid media in the presence of 1 mM 5-FOA. Positive selection indicates whether the variant grew under positive selection conditions on selective media in the presence of indoles. The tested variants all had Positive: +; Negative: +; Blank: - (gray shading).
Figure 21. All ligands tested in plate-based selection screening for new LARPs.
Figure 22. Fitting results for 1: 1 binding isotherm for fluorescence anisotropy of TS and LARP-I.
DETAILED DESCRIPTION OF THE DISCLOSURE
The embodiments herein and the various features and details thereof are explained more fully with reference to the non-limiting embodiments that are illustrated in the accompanying drawings and detailed in the following description. Descriptions of well-known components and processing techniques are omitted to avoid unnecessarily obscuring the embodiments herein. Also, the various embodiments described herein are not necessarily mutually exclusive, as some embodiments can be combined with one or more other embodiments to form new embodiments. The examples used herein are intended merely to facilitate an understanding of ways in which the embodiments herein can be practiced and to further enable those skilled in the art to practice the embodiments herein. Accordingly, the examples should not be construed as limiting the scope of the embodiments herein.
The present application relates to mutated RNA polymerases from bacteriophage whose activity is responsive to indole, or an indole derivative as shown below:
Indole Indole-5- carboxyaldehyde
The present application relates to mutated RNA polymerases from bacteriophage whose activity is responsive to indole, or an indole-similar compound as shown indoline, quinoline, and isoquinoline.
One example of bacteriophage encoded RNA polymerase is the T7 RNA polymerase. T7 is a bacteriophage capable of infecting E. coli cells, and other bacteria species.
In one embodiment, the engineered T7 RNAP of the disclosure a mutation at position 727, and more preferably a T7 RNAP with a tryptophan to glycine, alanine, or valine position 727 of the amino acid sequence.
In one preferred aspect, the application relates to an engineered RNAP comprising a polypeptide sequence having at least between 80% to 99%, or more sequence identity to the reference sequence of SEQ ID NO’s. 1 or 3, or a functional fragment thereof, having one or more mutations that causes the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation. In a preferred aspect, the engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and even more preferably a substitution mutation W727G, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
In another preferred aspect, the engineered RNA polymerase can further include a substitution set at a position selected from 430, 433, 633, 727, 849, and 880, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can include a substitution set at a position selected from S430, N433, S633, F849, and F880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can include a substitution selected from S430P, N433T, S633P, W727G, F849I, and F880Y, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1.
In another preferred aspect, the engineered RNA polymerase can further include a substitution set at a position selected from 378, 430, 433, 633, 727, 737, 781, 785, 786, 845, 849, and 880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the
engineered RNA polymerase can substitution selected from K378, S430, N433, S633, W727, Q737, N781, S785, Q786, F845, F849, and F880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can include a substitution selected from K378R, S430P, N433T, S633P, W727G, Q737L, N781S, S785A, Q786H, F845L, F849I, and F880Y, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. All of the above mutations causing the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
In another preferred aspect, the engineered RNA polymerase can further include a substitution set at a position selected from 378, 430, 433, 633, 727, 737, 778, 785, 786, 849, 880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can substitution selected from K378, S430, N433, S633, W727, Q737, 1778, S785, Q786, F849, F880, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. In still further embodiments, the engineered RNA polymerase can include a substitution selected from K378R, S430P, N433T, S633P, W727G, Q737C, I778L, S785A, Q786H, F849I, and F880Y, and/or any combinations thereof, wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO’s. 3 or 1. All of the above mutations causing the activity of the engineered RNAP to be responsive to the indole derivative the indole-5-carboxyaldehyde (I- 5CHO) or an indole-similar compound, such as indoline, quinoline, and isoquinoline, as compared to an RNA polymerase lacking said mutation.
In one embodiment, the engineered RNAP includes a polypeptide sequence having at least between 80% to 99%, or more sequence identity to the reference sequence of SEQ ID NO’s. 1 or 3, or a functional fragment thereof, wherein said engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, wherein as a result of said mutation the RNAP has a phenotype of being responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation. In a
preferred embodiment, the substitution at position 727 includes a substitution mutation selected from W727G, W727A, or W727V, and/or any combinations thereof. In another preferred embodiment, the engineered RNAP includes substitution at position 727 selected from W727G, W727A, or ' 'l TV, and an additional substitution set at a position selected from 737, 781, 785, 786, 845, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, and any of the above mutations cause the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indolesimilar compound, as compared to an RNA polymerase lacking said mutation.
In one embodiment, the engineered RNAP includes a polypeptide sequence having at least between 80% to 99%, or more sequence identity to the reference sequence of SEQ ID NO’s. 1 or 3, or a functional fragment thereof, wherein said engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, wherein as a result of said mutation the RNAP has a phenotype of being responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation. In a preferred embodiment, the substitution at position 727 includes a substitution mutation selected from W727G, W727A, or W727V, and/or any combinations thereof. In another preferred embodiment, the engineered RNAP includes substitution at position 727 selected from W727G, W727A, or W727V, and an additional substitution at a position selected from Q737, N781, S785, Q786, F845, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, and any of the above mutations cause the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indolesimilar compound, as compared to an RNA polymerase lacking said mutation.
In one embodiment, the engineered RNAP includes a polypeptide sequence having at least between 80% to 99%, or more sequence identity to the reference sequence of SEQ ID NO’s. 1 or 3, or a functional fragment thereof, wherein said engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, wherein as a result of said mutation the RNAP has a phenotype of being responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation. In a preferred embodiment, the substitution at position 727 includes a substitution mutation selected
from W727G, W727 A, or H IN, and/or any combinations thereof. In another preferred embodiment, the engineered RNAP includes substitution at position 727 selected from W727G, W727A, or W727V, and an additional substitution at a position selected from Q737L, N781S, S785A, Q786H, F845L, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, and any of the above mutations cause the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
In one embodiment, the engineered RNAP includes a polypeptide sequence having at least between 80% to 99%, or more sequence identity to the reference sequence of SEQ ID NO’s. 1 or 3, or a functional fragment thereof, wherein said engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, wherein as a result of said mutation the RNAP has a phenotype of being responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation. In a preferred embodiment, the substitution at position 727 includes a substitution mutation selected from W727G, W727A, or W727V, and/or any combinations thereof. In another preferred embodiment, the engineered RNAP includes substitution at position 727 selected from W727G, W727A, or W727V, and an additional substitution set at a position selected from 737, 778, 781, 782, 785, 786, 845, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, and any of the above mutations cause the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
In one embodiment, the engineered RNAP includes a polypeptide sequence having at least between 80% to 99%, or more sequence identity to the reference sequence of SEQ ID NO’s. 1 or 3, or a functional fragment thereof, wherein said engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, wherein as a result of said mutation the RNAP has a phenotype of being responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation. In a
preferred embodiment, the substitution at position 727 includes a substitution mutation selected from W727G, W727A, or W727V, and/or any combinations thereof. In another preferred embodiment, the engineered RNAP includes substitution at position 727 selected from W727G, W727A, or 'Wl'ITV, and an additional substitution set at a position selected from Q737, 1778, N781, F782, S785, Q786, F845, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, and any of the above mutations cause the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
In one embodiment, the engineered RNAP includes a polypeptide sequence having at least between 80% to 99%, or more sequence identity to the reference sequence of SEQ ID NO’s. 1 or 3, or a functional fragment thereof, wherein said engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, wherein as a result of said mutation the RNAP has a phenotype of being responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation. In a preferred embodiment, the substitution at position 727 includes a substitution mutation selected from W727G, W727A, or W727V, and/or any combinations thereof. In another preferred embodiment, the engineered RNAP includes substitution at position 727 selected from W727G, W727A, or W727V, and an additional substitution at a position selected from V725X, Q737L, Q737C, Q737F, Q737V, Q737I, Q737H, I778L, I778V, N781S, N781H, F782Y S785A, Q786H, F845Y, F845L, F845C, and/or any combinations thereof wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, and any of the above mutations cause the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
In one embodiment, the engineered RNAP includes a polypeptide sequence having at least between 80% to 99%, or more sequence identity to the reference sequence of SEQ ID NO’s. 1 or 3, or a functional fragment thereof, wherein said engineered RNA polymerase comprises a substitution at position 727 in said polypeptide sequence, and wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, wherein as a result of
said mutation the RNAP has a phenotype of being responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation. In a preferred embodiment, the substitution at position 727 includes a substitution mutation selected from W727G, W727A, or W727V, and/or any combinations thereof. In another preferred embodiment, the engineered RNAP includes substitution at position 727 selected from W727G, W727A, or W727V, and an additional mutations sets selected from the groups:
- Q737L, S785A, Q786H, F845Y;
- Q737C, I778L, S785A, Q786H, F845L;
- Q737C, S785A, Q786H, F845Y;
- Q737C, I778L, S785A, Q786H;
- (Q737L, N781S, S785A, Q786H;
- Q737L, N781H, S785A, Q786H, F845L;
- Q737L, S785A, Q786H, F845L;
- Q737C, S785A, Q786H;
- Q737C, S785A, Q786H, F845C;
- Q737F, S785A, Q786H, F845L;
- Q737L, N781H, S785A, Q786H;
- Q737V, N781H, S785A, Q786H, F845L;
- Q737I, N781S, S785A, Q786H, F845L;
- Q737H, N781S, S785A, Q786H, F845L;
- N781 S, S785 A, Q786H, F845L)
- Q737L, F782Y, S785A, Q786H, F845L;
- Q737L, I778V, S785A, Q786H, F845L;
- Q737C, S785A, Q786H, F845L;
- S430P, N433T, S633P, W727G, F849I, F880Y
- K378R, S430P, N433T, S633P, W727G, Q737L, N781S, S785A, Q786H, F845L, F849I, F880Y;
- K378R, S430P, N433T, S633P, W727G, Q737C, I778L, S785A, Q786H, F849I, F880Y; and
wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 3, and any of the above mutations cause the activity of the engineered RNAP to be responsive to indole, or an indole derivative, such as indole-5-carboxyaldehyde (I-5CH0) or an indole-similar compound, such as indoline, quinoline, and isoquinoline, as compared to an RNA polymerase lacking said mutation. In an alternative embodiment, any of the above substitutions can include a conservative amino acid substitution.
In another embodiment, the disclosures provides for an engineered RNA polymerase (RNAP) polypeptide, or functional fragment thereof, selected from SEQ ID NO’s. 19-21, or a functional fragment thereof, wherein the engineered RNAP is responsive to indole or indole-5- carboxyaldehyde (I-5CH0).
In another embodiment, the disclosures provides for a nucleotide sequence encoding an engineered RNA polymerase (RNAP) selected from SEQ ID NO’s. 19-21, or functional fragment thereof, wherein the engineered RNAP is responsive to indole or indole-5-carboxyaldehyde (I- 5CH0).
In another embodiment, the disclosures provides for an expression vector comprising the nucleic acid molecule according to SEQ ID NO’s. 19-21, or functional fragment thereof, operably linked to an expression control sequence.
In another embodiment, the disclosures provides for prokaryotic or eukaryotic cell transformed by the expression vector expression vector encoding a nucleic acid molecule according to SEQ ID NO’s. 19-21, or functional fragment thereof, and capable of expressing the engineered RNA polymerase. In some embodiments, the cell is further modified to disrupt or knock out one or more genes directed to the production of indole or one of its derivatives or an indole-similar compound.
In another embodiment, the application describes a nucleic acid molecule encoding the RNAP, or a functional fragment thereof, having one or more mutations that causes the activity of the engineered RNAP to be responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation and it use in an in vitro or in vivo system, such as an engineered cell, in vitro transcription and translation systems, a diagnostic device, or a pharmaceutical composition. The posit-translational control and activation of the RNAP via indole, an indole derivative, or an indole-similar compound thereof provides additional control of transcription and downstream gene interactions, for example in complex in
vivo or in vitro systems. Moreover, the wild type activity of T7 RNAP can be toxic to cells such that the reduced activity of the RNAP described herein can ameliorate the overall toxcicity to cells while providing additional post-translational ligand control over its use.
In one embodiment, the nucleic acid molecule encoding the RNAP, or a functional fragment thereof, can be operably linked to an expression control sequence forming an expression vector that can further be used to transform a cell and capable of expressing the engineered RNA polymerase. In still further embodiment, the cell, which can preferably include a bacterium, and more preferably an E. coli bacterium, can be further modified to disrupt or knock out one or more genes directed to the production of indole, an indole derivative, or an indole-similar compound. In a preferred embodiment, the gene that has been knocked out, or disrupted includes tryptophanase (TnaA). Methods of disrupting the expression of a gene, or generating a gene knock-out are known in the art, and can include various homologous recombination methods, RNAi constructs, as well as endonuclease-based systems such as CRISPR or other such systems.
In another embodiment, the application describes a method of amplifying a target nucleic acid in a sample in an isothermal transcription based nucleic acid amplification reaction, comprising contacting the sample with a primer pair comprising a first promoter-oligonucleotide and a second oligonucleotide for amplification of the target nucleic acid; an effective amount of indole, an indole derivative, or an indole-similar compound, and the T7 RNA polymerase according to any of claims 1-29, under conditions whereby the isothermal transcription based nucleic acid amplification reaction can occur to amplify the target nucleic acid.
In another embodiment, the application describes an enzyme mixture for use in an isothermal transcription based nucleic acid amplification reaction comprising an engineered T7 RNA polymerase according to the description herein; an enzyme having reverse transcriptase activity and optional RNase H activity; and an effective amount of indole, an indole derivative, or an indole-similar compound.
In another embodiment, the application describes a method of inducing expression in a cell, comprising the steps transforming a cell with the expression vector as described herein encoding an engineered RNAP, wherein the cell is genetically modified such that it does not endogenously produce indole, an indole derivative, or an indole-similar compound, and introducing an effective amount of exogenous indole, an indole derivative, or an indole-similar
compound, wherein the endogenous indole, an indole derivative, or an indole-similar compound activates the RNAP.
In another embodiment, the application describes a method of inducing expression in a cell, comprising the steps transforming a cell with the expression vector encoding an engineered RNAP as described herein, wherein the cell produces endogenous indole, an indole derivative, or an indole-similar compound that activates the RNAP.
In another embodiment, the application describes a method of inducing expression in a cell, comprising the steps: transforming a first cell with the expression vector encoding an engineered RNAP as described herein, wherein the cell is genetically modified such that it does not endogenously produce indole, an indole derivative, or an indole-similar compound, and introducing a second cell a that produces endogenous indole, an indole derivative, or an indolesimilar compound that activates the RNAP of the first cell.
Unless defined otherwise, all technical and scientific terms used herein generally have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure pertains. Generally, the nomenclature used herein, and the laboratory procedures of cell culture, molecular genetics, microbiology, biochemistry, organic chemistry, analytical chemistry and nucleic acid chemistry described below are those well-known and commonly employed in the art. Such techniques are well-known and described in numerous texts and reference works well known to those of skill in the art. Standard techniques, or modifications thereof, are used for chemical syntheses and chemical analyses. All patents, patent applications, articles and publications mentioned herein, both supra and infra, are hereby expressly incorporated herein by reference.
Although any suitable methods and materials similar or equivalent to those described herein find use in the practice of the present disclosure, some methods and materials are described herein. It is to be understood that this disclosure is not limited to the particular methodology, protocols, and reagents described, as these may vary, depending upon the context they are used by those of skill in the art. Accordingly, the terms defined immediately below are more fully described by reference to the application as a whole. All patents, patent applications, articles and publications mentioned herein, both supra and infra, are hereby expressly incorporated herein by reference.
Also, as used herein, the singular “a”, “an,” and “the” include the plural references, unless the context clearly indicates otherwise.
Numeric ranges are inclusive of the numbers defining the range. Thus, every numerical range disclosed herein is intended to encompass every narrower numerical range that falls within such broader numerical range, as if such narrower numerical ranges were all expressly written herein. It is also intended that every maximum (or minimum) numerical limitation disclosed herein includes every lower (or higher) numerical limitation, as if such lower (or higher) numerical limitations were expressly written herein.
The term “about” or “approximately,” means an acceptable error for a particular value. In some instances, “about” means within 0.05%, 0.5%, 1.0%, or 2.0%, of a given value range. In some instances, “about” means within 1, 2, 3, or 4 standard deviations of a given value.
Furthermore, the headings provided herein are not limitations of the various aspects or embodiments of the disclosure which can be had by reference to the application as a whole. Accordingly, the terms defined immediately below are more fully defined by reference to the application as a whole. Nonetheless, in order to facilitate understanding of the disclosure, a number of terms are defined below.
Unless otherwise indicated, nucleic acids are written left to right in 5’ to 3’ orientation; amino acid sequences are written left to right in amino to carboxy orientation, respectively.
As used herein, the term “comprising” and its cognates are used in their inclusive sense (i.e., equivalent to the term “including” and its corresponding cognates).
As used herein, “indole” describe a class of compounds containing a benzene ring fused to a five-membered nitrogen containing pyrrole ring. “Indole derivatives,” such as indole-5- carboxyaldehyde (I-5CHO), are well known in the art and would be recognized by those of ordinary skill. As used herein, an indole derivative is capable of activating one or more of the LARPs described herein.
As used herein, “indole-similar compounds” encompass compounds such as indoline, quinoline, and isoquinoline that are structurally similar to indole, or that contain an indole group that is further be fused with an separate ringed structure, such as a quinoline or isoquinolinone. As used herein, an indole-similar compound is capable of activating one or more of the LARPs described herein.
As used herein, “T7 RNA polymerase” refers to a monomeric T7 bacteriophage-encoded DNA directed RNA polymerase that catalyzes the formation of RNA in the 5’ to 3’ direction.
As used herein, “polynucleotide” and “nucleic acid’ refer to two or more nucleosides that are covalently linked together. The polynucleotide may be wholly comprised of ribonucleotides (i.e., RNA), wholly comprised of 2’ deoxyribonucleotides (i.e., DNA), or comprised of mixtures of ribo- and 2’ deoxyribonucleotides. While the nucleosides will typically be linked together via standard phosphodiester linkages, the polynucleotides may include one or more non-standard linkages. The polynucleotide may be single-stranded or double-stranded, or may include both single-stranded regions and double-stranded regions. Moreover, while a polynucleotide will typically be composed of the naturally occurring encoding nucleobases (i.e., adenine, guanine, uracil, thymine and cytosine), it may include one or more modified and/or synthetic nucleobases, such as, for example, inosine, xanthine, hypoxanthine, etc. In some embodiments, such modified or synthetic nucleobases are nucleobases encoding amino acid sequences.
A “protein,” “polypeptide,” and “peptide” are used interchangeably herein to denote a polymer of at least two amino acids covalently linked by an amide bond, regardless of length or post-translational modification (e.g., glycosylation or phosphorylation).
As used herein, “amino acids” are referred to herein by either their commonly known three- letter symbols or by the one-letter symbols recommended by IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single letter codes.
The abbreviations used for the genetically encoded amino acids are conventional and are as follows: alanine (Ala or A), arginine (Are or R), asparagine (Asn or N), aspartate (Asp or D), cysteine (Cys or C), glutamate (Glu or E), glutamine (Gin or Q), histidine (His or H), isoleucine (He or I), leucine (Leu or L), lysine (Lys or K), methionine (Met or M), phenylalanine (Phe or F), proline (Pro or P), serine (Ser or S), threonine (Thr or T), tryptophan (Trp or W), tyrosine (Tyr or Y), and valine (Vai or V).
When the three-letter abbreviations are used, unless specifically preceded by an “L” or a “D” or clear from the context in which the abbreviation is used, the amino acid may be in either the L- or D-configuration about a-carbon (Co). For example, whereas “Ala” designates alanine without specifying the configuration about the a-carbon, “D-Ala” and “L-Ala” designate D-alanine and L-alanine, respectively. When the one-letter abbreviations are used, upper case letters designate amino acids in the L-configuration about the a-carbon and lower case letters designate amino acids in the D-configuration about the a-carbon. For example, “A” designates L-alanine and
“a” designates D-alanine. When polypeptide sequences are presented as a string of one-letter or three-letter abbreviations (or mixtures thereof), the sequences are presented in the amino (N) to carboxy (C) direction in accordance with common convention.
The abbreviations used for the genetically encoding nucleosides are conventional and are as follows: adenosine (A); guanosine (G); cytidine (C); thymidine (T); and uridine (U). Unless specifically delineated, the abbreviated nucleosides may be either ribonucleosides or 2’- deoxyribonucleosides. The nucleosides may be specified as being either ribonucleosides or 2’- deoxyribonucleosides on an individual basis or on an aggregate basis. When nucleic acid sequences are presented as a string of one-letter abbreviations, the sequences are presented in the 5’ to 3’ direction in accordance with common convention, and the phosphates are not indicated.
The term “engineered”, “recombinant”, “non-naturally occurring”, and “variant,” when used with reference to a cell, a polynucleotide or a polypeptide refers to a material or a material corresponding to the natural or native form of the material that has been modified in a manner that would not otherwise exist in nature or is identical thereto but produced or derived from synthetic materials and/or by manipulation using recombinant techniques.
As used herein, “wild-type” and “naturally-occurring” refer to the form found in nature. For example, a wild-type polypeptide or polynucleotide sequence is a sequence present in an organism that can be isolated from a source in nature, and which has not been intentionally modified by human manipulation.
A “coding sequence” or “sequence encoding” refers to that part of a nucleic acid (e.g., a gene) that encodes an amino acid sequence of a protein.
The term “percent (%) sequence identity” is used herein to refer to comparisons among polynucleotides and polypeptides, and are determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide or polypeptide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence for optimal alignment of the two sequences. The percentage may be calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity. Alternatively, the percentage may be calculated by determining the number of positions at which either the identical
nucleic acid base or amino acid residue occurs in both sequences, or a nucleic acid base or amino acid residue is aligned with a gap to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity. Those of skill in the art appreciate that there are many established algorithms available to align two sequences. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith and Waterman (Smith and Waterman, Adv. Appl. Math., 2:482 [1981]), by the homology alignment algorithm of Needleman and Wunsch (Needleman and Wunsch, J. Mol. Biol., 48:443 [1970), by the search for similarity method of Pearson and Lipman (Pearson and Lipman, Proc. Natl. Acad. Sci. USA 85:2444 [1988]), by computerized implementations of these algorithms (e.g., GAP, BESTFIT, FASTA, and TFASTA in the GCG Wisconsin Software Package), or by visual inspection, as known in the art. Examples of algorithms that are suitable for determining percent sequence identity and sequence similarity include, but are not limited to the BLAST and BLAST 2.0 algorithms, which are described by Altschul et al. (See, Altschul et al., J. Mol. Biol., 215: 403- 410 [1990]; and Altschul et al., Nucleic Acids Res., 25:3389-3402 [1977], respectively). Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information website. This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as, the neighborhood word score threshold (See, Altschul et al., supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative- scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an
expectation (E) of 10, M=5, N=-4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (See, Henikoff and Henikoff, Proc. Natl. Acad. Sci. USA 89: 10915 [1989]). Exemplary determination of sequence alignment and % sequence identity can employ the BESTFIT or GAP programs in the GCG Wisconsin Software package (Accelrys, Madison Wis.), using default parameters provided.
A “reference sequence” refers to a defined sequence used as a basis for a sequence comparison. A reference sequence may be a subset of a larger sequence, for example, a segment of a full-length gene or polypeptide sequence. Generally, a reference sequence is at least 20 nucleotide or amino acid residues in length, at least 25 residues in length, at least 50 residues in length, at least 100 residues in length or the full length of the nucleic acid or polypeptide. Since two polynucleotides or polypeptides may each (1) comprise a sequence (i.e., a portion of the complete sequence) that is similar between the two sequences, and (2) may further comprise a sequence that is divergent between the two sequences, sequence comparisons between two (or more) polynucleotides or polypeptide are typically performed by comparing sequences of the two polynucleotides or polypeptides over a “comparison window” to identify and compare local regions of sequence similarity. In some embodiments, a “reference sequence” can be based on a primary amino acid sequence, where the reference sequence is a sequence that can have one or more changes in the primary sequence.
A “comparison window” refers to a conceptual segment of at least about 20 contiguous nucleotide positions or amino acids residues wherein a sequence may be compared to a reference sequence of at least 20 contiguous nucleotides or amino acids and wherein the portion of the sequence in the comparison window may comprise additions or deletions (i.e., gaps) of 20 percent or less as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The comparison window can be longer than 20 contiguous residues, and includes, optionally 30, 40, 50, 100, or longer windows.
“Corresponding to”, “reference to” or “relative to” when used in the context of the numbering of a given amino acid or polynucleotide sequence refers to the numbering of the residues of a specified reference sequence when the given amino acid or polynucleotide sequence is compared to the reference sequence. In other words, the residue number or residue position of a given polymer is designated with respect to the reference sequence rather than by the actual
numerical position of the residue within the given amino acid or polynucleotide sequence. For example, a given amino acid sequence, such as that of an engineered T7 RNA polymerase, can be aligned to a reference sequence by introducing gaps to optimize residue matches between the two sequences. In these cases, although the gaps are present, the numbering of the residue in the given amino acid or polynucleotide sequence is made with respect to the reference sequence to which it has been aligned.
The positions of amino acid differences generally are referred to herein as “Xn,” where n refers to the corresponding position in the reference sequence upon which the residue difference is based. For example, a “residue difference at position X727 as compared to SEQ ID NO. 1” refers to a difference of the amino acid residue at the polypeptide position corresponding to position 727 of SEQ ID NO. 1. Thus, if the reference polypeptide of SEQ ID NO. 1 has a tryptophan at position 727, then a “residue difference at position X727 as compared to SEQ ID NO. 1” an amino acid substitution of any residue other than tryptophan at the position of the polypeptide corresponding to position 727 of SEQ ID NO. 1. In most instances herein, the specific amino acid residue difference at a position is indicated as “XnY” where “Xn” specified the corresponding position as described above, and “Y” is the single letter identifier of the amino acid found in the engineered polypeptide (i.e., the different residue than in the reference polypeptide). In some instances (e.g., in the Tables provided in the Examples herein), the present disclosure also provides specific amino acid differences denoted by the conventional notation “AnB”, where A is the single letter identifier of the residue in the reference sequence, “n” is the number of the residue position in the reference sequence, and B is the single letter identifier of the residue substitution in the sequence of the engineered polypeptide. In some instances, a polypeptide of the present disclosure can include one or more amino acid residue differences relative to a reference sequence, which is indicated by a list of the specified positions where residue differences are present relative to the reference sequence. In some embodiments, where more than one amino acid can be used in a specific residue position of a polypeptide, the various amino acid residues that can be used are separated by a “/” (e g., X10H/X10P or X10H/P) or a comma X10H, X10P or X10H, P. In some embodiments, the enzyme variants comprise more than one substitution. These substitutions are separated by a slash or a comma for ease in reading (e.g., C14A/K122A or C14A, K122A). The present application includes engineered polypeptide sequences comprising one or more amino acid differences that include either/or both conservative and non-conservative amino acid substitutions.
A “conservative amino acid substitution” refers to a substitution of a residue with a different residue having a similar side chain, and thus typically involves substitution of the amino acid in the polypeptide with amino acids within the same or similar defined class of amino acids. By way of example and not limitation, an amino acid with an aliphatic side chain may be substituted with another aliphatic amino acid (e.g., alanine, valine, leucine, and isoleucine); an amino acid with hydroxyl side chain is substituted with another amino acid with a hydroxyl side chain (e.g., serine and threonine); an amino acids having aromatic side chains is substituted with another amino acid having an aromatic side chain (e g., phenylalanine, tyrosine, tryptophan, and histidine); an amino acid with a basic side chain is substituted with another amino acid with a basis side chain (e.g., lysine and arginine); an amino acid with an acidic side chain is substituted with another amino acid with an acidic side chain (e.g., aspartic acid or glutamic acid); and/or a hydrophobic or hydrophilic amino acid is replaced with another hydrophobic or hydrophilic amino acid, respectively.
A “non-conservative substitution” refers to substitution of an amino acid in the polypeptide with an amino acid with significantly differing side chain properties. Non-conservative substitutions may use amino acids between, rather than within, the defined groups and affects (a) the structure of the peptide backbone in the area of the substitution (e.g., proline for glycine) (b) the charge or hydrophobicity, or (c) the bulk of the side chain. By way of example and not limitation, an exemplary non-conservative substitution can be an acidic amino acid substituted with a basic or aliphatic amino acid; an aromatic amino acid substituted with a small amino acid; and a hydrophilic amino acid substituted with a hydrophobic amino acid.
As used herein, “deletion” refers to modification to the polypeptide by removal of one or more amino acids from the reference polypeptide. Deletions can comprise removal of 1 or more amino acids, 2 or more amino acids, 5 or more amino acids, 10 or more amino acids, 15 or more amino acids, or 20 or more amino acids, up to 10% of the total number of amino acids, or up to 20% of the total number of amino acids making up the reference enzyme while retaining enzymatic activity and/or retaining the improved properties of an engineered enzyme. Deletions can be directed to the internal portions and/or terminal portions of the polypeptide. In various embodiments, the deletion can comprise a continuous segment or can be discontinuous.
As used herein, “isolated polypeptide” refers to a polypeptide which is substantially separated from other contaminants that naturally accompany it (e.g., protein, lipids, and
polynucleotides). The term embraces polypeptides which have been removed or purified from their naturally-occurring environment or expression system (e.g., host cell or in vitro synthesis). The recombinant T7 RNA polymerase polypeptides may be present within a cell, present in the cellular medium, or prepared in various forms, such as lysates or isolated preparations. As such, in some embodiments, the recombinant T7 RNA polymerase polypeptides can be an isolated polypeptide. In some embodiment, an isolated polypeptide is a substantially pure. As used herein, “substantially pure polypeptide” refers to a composition in which the polypeptide species is the predominant species present (i.e., on a molar or weight basis it is more abundant than any other individual macromolecular species in the composition), and is generally a substantially purified composition when the object species comprises at least about 50 percent of the macromolecular species present by mole or % weight. Generally, a substantially pure T7 RNA polymerase composition comprises about 60% or more, about 70% or more, about 80% or more, about 90% or more, about 95% or more, and about 98% or more of all macromolecular species by mole or % weight present in the composition. In some embodiments, the object species is purified to essential homogeneity (i.e., contaminant species cannot be detected in the composition by conventional detection methods) wherein the composition consists essentially of a single macromolecular species. Solvent species, small molecules (<500 Daltons), and elemental ion species are not considered macromolecular species. In some embodiments, the isolated recombinant T7 RNA polymerase polypeptides are substantially pure polypeptide compositions.
As used herein, “hybridization stringency” relates to hybridization conditions, such as washing conditions, in the hybridization of nucleic acids. Generally, hybridization reactions are performed under conditions of lower stringency, followed by washes of varying but higher stringency. The term “moderately stringent hybridization” refers to conditions that permit target- DNA to bind a complementary nucleic acid that has about 60% identity, preferably about 75% identity, about 85% identity to the target DNA, with greater than about 90% identity to target- polynucleotide. Exemplary moderately stringent conditions are conditions equivalent to hybridization in 50% formamide, 5* Denhart's solution, 5><SSPE, 0.2% SDS at 42° C., followed by washing in 0.2*SSPE, 0.2% SDS, at 42° C. “High stringency hybridization” refers generally to conditions that are about 10° C. or less from the thermal melting temperature Tm as determined under the solution condition for a defined polynucleotide sequence. In some embodiments, a high stringency condition refers to conditions that permit hybridization of only those nucleic acid
sequences that form stable hybrids in 0.018M NaCl at 65° C. (i.e., if a hybrid is not stable in 0.018M NaCl at 65° C., it will not be stable under high stringency conditions, as contemplated herein). High stringency conditions can be provided, for example, by hybridization in conditions equivalent to 50% formamide, 5*Denhardfs solution, 5*SSPE, 0.2% SDS at 42° C., followed by washing in O.lxSSPE, and 0.1% SDS at 65° C. Another high stringency condition is hybridizing in conditions equivalent to hybridizing in 5><SSC containing 0.1% (w:v) SDS at 65° C. and washing in 0.1 *SSC containing 0.1% SDS at 65° C. Other high stringency hybridization conditions, as well as moderately stringent conditions, are described in the references cited above.
As used herein, “codon optimized” refers to changes in the codons of the polynucleotide encoding a protein to those preferentially used in a particular organism such that the encoded protein is more efficiently expressed in the organism of interest. Although the genetic code is degenerate in that most amino acids are represented by several codons, called “synonyms” or “synonymous” codons, it is well known that codon usage by particular organisms is nonrandom and biased towards particular codon triplets. This codon usage bias may be higher in reference to a given gene, genes of common function or ancestral origin, highly expressed proteins versus low copy number proteins, and the aggregate protein coding regions of an organism's genome. In some embodiments, the polynucleotides encoding the T7 RNA polymerase enzymes may be codon optimized for optimal production from the host organism selected for expression.
As used herein, “control sequence” refers herein to include all components, which are necessary or advantageous for the expression of a polynucleotide and/or polypeptide of the present application. Each control sequence may be native or foreign to the nucleic acid sequence encoding the polypeptide. Such control sequences include, but are not limited to, a leader, polyadenylation sequence, propeptide sequence, promoter sequence, signal peptide sequence, initiation sequence and transcription terminator. At a minimum, the control sequences include a promoter, and transcriptional and translational stop signals. The control sequences may be provided with linkers for the purpose of introducing specific restriction sites facilitating ligation of the control sequences with the coding region of the nucleic acid sequence encoding a polypeptide.
As used herein, “operably linked” is defined herein as a configuration in which a control sequence is appropriately placed (i.e., in a functional relationship) at a position relative to a polynucleotide of interest such that the control sequence directs or regulates the expression of the polynucleotide and/or polypeptide of interest.
As used herein, “promoter” refers to a nucleic acid sequence that is recognized by a host cell for expression of a polynucleotide of interest, such as a coding sequence. The promoter sequence contains transcriptional control sequences, which mediate the expression of a polynucleotide of interest. The promoter may be any nucleic acid sequence which shows transcriptional activity in the host cell of choice including mutant, truncated, and hybrid promoters, and may be obtained from genes encoding extracellular or intracellular polypeptides either homologous or heterologous to the host cell.
As used herein, a “substrate” in the context of an enzymatic conversion reaction process refers to the compound or molecule acted on by the T7 RNA polymerase polypeptide.
As used herein, a “product” in the context of an enzymatic conversion process refers to the compound or molecule resulting from the action of the T7 RNA polymerase polypeptide on a substrate.
As used herein the term “culturing” refers to the growing of a population of microbial cells under any suitable conditions (e.g., using a liquid, gel or solid medium).
Recombinant polypeptides can be produced using any suitable methods known the art. Genes encoding the wild-type polypeptide of interest can be cloned in vectors, such as plasmids, and expressed in desired hosts, such as A. coll, S. cerevisiae, etc. Variants of recombinant polypeptides can be generated by various methods known in the art. Indeed, there is a wide variety of different mutagenesis techniques well known to those skilled in the art. In addition, mutagenesis kits are also available from many commercial molecular biology suppliers. Methods are available to make specific substitutions at defined amino acids (site-directed), specific or random mutations in a localized region of the gene (regio-specific), or random mutagenesis over the entire gene (e.g., saturation mutagenesis). Numerous suitable methods are known to those in the art to generate enzyme variants, including but not limited to site-directed mutagenesis of single-stranded DNA or double-stranded DNA using PCR, cassette mutagenesis, gene synthesis, error-prone PCR, shuffling, and chemical saturation mutagenesis, or any other suitable method known in the art. Non-limiting examples of methods used for DNA and protein engineering are provided in the following patents: U.S. Pat. Nos. 6,117,679; 6,420,175; 6,376,246; 6,586,182; 7,747,391; 7,747,393; 7,783,428; and 8,383,346. After the variants are produced, they can be screened for any desired property (e.g., high or increased activity, or low or reduced activity, increased thermal activity, increased thermal stability, and/or acidic pH stability, etc.).
In some embodiments, “recombinant T7 RNA polymerase polypeptides” (also referred to herein as “engineered T7 RNA polymerase polypeptides,” “variant T7 RNA polymerase enzymes,” and “T7 RNA polymerase variants”) find use.
As used herein, an “expression vector” is a DNA construct for introducing a DNA sequence into a cell. In some embodiments, the vector is an expression vector that is operably linked to a suitable control sequence capable of effecting the expression in a suitable host of the polypeptide encoded in the DNA sequence. In some embodiments, an “expression vector” has a promoter sequence operably linked to the DNA sequence (e.g., transgene) to drive expression in a host cell, and in some embodiments, also comprises a transcription terminator sequence.
As used herein, the term “expression” includes any step involved in the production of the polypeptide including, but not limited to, transcription, post-transcriptional modification, translation, and post-translational modification. In some embodiments, the term also encompasses secretion of the polypeptide from a cell.
As used herein, the term “produces” refers to the production of proteins and/or other compounds by cells. It is intended that the term encompass any step involved in the production of polypeptides including, but not limited to, transcription, post-transcriptional modification, translation, and post-translational modification. In some embodiments, the term also encompasses secretion of the polypeptide from a cell.
As used herein, an amino acid or nucleotide sequence (e.g., a promoter sequence, signal peptide, terminator sequence, etc.) is “heterologous” to another sequence with which it is operably linked if the two sequences are not associated in nature.
As used herein, the terms “cell,” or “host cell” and refer to suitable hosts for expression vectors comprising DNA provided herein (e.g., the polynucleotides encoding the T7 RNA polymerase variants). In some embodiments, the host cells are prokaryotic or eukaryotic cells that have been transformed or transfected with vectors constructed using recombinant DNA techniques as known in the art. In a preferred embodiment, the prokaryotic cell comprises a bacterium, and preferably a bacterium capable of expressing a ligand activated RNA polymerases (LARPs) as described herein. Notably, one of ordinary skill in the art would appreciate that the LARPs of the disclosures can be expressed in variety of bacterial genus beyond model organism such as E. coli which are known in the field.
As used herein, “fragment” refers to a portion of a peptide or nucleotide sequence of a RNAP that still retains the activity of the whole. Unless other stated, disclosure of a DNA sequence also include the corresponding RNA and amino acid sequence including all redundant codons and conservative amino acid substitutions, disclosure of a RNA sequence also include the corresponding DNA and amino acid sequence including all redundant codons and conservative amino acid substitutions, and finally disclosure of amino acid sequence also include the corresponding RNA and DNA sequence including all redundant codons and conservative amino acid substitutions and vice versa.
The terms “isolated” and “purified” are used to refer to a molecule (e.g., an isolated nucleic acid, polypeptide, etc.) or other component that is removed from at least one other component with which it is naturally associated. The term “purified” does not require absolute purity, rather it is intended as a relative definition.
As used herein, “composition” and “formulation” encompass products comprising at least one engineered T7 RNA polymerase of the present disclosure, intended for any suitable use (e.g., research, diagnostics, etc.).
The term “transcription” is used to refer to the process whereby a portion of a DNA template is copied into RNA by the action of an RNA polymerase enzyme.
The term “DNA template” is used to refer to a double or single-stranded DNA molecule including a promoter sequence and a sequence coding for the RNA product of transcription.
The term “promoter” is used to refer to a DNA sequence that is recognized by RNA polymerase as the start site of transcription. The promoter recruits RNA polymerase, and in the case of T7RNA polymerase, determines the start site of transcription.
The term “RNA polymerase” is used to refer to a DNA-directed RNA polymerase, which copies a DNA template into an RNA polynucleotide, by incorporating nucleotide triphosphates stepwise into the growing RNA polymer
The terms “messenger RNA” and “mRNA” are used to refer to RNA molecules that code for a protein. This protein is decoded through the action of translation.
As used herein, the term “responsive to” means refers to a property of the engineered T7 RNA polymerase, which can be represented by reconstitution to approximately wild-type specific activity (e.g., product produced/time/weight protein) or reconstitution to approximately wild-type in percent conversion of the substrate to the product (e.g., percent conversion of starting amount
of substrate to product in a specified time period) when introduced to an effective amount of an inducer, such as indole, an indole derivative, or an indole-similar compound.
As used herein, the term “effective amount” refer to an amount sufficient to produce the desired result. One of general skill in the art may determine what the effective amount by using routine experimentation.
“Pharmaceutical compositions” are compositions that include an amount (for example, a unit dosage) of one or more of the disclosed compounds together with one or more non-toxic pharmaceutically acceptable additives, including carriers, diluents, and/or adjuvants, and optionally other biologically active ingredients. Such pharmaceutical compositions can be prepared by standard pharmaceutical formulation techniques such as those disclosed in Remington's Pharmaceutical Sciences , Mack Publishing Co., Easton, Pa. (19th Edition). The pharmaceutical acceptable carrier may comprise any conventional pharmaceutical carrier or excipient. The choice of carrier and/or excipient will to a large extent depend on factors such as the particular mode of administration, the effect of the carrier or excipient on solubility and stability, and the nature of the dosage form.
In one embodiment, a pharmaceutical compositions of the disclosure can include a probiotic bacterium engineered to express one or more engineered RNAP enzymes described herein.
The term “pharmaceutically acceptable” as used herein pertains to compounds, materials, compositions, and/or dosage forms which are, within the scope of sound medical judgement, suitable for use in contact with the tissues of a subject (e.g., human) without excessive toxicity, irritation, allergic response, or other problem or complication, commensurate with a reasonable benefit/risk ratio. Each carrier, excipient, etc. must also be “acceptable” in the sense of being compatible with the other ingredients of the formulation.
Suitable carriers, diluents, excipients, etc. can be found in standard pharmaceutical texts. See, for example, “Handbook of Pharmaceutical Additives”, 2nd Edition (eds. M. Ash and I. Ash), 2001 (Synapse Information Resources, Inc., Endicott, N.Y., USA), “Remington's Pharmaceutical Sciences”, 20th edition, pub. Lippincott, Williams & Wilkins, 2000; and “Handbook of Pharmaceutical Excipients”, 2nd edition, 1994.
The formulations may conveniently be presented in unit dosage form and may be prepared by any methods well known in the art of pharmacy. Such methods include the step of bringing into
association the active compound with the carrier which constitutes one or more accessory ingredients. In general, the formulations are prepared by uniformly and intimately bringing into association the active compound with liquid carriers or finely divided solid carriers or both, and then if necessary, shaping the product.
Formulations may be in the form of liquids, solutions, suspensions, emulsions, elixirs, syrups, tablets, lozenges, granules, powders, capsules, cachets, pills, ampoules, suppositories, pessaries, ointments, gels, pastes, creams, sprays, mists, foams, lotions, oils, boluses, electuaries, or aerosols.
The disclosure now being generally described will be more readily understood by reference to the following examples, which are included merely for the purposes of illustration of certain aspects of the embodiments of the present disclosure. The examples are not intended to limit the disclosure, as one of skill in the art would recognize from the above teachings and the following examples that other techniques and methods can satisfy the claims and can be employed without departing from the scope of the claimed disclosure. Indeed, while this disclosure has been particularly shown and described with references to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the disclosure encompassed by the appended claims.
EXAMPLES
Example 1 : Experimental Overview and Rationale.
As described herein, Applicants designed dynamic metabolite control over the general transfer of information on DNA to RNA13 using T7 RNAP. T7 RNAP and derivatives have been studied extensively, are used to synthesize mRNA for medical applications, and are the dominant transcriptional control mechanism employed in recombinant protein expression in bacteria. However, unregulated in vivo expression of T7 RNAP is usually toxic owing to its high basal activity. Although chemically inducible split T7 RNAPs have been demonstrated with low basal activity and high inducibility, a full length single subunit ligand activated T7 RNAP with minimal basal activity could also solve the toxicity problem inherent in many synthetic biology circuits utilizing T7 promoters. An additional advantage of expressing a single protein is its simplicity- complicated expression tuning of individual split pieces are minimized. Applicants chose indoles as a metabolite class because they are inexpensive, used for interspecies communication for biofdm formation and in the gut microbiome, and are associated with various disease states.
Ligand activated RNA polymerases (LARPs) with minimal activity in the absence of indoles would allow user defined transcriptional processes in diverse bacteria.
Herein, Applicants describe the use rational design to engineer the full-length, single subunit T7 RNA polymerase to be controlled by physiologically relevant concentrations of indole. Applicants optimized their LARPs through directed evolution to yield LARP-I with minimal transcriptional activity in the absence of indole, and a 29-fold increase in activity with an EC50 of 344 //M. Applicants utilized LARP-I in several contexts to show that indole controls T7- dependent gene expression exogenously, endogenously, and intercellularly. Applicants also demonstrated indole-dependent bacteriophage viability and propagation in trans. Specificity of different indoles, T7 promoter specificities, and portability to different bacteria are shown. These novel LARPs represent a new chemically inducible platform immediately deployable for novel synthetic biology applications, including for modulation of synthetic cocultures.
More broadly, the LARPs developed here represent a new chemically inducible system that functions in diverse bacteria with minimal modification. This system supplies an urgent need for the synthetic biology community of predictive gene expression control with an alternative controlling ligand. Further, LARPs can sense and respond to the native metabolite indole, which allows LARP receiver strains to be constructed for bottom-up manipulations of microbial communities. Applicants also demonstrated that LARPs can respond selectively to an indole derivative or indole similar compounds. LARPs may find applications in inducible or dynamic control of gene expression in bioreactors, for metabolite control of engineered phage therapies, or to perform user-defined operations in the gut microbiome or other mixed microbial communities producing indole, or indole derivatives, or indole similar compounds.
These LARPs can be complementary to existing approaches of engineering cellular control using transcription factors and/or riboswitches. While transcription factors are a form of pre- transcriptional control their output being repression or activation of transcription from a target DNA sequence-riboswitches are regulated at the level of mRNA by ligand binding to control continued transcription of nascent mRNA, or translation of the full length mRNA. Riboswitches often suffer from low dynamic range and leaky off-states, but these disadvantages can be addressed using cascading systems. The post-translational mechanism inherent in these LARPs may allow faster on- and off-switching than these alternatives, albeit with a much narrower range of potential ligands.
Example 2: Design, Engineering, and Optimization of LARPs.
Applicants identified initial LARPs using a chemical recovery of structure approach, which is predicated on the destabilizing effect of large-to-small mutations in the core of the protein and the ability for small molecules to complement the pocket (Figure 1A). Applicants performed glycine scanning mutagenesis of 15 tryptophan residues buried in the transcription initiation complex of T7 RNAP, expressed proteins as N-terminal His6 tag (SEQ ID NO. 5 or 15) fusions, purified proteins over nickel columns, and assessed these proteins for indole-dependent activity using a modified in vitro transcriptional assay. In this assay, potential LARPs are incubated with linear dsDNA containing a T7 promoter sequence driving expression of an RNA Spinach or Pepper aptamer. The amount of transcript is proportional to fluorescence, and the rate of fluorescence is determined in the presence and absence of 1 mM indole (Figure 5). Six of the 15 designs expressed in the soluble fraction of Escherichia coli lysates (Figure 6), and none showed indole responsiveness relative to the solvent control.
Applicants hypothesized that glycine scanning in the protein core was too destabilizing for potential ligand-responsive RNAPs. In support of this, two different computational methods predicted that the glycine mutations for the insoluble designs were more highly destabilizing (Figure 7). Applicants hypothesized that expression of the insoluble variants could be rescued in a thermally stabilized (TS) background. Upon transfer to the TS background (S430P, N433T, S633P, F849I, F880Y), two variants were rescued and responsive at 1 mM indole (Figure 8). The best design, LARPV1, which had a W727G substitution in addition to the TS background (T7 RNAP, S430P, N433T, S633P, W727G, F849I, F880Y) showed a 3.2-fold increase in transcription rate (s.d. = 0.9 [2.3,4.!]; n = 6) (Figure IB) with an approximate EC50 of 152 /M indole (95% c.i. 117-200 /M; n = 6) (Figure 9).
In the absence of indole, LARPV1 has 13.1% (s.d. = 3.1%; n = 6) activity relative to native T7 RNAP (Figure IB). Near zero basal activity is required for most relevant applications. To identify low background, highly inducible LARPs Applicants optimized a previously described dual positive and negative bacterial selection (Figure 1C). In this system, a plasmid construct containing a T7 promoter upstream of HIS3 and URA3 genes is transformed into an E. coli strain (i) incapable of producing endogenous indole; and (ii) auxotrophic for histidine and uracil (E. coli USO EhisB, pyrF, ina A Figure 10). Growth on minimal salts media deficient in histidine requires T7-dependent transcription of HIS3. Constitutive LARPs can be counter selected against
by growth on media supplemented with 5- fluoro-orotic acid (5-FOA), as the expressed URA3 gene product will convert 5-FOA to the cytotoxic 5-fluorouracil. Preliminary selection suggested high expression levels of T7 RNAP was leading to toxicity, necessitating extensive expression engineering of LARPV1 to move the functional response into the appropriate “Goldilocks” zone for selection. High LARPV1 expression resulted in high toxicity in the absence of selection conditions, while too low of basally expressed LARPV1 did not recover differential growth in the presence of indole under selection conditions. Applicants found that expression of a LARPV1 construct from a low copy number plasmid (ORI p 15 A), replacing the ATG start codon with GTG, and addition of an N-terminal Degron tag helped tune the appropriate expression window (Figure H).
This auxotrophy complementation selection was used to select low basal activity LARPs from a library of approximately 11 million members. A combinatorial library containing mutations at nine positions within 6 A of position 727 was constructed by a cassette-based Golden Gate protocol using_synthetic DNA containing degenerate codons. Transformed cells were passed through a round of negative selection using 1 mM 5-FOA. Surviving members of this library were then selected for activity in the presence and absence of 50 «M indole. Deep sequencing of the selected and input libraries, followed by analysis using position specific enrichment ratio matrices (PSERM), revealed a Pareto front of candidate indole-specific LARPs (Figures ID and 12). PSERM is a method for analyzing combinatorial libraries after selection useful for cooptimization. PSERM scores for a variant are the summation of the individual enrichment ratios of each mutation. Of the 20 variants on the Pareto front that were cloned and tested, 11 LARPs passed plate-based assays and showed statistically significant (one-way ANOVA, Tukey’s posthoc test,/?-value < 0.03 for all samples) indole-specific growth rate increases in defined minimal media (Figure 13; Figure 20). LARP-I (K378R, S430P, N433T, S633P, W727G, Q737L, N781S, S785A, Q786H, F845L, F849I, F880Y) is marked by minimal growth rate in the absence of indole, and near-maximal growth rate with 50 //M indole under selection conditions (Figure IE). Applicants expressed and purified LARP-I (Figure 9) to evaluate its binding properties in vitro. The basal activity of LARP-I is less than 0.1% of the activity of parent enzyme T7 RNAP TS, the EC50 for indole is 344 «M (95% c.i. 193-667 /M; n = 6), and the dynamic range for indole responsiveness is 28.9 at 1 mM (s.d. = 12.2; n = 6) (Figure 1F,G). To evaluate the time scales of indole activation, Applicants assayed the time-dependent fluorescence; a late-addition spike of indole also shows
post-translational temporal control over transcription within minutes (Figure 1H). To test specificity, Applicants also assayed LARP-I with different closely related nitrogen-containing heterocycles similar to indole. While LARP-I is activated by each of these to various extents, both the magnitude and EC50 (none are saturable up to 2 mM) are much lower than for indole (Figure 14). Thus, LARP-I identified from our selection shows low basal activity and post-translational, indole-inducible transcription with a high dynamic range in vitro.
Example 3: LARPs Control Gene Expression Exogenously, Endogenously, and in Cocultures.
Applicants evaluated the ability of LARP-I to control indole-dependent gene expression in a variety of contexts. A plasmid encoding a sfGFP expression marker downstream of a T7 promoter sequence which was optimized for strong translation was used to transform tnaA E. coli (lacking tryptophanase) expressing LARP-I (Figure 2A). Tryptophanase produces indole, pyruvate, and ammonia through the hydrolysis of tryptophan. LARP-I shows low constitutive activity resulting in minor expression of sfGFP in the absence of indole (<1.5-fold RFU/ODeoo above negative control), and a 17.1-fold increase in sfGFP reporter expression at 500 «M indole (s.d. = 3.7, n = 3) (Figure 2B). The maximum response under the conditions of the assay was 40.8% (s.d. = 9.0%, n = 3) of the response of positive control T7 RNAPR632S, a variant which has previously been shown to have lower toxicity for the cell, under the same promoter, plasmid, and condition (Figure 2B). Thus, external addition of indole is sufficient to drive high levels of gene expression.
To determine whether LARP-I can be activated by endogenous metabolites, Applicants transformed plasmids containing the reporter sfGFP and LARP-I in E. coli expressing tryptophanase encoded by the tnaA gene. In planktonic cultures and near stationary phase, E. coli with active tnaA produce mM concentrations of indole (Figure 2C). E. coli tnaA+ activates similar levels of gene expression as E. coli AtnaA with 125
indole (Figure 2D). No statistically significant difference in gene activation occurs in the presence or absence of 125 z/M indole in E. coli tnaA+ (Figure 2D; -value = 0.7602, Welch’s two-tailed rtest). E. coli tnaA+ expressing LARP-I shows a 12.4-fold increase in gene expression over the AtnaA strain in the absence of exogenous indole (s.d. = 2.6, n = 3) and 62.2% gene expression compared to T7 RNAPR632S in direct comparison (s.d. = 21.3%, n = 3).
Receiver strains could release a biological payload in response to indole either through cell lysis or secretion. To demonstrate that indole could be used to induce lysis, Applicants screened a
panel of previously described single gene lysins and identified several that lysed E. coli under induction with an arabinose-inducible promoter, including SglKU1. Indole induction of SglKU1 by LARP-I expressing E. coli MG 1655 AtnaA resulted in cell lysis within 45 min (Figure 2E and 15). Therefore, LARP-I can be controlled by physiologically relevant concentrations of an endogenous metabolite. Additionally, LARP-I enables indole-mediated delivery of intracellular cargo to the extracellular environment through cell lysis.
Microbiome engineering would benefit from additional bottom-up quorum signaling circuits. To determine whether LARP-I can be used for indole-dependent intercellular signaling, Applicants constructed a sender strain (E. coli USO MnsB, pyrF, RIF ) that produces indole. As a control, Applicants also prepared a sender strain deficient in indole production (E. coli USO MiisB, UspyrF RlnaA, RF ). For the receiver strain, Applicants constructed (E. coli USO hisB, pyrF RlnaR) with the LARPI/sfGFP reporting system (Figure 2G). Additional receiver strains encoding T7 RNAPR632S and a catalytically inactive T7 RNAPR632S/Y639A were included as positive and negative controls, respectively. Population measurements of fluorescence show a 12.9-fold difference in sfGFP expression (n = 2; [7.1,19.3]) for the LARP-I receiver strain when cocultured with the sender strains able or unable to produce indole. No differences in sfGFP expression were observed for the negative control receiver strains regardless of the sender strain (Figure 2F). Fluorescence microscopy of the co-cultures shows that most of the individual LARP-I receiver cells are activated in a coculture with the indole-producing sender strain (representative data shown in Figure 2F), while few are activated by the indole-deficient sender strain. Thus, LARP-I can be part of novel receiver circuits that allows intercellular communication between indole sender strains and LARP-I containing receiver strains.
Example 4: LARPs Enable Ligand-Dependent Bacteriophage Viability In Trans,
To test whether LARP-I enables indole dependent phage propagation in trans, Applicants infected E. coli expressing LARP-I with T7 Agpl bacteriophage that does not contain the vital gp l gene (T7 RNAP) (Figure 3 A). In LARP-I expressing strains, an approximate 104 fold increase in countable plaques was observed between 0 and 500
indole (Figure 3B), and the number of plaques at 500 /M was indistinguishable between strains expressing WT T7 RNAP and LARP-I (Figure 3B). To evaluate whether phage infection efficiency is ligand-dependent, Applicants tested LARP-I expressing strains using a phage infection kill-curve assay. At increasing indole concentrations, a more robust phage infection (time to clearance) was observed (Figure 3C). Thus,
indole dependent infection and propagation enabled by LARP-I allows inter-kingdom communication between phage and bacteria.
Example 5: LARPs are Portable for Different Bacteria, DNA Promoter Specificities, and Ligands.
New synthetic biology applications would be enabled if LARPs could be demonstrated to function in different organisms, with different promoter sequence specificities, and with different controlling ligands. To determine the portability of LARP-I to other organisms, Applicants integrated a sfGFP reporter driven by a canonical T7 promoter and constitutively expressed LARP- I in Pseudomonas putida AG4775 (Figure 4A). P. putida AG 775 does not endogenously express indole (Figure 16A) and growth is diminished somewhat by sub-mM concentrations of indole (Figure 16B), a phenotype known at higher indole concentrations. Nevertheless, LARP-I shows robust activation of a GFP reporter at increasing indole concentrations, as measured by flow cytometry (Figures 4B and 16C).
Previous studies engineered T7 RNAPs that respond selectively to different promoter sequences. These engineered T7 RNAPs all contained mutations in their specificity loops (positions 739-766) adjacent to the LARP-I mutations. To determine the portability of the LARP- I mutations, Applicants engineered hybrid polymerases containing both the LARP-I and the specificity loop mutations (Figure 4C). At the same time, Applicants also constructed plasmids containing alternative promoters (pCGG, pCTGA, pN4) driving URA3 and HIS3. If the hybrid polymerases were functional, positive and negative selection using the auxotrophic strain E. coll USO hisB, EpyrF, EtnaA would result in the same growth phenotype as with pT7 and LARP-I. If the hybrid polymerases were specific, then for noncognate pairs positive selection would result in no growth while negative selection would result in growth. To test this hypothesis, Applicants transformed E. coli with the combinatorial set of hybrid polymerases and promoters. For the hybrid polymerases tested, indole-dependent growth was observed only for cognate polymerase-promoter pairs (Figure 4D). Consistent with this, ligand-dependent toxicity was observed only for those same cognate polymerase-promoter pairs (Figure 4D). Thus, the LARP-I mutations are transferable to engineered T7 RNAPs with altered promoter specificity.
To determine whether LARPs could be engineered for other controlling ligands, Applicants repeated the selection and analysis shown in Figure 1C for 21 other ligands (Figure 21) chosen for their relatively small size (<300 Da) and physicochemical similarities to indole and indole derivatives. While there were no discernible differences in the growth of libraries on most of the
ligands (data not shown), the indole derivative indole-5-carboxyaldehyde (I-5CH0) showed significant growth with enriched mutations following selection analysis (Figure 12). Following PSERM analysis comparing I-5CHO vs indole scores (Figure 4E) Applicants identified LARP- I5CHO (K378R, S430P, N433T, S633P, W727G, Q737C, I778L, S785A, Q786H, F849I, F880Y). As determined by growth on media without histidine, E. coli USO EhisB, EpyrF, EtnaA expressing LARP-I5CHO grows minimally in the absence of I-5CHO and is selectively activated by I-5CHO over indole (Figure 4F). Expression of LARP-I5CHO in the same reporter strain shown in Figure 2A revealed I-5CH0 dependent sfGFP expression, with the overall activation on the same order of magnitude as constitutive expression of T7 RNAPR632S (Figure 4G). Additionally, LARP- I5CH0 shows specificity for L5CHO over indole in phage infection assays (Figure 17). Thus, LARPs can be engineered to be specific to other indoles.
Example 6: LARP-I Recognizes Promoter DNA in the Absence of Indole.
The LARPs demonstrated here have minimal basal activity and large indole-dependent increases in activity. The LARP-specific mutations centered around W727 are located beneath the interface with the allosteric inhibitor T7 lysozyme. T7 lysozyme traps T7 RNAP in the initiation complex where, while it is able to bind DNA, it only produces abortive transcripts. It is possible that a similar-but opposite in effect-allosteric mechanism occurs for the LARPs. While T7 lysozyme inhibits T7 RNAP (antagonism), the mutational inhibition of LARPs is recovered by an allosteric effector small molecule (agonism). To interrogate the ability for LARP-I to bind the promoter, Applicants performed fluorescence anisotropy experiments in replicate (values given as mean of n = 2), fluorescently labeling one strand of the DNA duplex at constant concentration and titrating T7 RNAP at two temperatures, to see if there were any temperature effects. In the absence of indole, TS recognized pT7 with an approximately KD of 1 nM at both 25 and 37°C consistent with previous literature values of WT (Figure 18). At 500 z/M indole, the measured KD of TS for pT7 was 1.1 nM at both 25 and 37 °C. Thus, the binding was not significantly impacted by temperature or indole in the ranges tested for TS. In the absence of indole, LARP-I also bound pT7 DNA, albeit with a reduced KD of approximately 30 nM at both 25 and 37°C (Figure 18). In the presence of 500 z/M indole the affinity for pT7 DNA changed to approximately 17 nM at both 25 and 37°C (Figure 18). Thus, LARP-I recognizes promoter DNA in the presence and absence of indole, and with minimal temperature differences.
Applicants also used circular dichroism to evaluate the secondary structure and the apparent melting temperature of LARP-I with and without indole. There was no significant difference in the secondary structure of LARP-I relative to the TS background (Figure 19A). While T7 RNAP is largely an alpha helical protein, the locations of the LARP mutations are close to beta strands from positions 720-770. There were no differences in ellipticity as a function of temperature at 222 nm (measuring alpha helical content) in the presence or absence of indole (Figure 19B). On the other hand, the presence of indole stabilized other secondary structures (measured by ellipticity at 208.5 nm) for LARP-I (Figure 19C), consistent with the expected binding location. This stabilization is also congruent with modest differences for the affinity of LARP-I with pT7 DNA observed in the presence of indole. In the absence of indole, LARP-I is still able to recognize pT7 DNA. Together, these results suggest a more nuanced mechanism of allostery, potentially similar to the T7 lysozyme binding inhibition.
Example 7: Materials and Methods.
Plasmid and plasmid library generation. All sequences were ordered as synthetic dsDNA gene blocks (eBlocks/gBlocks, IDT) or were PCR amplified using gene-specific primers (IDT) and commercially available kits. Plasmids were constructed by restriction cloning, Golden Gate assembly 1, Gibson assembly 2 or DNA HiFi MasterMix using commercially available kits, by site-directed mutagenesis using commercially available kits (Agilent QuikChange II or NEB Q5®), or by combinatorial nicking mutagenesis 3,4. Most plasmids were sequence verified using Oxford Nanopore (Plasmidsaurus). All other plasmids had sequence verification of the inserted T7 RNAP using Sanger sequencing (Genewiz). The base T7 RNAP plasmid pZB017-T7-His6-WT encoding T7 RNAP K378R with an N-terminal 6x-His Tag was constructed from pT7-911Q- His6-WT5, a gift from Seelig Lab at the University of Minnesota. The His tag and K378R mutation appears to have no impact on the activity of the T7 RNAP in benchmark comparisons with commercially purchased T7 RNAP (data not shown). RNAP and LARP constructs containing a/5. putida ribosome binding site and ATG start codons were restriction cloned into pJH0204, and the T7-sfGFP reporter was cloned into pJH0228, essentially as described 6. Construction of the T7 RNAP mutational library was previously described 1. Applicants used the high resolution structures available for T7 RNAP (PDB codes: 1CEZ, 1QLN, 1H38, 1MSW, 3E2E) 710 to identify mutations with (i.) any heavy atom within 6 A of the Cb of position 727; and (ii.) CaCb vector pointing towards position 727. The following sets of mutations were encoded in the library -
725 A GV; 727G; 735 AGV; 737CDFGHILNQRSVY; 778ILV; 781HKNQRS;
782CDFGHILNRSVY; 785AGST; 786DEGHKNQRS & 845CDFGHILNRSVY.
Strain development. Constructs were integrated into engineered genomic sites (serine integrase attP arrays) in P. putida AG4775 essentially as described 11 by co-transforming the Bxb 1 or the TGI recombinase with the construct in pJH0204 or pJH0228, respectively. Transformations were performed by electroporation. The non-essential tryptophanase gene tnaA was knocked out of the E. coli BIH USO strain yielding USO pyrF-/hisB-/tnaA-, as well as the E. coll phage infection strain MG1655 yielding MG1655 tnaA-. To knockout the indole-producing tryptophanase gene from both strains, a Cas9 recombineering protocol was used as described by Choudhury et al., and Morgenthaler et al. The genomes of E. coli BL21, K-12, and W were screened for appropriate gRNA sequences around the tnaA locus in the tnaCAB operon (Strain, Genomic Location, Accession Code; BL21, 3746691-3748106, CP010816.1; K-12, 3886753- 3888168, U00096.2; W, 4111028-41112443, CP002185). Two NGG containing gRNA sequences in the middle of the tnaA gene (ACCATCACCAGTAACTCTGC (SEQ ID NO. 16); ctggctcaataacacgaatg (SEQ ID NO. 18)) were selected. Additionally, the DNA used for recombination (gBlock 21) was designed to remove 1032 bp, or more than 70% of the tnaA gene sequence. A genome PCR and Kovak’s reagentl4 indole test were conducted to verify successful knockout of the tnaA gene. Genome PCR was performed by picking a single colony into 1 mL of LB and growing to an ODeoo ~1. The culture was then pelleted at 17,000xG and washed three times by resuspending in 1 mL nuclease free water and re-centrifuging. After the final wash and resuspension, cells were sonicated for 10 minutes. A 20 uL PCR was performed using primers at 1 uM [407, 408; 407, 409; 407, 410], 6 uL of the sonicated cells, and NEB Q5 MasterMix. The products from the genome PCR were ran on a 2wt% agarose-TAE gel with SYBR Safe, visualized with a Saferimage plate reader, and compared with the products from WT cells. Kovak’s reagent was prepared (2 g p-dimethylamino-benzaldehyde, 10 mL 37% HC1, 30 mL Isoamyl alcohol) and mixed at 50 L in 2 mL of a saturated bacterial culture and observed for the formation of pink coloring. No pink color was observed in the selected knockout strains. Successful curing of temperature sensitive plasmids was verified by replica plating on LB-agar with and without chloramphenicol or ampicillin.
Protein Expression and Purification. Plasmids containing T7 RNAP proteins driven by a lac-inducible promoter were transformed into chemically competent E. coli BL21, grown at 37°C
in LB supplemented with 1 (w/v)% dextrose and appropriate antibiotic. At an ODeoo =[0.6-1 .0] the protein expression was induced with 0.25 mM IPTG and allowed to express for 4 hours in a shaker for 6 hours at 30°C (variants for data reported in Supp Fig 2, Supp Fig 4), or for 16 hours at 18°C (all other data). After induction, cultures were immediately spun down at 4000xg, and stored at - 80°C until purification. Proteins were purified by Ni-NTA affinity chromatography essentially as described by Rio et al, using BPER and Turbonuclease for lysis. Protein yields ranged from 5-50 mg/L of culture volume. Proteins were stored at 4°C T7 storage buffer (50mM Tris-HCl pH 7.9, ImM EDTA, 5mM bME, lOOmM NaCl, 15% glycerol) until use.
Transcriptional Assay. Transcriptional activity was tested using an adaption of a previously described fluorescent microtiter plate assay 15. Reactions were assembled in a Black Costar Round-bottom 96-well plate. A reaction mixture (90 pL total) and the T7 RNAP (10 pL total) were mixed and incubated at 37 C and the increase in fluorescence measured. The 100 pL reaction mixture consisted of 20 pL of 5x Transcription Buffer (640 mM HEPES pH 7.5, 112 mM MgC12, 200 mM DTT, 6.4 mM Spermidine), 20 pL of 25 mM rNTP Mix, 10 pL 1 M KC1, 50 ng pT7-Spinach Template DNA (111 bp) or pT7-8Peppers (555 bp), 1 pL 10 mg/mL BSA, 1 pL 0.1 U/mL iPPase , 2 pL of 100 pM 3,5-difluor-4-hydroxybenzylidine imidizolanone (for Spinach aptamer, DFHBI, in ethanol) or 1 pL of 100 pM 4-Cyano-a-[[4-[(2-hydroxyethyl) methylamino] phenyl] methylene] benzeneacetonitrile (for Peppers Aptamer, HBC530, in DMSO) , 2 pL 1 (w/v)% Triton X-100, and a balance of Nuclease Free Water to 90 pL. This assembled reaction mixture sans enzyme was added to a Black Costar Plate Well containing 10 pL of 20 pM Protein and mixed by pipetting. The fluorescence readout (Spinach: Ex/Em 469/501; Peppers Ex/Em 485/530) was performed every 60-72 sec for at least 60 minutes in a Biotek Synergy Hl plate reader set to 37°C. Transcriptional activity was determined as the slope of the increase in fluorescence (RFU/min). RFU/min was calculated by fitting a line to a moving window of 20 data points, and reporting the largest value. For low activity, a correction to the max rate is used where the slope is multiplied by the correlation coefficient of the fit, and the highest value is reported.
In vivo selection for identification of LARPs. The T7 RNAP screen was developed from a previous selection used for screening Zinc finger nuclease target sequences. An auxotrophic strain, US0 pyrF-/hisB-, containing the homologous HIS3 and URA3 deletions was complemented with a plasmid containing the HIS3/URA3 genes under a pT7 promoter. Additionally, the gene for T7 RNAP was transformed under a secondary plasmid. Initial tests of growth screening on solid
media suggested the pT7/T7 RNAP system was toxic, as previously documented. To reduce the toxicity of T7 RNAP, different pT7 promoter strengths and T7 RNAP expression strengths were tested. Two plasmids were co-transformed into USO pyrF-/hisB- and plated on M9-Complete (M9 mineral salts with 6 g/L dextrose, 1.4 g/L Mix Amino Acids (-Trp, -Leu, -His, -Ura), 78 mg/L Trp, 22.4 mg Uracil, 380 mg Leucine, 0.5 g/L of Histidine and 0.2 g/L Yeast Extract) 1.8 (w/v)% agar plates with Tetracycline (10 /zg/mL), Chloramphenicol (30 /zg/mL), and Kanamycin (50 /zg/mL) and grown overnight at 37°C. Fresh colonies were selected the next day into M9-his (M9 mineral salts with 6 g/L dextrose, 1.4 g/L Mix Amino Acids (-Trp, -Leu, -His, -Ura), 78 mg/L Trp, 22.4 mg Uracil, 380 mg Leucine) liquid culture and grown overnight. Overnight cultures were diluted to an ODeoo of 0.01-0.04 into 3.5 mL of M9-his with 1 mM of the HIS3 competitive inhibitor 3 -aminotriazole (3-AT). Growth was measured by ODeoo in Hungate tubes. The ultimate screen used in this work had a lacUV5 promoter driving expression of full length T7 RNAP with a GUG initiation codon and the UmuD N-degron tag.
For library selections, 1800 ng of the previously developed library 1 was transformed into prepared electrocompetent cells of the selection strain (USO pyrF-/hisB-/tnaA-) already containing the pT7H3U3 plasmid (pZB260), resulting in 20E7 estimated transformants (range 10E7-31E7). The transformation was directly plated on large Coming bioassay (245x245mm) plates with M9Complete media. The resulting libraries were scraped with M9-Complete liquid media and prepared as glycerol stock (15 (w/v)% glycerol). The negative selection was performed by plating approximately 5E9 cells (10 OD-mL) on agar containing M9-Complete media and 1 mM 5fluoroorotic acid (5-FOA), and incubated at 37°C overnight. After negative selection plates were scraped with M9-his, washed two times by pelleting (4000xG for 10 mins) and resuspending in M9-his, and either directly used for positive selection or prepared as glycerol stocks and stored at -80°C.
For positive selection, M9-his plates were prepared with 1 mM 3-amino-l,2,4-triazole and either 50 /iM of indole or indole-5-carboxyaldehyde or ethanol as the solvent control. Cells passed through counter selection were plated on the selective media plates and grown for 20 hours at 37°C. The resulting colonies for each selection condition were scraped with M9-his media and prepared as glycerol stocks. DNA from resulting libraries was plasmid extracted and prepared for next generation sequencing essentially following method B of Kowalsky et al. The libraries were
sequenced on an Illumina MiSeq using 2 x 250 paired end reads by Rush University genomics core.
Next-generation sequencing. Raw reads from the sequencing were analyzed essentially as described in Smith et al. Briefly, fastq files were merged using FLASh with standard inputs. Merged sequences were then converted to fasta files. Next, sequences were sorted by correct length (416 bp) and absence of any ambiguous base-calls (e.g. “N” in sequence). Sequences that passed filtering were translated to an amino acid string using Biopython and screened for a proper initial sequence of three amino acids. Last, an identifying mutational string was generated for each read constrained to only encoded library positions and the prevalence of each mutational string was counted and divided by all observed sequences to obtain frequencies. These frequencies were then used in the subsequent PSERM analyses.
Planktonic Clonal Growth Assays. For clonal growth assays, LARP plasmids were freshly transformed into the selection strain containing the pT7-H3U3 plasmid (USO pyrF-/hisB- /tnaA-; pZB260) and dilution plated on M9C (+ 1 mM 5-FOA). Individual colonies were selected from M9-Complete (+ 1 mM 5-FOA) and grown overnight in M9-his media. The following morning, cultures were diluted 50 zL into 3.5 mL of M9-his (+3-AT) containing 50 M ligand or ethanol as the solvent control in 14 mm Hungate tubes resulting in starting ODeoo of 0.01-0.04. Hungate tubes were incubated aerobically at 37°C and shaking, and the ODeoo was measured every 30-90 minutes for up to 8 hours or until cultures reached saturation (ODeoo = 0.6).
GFP Reporter Assays. Swapping an sfGFP gene directly into the HIS3/URA3 genes of pZB260 did not yield sfGFP expression (data not shown), so an optimized pT7-sfGFP reporter plasmid (pZB604) was constructed with improved predicted translation rates using the RBS Calculator26. For exogenous activation experiments, colonies were incubated in 400 L of SOB containing the appropriate antibiotic and grown overnight for ~16 hours. Saturated cultures were then diluted 40 /zL into 360 /JL of fresh SOB containing the appropriate antibiotics and indole concentrations (or ethanol as a vehicle control) and grown for an additional 22 hours prior to measuring GFP. For endogenous activation experiments, colonies of each variant were picked into 400 /zL of SOB containing the appropriate antibiotic and grown overnight. Saturated cultures were then diluted 40 /zL into 360 L of fresh SOB containing the appropriate antibiotics and indole concentrations (or solvent control) and grown for an additional 22 hours prior to measuring GFP. For both experiments, GFP (ex: 485 nm; em: 515 nm) was measured with 200 uL of undiluted
cultures in 96 well black costar round bottom plates (REF# 3792) and the pathlength corrected ODeoo was measured at 200 L total volume in 96 well clear flat bottom plates (fisherbrand; Cat. No. 12565501) following a 5x dilution into SOB. All measurements were performed using a Biotek Synergy Hl plate reader at room temperature and automatic gain. For co culturing experiments, colonies of each variant (all senders expressing mRFP; all receivers containing pT7- sfGFP as pZB604; sender: USO tnaA+; sender control: USO tna-; LARP-I expressing receiver: USO tnaA-; T7 KNAP116328 expressing receiver positive control: USO tna+/-; T7 RNAPR632SA 639A expressing receiver negative control: USO tna+/-) were picked into 400 L of SOB containing the appropriate antibiotic and grown overnight for ~16 hours. Saturated cultures were then measured for ODeoo, spun down, resuspended in fresh SOB containing the appropriate antibiotics, and mixed at 1 : 1 ratio of OD-mL to obtain approximately equivalent cell populations. Samples were then grown for 22 hours prior to measuring RFP/GFP fluorescence and the 5x dilution of the ODeoo at 200 L volume. The non-diluted samples were used for imaging (RFP Ex: 583 nm; Em: 609 nm).
Fluorescence imaging. Fluorescent images of the co-culture experiments reported in Fig 2 used 5 L of diluted culture (ODeoo = 0.1) plated on a 2 cm x 1 cm section of minimal media plates (M9) with 1.5 (w/v) % agarose. After cells dried, the agarose was inverted onto an ibidi/Slide well plate (Cat. No. 80427) and imaged on an Olympus Microscope. Brightfield was used to find cells that were of suitable imaging density. Exposure time and fluorescent channel were manually set, and then the laser was opened (20% laser power). Exposure time was 2 ms for Brightfield, 100 ms for GFP channel (Ex: 488 nm; Em: 520 nm), and 300 ms for RFP (Ex: 594 nm; Em: 610 nm) channel. Images were taken in the first few seconds following laser exposure and magnification adjustment to reduce bleaching. Images were processed in Imaged 27 by combining the individual channels (BF, GFP, RFP), adjusting brightness and contrast for the entire channels to reduce over-exposure, (BF-gray minimum/maximum 2700/4000, RFP-red/GFP-cyan minimum/maximum 100/4000). Representative images are shown in Figure 2F.
Phage Assays. Phage were amplified in 50 mb cultures of LB and titered following Kibby et al., and stored at 4°C following filtration through a 0.22 gm filter. For gpl deficient phage strains, E. coli BL21*(DE3) was used to amplify the phage. Phage was diluted in the same LB media used for amplification. To perform phage plaque assays, MG1655 tnaA- was transformed with pZB513 (plasmid expressing WT T7 RNAP) and pZB578 (LARP-I). Single colonies were picked and grown overnight at 37°C in LB + 30 gg/mL chloramphenicol. The next day, LB overlay
plates were prepared following Kibby et al. Briefly, 3 mb of melted 1.5 (w/v)% LB-agarwas added to 6 mL LB containing 30 pg/mL chloramphenicol, phage infection salts (10 mM MgC12, 10 mM CaC12 and 100 pM MnC12), and indole or ethanol as a vehicle control. To this mixture 100 pL of saturated overnight culture was added, immediately mixed, and plated on top of pre-prepared LB plates. The overlay agar was allowed to dry for at least 15 minutes and not more than 1 hour at room temperature. After plates were dried, 4 pL of phage serial dilutions were plated onto the overlay and allowed to dry. Plates were then covered and placed in the static incubator at 30°C for 14 hours. To perform phage clearance in planktonic cultures, single colonies of MG1655 tnaA- containing pZB513 (WT T7 RNAP) or pZB578 (LARP-I) were picked and grown for 6-8 hours in LB containing chloramphenicol and phage infection salts. Cultures were back-diluted to an ODeoo of approx. 0.2 in LB containing 30 pg/mL chloramphenicol, phage infection salts (10 mM MgC12, 10 mM CaC12 and 100 pM MnC12), and indole or ethanol as a vehicle control. Phage infection was carried out in 96-well plates with 200 pL total volume per well by adding 4 pL of phage dilutions at a multiplicity of infection (PFU/cell, assuming 5E8 E.coli per OD-mL) of 0.5 or 5.
P. putida growth experiments. P. putida colonies with stable integrations of the T7 reporter and T7 RNAP or LARP constructs were grown aerobically overnight at 30°C, shaking at 250rpm, in LB with kanamycin (50 pg/mL) and streptomycin (50 pg/mL). Cultures were back- diluted to an ODeoo=0.05 in fresh media, then indole (or ethanol as a vehicle control) was added at the concentrations indicated. Cultures were grown at 30°C, shaking at 250rpm prior to measurement of ODeoo and GFP by flow cytometry. Induction of sfGFP was measured at 24 h of growth in indoles by flow cytometry. Per-cell GFP were quantified using the sorting gates shown in Figure 15.
Promoter Specificity Testing. Orthogonal polymerase-promoter specificities were tested by cloning the different promoter sequences in frame of the canonical pT7 sequence in pZB260, upstream of the HIS3/URA3 genes resulting in pH3U3 promoter plasmids (pZB612: pCGG; pZB613 pCTGA,pZB616: pN4). Additionally, corresponding specificity loop mutations for T7 RNAP were incorporated into the pBIHl -LARP-I plasmid (pZB578), resulting in plasmids (pZB643-LARP-I-pCGG, pZB644LARP-I-pCTGA, pZB647-LARP-I-pN4). All promoter and polymerase combinations (16) were transformed into the selection strain and plated on non- selective media. Fresh colonies of each combination were picked into 400 pL of M9-Complete media and grown for 12 hours to saturation (ODeoo ~1.0). Cultures were diluted lOOx into M9-his
(+ 1 mM 3AT) and 5 /zL was plated on solid selective media M9-his (+1 mM 3-AT; +/- 100 [J.M indole) and counter- selective M9-Complete media (+1 mM 5-FOA; +/- 100 [J.M indole). Plates were incubated at 37°C for 18 hours before imaging.
Circular dichroism. Circular dichroism was performed on an Applied Photophysics Chirascan VI 00. Purified LARPI was buffer exchanged into 20 mM NaPO4, pH 7.8 with 50 mM NaCl using 7 kDa Zeba 0.5 mL spin columns per manufacturer's protocol and diluted to a concentration in the range of 0.1-0.4 mg/mL. 170 /zL of sample was loaded into clean 0.5 mm cuvettes. Due to the presence of 50 mM NaCl, the usable spectra were limited to 195 nm. Spectral measurements were taken using 1 sec timepoints at 2 nm bandwidth and 0.5 nm step size for LARP-I in the presence and absence of 500 /zM indole. The ellipticity spectra of LARP-I from 195-260 nm was indistinguishable from TS at both 0 and 500 uM indole concentrations. For CD melt curves, the wavelength was maintained at either 208.5 nm or 222 nm and the ellipticity was measured for 24 secs between 0.5 C temperature steps as the sample was heated at l°C/min.
Fluorescence Anisotropy. The canonical 17-nt T7 promoter TAATACGACTCACTATA (SEQ ID NO. 17) spanning from -17 to -1 position of the initiation start site was purchased from IDT with 5’ 6FAM-fluorescein conjugated to the 5’ end of the forward promoter sequence. The fluorescent promoter was annealed with the unconjugated reverse complement sequence at 20 zM concentration in buffer containing 10 mM Tris, pH 7.8, 50 mM NaCl, and 1 mM EDTA with the complementary strand in 20% excess (24 /zM) by heating the mixture to 98°C for 1 minute and cooling to 4°C over 3 hours in a thermal cycler. Annealed primer was diluted to 2 nM in a buffer containing 33 mM HEPEs, pH 7.8, 50 mM NaCl, and 10 mM MgC12. Protein was exchanged into 33 mM HEPES, pH 7.8, 50 mMNaCl, and 10 mM MgC12 using 7 kDa Zeba 0.5 mL spin columns per manufacturer's protocol. Protein concentration was determined using A280 nanodrop as well as Bradford assay. Protein was serial diluted 22 times in 2x dilutions (1 : 1) or 1.5x dilutions (2: 1) in the same buffer. Equivolume mixtures of DNA solution and protein dilutions (DNA final concentration: 1 nM, final volume 20 /zL per well) were prepared in 384 well plates in triplicate. Samples were allowed to incubate at 25°C or 37°C for at least 1 hour prior to measurement. Fluorescence polarization was measured using a Tecan Spark plate reader with automated settings for FAM-fluorescein channels (excitation wavelength of 483 nm; excitation bandwidth of 20 nm; emission wavelength of 529 nm; emission bandwidth of 20 nm; automatic gain; mirror: automatic
-dichroic 510; 30 flashes; integration time: 40 ^s; settling time: 100 ms; z-position: 20000 /zm; G- factor calibrated from a buffer blank and reference sample of 1 nM fluorescent DNA in buffer).
Analysis of the fluorescence anisotropy data was performed using a Python package (github.com/nicklammer/AnisotropyBindingFit) to fit data to a 1 : 1 binding isotherm.
Claims
1. An engineered RNA polymerase (RNAP) comprising a polypeptide sequence having at least between 80% to 99%, or more sequence identity to the reference sequence of SEQ ID NO. 1, or a functional fragment thereof, wherein the engineered RNA polymerase comprises a substitution at position W727 in said polypeptide sequence, and wherein the amino acid positions of said polypeptide sequence are numbered with reference to the SEQ ID NO. 1 or 3, wherein as a result of said mutation the RNAP has a phenotype of being responsive to indole, an indole derivative, or an indole-similar compound, as compared to an RNA polymerase lacking said mutation.
2. The engineered RNAP of any of claims 1, wherein said substitution at position 727 comprises a substitution mutation selected from W727G, W727A, or W727V.
3. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises a substitution set at a positions S430, N433, S633, W727, F849, and F880.
4. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises a substitution set at a positions S430P, N433T, S633P, W727G, F849I, and F880Y.
5. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises a substitution set at a positions K378, S430, N433, S633, W727, Q737, N781, S785, Q786, F845, F849, and F880.
6. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises a substitution set at a positions K378R, S430P, N433T, S633P, W727G, Q737L, N781S, S785A, Q786H, F845L, F849I, and F880Y.
7. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises a substitution set at a positions K378, S430, N433, S633, W727, Q737, 1778, S785, Q786, F849,
8. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises a substitution set at a positions K378R, S430P, N433T, S633P, W727G, Q737C, I778L, S785A, Q786H, F849I, and F880Y.
9. The engineered RNAP of any of claims 1-8, wherein the indole-similar compound is selected from indoline, quinoline, and isoquinoline.
10. The engineered RNAP of any of claims 1-9, wherein the indole derivative is indole-5- carboxyaldehyde (I-5CHO).
11. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises a substitution selected from Q737L, N781S, S785A, Q786H, F845L, or any combinations thereof.
12. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises a substitution selected from V725X, Q737L, Q737C, Q737F, Q737V, Q737I, Q737H, I778L, I778V, N781S, N781H, F782Y S785A, Q786H, F845Y, F845L, F845C, or any combinations thereof.
13. The RNAP of any of claims 1-2, wherein said polypeptide further comprises substitution mutations Q737L, N781S, S785A, Q786H, F845L.
14. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises substitution mutations Q737L, S785A, Q786H, and F845Y.
15. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises substitution mutations Q737C, I778L, S785A, Q786H, and F845L.
16. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises substitution mutations Q737C, S785A, Q786H, and F845Y.
17. The engineered RNAP of any of claims 1- 2, wherein said polypeptide further comprises substitution mutations Q737C, I778L, S785A, and Q786H.
18. The engineered RNAP of any of claims 1- 2, wherein said polypeptide further comprises substitution mutations Q737L, N781S, S785A, and Q786H.
19. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises substitution mutations Q737L, N781H, S785A, Q786H, and F845L.
20. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises substitution mutations Q737L, S785A, Q786H, and F845L.
21. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises substitution mutations Q737C, S785A, and Q786H.
22. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises substitution mutations Q737C, S785A, Q786H, and F845C.
23. The RNAP of any of claims 1-2, wherein said polypeptide further comprises substitution mutations Q737F, S785A, Q786H, and F845L.
24. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises substitution mutations Q737L, N781H, S785A, and Q786H.
25. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises substitution mutations Q737V, N781H, S785A, Q786H, and F845L.
26. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises substitution mutations Q737I, N781S, S785A, Q786H, and F845L.
27. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises substitution mutations Q737H, N781S, S785A, Q786H, and F845L.
28. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises substitution mutations N78 IS, S785A, Q786H, and F845L.
29. The RNAP of any of claims 1-2, wherein said polypeptide further comprises substitution mutations Q737L, F782Y, S785A, Q786H, and F845L.
30. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises substitution mutations Q737L, I778V, S785A, Q786H, and F845L.
31. The engineered RNAP of any of claims 1-2, wherein said polypeptide further comprises substitution mutations Q737C, S785A, Q786H, and F845L.
32. The engineered RNAP of any of claims 1-31 wherein said polypeptide further comprises a MIG substitution.
33. The engineered RNAP of any of claims 1-31 wherein said polypeptide further comprises an N-terminal degron sequence.
34. A nucleic acid molecule encoding the engineered RNAP, or a functional fragment thereof, according to any of claims 1-33.
35. An expression vector comprising the nucleic acid molecule according to 34, operably linked to an expression control sequence.
36. An isolated cell transformed with the expression vector according to claim 35, and capable of expressing the engineered RNA polymerase.
37. The cell of claim 36, wherein said cell is further modified to disrupt or knock out one or more genes directed to the production of indole, one of its derivatives, or a indole similar compound.
38. The cell of claim 37, wherein said cell is modified to disrupt or knock out expression of tryptophanase (TnaA).
39. A composition comprising an isolated engineered RNAP according to any of claims claim 1- 31.
40. A method of amplifying a target nucleic acid in a sample in an isothermal transcription based nucleic acid amplification reaction, comprising contacting the sample with:
- a primer pair comprising a first promoter-oligonucleotide and a second oligonucleotide for amplification of the target nucleic acid;
- an effective amount of indole, an indole derivative, or an indole-similar compound; and
- the engineered RNA polymerase according to any of claims 1-31, under conditions whereby the isothermal transcription based nucleic acid amplification reaction can occur to amplify the target nucleic acid.
41. An enzyme mixture for use in an isothermal transcription based nucleic acid amplification reaction comprising:
- the engineered RNA polymerase according to any of claims 1-31;
- an enzyme having reverse transcriptase activity and optional RNase H activity; and
- an effective amount of indole, an indole derivative, or an indole-similar compound.
42. A method of inducing expression in a cell, comprising the steps:
- transforming a cell with the expression vector encoding an engineered RNAP according to any of claims 1-31, wherein the cell is genetically modified such that it does not endogenously produce indole, indole-5-carboxyaldehyde (I-5CH0), indoline, quinoline, or isoquinoline; and
- introducing an effective amount of exogenous indole, indole-5-carboxyaldehyde (I- 5CH0), indoline, quinoline, or isoquinoline, which activates the engineered RNAP.
43. A method of inducing expression in a cell, comprising the steps:
- transforming a cell with the expression vector encoding an engineered RNAP to any of claims 1-31, wherein the cell produces endogenous indole, indole-5-carboxyaldehyde (I- 5CH0), indoline, quinoline, or isoquinoline that activates the engineered RNAP.
44. A method of inducing expression in a cell, comprising the steps:
- transforming a first cell with the expression vector encoding an engineered RNAP according to any of claims 1-31, wherein the cell is genetically modified such that it does not endogenously produce indole, indole-5-carboxyaldehyde (I-5CH0), indoline, quinoline, or isoquinoline; and
- introducing a second cell a that produces endogenous indole, indole-5-carboxyaldehyde (I- 5CH0), indoline, quinoline, or isoquinoline that activates the engineered RNAP of the first cell.
45. A method of any of claims 42-44, wherein expression of endogenous indole, indole-5- carboxyaldehyde (I-5CH0), indoline, quinoline, or isoquinoline is operably linked to an inducible promoter.
46. A method of claim 44, wherein the steps of introducing comprises the step of co-culturing.
47. The method of any of claims 42-44, wherein the cell is further transformed by an expression vector having a target gene, operably linked to a promoter, configured to be transcribed by the engineered RNAP.
48. The method of claim 47, wherein the promoter comprises a T7 RNAP promoter.
49. A method of modulating the gut microbiome in a subject in need thereof, comprising the steps of administering a pharmaceutical compositions containing an effective amount of engineered RNA polymerase according to any of claims 1-31, and a pharmaceutically acceptable carrier.
50. A method of claim 49, wherein said effective amount comprises a cell expressing an effective amount of engineered RNA polymerase according to any of claims 1-31.
51. A method of claim 50, wherein said cell comprises a bacterium that is further genetically modified such that it does not endogenously produce indole, indole-5-carboxyaldehyde (I-5CHO), indoline, quinoline, or isoquinoline.
52. A method of claim 51, wherein the bacterium is further transformed by an expression vector having a target gene, operably linked to a promoter, configured to be transcribed by the engineered RNAP when activated by exogenous indole, indole-5-carboxyaldehyde (I-5CHO), indoline, quinoline, or isoquinoline.
53. A method of claim 52, wherein said bacterium comprises a probiotic.
54. An engineered RNA polymerase (RNAP) polypeptide, or functional fragment thereof, selected from SEQ ID NO’s. 19-21, wherein the engineered RNAP is responsive to indole, indole-5- carboxyaldehyde (I-5CH0), indoline, quinoline, or isoquinoline.
55. A nucleotide sequence encoding an engineered RNA polymerase (RNAP) selected from SEQ ID NO’s. 19-21, or functional fragment thereof, wherein the engineered RNAP is responsive to indole, indole-5-carboxyaldehyde (I-5CH0), indoline, quinoline, or isoquinoline.
56. An expression vector comprising the nucleic acid molecule according to 55, operably linked to an expression control sequence.
57. A cell transformed by the expression vector of claim 56, and capable of expressing the engineered RNA polymerase.
58. The cell of claim 57, wherein said cell is further modified to disrupt or knock out one or more genes directed to the production of indole, one of its derivatives, or an indole similar compound.
59. The cell of claim 58, wherein said cell is modified to disrupt or knock out expression of tryptophanase (TnaA).
60. The engineered RNAP of any of the preceding claims, wherein one or more substitutions comprises a conservative amino acid substitution.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202463566882P | 2024-03-18 | 2024-03-18 | |
| US63/566,882 | 2024-03-18 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025199099A1 true WO2025199099A1 (en) | 2025-09-25 |
Family
ID=97140263
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2025/020365 Pending WO2025199099A1 (en) | 2024-03-18 | 2025-03-18 | Ligand dependent post-translational control of engineered rna polymerase (rnap) mutants from bacteriophage t7 |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025199099A1 (en) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20120064577A1 (en) * | 2010-09-13 | 2012-03-15 | Enzo Biochem, Inc. | Mutant T7 polymerases |
| US20190002851A1 (en) * | 2017-06-30 | 2019-01-03 | Codexis, Inc. | T7 rna polymerase variants |
| US20230174957A1 (en) * | 2020-05-11 | 2023-06-08 | Thermo Fisher Scientific Baltics Uab | Mutant polymerases and methods of using the same |
| US20240043823A1 (en) * | 2017-07-11 | 2024-02-08 | Synthorx, Inc. | Incorporation of unnatural nucleotides and methods thereof |
-
2025
- 2025-03-18 WO PCT/US2025/020365 patent/WO2025199099A1/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20120064577A1 (en) * | 2010-09-13 | 2012-03-15 | Enzo Biochem, Inc. | Mutant T7 polymerases |
| US20190002851A1 (en) * | 2017-06-30 | 2019-01-03 | Codexis, Inc. | T7 rna polymerase variants |
| US20240043823A1 (en) * | 2017-07-11 | 2024-02-08 | Synthorx, Inc. | Incorporation of unnatural nucleotides and methods thereof |
| US20230174957A1 (en) * | 2020-05-11 | 2023-06-08 | Thermo Fisher Scientific Baltics Uab | Mutant polymerases and methods of using the same |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP4262778B2 (en) | Protein fragment complementation assay to detect biomolecular interactions | |
| US20230340538A1 (en) | Compositions and methods for improved site-specific modification | |
| Sun et al. | Enhancing the solubility of recombinant proteins in Escherichia coli by using hexahistidine-tagged maltose-binding protein as a fusion partner | |
| KR20150132395A (en) | Using rna-guided foki nucleases (rfns) to increase specificity for rna-guided genome editing | |
| WO2019010164A1 (en) | EVOLUTION OF ARNT SYNTHÉTASES | |
| Goldberg et al. | Engineered dual selection for directed evolution of SpCas9 PAM specificity | |
| JP2020533964A (en) | Cell-free protein expression using double-stranded concatemer DNA | |
| Vopálenský et al. | Messenger RNAs of yeast virus-like elements contain non-templated 5′ Poly (A) leaders, and their expression is independent of eIF4E and Pab1 | |
| Belukhina et al. | Specificity and Mechanism of tRNA cleavage by the AriB Toprim nuclease of the PARIS bacterial immune system | |
| EP2687603A1 (en) | T7 promoter variants and methods of using the same | |
| CN113366105A (en) | Method for screening in vitro display library in cells | |
| WO2019077119A1 (en) | Primer-independent dna polymerases and their use for dna synthesis | |
| WO2025199099A1 (en) | Ligand dependent post-translational control of engineered rna polymerase (rnap) mutants from bacteriophage t7 | |
| Liu et al. | OMEGA-guided DNA polymerases enable random mutagenesis in a tunable window | |
| Iro et al. | The lysogenic region of virus φCh1: Identification of a repressor-operator system and determination of its activity in halophilic Archaea | |
| Stukenberg et al. | Graded-CRISPRi, a novel tool for tuning the strengths of CRISPRi-mediated knockdowns in Vibrio natriegens using gRNA libraries | |
| Prostova et al. | DNA-targeting short Argonaute triggers effector nuclease to protect bacteria from invaders | |
| CN111778270B (en) | Method for reflecting in vitro cell-free protein expression level by integrating luminescent reporter gene | |
| EP4381094B1 (en) | Argonaute-based nucleic acid detection system | |
| Wen et al. | Directed evolution: novel and improved enzymes | |
| US20230279464A1 (en) | Biosensors for selectively identifying azide ions | |
| US20060057627A1 (en) | Selection scheme for enzymatic function | |
| Vincent et al. | Building a new bacterial orthogonal translation initiation system | |
| US20090269803A1 (en) | In Vivo Generation of Dna, Rna, Peptide, and Protein Libraries | |
| WO2025188249A1 (en) | Compositions and methods for nucleic acid replication |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 25774468 Country of ref document: EP Kind code of ref document: A1 |