WO2025128629A1 - Acides aminés non naturels, protéines bioréactives et leurs utilisations - Google Patents
Acides aminés non naturels, protéines bioréactives et leurs utilisations Download PDFInfo
- Publication number
- WO2025128629A1 WO2025128629A1 PCT/US2024/059463 US2024059463W WO2025128629A1 WO 2025128629 A1 WO2025128629 A1 WO 2025128629A1 US 2024059463 W US2024059463 W US 2024059463W WO 2025128629 A1 WO2025128629 A1 WO 2025128629A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- substituted
- unsubstituted
- protein
- amino acid
- receptor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07F—ACYCLIC, CARBOCYCLIC OR HETEROCYCLIC COMPOUNDS CONTAINING ELEMENTS OTHER THAN CARBON, HYDROGEN, HALOGEN, OXYGEN, NITROGEN, SULFUR, SELENIUM OR TELLURIUM
- C07F9/00—Compounds containing elements of Groups 5 or 15 of the Periodic Table
- C07F9/02—Phosphorus compounds
- C07F9/28—Phosphorus compounds with one or more P—C bonds
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07F—ACYCLIC, CARBOCYCLIC OR HETEROCYCLIC COMPOUNDS CONTAINING ELEMENTS OTHER THAN CARBON, HYDROGEN, HALOGEN, OXYGEN, NITROGEN, SULFUR, SELENIUM OR TELLURIUM
- C07F9/00—Compounds containing elements of Groups 5 or 15 of the Periodic Table
- C07F9/02—Phosphorus compounds
- C07F9/547—Heterocyclic compounds, e.g. containing phosphorus as a ring hetero atom
- C07F9/6564—Heterocyclic compounds, e.g. containing phosphorus as a ring hetero atom having phosphorus atoms, with or without nitrogen, oxygen, sulfur, selenium or tellurium atoms, as ring hetero atoms
- C07F9/6581—Heterocyclic compounds, e.g. containing phosphorus as a ring hetero atom having phosphorus atoms, with or without nitrogen, oxygen, sulfur, selenium or tellurium atoms, as ring hetero atoms having phosphorus and nitrogen atoms with or without oxygen or sulfur atoms, as ring hetero atoms
- C07F9/6584—Heterocyclic compounds, e.g. containing phosphorus as a ring hetero atom having phosphorus atoms, with or without nitrogen, oxygen, sulfur, selenium or tellurium atoms, as ring hetero atoms having phosphorus and nitrogen atoms with or without oxygen or sulfur atoms, as ring hetero atoms having one phosphorus atom as ring hetero atom
- C07F9/65842—Cyclic amide derivatives of acids of phosphorus, in which one nitrogen atom belongs to the ring
- C07F9/65846—Cyclic amide derivatives of acids of phosphorus, in which one nitrogen atom belongs to the ring the phosphorus atom being part of a six-membered ring which may be condensed with another ring system
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/67—General methods for enhancing the expression
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/93—Ligases (6)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y601/00—Ligases forming carbon-oxygen bonds (6.1)
- C12Y601/01—Ligases forming aminoacyl-tRNA and related compounds (6.1.1)
- C12Y601/01026—Pyrrolysine-tRNAPyl ligase (6.1.1.26)
Definitions
- SuFEx click chemistry via the latent aryl fluorosulfate group has demonstrated value in aiding modular organic synthesis, chemical biology, and drug development.
- the inventors incorporated fluorosulfate-L-tyrosine (FSY) into proteins for protein crosslinking and generating covalent protein drugs.
- FSY fluorosulfate-L-tyrosine
- SUMMARY [0005] Provided herein are compounds having the following structures or stereoisomers thereof: nd wherein Z is sulfur o -, or –O-; x is an integer from 1 to 8; x1 is an integer from 0 to 5; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R 1 is hydrogen or an electron withdrawing group; and R 2 and R 3 are each independently substituted or unsubstituted C 1-5 alkyl.
- Z is sulfur.
- Z is phosphorous.
- proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain having the following structures: , integer from 1 to 8; x1 is an integer from 0 to 5; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R 1 is hydrogen or an electron withdrawing group; and R 2 and R 3 are each independently substituted or unsubstituted C1-5 alkyl.
- Z is sulfur.
- Z is phosphorous.
- protein conjugates having the following structures: ; is a bond, -(CH2)1-5-, -O-(CH2)1-5-, or –O-; x is an integer from 1 to 8; x1 is an integer from 0 to 5; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; and R 1 is hydrogen or an electron withdrawing group; R 2 and R 3 are each independently substituted or unsubstituted C 1-5 alkyl; and L 2 and L 3 are as defined herein.
- Z is sulfur.
- Z is phosphorous.
- L 2 is a bond and L 3 -R 5 .
- FIGS.1A-1H show the design, synthesis, and genetic incorporation of PFY into proteins in E. coli and mammalian cells.
- FIG.1A Structure of PFY.
- FIG.1B PFY to react with nucleophilic residues in close proximity via proximity-enabled PFEx reactivity.
- FIG.1C Chemical synthesis of PFY.
- FIG.1D SDS-PAGE and Western blot analysis of PFY incorporation into Afb(36TAG) by the tRNA Pyl /PFYRS in E. coli.
- FIG.1E Bright field and fluorescence microscopic images of HEK-293T cells.
- FIG.1F Western blot analysis of HEK-293T cells in (J).
- FIG.1G Flow cytometric analysis of HEK-293T cells, which were transfected with the tRNA Pyl /PFYRS and the EGFP(182TAG) gene and grown in the presence of various concentrations of PFY for 24 h and 48 h.
- FIG.1H Flow cytometric quantification of fluorescence intensity of the HEK-239T cells in (I) incubated with PFY for 48 h.
- FIGS.2A-2E show that PFY reacts with proximal His, Tyr, Lys, and Cys in proteins through PFEx.
- the reaction occurs at an acidic pH.
- the acidic pH is from 5 to 6.9.
- the acidic pH is from 5.5 to 6.5.
- FIG.2A Structure of the Afb-Z complex (PDB code: 1LP1) showing D36 in Afb for PFY incorporation and N6 in Z protein for mutation to different residues.
- FIG.2B Western blot analysis of Afb(36PFY) protein incubation with Z(6X) protein. X represents mutated residues.
- FIG.2C SDS-PAGE analysis of Afb(36PFY) protein incubation with Z(6X) protein.
- FIG.2D Structure of ecGST (PDB code: 1A0F) showing residue T103 in one monomer for PFY incorporation and the proximal residue H106 in the other monomer for mutations.
- FIG.2E Western blot analysis of ecGST dimeric cross-linking in HEK-293T cells. ecGST(103PFY/106X) was expressed in HEK-293T cells and the cell lysate was probed with anti-Hisx6 antibody to detect the Hisx6 tag appended at the C-terminus of ecGST.
- FIGS.3A-3D show that basic pH increases PFY-Tyr cross-linking but decreases PFY- His cross-linking.
- FIG.3A SDS-PAGE analysis of Afb(36PFY) cross-linking with MBP-Z(6X) at pH 7.4 and pH 8.8.
- FIG.3B Cross-linking efficiency between Afb(36PFY) and MBP- Z(6His) or MBP-Z(6Tyr) at pH 7.4 and pH 8.8.
- FIG.3C SDS-PAGE analysis Afb(36PFY) or Afb(36FSY) cross-linking with MBP-Z(6His) at various pH.
- FIG.3D changes in cross-linking efficiency with varying pH for reaction of Afb(36PFY) or Afb(36FSY) with MBP-Z(6His).
- FIGS.4A-4F show that temperature affects PFY reaction with Tyr and His differently.
- FIG.4A Scheme showing how Afb(36PFY) was incubated with MBP-Z(6X) and then treated at different temperature.
- FIG.4B SDS-PAGE analysis of Afb(36PFY) cross-linking with MBP- Z(6X) under different incubation and treatment temperatures.
- FIG.4C Cross-linking efficiency of Afb(36PFY) with MBP-Z(6Tyr) at different temperatures measured from (B) using densitometry.
- FIG.4D Cross-linking efficiency of Afb(36PFY) with MBP-Z(6His) at different temperatures measured from (B) using densitometry.
- FIG.4E thermal sensitivity of the P(V)-N linkage resulting from PFY reaction with His.
- Top SDS-PAGE analysis of the Afb(36PFY)/MBP-Z(6His) cross-linking product after it was subjected to various temperatures for durations of 5 and 10 minutes.
- Bottom the cross-linking efficiency of Afb(36PFY) with MBP-Z(6His) was evaluated after their cross- linked product was exposed to various temperatures for 5 and 10 min.
- FIG.4F PFY exhibited enhanced durability compared with FSY in proteins.
- the figure shows the remaining activity of AFb(36PFY) and Afb(36FSY) after incubation at 37°C for the indicated number of days.
- the cross-linking efficiency of Afb(36PFY) and Afb(36FSY) with MBP-Z(6His) was measured to quantify their remaining reactivity.
- FIGS.5A-5G show that Na2SiO3 increases PFY reaction with Tyr and Cys but decreases PFY reaction with His.
- FIG.5A Structure of the Afb-Z complex (PDB code: 1LP1) showing E24 in Z protein for PFY incorporation and K7 in Afb for mutation to Cys, Tyr, or His.
- FIGS.5B-5D SDS-PAGE analysis (top panel) and quantification of MBP-Z(24PFY) cross- linking with Afb(7X) (bottom panel) in the presence of different concentration of Na2SiO3.
- B Afb(7Cys);
- C Afb(7Tyr);
- D Afb(7His).
- FIG.5E Structure of the Afb-Z complex (PDB code: 1LP1) showing D36 in Afb for PFY incorporation and N6 in Z protein for mutation to Tyr, or His.
- FIG.5F-5G SDS-PAGE analysis (top panel) and quantification of Afb(36PFY) cross-linking with MBP-Z(6X) (bottom panel) in the presence of different concentration of Na2SiO3.
- F MPB-Z(6Tyr);
- G MPB-Z(6His).
- FIGS.6A-6I show that genetic incorporation of PFK expands protein cross-linking unreachable by PFY in vitro and in cells.
- FIG.6A Structure of PFK.
- FIG.6B Western blot analysis of PFK incorporation in mNb6(54TAG) in E. coli cells.
- FIG.6C Western blot analysis of PFK incorporation in EGFP(182TAG) in HEK-293T cells.
- FIG.6D Structure of nanobody mNb6 binding with the Spike protein of SARS-CoV-2 (PDB code 7KKL). Sites 50-59 for PFK incorporation are colored in purple. R54 in mNb6 and the target residue Y351 in the Spike are shown in stick.
- FIG.6E Western blot analysis of cross-linking between the Spike protein’s receptor binding domain (RBD) and mNb6 mutants with PFK incorporated at the indicated sites.
- FIG.6F Western blot comparison between mNb6(54PFK) and mNb6(54PFY) for cross-linking with the Spike protein’s RBD.
- FIG.6G Structure of ecGST (PDB code: 1A0F) showing residue T103 in one monomer for PFK/PFY incorporation and the proximal residue C10 in the other monomer for mutations.
- FIG.6H Western blot analysis of HEK-293T cell lysate of cells expressing ecGST mutants with PFK/PFY incorporated at site 103 and different mutations at site 10.
- FIG.7 shows the chemical synthesis of PFK.
- FIGS.8A-8B show that PFY was nontoxic to E. coli and HEK-293T cells at 4 mM and 2 mM concentration, respectively.
- FIG.9 shows the suppression of TAG in the EGFP-182TAG gene in E. coli by tRNA Pyl /mFSYRS or tRNA Pyl /NpYRS in the presence of different concentrations of PFY.
- OD600-normalized fluorescence intensity of EGFP measured from cells showed that mFSYRS was able to incorporate PFY more efficiently than NpYRS.
- FIGS.10A-10C show that PFY reactivity in proteins was proximity driven.
- FIG.10A Structure of nanobody SR4 in complex with the Spike protein of SARS-CoV-2 (PDB code: 7C8V).
- FIGS.10B-10C SDS-PAGE analysis of the cross-linking of the Spike protein’s receptor binding domain (RBD) with either SR4(54PFY) or SR4(57PFY).
- FIGS.11A-11B show the P(V)-S linkage resultant from PFY-Cys reaction was stable at 95 °C.
- FIG.11A SDS-PAGE analysis of MBP-Z(24PFY) cross-linking with Afb(7Cys).95 °C sample, 37 °C sample, and 4 °C sample were prepared using the same incubation and treatment temperatures as described in FIG 4A. Na 2 SiO 3 (2 mM) was added to boost PFY reaction with Cys as described in FIG 5.
- FIG.12 shows flow cytometric analysis of PFK incorporation into GFP in HeLa cells.
- HeLa-GFP(182TAG) reporter cells were transfected with the tRNA Pyl /PFKRS genes and grown in the absence or presence of 1 mM PFK for 24 h.
- FIG.13 is an SDS-PAGE analysis of affibody(36PFY) at left and affibody(36PFY/32R) at right crosslinking with MBP-Z (N6Y) protein.
- FIGS.14A-14B show primers for cloning as described in the examples.
- PFY refers to a compound having the following structure: .
- PFK structure: H 2 N 2 .
- the stereoisomer should be considered as encompassed by the compound in each instance, whether or not it is specifically stated.
- the compounds described herein e.g., PFY and PFK
- antibody is used according to its commonly art. Antibodies exist, e.g., as intact immunoglobulins or as a number of well-characterized fragments produced by digestion with various peptidases.
- pepsin digests an antibody below the disulfide linkages in the hinge region to produce F(ab)'2, a dimer of Fab which itself is a light chain joined to V H -C H1 by a disulfide bond.
- F(ab)' 2 is used interchangeably with “Fab dimer.”
- the F(ab)'2 may be reduced under mild conditions to break the disulfide linkage in the hinge region, thereby converting the F(ab)' 2 dimer into an Fab' monomer.
- the Fab' monomer is essentially Fab with part of the hinge region (see Fundamental Immunology (Paul ed., 3d ed.1993)).
- Fab monomer
- Fab fragments
- Fab fragments either produced by the modification of whole antibodies, or those synthesized de novo using recombinant DNA methodologies (e.g., single chain Fv) or those identified using phage display libraries (e.g., McCafferty et al., Nature 348:552-554 (1990)).
- phage display libraries e.g., McCafferty et al., Nature 348:552-554 (1990)
- a natural antibody molecule contains two identical pairs of polypeptide chains, each pair having one light chain and one heavy chain.
- Each light chain and heavy chain in turn consists of two regions: a variable (“V”) region involved in binding the target antigen, and a constant (“C”) region that interacts with other components of the immune system.
- the light and heavy chain variable regions come together in 3-dimensional space to form a variable region that binds the antigen (for example, a receptor on the surface of a cell).
- the complementarity determining regions CDRs
- an “antibody variant” as provided herein refers to a polypeptide capable of binding to a receptor protein or an antigen and including one or more structural domains of an antibody or fragment thereof.
- Non-limiting examples of antibody variants include single-domain antibodies (nanobodies), affibodies (polypeptides smaller than monoclonal antibodies and capable of binding receptor proteins or antigens with high affinity and imitating monoclonal antibodies), antigen-binding fragments (Fab), Fab dimers (monospecific Fab2, bispecific Fab2), trispecific Fab3, monovalent IgGs, single-chain variable fragments (scFv), bispecific diabodies, trispecific triabodies, scFv-Fc, minibodies, IgNAR, V-NAR, hcIgG, VhH, and peptibodies.
- Fab antigen-binding fragments
- Fab dimers monospecific Fab2, bispecific Fab2
- trispecific Fab3 monovalent IgGs
- scFv single-chain variable fragments
- minibodies minibodies, IgNAR, V-NAR, hcIgG, VhH, and pepti
- a “peptibody” as provided herein refers to a peptide moiety attached (through a covalent or non-covalent linker) to the Fc domain of an antibody.
- a “single-domain antibody” or “nanobody” refers to an antibody fragment having a single monomeric variable antibody domain. Like a whole antibody, it is able to bind selectively to a specific antigen.
- the single domain antibody is a human or humanized single-domain antibody.
- a single-chain variable fragment (scFv) is typically a fusion protein of the variable regions of the heavy (VH) and light chains (VL) of immunoglobulins, connected with a short linker peptide of 10 to about 25 amino acids.
- the linker is usually rich in glycine for flexibility, as well as serine or threonine for solubility.
- the linker can either connect the N-terminus of the VH with the C-terminus of the VL, or vice versa.
- the term “target protein” refers to a targeting molecule having, e.g., a regulatory role in a cell.
- a “target protein” is a receptor protein, a cytosolic protein, a transcriptional factor, or an enzyme.
- a receptor protein is an extracellular domain receptor protein, a transmembrane domain receptor protein, or an intracellular domain receptor protein.
- the “target protein” is the binding target of the antibody or antibody variant described herein.
- the protein (e.g., antibody or antibody variant) described herein is capable of inhibiting or activating the biological activity of the target protein upon binding. In embodiments, the activity of the target protein is increased or decreased.
- “Receptor protein” or “membrane receptor” refers to a receptor (protein) that is embedded in the plasma membrane of a cell. In embodiments, the receptor protein is located in the extracellular domain of a cell, the transmembrane domain of a cell, or the intracellular domain of a cell. In embodiments, the receptor protein is a cell-surface receptor. In embodiments, the receptor protein is in the extracellular domain. In embodiments, the receptor protein is in the transmembrane domain.
- the receptor protein is an ion channel- linked receptor, an enzyme-linked receptor, or a G protein-coupled receptor. In embodiments, the receptor protein is a hormone receptor.
- the term “peptidyl moiety” as used herein refers to a protein, protein fragment, or peptide that may form part of a biomolecule or a biomolecule conjugate. In aspects, the peptidyl moiety forms part of a biomolecule (e.g., protein). In aspects, the peptidyl moiety forms part of a biomolecule (e.g., protein) conjugate. The peptidyl moiety may also be substituted with additional chemical moieties (e.g., additional R substituents).
- nucleic acid refers to nucleotides (e.g., deoxyribonucleotides or ribonucleotides) and polymers thereof in either single-, double- or multiple-stranded form, or complements thereof.
- polynucleotide refers, in the usual and customary sense, to a linear sequence of nucleotides.
- nucleotide refers, in the usual and customary sense, to a single unit of a polynucleotide, i.e., a monomer. Nucleotides can be ribonucleotides, deoxyribonucleotides, or modified versions thereof. Examples of polynucleotides contemplated herein include single and double stranded DNA, single and double stranded RNA, and hybrid molecules having mixtures of single and double stranded DNA and RNA. Examples of nucleic acid, e.g. polynucleotides contemplated herein include any types of RNA, e.g.
- a polynucleotide is typically composed of a specific sequence of four nucleotide bases: adenine (A); cytosine (C); guanine (G); and thymine (T) (uracil (U) for thymine (T) when the polynucleotide is RNA).
- A adenine
- C cytosine
- G guanine
- T thymine
- U uracil
- T thymine
- the term “polynucleotide sequence” is the alphabetical representation of a polynucleotide molecule; alternatively, the term may be applied to the polynucleotide molecule itself. This alphabetical representation can be input into databases in a computer having a central processing unit and used for bioinformatics applications such as functional genomics and homology searching.
- Polynucleotides may optionally include one or more non-standard nucleotide(s), nucleotide analog(s) and/or modified nucleotides.
- amino acid refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids.
- Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, ⁇ -carboxyglutamate, and O-phosphoserine.
- Amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an ⁇ carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid.
- Amino acid mimetics refers to chemical compounds that have a structure that is different from the general chemical structure of an amino acid, but that functions in a manner similar to a naturally occurring amino acid.
- non-naturally occurring amino acid and “unnatural amino acid” refer to amino acid analogs, synthetic amino acids, and amino acid mimetics which are not found in nature.
- the unnatural amino acid is an amino acid described herein, including embodiments thereof.
- Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes.
- amino acid side chain refers to the functional substituent contained on amino acids.
- an amino acid side chain may be the side chain of a naturally occurring amino acid.
- Naturally occurring amino acids are those encoded by the genetic code (e.g., alanine, arginine, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine, or valine), as well as those amino acids that are later modified, e.g., hydroxyproline, ⁇ -carboxyglutamate, and O-phosphoserine.
- the amino acid side chain may be a non-natural amino acid side chain.
- the amino acid side chain is H, refers to the functional substituent of compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an ⁇ carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium, allylalanine, 2-aminoisobutryric acid.
- Non-natural amino acids are non- proteinogenic amino acids that either occur naturally or are chemically synthesized. Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid.
- Non-limiting examples include exo-cis-3-aminobicyclo[2.2.1]hept-5-ene-2-carboxylic acid hydrochloride, cis-2- aminocycloheptane-carboxylic acid hydrochloride, cis-6-amino-3-cyclohexene-1-carboxylic acid hydrochloride, cis-2-amino-2-methylcyclohexanecarboxylic acid hydrochloride, cis-2- amino-2-methylcyclopentane-carboxylic acid hydrochloride, 2-(Boc-aminomethyl)benzoic acid, 2-(Boc-amino)octanedioic acid, Boc-4,5-dehydro-Leu-OH (dicyclohexylammonium), Boc-4- (Fmoc-amino)-L-phenylalanine, Boc- ⁇ -Homopyr-OH, Boc-(2-indanyl)-Gly-OH, 4-Bo
- the following eight groups each contain amino acids that are conservative substitutions for one another: (i) Alanine (A), Glycine (G); (ii) Aspartic acid (D), Glutamic acid (E); (iii) Asparagine (N), Glutamine (Q); (iv) Arginine (R), Lysine (K); (v) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); (vi) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); (vii) Serine (S), Threonine (T); and (viii) Cysteine (C), Methionine (M).
- polypeptide refers to a polymer of amino acid residues.
- the polymer of amino acids may, in embodiments, be conjugated to a moiety that does not consist of amino acids.
- the terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers and non-naturally occurring amino acid polymers.
- a “fusion protein” refers to a chimeric protein encoding two or more separate protein sequences that are recombinantly expressed as a single moiety.
- amino acid or nucleotide base “position” is denoted by a number that sequentially identifies each amino acid (or nucleotide base) in the reference sequence based on its position relative to the N-terminus (or 5'-end). Due to deletions, insertions, truncations, fusions, and the like that must be taken into account when determining an optimal alignment, in general the amino acid residue number in a test sequence determined by simply counting from the N- terminus will not necessarily be the same as the number of its corresponding position in the reference sequence. For example, in a case where a variant has a deletion relative to an aligned reference sequence, there will be no amino acid in the variant that corresponds to a position in the reference sequence at the site of deletion.
- Percentage of sequence identity is determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide or polypeptide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.
- nucleic acids or polypeptide sequences refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same (i.e., about 60% identity, or at least 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identity over a specified region, when compared and aligned for maximum correspondence over a comparison window or designated region) as measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default parameters described below, or by manual alignment and visual inspection (e.g., NCBI web site ncbi.nlm.nih.gov/BLAST/ or the like).
- sequences are then said to be “substantially identical.”
- This definition also refers to, or may be applied to, the compliment of a test sequence.
- the definition also includes sequences that have deletions and/or additions, as well as those that have substitutions.
- the preferred algorithms can account for gaps and the like.
- identity exists over a region that is at least about 25 amino acids or nucleotides in length, or more preferably over a region that is 50-100 amino acids or nucleotides in length.
- pyrrolysyl-tRNA synthetase refers to an enzyme (including homologs, isoforms, and functional fragments thereof) with pyrrolysyl-tRNA synthetase activity.
- Pyrrolysyl-tRNA synthetase is an aminoacyl-tRNA synthetase that catalyzes the reaction necessary to attach ⁇ -amino acid pyrrolysine to the cognate tRNA (tRNA pyl ), thereby allowing incorporation of pyrrolysine during proteinogenesis at amber stop codons (i.e., UAG).
- the term includes any recombinant or naturally-occurring form of pyrrolysyl-tRNA synthetase or variants, homologs, or isoforms thereof that maintain pyrrolysyl-tRNA synthetase activity (e.g.
- the pyrrolysyl-tRNA synthetase comprises the amino acid sequence set forth as SEQ ID NO:9.
- the term “mutant pyrrolysyl-tRNA synthetase” or “mutant PylRS” refers to any pyrrolysyl-tRNA synthetase that has a different amino acid sequence from wild-type amino acid sequence.
- tRNA Pyl and “rTNA Pyl CUA ” and “tRNA Pyl C UA ” (i.e., tRNA(superscript Pyl)(subscript CUA)) are used interchangeably and all refer to a single-stranded RNA molecule containing about 70 to 90 nucleotides which fold via intrastrand base pairing to form a characteristic cloverleaf structure that carries a specific amino acid (e.g., compound of Formula (1) or embodiments thereof) and matches it to its corresponding codon (i.e., a complementary to the anticodon of the tRNA) on an mRNA during protein synthesis.
- the anticodon is CUA.
- substrate-binding site refers to residues located in the enzyme active site that form temporary bonds or interactions with the substrate.
- the substrate-binding site of pyrrolysyl-tRNA synthetase refers to residues located in the active site of pyrrolysyl-tRNA synthetase that form temporary bonds or interactions with the amino acid substrate.
- vector refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked.
- plasmid refers to a linear or circular double stranded DNA loop into which additional DNA segments can be ligated.
- vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as “expression vectors.” In general, expression vectors of utility in recombinant DNA techniques are often in the form of plasmids.
- plasmid and “vector” can be used interchangeably as the plasmid is the most commonly used form of vector.
- the disclosure is intended to include such other forms of expression vectors, such as viral vectors (e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses), which serve equivalent functions.
- viral vectors e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses
- Some viral vectors are capable of targeting a particular cells type either specifically or non- specifically.
- Exemplary vectors that can be used include, but are not limited to, pEvol vector, pMP vector, pET vector, pTak vector, pBad vector.
- complex refers to a composition that includes two or more components, where the components bind together to make a functional unit.
- a complex described herein include a mutant pyrrolysyl-tRNA synthetase described herein and an amino acid substrate (e.g., the compound of Formula (1) or embodiments thereof; the compound of Formula (5) or embodiments thereof).
- a complex described herein includes a mutant pyrrolysyl-tRNA synthetase described herein and a tRNA (e.g., tRNA Py ).
- a complex described herein includes a mutant pyrrolysyl-tRNA synthetase described herein, an amino acid substrate (e.g., PFY, PFK) and a tRNA (e.g., tRNA Py ).
- a complex described herein includes at least two components selected from the group consisting of a mutant pyrrolysyl-tRNA synthetase described herein, an amino acid substrate (e.g., the compound of Formula (1) or embodiments thereof), a polypeptide containing the compound of Formula (1) or embodiments thereof, and a tRNA (e.g., tRNA Py ).
- a complex described herein includes at least two components selected from the group consisting of a mutant pyrrolysyl-tRNA synthetase described herein, an amino acid substrate (e.g., the compound of Formula (5) or embodiments thereof), a polypeptide containing the compound of Formula (5) or embodiments thereof, and a tRNA (e.g., tRNA Py ).
- an amino acid substrate e.g., the compound of Formula (5) or embodiments thereof
- a polypeptide containing the compound of Formula (5) or embodiments thereof e.g., tRNA Py .
- protein/protein complex refers to a composition that includes one protein- binding protein (e.g., comprising an unnatural amino acid as described herein) and one protein, where the protein-binding protein and protein are proximal to each other but not bound together; the protein-binding protein and protein are covalently bound together; or the protein-binding protein and protein are ionically bound together.
- the protein-binding protein and protein are proximal to each other but not bound together.
- the protein- binding protein and protein are covalently bonded together.
- the protein-binding protein and protein are ionically bonded together.
- the protein-binding protein and protein are covalently and ionically bonded together.
- the chemical reaction forming the protein/protein complex is a SuFEx reaction.
- the terms “transfection”, “transduction”, “transfecting” or “transducing” can be used interchangeably and are defined as a process of introducing a nucleic acid molecule or a protein to a cell. Nucleic acids are introduced to a cell using non-viral or viral-based methods. The nucleic acid molecules may be gene sequences encoding complete proteins or functional portions thereof. Non-viral methods of transfection include any appropriate transfection method that does not use viral DNA or viral particles as a delivery system to introduce the nucleic acid molecule into the cell.
- Exemplary non-viral transfection methods include calcium phosphate transfection, liposomal transfection, nucleofection, sonoporation, transfection through heat shock, magnetifection and electroporation.
- the nucleic acid molecules are introduced into a cell using electroporation following standard procedures well known in the art.
- any useful viral vector may be used in the methods described herein. Examples for viral vectors include, but are not limited to retroviral, adenoviral, lentiviral and adeno-associated viral vectors.
- the nucleic acid molecules are introduced into a cell using a retroviral vector following standard procedures well known in the art.
- transfection′′ or ′′transduction′′ also refer to introducing proteins into a cell from the external environment. Typically, transduction or transfection of a protein relies on attachment of a peptide or protein capable of crossing the cell membrane to the protein of interest.
- isolated when applied to a nucleic acid or protein, denotes that the nucleic acid or protein is essentially free of other cellular components with which it is associated in the natural state. It can be, for example, in a homogeneous state and may be in either a dry or aqueous solution. Purity and homogeneity are typically determined using analytical chemistry techniques such as polyacrylamide gel electrophoresis or high performance liquid chromatography.
- a protein that is the predominant species present in a preparation is substantially purified.
- therapeutic agent refers to any agent useful in treating and/or preventing a disease.
- Therapeutic agent includes, without limitation, small molecule drugs, proteins, nucleic acids (e.g., DNA, RNA), and the like.
- Small-molecule drugs refers to chemical compounds with low molecular weight that are capable of treating and/or preventing diseases.
- the proteins described herein are bonded to a therapeutic agent. Methods for covalently bonding therapeutic agents to proteins are well-known in the art.
- intermolecular linker refers to a linking group between two biomolecules.
- the peptidyl moiety of R 4 is a first protein and the peptidyl moiety of R 5 is a second (different) protein, such that the first protein and the second protein are covalently bonded.
- the first protein and the second protein can have the same sequence, e.g., providing an intermolecular linker between two different proteins having the same amino acid sequence.
- the first protein and the second protein are different proteins, e.g., providing an intermolecular linker between two different proteins, such as a nanobody and a receptor protein.
- intramolecular linker refers to a linking group within a biomolecule.
- the compounds of Formula (4) or (8) (or embodiments thereof) are an intramolecular linker, then the peptidyl moiety of R 4 and the peptidyl moiety of R 5 are in the same protein.
- a compound having an intramolecular linker may also be referred to as an intramolecularly conjugated protein.
- substituent groups are specified by their conventional chemical formulae, written from left to right, they equally encompass the chemically identical substituents that would result from writing the structure from right to left, e.g., -CH 2 O- is equivalent to -OCH 2 -.
- alkyl by itself or as part of another substituent, means, unless otherwise stated, a straight (i.e., unbranched) or branched carbon chain (or carbon), or combination thereof, which may be fully saturated, mono- or polyunsaturated and can include mono-, di- and multivalent radicals.
- the alkyl may include a designated number of carbons (e.g., C 1 -C 10 means one to ten carbons).
- Alkyl is an uncyclized chain.
- saturated hydrocarbon radicals include, but are not limited to, groups such as methyl, ethyl, n-propyl, isopropyl, n-butyl, t-butyl, isobutyl, sec-butyl, methyl, homologs and isomers of, for example, n-pentyl, n-hexyl, n-heptyl, n-octyl, and the like.
- An unsaturated alkyl group is one having one or more double bonds or triple bonds.
- Examples of unsaturated alkyl groups include, but are not limited to, vinyl, 2- propenyl, crotyl, 2-isopentenyl, 2-(butadienyl), 2,4-pentadienyl, 3-(1,4-pentadienyl), ethynyl, 1- and 3-propynyl, 3-butynyl, and the higher homologs and isomers.
- An alkoxy is an alkyl attached to the remainder of the molecule via an oxygen linker (-O-).
- An alkyl moiety may be an alkenyl moiety.
- An alkyl moiety may be an alkynyl moiety.
- An alkyl moiety may be fully saturated.
- alkenyl may include more than one double bond and/or one or more triple bonds in addition to the one or more double bonds.
- An alkynyl may include more than one triple bond and/or one or more double bonds in addition to the one or more triple bonds.
- alkylene by itself or as part of another substituent, means, unless otherwise stated, a divalent radical derived from an alkyl, as exemplified by, e.g., -CH2CH2CH2CH2-.
- an alkyl (or alkylene) group will have from 1 to 24 carbon atoms, with those groups having 10 or fewer carbon atoms being preferred herein.
- a “lower alkyl” or “lower alkylene” is a shorter chain alkyl or alkylene group, generally having eight or fewer carbon atoms.
- alkenylene by itself or as part of another substituent, means, unless otherwise stated, a divalent radical derived from an alkene.
- heteroalkyl by itself or in combination with another term, means, unless otherwise stated, a stable straight or branched chain, or combinations thereof, including at least one carbon atom and at least one heteroatom (e.g., O, N, P, Si, and S), and wherein the nitrogen and sulfur atoms may optionally be oxidized, and the nitrogen heteroatom may optionally be quaternized.
- heteroatom(s) may be placed at any interior position of the heteroalkyl group or at the position at which the alkyl group is attached to the remainder of the molecule.
- a heteroalkyl moiety may include one heteroatom.
- a heteroalkyl moiety may include two optionally different heteroatoms.
- a heteroalkyl moiety may include three optionally different heteroatoms.
- a heteroalkyl moiety may include four optionally different heteroatoms.
- a heteroalkyl moiety may include five optionally different heteroatoms.
- a heteroalkyl moiety may include up to 8 optionally different heteroatoms.
- the term “heteroalkenyl,” by itself or in combination with another term, means, unless otherwise stated, a heteroalkyl including at least one double bond.
- a heteroalkenyl may optionally include more than one double bond and/or one or more triple bonds in additional to the one or more double bonds.
- a heteroalkynyl may optionally include more than one triple bond and/or one or more double bonds in additional to the one or more triple bonds.
- heteroalkylene by itself or as part of another substituent, means, unless otherwise stated, a divalent radical derived from heteroalkyl, as exemplified, but not limited by, -CH2-CH2-S-CH2-CH2- and -CH2-S-CH2-CH2-NH-CH2-.
- heteroatoms can also occupy either or both of the chain termini (e.g., alkyleneoxy, alkylenedioxy, alkyleneamino, alkylenediamino, and the like).
- no orientation of the linking group is implied by the direction in which the formula of the linking group is written.
- heteroalkyl groups include those groups that are attached to the remainder of the molecule through a heteroatom, such as - C(O)R', -C(O)NR', -NR'R'', -OR', -SR', and/or -SO 2 R'.
- heteroalkyl is recited, followed by recitations of specific heteroalkyl groups, such as -NR'R'' or the like, it will be understood that the terms heteroalkyl and -NR'R'' are not redundant or mutually exclusive.
- heteroalkyl should not be interpreted herein as excluding specific heteroalkyl groups, such as -NR'R'' or the like.
- cycloalkyl and heterocycloalkyl mean, unless otherwise stated, cyclic versions of “alkyl” and “heteroalkyl,” respectively. Cycloalkyl and heterocycloalkyl are not aromatic. Additionally, for heterocycloalkyl, a heteroatom can occupy the position at which the heterocycle is attached to the remainder of the molecule.
- cycloalkyl examples include, but are not limited to, cyclopropyl, cyclobutyl, cyclopentyl, cyclohexyl, 1-cyclohexenyl, 3-cyclohexenyl, cycloheptyl, and the like.
- heterocycloalkyl examples include, but are not limited to, 1-(1,2,5,6- tetrahydropyridyl), 1-piperidinyl, 2-piperidinyl, 3-piperidinyl, 4-morpholinyl, 3-morpholinyl, tetrahydrofuran-2-yl, tetrahydrofuran-3-yl, tetrahydrothien-2-yl, tetrahydrothien-3-yl, 1- piperazinyl, 2-piperazinyl, and the like.
- the term “cycloalkyl” means a monocyclic, bicyclic, or a multicyclic cycloalkyl ring system.
- monocyclic ring systems are cyclic hydrocarbon groups containing from 3 to 8 carbon atoms, where such groups can be saturated or unsaturated, but not aromatic.
- cycloalkyl groups are fully saturated.
- monocyclic cycloalkyls include cyclopropyl, cyclobutyl, cyclopentyl, cyclopentenyl, cyclohexyl, cyclohexenyl, cycloheptyl, and cyclooctyl.
- Bicyclic cycloalkyl ring systems are bridged monocyclic rings or fused bicyclic rings.
- bridged monocyclic rings contain a monocyclic cycloalkyl ring where two non adjacent carbon atoms of the monocyclic ring are linked by an alkylene bridge of between one and three additional carbon atoms (i.e., a bridging group of the form (CH2)w , where w is 1, 2, or 3).
- bicyclic ring systems include, but are not limited to, bicyclo[3.1.1]heptane, bicyclo[2.2.1]heptane, bicyclo[2.2.2]octane, bicyclo[3.2.2]nonane, bicyclo[3.3.1]nonane, and bicyclo[4.2.1]nonane.
- fused bicyclic cycloalkyl ring systems contain a monocyclic cycloalkyl ring fused to either a phenyl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, a monocyclic heterocyclyl, or a monocyclic heteroaryl.
- the bridged or fused bicyclic cycloalkyl is attached to the parent molecular moiety through any carbon atom contained within the monocyclic cycloalkyl ring.
- cycloalkyl groups are optionally substituted with one or two groups which are independently oxo or thia.
- the fused bicyclic cycloalkyl is a 5 or 6 membered monocyclic cycloalkyl ring fused to either a phenyl ring, a 5 or 6 membered monocyclic cycloalkyl, a 5 or 6 membered monocyclic cycloalkenyl, a 5 or 6 membered monocyclic heterocyclyl, or a 5 or 6 membered monocyclic heteroaryl, wherein the fused bicyclic cycloalkyl is optionally substituted by one or two groups which are independently oxo or thia.
- multicyclic cycloalkyl ring systems are a monocyclic cycloalkyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two other ring systems independently selected from the group consisting of a phenyl, a bicyclic aryl, a monocyclic or bicyclic heteroaryl, a monocyclic or bicyclic cycloalkyl, a monocyclic or bicyclic cycloalkenyl, and a monocyclic or bicyclic heterocyclyl.
- multicyclic cycloalkyl is attached to the parent molecular moiety through any carbon atom contained within the base ring.
- multicyclic cycloalkyl ring systems are a monocyclic cycloalkyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two other ring systems independently selected from the group consisting of a phenyl, a monocyclic heteroaryl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, and a monocyclic heterocyclyl.
- a cycloalkyl is a cycloalkenyl.
- the term “cycloalkenyl” is used in accordance with its plain ordinary meaning.
- a cycloalkenyl is a monocyclic, bicyclic, or a multicyclic cycloalkenyl ring system.
- monocyclic cycloalkenyl ring systems are cyclic hydrocarbon groups containing from 3 to 8 carbon atoms, where such groups are unsaturated (i.e., containing at least one annular carbon carbon double bond), but not aromatic.
- monocyclic cycloalkenyl ring systems include cyclopentenyl and cyclohexenyl.
- bicyclic cycloalkenyl rings are bridged monocyclic rings or a fused bicyclic rings.
- bridged monocyclic rings contain a monocyclic cycloalkenyl ring where two non adjacent carbon atoms of the monocyclic ring are linked by an alkylene bridge of between one and three additional carbon atoms (i.e., a bridging group of the form (CH 2 ) w , where w is 1, 2, or 3).
- alkylene bridge of between one and three additional carbon atoms
- bicyclic cycloalkenyls include, but are not limited to, norbornenyl and bicyclo[2.2.2]oct 2 enyl.
- fused bicyclic cycloalkenyl ring systems contain a monocyclic cycloalkenyl ring fused to either a phenyl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, a monocyclic heterocyclyl, or a monocyclic heteroaryl.
- the bridged or fused bicyclic cycloalkenyl is attached to the parent molecular moiety through any carbon atom contained within the monocyclic cycloalkenyl ring.
- cycloalkenyl groups are optionally substituted with one or two groups which are independently oxo or thia.
- multicyclic cycloalkenyl rings contain a monocyclic cycloalkenyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two ring systems independently selected from the group consisting of a phenyl, a bicyclic aryl, a monocyclic or bicyclic heteroaryl, a monocyclic or bicyclic cycloalkyl, a monocyclic or bicyclic cycloalkenyl, and a monocyclic or bicyclic heterocyclyl.
- multicyclic cycloalkenyl is attached to the parent molecular moiety through any carbon atom contained within the base ring.
- multicyclic cycloalkenyl rings contain a monocyclic cycloalkenyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two ring systems independently selected from the group consisting of a phenyl, a monocyclic heteroaryl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, and a monocyclic heterocyclyl.
- a heterocycloalkyl is a heterocyclyl.
- heterocyclyl as used herein, means a monocyclic, bicyclic, or multicyclic heterocycle.
- the heterocyclyl monocyclic heterocycle is a 3, 4, 5, 6 or 7 membered ring containing at least one heteroatom independently selected from the group consisting of O, N, and S where the ring is saturated or unsaturated, but not aromatic.
- the 3 or 4 membered ring contains 1 heteroatom selected from the group consisting of O, N and S.
- the 5 membered ring can contain zero or one double bond and one, two or three heteroatoms selected from the group consisting of O, N and S.
- the 6 or 7 membered ring contains zero, one or two double bonds and one, two or three heteroatoms selected from the group consisting of O, N and S.
- the heterocyclyl monocyclic heterocycle is connected to the parent molecular moiety through any carbon atom or any nitrogen atom contained within the heterocyclyl monocyclic heterocycle.
- heterocyclyl monocyclic heterocycles include, but are not limited to, azetidinyl, azepanyl, aziridinyl, diazepanyl, 1,3-dioxanyl, 1,3-dioxolanyl, 1,3-dithiolanyl, 1,3-dithianyl, imidazolinyl, imidazolidinyl, isothiazolinyl, isothiazolidinyl, isoxazolinyl, isoxazolidinyl, morpholinyl, oxadiazolinyl, oxadiazolidinyl, oxazolinyl, oxazolidinyl, piperazinyl, piperidinyl, pyranyl, pyrazolinyl, pyrazolidinyl, pyrrolinyl, pyrrolidinyl, tetrahydrofuranyl, tetrahydrothienyl
- the heterocyclyl bicyclic heterocycle is a monocyclic heterocycle fused to either a phenyl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, a monocyclic heterocycle, or a monocyclic heteroaryl.
- the heterocyclyl bicyclic heterocycle is connected to the parent molecular moiety through any carbon atom or any nitrogen atom contained within the monocyclic heterocycle portion of the bicyclic ring system.
- bicyclic heterocyclyls include, but are not limited to, 2,3-dihydrobenzofuran-2-yl, 2,3-dihydrobenzofuran-3-yl, indolin-1-yl, indolin-2-yl, indolin-3-yl, 2,3-dihydrobenzothien-2-yl, decahydroquinolinyl, decahydroisoquinolinyl, octahydro-1H-indolyl, and octahydrobenzofuranyl.
- heterocyclyl groups are optionally substituted with one or two groups which are independently oxo or thia.
- the bicyclic heterocyclyl is a 5 or 6 membered monocyclic heterocyclyl ring fused to a phenyl ring, a 5 or 6 membered monocyclic cycloalkyl, a 5 or 6 membered monocyclic cycloalkenyl, a 5 or 6 membered monocyclic heterocyclyl, or a 5 or 6 membered monocyclic heteroaryl, wherein the bicyclic heterocyclyl is optionally substituted by one or two groups which are independently oxo or thia.
- Multicyclic heterocyclyl ring systems are a monocyclic heterocyclyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two other ring systems independently selected from the group consisting of a phenyl, a bicyclic aryl, a monocyclic or bicyclic heteroaryl, a monocyclic or bicyclic cycloalkyl, a monocyclic or bicyclic cycloalkenyl, and a monocyclic or bicyclic heterocyclyl.
- multicyclic heterocyclyl is attached to the parent molecular moiety through any carbon atom or nitrogen atom contained within the base ring.
- multicyclic heterocyclyl ring systems are a monocyclic heterocyclyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two other ring systems independently selected from the group consisting of a phenyl, a monocyclic heteroaryl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, and a monocyclic heterocyclyl.
- multicyclic heterocyclyl groups include, but are not limited to 10H-phenothiazin-10-yl, 9,10- dihydroacridin-9-yl, 9,10-dihydroacridin-10-yl, 10H-phenoxazin-10-yl, 10,11-dihydro-5H- dibenzo[b,f]azepin-5-yl, 1,2,3,4-tetrahydropyrido[4,3-g]isoquinolin-2-yl, 12H- benzo[b]phenoxazin-12-yl, and dodecahydro-1H-carbazol-9-yl.
- halo or “halogen,” by themselves or as part of another substituent, mean, unless otherwise stated, a fluorine, chlorine, bromine, or iodine atom. Additionally, terms such as “haloalkyl” are meant to include monohaloalkyl and polyhaloalkyl.
- halo(C1-C4)alkyl includes, but is not limited to, fluoromethyl, difluoromethyl, trifluoromethyl, 2,2,2-trifluoroethyl, 4-chlorobutyl, 3-bromopropyl, and the like.
- acyl means, unless otherwise stated, -C(O)R where R is a substituted or unsubstituted alkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
- aryl means, unless otherwise stated, a polyunsaturated, aromatic, hydrocarbon substituent, which can be a single ring or multiple rings (preferably from 1 to 3 rings) that are fused together (i.e., a fused ring aryl) or linked covalently.
- a fused ring aryl refers to multiple rings fused together wherein at least one of the fused rings is an aryl ring.
- heteroaryl refers to aryl groups (or rings) that contain at least one heteroatom such as N, O, or S, wherein the nitrogen and sulfur atoms are optionally oxidized, and the nitrogen atom(s) are optionally quaternized.
- heteroaryl includes fused ring heteroaryl groups (i.e., multiple rings fused together wherein at least one of the fused rings is a heteroaromatic ring).
- a 5,6-fused ring heteroarylene refers to two rings fused together, wherein one ring has 5 members and the other ring has 6 members, and wherein at least one ring is a heteroaryl ring.
- a 6,6-fused ring heteroarylene refers to two rings fused together, wherein one ring has 6 members and the other ring has 6 members, and wherein at least one ring is a heteroaryl ring.
- a 6,5- fused ring heteroarylene refers to two rings fused together, wherein one ring has 6 members and the other ring has 5 members, and wherein at least one ring is a heteroaryl ring.
- a heteroaryl group can be attached to the remainder of the molecule through a carbon or heteroatom.
- Non- limiting examples of aryl and heteroaryl groups include phenyl, naphthyl, pyrrolyl, pyrazolyl, pyridazinyl, triazinyl, pyrimidinyl, imidazolyl, pyrazinyl, purinyl, oxazolyl, isoxazolyl, thiazolyl, furyl, thienyl, pyridyl, pyrimidyl, benzothiazolyl, benzoxazoyl benzimidazolyl, benzofuran, isobenzofuranyl, indolyl, isoindolyl, benzothiophenyl, isoquinolyl, quinoxalinyl, quinolyl, 1-naphthyl, 2-naphthyl, 4-biphenyl, 1-pyrrolyl, 2-pyrrolyl, 3-pyrrolyl, 3-pyrazolyl, 2- imidazolyl, 4-imid
- Substituents for each of the above noted aryl and heteroaryl ring systems are selected from the group of acceptable substituents described below.
- a heteroaryl group substituent may be -O- bonded to a ring heteroatom nitrogen.
- a fused ring heterocyloalkyl-aryl is an aryl fused to a heterocycloalkyl.
- a fused ring heterocycloalkyl-heteroaryl is a heteroaryl fused to a heterocycloalkyl.
- a fused ring heterocycloalkyl-cycloalkyl is a heterocycloalkyl fused to a cycloalkyl.
- a fused ring heterocycloalkyl-heterocycloalkyl is a heterocycloalkyl fused to another heterocycloalkyl.
- Fused ring heterocycloalkyl-aryl, fused ring heterocycloalkyl-heteroaryl, fused ring heterocycloalkyl- cycloalkyl, or fused ring heterocycloalkyl-heterocycloalkyl may each independently be unsubstituted or substituted with one or more of the substituents described herein.
- Spirocyclic rings are two or more rings wherein adjacent rings are attached through a single atom.
- the individual rings within spirocyclic rings may be identical or different.
- Individual rings in spirocyclic rings may be substituted or unsubstituted and may have different substituents from other individual rings within a set of spirocyclic rings.
- Possible substituents for individual rings within spirocyclic rings are the possible substituents for the same ring when not part of spirocyclic rings (e.g. substituents for cycloalkyl or heterocycloalkyl rings).
- Spirocyclic rings may be substituted or unsubstituted cycloalkyl, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkyl or substituted or unsubstituted heterocycloalkylene and individual rings within a spirocyclic ring group may be any of the immediately previous list, including having all rings of one type (e.g. all rings being substituted heterocycloalkylene wherein each ring may be the same or different substituted heterocycloalkylene).
- heterocyclic spirocyclic rings means a spirocyclic rings wherein at least one ring is a heterocyclic ring and wherein each ring may be a different ring.
- substituted spirocyclic rings means that at least one ring is substituted and each substituent may optionally be different.
- alkylsulfonyl means a moiety having the formula -S(O2)-R', where R' is a substituted or unsubstituted alkyl group as defined above. R' may have a specified number of carbons (e.g., “C 1 -C 4 alkylsulfonyl”).
- alkylarylene as an arylene moiety covalently bonded to an alkylene moiety (also referred to herein as an alkylene linker).
- An alkylarylene moiety may be substituted (e.g. with a substituent group) on the alkylene moiety or the arylene linker (e.g.
- alkylarylene is unsubstituted.
- R, R', R'', R'', and R''' each preferably independently refer to hydrogen, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl (e.g., aryl substituted with 1-3 halogens), substituted or unsubstituted heteroaryl, substituted or unsubstituted alkyl, alkoxy, or thioalkoxy groups, or arylalkyl groups.
- aryl e.g., aryl substituted with 1-3 halogens
- substituted or unsubstituted heteroaryl substituted or unsubstituted alkyl, alkoxy, or thioalkoxy groups, or arylalkyl groups.
- each of the R groups is independently selected as are each R', R'', R''', and R''' group when more than one of these groups is present.
- R' and R'' are attached to the same nitrogen atom, they can be combined with the nitrogen atom to form a 4-, 5-, 6-, or 7-membered ring.
- -NR'R'' includes, but is not limited to, 1-pyrrolidinyl and 4-morpholinyl.
- alkyl is meant to include groups including carbon atoms bound to groups other than hydrogen groups, such as haloalkyl (e.g., -CF3 and -CH2CF3) and acyl (e.g., -C(O)CH3, -C(O)CF3, -C(O)CH2OCH3, and the like).
- haloalkyl e.g., -CF3 and -CH2CF3
- acyl e.g., -C(O)CH3, -C(O)CF3, -C(O)CH2OCH3, and the like.
- each of the R groups is independently selected as are each R', R'', R'', and R''' groups when more than one of these groups is present.
- Substituents for rings e.g. cycloalkyl, heterocycloalkyl, aryl, heteroaryl, cycloalkylene, heterocycloalkylene, arylene, or heteroarylene
- substituents on the ring may be depicted as substituents on the ring rather than on a specific atom of a ring (commonly referred to as a floating substituent).
- the substituent may be attached to any of the ring atoms (obeying the rules of chemical valency) and in the case of fused rings or spirocyclic rings, a substituent depicted as associated with one member of the fused rings or spirocyclic rings (a floating substituent on a single ring), may be a substituent on any of the fused rings or spirocyclic rings (a floating substituent on multiple rings).
- the multiple substituents may be on the same atom, same ring, different atoms, different fused rings, different spirocyclic rings, and each substituent may optionally be different.
- a point of attachment of a ring to the remainder of a molecule is not limited to a single atom (a floating substituent)
- the attachment point may be any atom of the ring and in the case of a fused ring or spirocyclic ring, any atom of any of the fused rings or spirocyclic rings while obeying the rules of chemical valency.
- a ring, fused rings, or spirocyclic rings contain one or more ring heteroatoms and the ring, fused rings, or spirocyclic rings are shown with one more floating substituents (including, but not limited to, points of attachment to the remainder of the molecule), the floating substituents may be bonded to the heteroatoms.
- the ring heteroatoms are shown bound to one or more hydrogens (e.g. a ring nitrogen with two bonds to ring atoms and a third bond to a hydrogen) in the structure or formula with the floating substituent, when the heteroatom is bonded to the floating substituent, the substituent will be understood to replace the hydrogen, while obeying the rules of chemical valency.
- Two or more substituents may optionally be joined to form aryl, heteroaryl, cycloalkyl, or heterocycloalkyl groups.
- Such so-called ring-forming substituents are typically, though not necessarily, found attached to a cyclic base structure.
- the ring-forming substituents are attached to adjacent members of the base structure.
- two ring-forming substituents attached to adjacent members of a cyclic base structure create a fused ring structure.
- the ring-forming substituents are attached to a single member of the base structure.
- two ring-forming substituents attached to a single member of a cyclic base structure create a spirocyclic structure.
- the ring-forming substituents are attached to non-adjacent members of the base structure.
- Two of the substituents on adjacent atoms of the aryl or heteroaryl ring may optionally form a ring of the formula -T-C(O)-(CRR') q -U-, wherein T and U are independently -NR-, -O-, -CRR'-, or a single bond, and q is an integer of from 0 to 3.
- two of the substituents on adjacent atoms of the aryl or heteroaryl ring may optionally be replaced with a substituent of the formula -A-(CH2)r-B-, wherein A and B are independently -CRR'-, -O-, -NR-, -S-, -S(O) -, -S(O) 2 -, -S(O) 2 NR'-, or a single bond, and r is an integer of from 1 to 4.
- One of the single bonds of the new ring so formed may optionally be replaced with a double bond.
- two of the substituents on adjacent atoms of the aryl or heteroaryl ring may optionally be replaced with a substituent of the formula -(CRR')s-X'- (C''R''R'')d-, where s and d are independently integers of from 0 to 3, and X' is -O-, -NR'-, -S-, -S(O)-, -S(O) 2 -, or -S(O) 2 NR'-.
- R, R', R'', and R''' are preferably independently selected from hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, and substituted or unsubstituted heteroaryl.
- heteroatom or “ring heteroatom” are meant to include oxygen (O), nitrogen (N), sulfur (S), phosphorus (P), and silicon (Si).
- a “substituent group,” as used herein, means a group selected from the following moieties: [0086] (A) oxo, halogen, -CCl 3 , -CBr 3 , -CF 3 , -CI 3 ,-CN, -OH, -NH 2 , -COOH, -CONH 2 , -NO 2 , -SH, -SO3H, -SO4H, -SO2NH2, ⁇ NHNH2, ⁇ ONH2, ⁇ NHC(O)NHNH2, -NHC(O)NH2, -NHSO2H, -NHC(O)H, -NHC(O)OH, -NHOH, -OCCl3, -OCF3, -OCBr3, -OCI3,-OCHCl2, -OCHBr2, -OCHI 2 , -OCHF 2 , unsubstituted alkyl (e.g., C 1 -C 8 alkyl,
- each substituted group described in the compounds herein is substituted with at least one substituent group. More specifically, in embodiments, each substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene described in the compounds herein are substituted with at least one substituent group. In embodiments, at least one or all of these groups are substituted with at least one size-limited substituent group. In embodiments, at least one or all of these groups are substituted with at least one lower substituent group.
- each substituted or unsubstituted alkyl may be a substituted or unsubstituted C1-C20 alkyl
- each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 20 membered heteroalkyl
- each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C3-C8 cycloalkyl
- each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 8 membered heterocycloalkyl
- each substituted or unsubstituted aryl is a substituted or unsubstituted C6-C10 aryl
- each substituted or unsubstituted heteroaryl is a substituted or unsubstituted 5 to 10 membered heteroaryl.
- each substituted or unsubstituted alkylene is a substituted or unsubstituted C 1 -C 20 alkylene
- each substituted or unsubstituted heteroalkylene is a substituted or unsubstituted 2 to 20 membered heteroalkylene
- each substituted or unsubstituted cycloalkylene is a substituted or unsubstituted C 3 -C 8 cycloalkylene
- each substituted or unsubstituted heterocycloalkylene is a substituted or unsubstituted 3 to 8 membered heterocycloalkylene
- each substituted or unsubstituted arylene is a substituted or unsubstituted C6-C10 arylene
- each substituted or unsubstituted heteroarylene is a substituted or unsubstituted 5 to 10 membered heteroarylene.
- each substituted or unsubstituted alkyl is a substituted or unsubstituted C1-C8 alkyl
- each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 8 membered heteroalkyl
- each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C3-C7 cycloalkyl
- each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 7 membered heterocycloalkyl
- each substituted or unsubstituted aryl is a substituted or unsubstituted C6-C10 aryl
- each substituted or unsubstituted heteroaryl is a substituted or unsubstituted 5 to 9 membered heteroaryl.
- each substituted or unsubstituted alkylene is a substituted or unsubstituted C1-C8 alkylene
- each substituted or unsubstituted heteroalkylene is a substituted or unsubstituted 2 to 8 membered heteroalkylene
- each substituted or unsubstituted cycloalkylene is a substituted or unsubstituted C 3 -C 7 cycloalkylene
- each substituted or unsubstituted heterocycloalkylene is a substituted or unsubstituted 3 to 7 membered heterocycloalkylene
- each substituted or unsubstituted arylene is a substituted or unsubstituted C 6 -C 10 arylene
- each substituted or unsubstituted heteroarylene is a substituted or unsubstituted 5 to 9 membered heteroarylene.
- a substituted or unsubstituted moiety e.g., substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, and/or substituted or unsubstituted heteroarylene) is unsubstituted (e.g., is an unsubstituted alkyl, unsubstituted heteroalkyl, unsubstituted cycloalkyl, unsubstituted
- a substituted or unsubstituted moiety e.g., substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, and/or substituted or unsubstituted heteroarylene) is substituted (e.g., is a substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alky
- a substituted moiety e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene
- is substituted with at least one substituent group wherein if the substituted moiety is substituted with a plurality of substituent groups, each substituent group may optionally be different. In embodiments, if the substituted moiety is substituted with a plurality of substituent groups, each substituent group is different.
- a substituted moiety e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene
- is substituted with at least one size-limited substituent group wherein if the substituted moiety is substituted with a plurality of size-limited substituent groups, each size-limited substituent group may optionally be different.
- each size-limited substituent group is different.
- a substituted moiety e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene
- each lower substituent group is different.
- a substituted moiety e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene
- the substituted moiety is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group is different.
- substituent groups size-limited substituent groups, and lower substituent groups
- each substituent group, size-limited substituent group, and/or lower substituent group is different.
- Certain compounds of the present disclosure possess asymmetric carbon atoms (optical or chiral centers) or double bonds; the enantiomers, racemates, diastereomers, tautomers, geometric isomers, stereoisometric forms that may be defined, in terms of absolute stereochemistry, as (R)-or (S)- or, as (D)- or (L)- for amino acids, and individual isomers are encompassed within the scope of the present disclosure.
- the compounds of the present disclosure do not include those that are known in art to be too unstable to synthesize and/or isolate.
- the present disclosure is meant to include compounds in racemic and optically pure forms.
- Optically active (R)- and (S)-, or (D)- and (L)-isomers may be prepared using chiral synthons or chiral reagents, or resolved using conventional techniques.
- the compounds described herein contain olefinic bonds or other centers of geometric asymmetry, and unless specified otherwise, it is intended that the compounds include both E and Z geometric isomers.
- the term “isomers” refers to compounds having the same number and kind of atoms, and hence the same molecular weight, but differing in respect to the structural arrangement or configuration of the atoms.
- the term “tautomer,” as used herein, refers to one of two or more structural isomers which exist in equilibrium and which are readily converted from one isomeric form to another. It will be apparent to one skilled in the art that certain compounds of this disclosure may exist in tautomeric forms, all such tautomeric forms of the compounds being within the scope of the disclosure. Unless otherwise stated, structures depicted herein are also meant to include all stereochemical forms of the structure; i.e., the R and S configurations for each asymmetric center.
- the compounds described herein may also contain unnatural proportions of atomic isotopes at one or more of the atoms that constitute such compounds.
- the compounds may be radiolabeled with radioactive isotopes, such as for example tritium ( 3 H), iodine-125 ( 125 I), or carbon-14 ( 14 C). All isotopic variations of the compounds described herein, whether radioactive or not, are encompassed within the scope of the present disclosure.
- Analog or “analogue” is used in accordance with its plain ordinary meaning within Chemistry and Biology and refers to a chemical compound that is structurally similar to another compound (i.e., a so-called “reference” compound) but differs in composition, e.g., in the replacement of one atom by an atom of a different element, or in the presence of a particular functional group, or the replacement of one functional group by another functional group, or the absolute stereochemistry of one or more chiral centers of the reference compound. Accordingly, an analog is a compound that is similar or comparable in function and appearance but not in structure or origin to a reference compound. [0106] The terms “a” or “an,” as used in herein means one or more.
- substituted with a[n] means the specified group may be substituted with one or more of any or all of the named substituents.
- a group such as an alkyl or heteroaryl group, is “substituted with an unsubstituted C 1 -C 20 alkyl, or unsubstituted 2 to 20 membered heteroalkyl,” the group may contain one or more unsubstituted C 1 -C 20 alkyls, and/or one or more unsubstituted 2 to 20 membered heteroalkyls.
- R-substituted where a moiety is substituted with an R substituent, the group may be referred to as “R-substituted.” Where a moiety is R-substituted, the moiety is substituted with at least one R substituent and each R substituent is optionally different. Where a particular R group is present in the description of a chemical genus (such as Formula (1)), a Roman alphabetic symbol may be used to distinguish each appearance of that particular R group. For example, where multiple R 3 substituents are present, each R 3 substituent may be distinguished as R 3A , R 3B , wherein each of R 3A , R 3B , is defined within the scope of the definition of R 3 and optionally differently.
- variable e.g., moiety or linker
- a compound or of a compound genus e.g., a genus described herein
- the unfilled valence(s) of the variable will be dictated by the context in which the variable is used.
- variable of a compound as described herein when a variable of a compound as described herein is connected (e.g., bonded) to the remainder of the compound through a single bond, that variable is understood to represent a monovalent form (i.e., capable of forming a single bond due to an unfilled valence) of a standalone compound (e.g., if the variable is named “methane” in an embodiment but the variable is known to be attached by a single bond to the remainder of the compound, a person of ordinary skill in the art would understand that the variable is actually a monovalent form of methane, i.e., methyl or – CH3).
- variable is the divalent form of a standalone compound (e.g., if the variable is assigned to “PEG” or “polyethylene glycol” in an embodiment but the variable is connected by two separate bonds to the remainder of the compound, a person of ordinary skill in the art would understand that the variable is a divalent (i.e., capable of forming two bonds through two unfilled valences) form of PEG instead of the standalone compound PEG).
- bond refers to direct bonds, such as covalent bonds (e.g., direct or a linking group), or indirect bonds, such as non-covalent bond (e.g., electrostatic interactions (e.g., ionic bond, hydrogen bond, halogen bond), van der Waals interactions (e.g., dipole-dipole, dipole-induced dipole, London dispersion), ring stacking (pi effects), hydrophobic interactions, and the like).
- electrostatic interactions e.g., ionic bond, hydrogen bond, halogen bond
- van der Waals interactions e.g., dipole-dipole, dipole-induced dipole, London dispersion
- ring stacking pi effects
- hydrophobic interactions and the like.
- the term “electron-donating group” refers to a chemical moiety or substituent that can donate electron density into a conjugated pi-electron system, thereby making the pi electron system more nucleophilic.
- the terms “bind” and “bound” as used herein is used in accordance with its plain and ordinary meaning and refers to the association between atoms or molecules. The association can be direct or indirect. For example, bound atoms or molecules may be bound, e.g., by covalent bond, linker (e.g. a first linker or second linker), or non-covalent bond (e.g. electrostatic interactions (e.g. ionic bond, hydrogen bond, halogen bond), van der Waals interactions (e.g.
- the term “capable of binding” as used herein refers to a moiety (e.g., a single-domain antibody or a recombinant protein as described herein, i.e., comprising an unnatural amino acid side chain that is capable of binding to an amino acid residue on a different protein) that is able to measurably bind to a target.
- a moiety e.g., a single-domain antibody or a recombinant protein as described herein, i.e., comprising an unnatural amino acid side chain that is capable of binding to an amino acid residue on a different protein
- a moiety is capable of binding a target
- the moiety is capable of binding with a Kd of less than about 10 ⁇ M, 5 ⁇ M, 1 ⁇ M, 500 nM, 250 nM, 100 nM, 75 nM, 50 nM, 25 nM, 15 nM, 10 nM, 5 nM, 1 nM, or about 0.1 nM.
- Arginine refers to the amino acid having the following structure: arginine” and a
- the term “naturally-occurring arginine” refers to an arginine that naturally occurs at a given position in a protein.
- non-naturally occurring arginine refers to an arginine that is a point mutation of a different amino acid (e.g., Ala, Ile, Leu, Met, Val, Phe, Trp, Tyr, Asn, Cys, Gln, Ser, Thr, Asp, Glu, His, Lys, Gly, Pro) that naturally occurs in a protein.
- a different amino acid e.g., Ala, Ile, Leu, Met, Val, Phe, Trp, Tyr, Asn, Cys, Gln, Ser, Thr, Asp, Glu, His, Lys, Gly, Pro
- proximal is based on the three-dimensional structure of the protein. In embodiments, “proximal” means up to about 25 angstroms. In embodiments, “proximal” means up to about 20 angstroms.
- proximal means up to about 15 angstroms. In embodiments, “proximal” means up to about 10 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 25 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 20 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 15 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 12 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 10 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 8 angstroms.
- proximal means from about 1 angstrom to about 6 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 5 angstroms. In embodiments, “proximal” means from about 1 angstroms to about 4 angstroms. [0118] In embodiments, the term “proximal” means that the naturally or non-naturally occurring arginine is within 1 to about 10 amino acid residues of the unnatural amino acid. In embodiments, the naturally or non-naturally occurring arginine is within 1 to about 9 amino acid residues of the unnatural amino acid. In embodiments, the naturally or non-naturally occurring arginine is within 1 to about 8 amino acid residues of the unnatural amino acid.
- the naturally or non-naturally occurring arginine is within 1 to about 7 amino acid residues of the unnatural amino acid. In embodiments, the naturally or non-naturally occurring arginine is within 1 to about 6 amino acid residues of the unnatural amino acid. In embodiments, the naturally or non-naturally occurring arginine is within 1 to about 5 amino acid residues of the unnatural amino acid. In embodiments, the naturally or non-naturally occurring arginine is within 1 to about 4 amino acid residues of the unnatural amino acid. In embodiments, the naturally or non-naturally occurring arginine is within 1 to about 3 amino acid residues of the unnatural amino acid.
- the naturally or non-naturally occurring arginine is within 1 to about 2 amino acid residues of the unnatural amino acid. In embodiments, the naturally or non-naturally occurring arginine is within 2 to about 6 amino acid residues of the unnatural amino acid. In embodiments, the naturally or non-naturally occurring arginine is within 2 to about 5 amino acid residues of the unnatural amino acid. In embodiments, the naturally or non-naturally occurring arginine is within 2 to about 4 amino acid residues of the unnatural amino acid. In embodiments, the naturally or non-naturally occurring arginine is within 2 to about 3 amino acid residues of the unnatural amino acid.
- the naturally or non-naturally occurring arginine is within 1 amino acid residue of the unnatural amino acid. In embodiments, the naturally or non-naturally occurring arginine is within 2 amino acid residues of the unnatural amino acid. In embodiments, the naturally or non-naturally occurring arginine is within 3 amino acid residues of the unnatural amino acid. In embodiments, the naturally or non-naturally occurring arginine is within 4 amino acid residues of the unnatural amino acid. In embodiments, the naturally or non-naturally occurring arginine is within 5 amino acid residues of the unnatural amino acid.
- the phrase “within 1” means that the naturally or non- naturally-occurring arginine is next to the unnatural amin acid.
- the compound of Formula (1) is a compound of Formula (IA) or a stereoisomer thereof: O , wherein x is an alkylene, or substituted or unsubstituted heteroalkylene; R 1 is hydrogen or an electron withdrawing group; and R 2 and R 3 are each independently substituted or unsubstituted C 1-5 alkyl.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 and R 3 are each independently unsubstituted C 1-5 alkyl. The substituents are described in more detail below.
- the compound of Formula (1) is a compound of Formula (1B) or a stereoisomer thereof: , wherein x is an alkylene, or substituted or group; and R 2 and R 3 are each independently substituted or unsubstituted C1-5 alkyl.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 and R 3 are each independently unsubstituted C1-5 alkyl. The substituents are described in more detail below.
- the compound of Formula (1) is a compound of Formula (1C) or a stereoisomer thereof: , wherein x is an alkylene, or substituted or unsubstituted heteroalkylene; and R 1 is hydrogen or an electron withdrawing group.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- the compound of Formula (1) is a compound of Formula (1D) or a stereoisomer thereof: , wherein x is an alkylene, or substituted or unsubstituted heteroalkylene; and R 2 and R 3 are each independently substituted or unsubstituted C 1-5 alkyl.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 and R 3 are each independently unsubstituted C1-5 alkyl.
- the substituents are described in more detail below.
- the compound of Formula (1) is a compound of Formula (1E) or a stereoisomer thereof: , wherein x is an alkylene, or substituted or a or substituted or unsubstituted heteroalkylene. The substituents are described in more detail below.
- the compound of Formula (1) is PFY: . [0127] In H 2 N .
- x is an substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R 1 is hydrogen or an electron withdrawing group; and R 2 is substituted or unsubstituted C 1-5 alkyl.
- x1 is 0. In embodiments, x1 is 1. In embodiments, x1 is 2. In embodiments, x1 is 3. In embodiments, x1 is 4. In embodiments, x1 is 5.
- L 1 is a bond or substituted or unsubstituted heteroalkylene. In embodiments, R 2 is unsubstituted C1-5 alkyl. The substituents are described in more detail below.
- the compound of Formula (5) is a compound of Formula (5A) or a stereoisomer thereof: , wherein x is an substituted or unsubstituted or or R 2 is substituted or unsubstituted C 1-5 alkyl.
- x1 is 0.
- x1 is 1.
- x1 is 2.
- x1 is 3.
- x1 is 4.
- x1 is 5.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 is unsubstituted C1-5 alkyl. The substituents are described in more detail below.
- the compound of Formula (5) is a compound of Formula (5B) or a stereoisomer thereof: R 1 O O , wherein x is an substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; and R 1 is hydrogen or an electron withdrawing group.
- x1 is 0.
- x1 is 1.
- x1 is 2.
- x1 is 3.
- x1 is 4.
- x1 is 5.
- L 1 is a bond or substituted or unsubstituted heteroalkylene. The substituents are described in more detail below.
- the compound of Formula (5) is a compound of Formula (5C) or a stereoisomer thereof: C), wherein x is an int nd, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; and R 1 is hydrogen or an electron withdrawing group.
- x1 is 0.
- x1 is 1.
- x1 is 2.
- x1 is 3.
- x1 is 4.
- x1 is 5.
- L 1 is a bond or substituted or unsubstituted heteroalkylene. The substituents are described in more detail below.
- the compound of Formula (5) is a compound of Formula (5D) or a stereoisomer thereof: .
- L 4 is a bond.
- L 4 is -(CH2)1-5-. In embodiments, L 4 is –O-. In embodiments, L 4 is -O-(CH 2 ) 1-5 -. When L 4 is -O-(CH 2 ) 1-5 -, the oxygen atom is adjacent to the phenyl ring and the alkylene group is adjacent to the phosphorous atom.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 and R 3 are each independently unsubstituted C1-5 alkyl. The substituents are described in more detail below.
- the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid. In embodiments, the protein further comprises a non- naturally occurring arginine proximal to the unnatural amino acid. In embodiments, the protein further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the proteins comprises an unnatural amino acid, wherein the unnatural amino comprises a side chain of: , wherein x is an alkylene, or substituted or unsubstituted heteroalkylene; R 1 is hydrogen or an electron withdrawing group; and R 2 and R 3 are each independently substituted or unsubstituted C1-5 alkyl.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 and R 3 are each independently unsubstituted C1-5 alkyl. The substituents are described in more detail below.
- the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid. In embodiments, the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid. In embodiments, the protein further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the proteins comprises an unnatural amino acid, wherein the unnatural amino comprises a side chain of: , wherein x is an alkylene, or substituted or unsubstituted heteroalkylene; R 1 is hydrogen or an electron withdrawing group; and R 2 and R 3 are each independently substituted or unsubstituted C 1-5 alkyl.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 and R 3 are each independently unsubstituted C 1-5 alkyl. The substituents are described in more detail below.
- the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid. In embodiments, the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid. In embodiments, the protein further comprises a naturally occurring arginine proximal to the unnatural amino acid. [0138] In embodiments, the proteins comprises an unnatural amino acid, wherein the unnatural amino comprises a side chain of: , wherein x is an alkylene, or substituted or unsubstituted heteroalkylene; and R 1 is hydrogen or an electron withdrawing group.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid.
- the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid.
- the protein further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the proteins comprises an unnatural amino acid, wherein the unnatural amino comprises a side chain of: , wherein x is an alkylene, or substituted or unsubstituted heteroalkylene; and R 2 and R 3 are each independently substituted or unsubstituted C 1-5 alkyl.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 and R 3 are each independently unsubstituted C 1-5 alkyl. The substituents are described in more detail below.
- the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid. In embodiments, the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid. In embodiments, the protein further comprises a naturally occurring arginine proximal to the unnatural amino acid. [0140] In embodiments, the proteins comprises an unnatural amino acid, wherein the unnatural amino comprises a side chain of: , wherein x is an alkylene, or substituted or unsubstituted heteroalkylene. In embodiments, L 1 is a bond or substituted or unsubstituted heteroalkylene.
- the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid. In embodiments, the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid. In embodiments, the protein further comprises a naturally occurring arginine proximal to the unnatural amino acid. [0141] In embodiments, the proteins comprises an unnatural amino acid, wherein the unnatural amino comprises a side chain of: . In embodiments, the arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid.
- the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid. In embodiments, the protein further comprises a naturally occurring arginine proximal to the unnatural amino acid. [0142] In embodiments, the proteins comprises an unnatural amino acid, wherein the unnatural amino comprises a side chain of: . In embodiments, proximal to the unnatural unnatural amino acid. In embodiments, the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid. In embodiments, the protein further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (6): , wherein x is an integer a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R 1 is hydrogen or an electron withdrawing group; and R 2 is substituted or unsubstituted C1-5 alkyl.
- x1 is 0.
- x1 is 1.
- x1 is 2.
- x1 is 3.
- x1 is 4.
- x1 is 5.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 is unsubstituted C 1-5 alkyl.
- the protein further comprises a non- naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid.
- the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid.
- the protein further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the proteins comprises an unnatural amino acid, wherein the unnatural amino comprises a side chain of: A), wherein x is an integer f a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; and R 2 is substituted or unsubstituted C1-5 alkyl.
- x1 is 0.
- x1 is 1.
- x1 is 2.
- x1 is 3.
- x1 is 4.
- x1 is 5.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 is unsubstituted C 1-5 alkyl.
- the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid. In embodiments, the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid. In embodiments, the protein further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the proteins comprises an unnatural amino acid, wherein the unnatural amino comprises a side chain of: , wherein x is an integer bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; and R 1 is hydrogen or an electron withdrawing group.
- x1 is 0.
- x1 is 1.
- x1 is 2.
- x1 is 3.
- x1 is 4.
- x1 is 5.
- L 1 is a bond or substituted or unsubstituted heteroalkylene. The substituents are described in more detail below.
- the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid. In embodiments, the protein further comprises a non- naturally occurring arginine proximal to the unnatural amino acid. In embodiments, the protein further comprises a naturally occurring arginine proximal to the unnatural amino acid. [0146] In embodiments, the proteins comprises an unnatural amino acid, wherein the unnatural amino comprises a side chain of: C), wherein x is an integer a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; and R 1 is hydrogen or an electron withdrawing group.
- x1 is 0. In embodiments, x1 is 1. In embodiments, x1 is 2. In embodiments, x1 is 3. In embodiments, x1 is 4. In embodiments, x1 is 5. In embodiments, L 1 is a bond or substituted or unsubstituted heteroalkylene. The substituents are described in more detail below.
- the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid. In embodiments, the protein further comprises a non- naturally occurring arginine proximal to the unnatural amino acid.
- the protein further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the proteins comprises an unnatural amino acid, wherein the unnatural amino comprises a side chain of: .
- the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid.
- the protein further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the proteins comprises an unnatural amino acid, wherein the unnatural amino comprises a side chain of: .
- the protein further comprises a non-naturally occurring arginine proximal to the unnatural amino acid. In embodiments, the protein further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- proteins of Formula (3) or a stereoisomer thereof wherein W is –H, moiety, or an amino acid - 1-5-, - 1-5-, ; integer from 1 to 8; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R 1 is hydrogen or an electron withdrawing group; and R 2 and R 3 are each independently substituted or unsubstituted C1-5 alkyl.
- L 4 is a bond.
- L 4 is -(CH2)1-5-.
- L 4 is –O-.
- L 4 is -O-(CH2)1-5-.
- W and/or Y comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid.
- W or Y further comprises a non-naturally occurring arginine proximal to the unnatural amino acid.
- W and/or Y further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the protein of Formula (3) is a protein of Formula (3A) or a stereoisomer thereof: , wherein W is moiety, or an amino acid moiety; x is an integer from 1 to 8; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R 1 is hydrogen or an electron withdrawing group; and R 2 and R 3 are each independently substituted or unsubstituted C1-5 alkyl. In embodiments, L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 and R 3 are each independently unsubstituted C1-5 alkyl.
- W is not –H when Y is –OH.
- the substituents are described in more detail below.
- W and/or Y comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid.
- W or Y further comprises a non-naturally occurring arginine proximal to the unnatural amino acid.
- W and/or Y further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the protein of Formula (3) is a protein of Formula (3B) or a stereoisomer thereof: , wherein W is moiety, or an amino acid moiety; x is an integer from 1 to 8; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R 1 is hydrogen or an electron withdrawing group; and R 2 and R 3 are each independently substituted or unsubstituted C1-5 alkyl.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 and R 3 are each independently unsubstituted C1-5 alkyl.
- W is not –H when Y is –OH.
- the substituents are described in more detail below.
- W and/or Y comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid.
- W or Y further comprises a non-naturally occurring arginine proximal to the unnatural amino acid.
- W and/or Y further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the protein of Formula (3) is a protein of Formula (3C) or a stereoisomer thereof: C), wherein W is – ptidyl moiety, or an amino acid moiety; x is an integer from 1 to 8; L is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; and R 1 is hydrogen or an electron withdrawing group.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- W is not –H when Y is –OH. The substituents are described in more detail below.
- W and/or Y comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid. In embodiments, W or Y further comprises a non-naturally occurring arginine proximal to the unnatural amino acid. In embodiments, W and/or Y further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the protein of Formula (3) is a protein of Formula (3D) or a stereoisomer thereof: , wherein W is moiety, or an amino acid moiety; x is an integer from 1 to 8; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; and R 2 and R 3 are each independently substituted or unsubstituted C1-5 alkyl.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 and R 3 are each independently unsubstituted C1-5 alkyl.
- W is not –H when Y is –OH.
- W and/or Y comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid.
- W or Y further comprises a non-naturally occurring arginine proximal to the unnatural amino acid.
- W and/or Y further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the protein of Formula (3) is a protein of Formula (3E) or a stereoisomer thereof: , wherein W is moiety, or an amino acid x an a or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- W is not –H when Y is –OH. The substituents are described in more detail below.
- W and/or Y comprises a non- naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid.
- W or Y further comprises a non-naturally occurring arginine proximal to the unnatural amino acid.
- W and/or Y further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the protein of Formula (3) is a protein of Formula (3F) or a stereoisomer thereof: , W is –H, a peptidyl moiety, or an amino acid moiety. In embodiments, W is not –H when Y is –OH. The substituents are described in more detail below.
- W and/or Y comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid.
- W or Y further comprises a non-naturally occurring arginine proximal to the unnatural amino acid.
- W and/or Y further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the protein of Formula (3) is a protein of Formula (3G) or a stereoisomer thereof: : , W is –H, a or an amino acid moiety. are in more detail below.
- W and/or Y comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid. In embodiments, W or Y further comprises a non-naturally occurring arginine proximal to the unnatural amino acid. In embodiments, W and/or Y further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- proteins of Formula (7) or a stereoisomer thereof wherein W is –H, a peptidyl moiety, or an amino acid moiety; x is an integer from 1 to 8; x1 is an integer from 0 to 5; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R 1 is hydrogen or an electron withdrawing group; and R 2 is substituted or unsubstituted C1-5 alkyl.
- x1 is 0.
- x1 is 1.
- x1 is 2.
- x1 is 3.
- x1 is 4.
- x1 is 5.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 is unsubstituted C 1-5 alkyl.
- W is not –H when Y is –OH.
- the substituents are described in more detail below.
- W and/or Y comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid.
- W or Y further comprises a non-naturally occurring arginine proximal to the unnatural amino acid.
- W and/or Y further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the protein of Formula (7) is a protein of Formula (7A) or a stereoisomer thereof: , wherein W is –H, a peptidyl moiety, or an amino acid 5; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; and R 2 is substituted or unsubstituted C1-5 alkyl.
- x1 is 0. In embodiments, x1 is 1. In embodiments, x1 is 2. In embodiments, x1 is 3. In embodiments, x1 is 4. In embodiments, x1 is 5.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 is unsubstituted C1-5 alkyl.
- W is not –H when Y is –OH.
- the substituents are described in more detail below.
- W and/or Y comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid.
- W or Y further comprises a non- naturally occurring arginine proximal to the unnatural amino acid.
- W and/or Y further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the protein of Formula (7) is a protein of Formula (7B) or a stereoisomer thereof: , wherein W is –H, peptidyl moiety, or an amino acid moiety; x is an integer from 1 to 8; x1 is an integer from 0 to 5; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; and R 1 is hydrogen or an electron withdrawing group.
- x1 is 0.
- x1 is 1.
- x1 is 2.
- x1 is 3.
- x1 is 4.
- x1 is 5.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- W is not –H when Y is –OH.
- the substituents are described in more detail below.
- W and/or Y comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid.
- W or Y further comprises a non-naturally occurring arginine proximal to the unnatural amino acid.
- W and/or Y further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the protein of Formula (7) is a protein of Formula (7C) or a stereoisomer thereof: , wherein W is –H, peptidyl moiety, or an amino acid moiety; x is an integer from 1 to 8; x1 is an integer from 0 to 5; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; and R 1 is hydrogen or an electron withdrawing group.
- x1 is 0.
- x1 is 1.
- x1 is 2.
- x1 is 3.
- x1 is 4.
- x1 is 5.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- W is not –H when Y is –OH.
- the substituents are described in more detail below.
- W and/or Y comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid.
- W or Y further comprises a non-naturally occurring arginine proximal to the unnatural amino acid.
- W and/or Y further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the protein of Formula (7) is a protein of Formula (7D) or a stereoisomer thereof: W is –H, a peptidyl moiety, or an amino acid moiety; Y is —OH, a peptidyl moiety, or an amino acid moiety. In embodiments, W is not –H when Y is –OH.
- W and/or Y comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid. In embodiments, W or Y further comprises a non-naturally occurring arginine proximal to the unnatural amino acid.
- W and/or Y further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the protein of Formula (7) is a protein of Formula (7E) or a stereoisomer thereof: , W is –H, or an amino acid moiety. In embodiments, W is not –H when Y is –OH. The substituents are described in more detail below.
- W and/or Y comprises a non-naturally occurring arginine proximal to the unnatural amino acid and/or a naturally occurring arginine proximal to the unnatural amino acid.
- W or Y further comprises a non-naturally occurring arginine proximal to the unnatural amino acid. In embodiments, W and/or Y further comprises a naturally occurring arginine proximal to the unnatural amino acid.
- the protein is an antibody, an antibody variant, or a receptor protein. In embodiments, the protein is an antibody. In embodiments, the protein is an antibody variant. In embodiments, the protein is a receptor protein. In embodiments, the antibody variant is a variant as defined herein. In embodiments, the antibody variant is a single-chain variable fragment, a single-domain antibody, an affibody, or an antigen-binding fragment.
- the antibody variant is a single-chain variable fragment. In embodiments, the antibody variant is a single-domain antibody. In embodiments, the antibody variant is an affibody. In embodiments, the antibody variant is or an antigen- binding fragment. In embodiments, the receptor protein is any receptor protein described herein. [0164] In embodiments of the compounds described herein, the protein is a receptor protein.
- the receptor protein is a programmed death-ligand 1 (PD-L1) receptor, a programmed cell death protein 1 (PD-1) receptor, a 5-hydroxytryptamine receptor, an acetylcholine receptor, an adenosine receptor, an adenosine A2A receptor, an adenosine A2B receptor, an angiotensin receptor, an apelin receptor, a bile acid receptor, a bombesin receptor, a bradykinin receptor, a cannabinoid receptor, a chemerin receptor, a chemokine receptor, a cholecystokinin receptor, a Class A Orphan receptor, a dopamine receptor, an endothelin receptor, an epidermal growth factor receptor (EGFR), a formyl peptide receptor, a free fatty acid receptor, a galanin receptor, a ghrelin receptor, a glycoprotein hormone receptor, a gonadotrophin-releasing hormone receptor, a G
- the protein is a receptor protein.
- the receptor protein is a programmed death-ligand 1 (PD-L1) receptor, a programmed cell death protein 1 (PD-1) receptor, a 5-hydroxytryptamine receptor, an acetylcholine receptor, an adenosine receptor, an adenosine A2A receptor, an adenosine A2B receptor, an angiotensin receptor, an apelin receptor, a bile acid receptor, a bombesin receptor, a bradykinin receptor, a cannabinoid receptor, a chemerin receptor, a chemokine receptor, a cholecystokinin receptor, a Class A Orphan receptor, a dopamine receptor, an endothelin receptor, an epidermal growth factor receptor (EGFR), a formyl peptide receptor, a free fatty acid receptor, a galanin receptor, a ghrelin receptor
- EGFR epidermal growth factor receptor
- the receptor protein is an integrin. In embodiments, the receptor protein is a somatostatin receptor. In embodiments, the receptor protein is a gonadotropin-releasing hormone receptor. In embodiments, the receptor protein is a bombesin receptor. In embodiments, the receptor protein is a vasoactive intestinal peptide receptor. In embodiments, the receptor protein is a neurotensin receptor. In embodiments, the receptor protein is a cholecystokinin 2 receptor. In embodiments, the receptor protein is a melanocortin receptor. In embodiments, the receptor protein is a ghrelin receptor. [0166] In embodiments, the receptor protein is a PD-L1 receptor or a PD-1 receptor.
- the receptor protein is a PD-L1 receptor. In embodiments, the receptor protein is a PD-1 receptor. [0167] In embodiments, the receptor protein is a receptor expressed on a cancer cell. In embodiments, the receptor protein is a receptor overexpressed on a cancer cell relative to a control. [0168] In embodiments, the receptor protein is a G protein-coupled receptor. In embodiments, the receptor protein is a receptor tyrosine kinase. In embodiments, the receptor protein is a an ErbB receptor. In embodiments, the receptor protein is an epidermal growth factor receptor (EGFR). In embodiments, the receptor protein is epidermal growth factor receptor 1 (HER1).
- EGFR epidermal growth factor receptor
- HER1 epidermal growth factor receptor 1
- the receptor protein is epidermal growth factor receptor 2 (HER2). In embodiments, the receptor protein is epidermal growth factor receptor 3 (HER3). In embodiments, the receptor protein is epidermal growth factor receptor 4 (HER4).
- the proteins comprise an unnatural amino acid within CDR-L1, CDR- L2, CDR-L3, CDR-H1, CDR-H2, or CDR-H3.
- the protein is an antigen- binding fragment, an antibody, or an antibody variant.
- the protein is an antigen-binding fragment.
- the protein is a single-chain variable fragment. In embodiments, the protein is an antibody. In embodiments, the protein has one unnatural amino acid within CDR-L1.
- the protein has one unnatural amino acid within CDR-L2. In embodiments, the protein has one unnatural amino acid within CDR-L3. In embodiments, the protein has one unnatural amino acid within CDR-H1. In embodiments, the protein has one unnatural amino acid within CDR-H2. In embodiments, the protein has one unnatural amino acid within CDR-H3. In embodiments, the protein has two or more unnatural amino acids within CDR-L1, CDR-L2, CDR-L3, CDR-H1, CDR-H2, or CDR-H3. The two or more unnatural acids can be in the same or different CDR, and can be in the same or different chain (i.e., light or heavy).
- the proteins described herein further comprise a detectable agent.
- the detectable agent is a radioisotope.
- the proteins described herein are capable of forming protein conjugates.
- methods of binding a target protein on a cell or within a cell comprising contacting the cell with a protein described herein (e.g., peptidyl moiety), wherein the protein is capable of specifically binding to a target protein (e.g., peptidyl moiety) on the surface of the cell or within a cell, whereby the protein forms a covalent bond with the target protein, thereby forming a protein conjugate.
- the covalent bond is formed through a phorphorus fluoride exchange reaction.
- the covalent bond is formed through a proximity-enabled, phorphorus fluoride exchange (PFEx) reaction.
- PFEx proximity-enabled, phorphorus fluoride exchange
- the proximity-enabled PFEx reaction occurs at an acidic pH.
- the acidic pH is from 5 to 6.9.
- the acidic pH is from 5.5 to 6.5.
- Contacting is used in accordance with its plain ordinary meaning and refers to the process of allowing at least two distinct species (e.g., chemical compounds including proteins described herein and targets on a cell or within a cell) to become sufficiently proximal to react, interact or physically touch.
- Conjugates Provided herein are protein conjugates comprising a first peptidyl moiety linked to a second peptidyl moiety by a bioconjugate liker of the following formula: or wherein the first peptidyl moiety can be a protein as described herein and the second peptidyl moiety can be a target protein as described herein.
- L 4 is a bond, -(CH2)1-5-, - 1-5-, a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R 1 is hydrogen or an electron withdrawing group; and R 2 and R 3 are each independently substituted or unsubstituted C 1-5 alkyl.
- L 4 is a bond.
- L 4 is -(CH 2 ) 1-5 -.
- L 4 is –O-.
- L 4 is -O-(CH2)1-5-.
- L 4 When L 4 is -O-(CH2)1-5-, the oxygen atom is adjacent to the phenyl ring and the alkylene group is adjacent to the phosphorous atom.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 and R 3 are each independently unsubstituted C 1-5 alkyl.
- L 2 is a bond and L 3 is N N , more detail below.
- the protein conjugate of Formula (4) is a protein conjugate of Formula (4A): , wherein R 4 and R 5 as defined herein; x is an integer from 1 to 8; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R 1 is hydrogen or an electron withdrawing group; and R 2 and R 3 are each independently substituted or unsubstituted C1-5 alkyl.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 and R 3 are each independently unsubstituted C1-5 alkyl.
- the protein conjugate of Formula (4) is a protein conjugate of Formula (4B): , wherein R 4 and R 5 as defined herein; x is an integer from a or or substituted or unsubstituted heteroalkylene; R 1 is hydrogen or an electron withdrawing group; and R 2 and R 3 are each independently substituted or unsubstituted C 1-5 alkyl.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 and R 3 are each independently unsubstituted C 1-5 alkyl.
- the protein conjugate of Formula (4) is a protein conjugate of Formula (4C): , wherein R 4 and defined herein; x is an integer from 1 to 8; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; and R 1 is hydrogen or an electron withdrawing group. In embodiments, L 1 is a bond or substituted or unsubstituted heteroalkylene.
- L 2 is a bond and L 3 is N N , more [0180]
- the protein conjugate of Formula (4) is a protein conjugate of Formula (4D): , wherein R 4 and as defined herein; x is an integer from a or or substituted or unsubstituted heteroalkylene; and R 2 and R 3 are each independently substituted or unsubstituted C 1-5 alkyl.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 and R 3 are each independently unsubstituted C1-5 alkyl.
- L 2 is a bond and L 3 is N N , more detail below.
- the protein conjugate of Formula (4) is a protein conjugate of Formula (4G): , wherein R 4 defined herein.
- L 2 is a bond and L 3 is N , more detail below.
- x1 is an integer from 1 to 8; x1 is an integer from 0 to 5; L is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R 1 is hydrogen or an electron withdrawing group; and R 2 is substituted or unsubstituted C1-5 alkyl.
- x1 is 0. In embodiments, x1 is 1. In embodiments, x1 is 2. In embodiments, x1 is 3. In embodiments, x1 is 4. In embodiments, x1 is 5.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- R 2 is unsubstituted C 1-5 alkyl.
- L 2 is a bond and L 3 is N N , more detail below.
- the protein conjugate of Formula (8) is a protein conjugate of Formula (8A): , wherein R 4 and R 5 as defined herein; x is an integer from 1 to 8; x1 is an integer from 0 to 5; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; and R 2 is substituted or unsubstituted C1-5 alkyl.
- x1 is 0. In embodiments, x1 is 1. In embodiments, x1 is 2. In embodiments, x1 is 3. In embodiments, x1 is 4. In embodiments, x1 is 5. In embodiments, L 1 is a bond or substituted or unsubstituted heteroalkylene. In embodiments, R 2 is unsubstituted C1-5 alkyl. In embodiments, L 2 is a bond and L 3 is N , more detail below.
- the protein conjugate of Formula (8) is a protein conjugate of Formula (8B): , wherein R 4 and as defined herein; x is an integer from or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; and R 1 is hydrogen or an electron withdrawing group.
- x1 is 0.
- x1 is 1.
- x1 is 2.
- x1 is 3.
- x1 is 4.
- x1 is 5.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- L 2 is a bond and L 3 is N N , more detail below.
- the protein conjugate of Formula (8) is a protein conjugate of Formula (8C): , wherein R 4 and as defined herein; x is an integer from 1 to 8; x1 is an integer from 0 to 5; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; and R 1 is hydrogen or an electron withdrawing group.
- x1 is 0.
- x1 is 1.
- x1 is 2.
- x1 is 3.
- x1 is 4.
- x1 is 5.
- L 1 is a bond or substituted or unsubstituted heteroalkylene.
- L 2 is a bond and L 3 is N , more detail below.
- the protein conjugate of Formula (8) is a protein conjugate of Formula (8D): O O P L 3 -R 5 , wherein R 4 and R 5 are L 3 are as defined herein. In N N , more [0189]
- the protein conjugate of Formula (8) is a protein conjugate of Formula (8E): , wherein R 4 defined herein/.
- L 2 is a bond and L 3 is N , more detail below.
- L 2 is a bond, -NR 2A -, -S-, -S(O)2-, -O-, -C(O)-, -C(O)O-, -OC(O)-, -N(R 2A )C(O)-, -C(O)N(R 2A )-, -NR 2A C(O)NR 2B -, -NR 2A C(NH)NR 2B -, -SO2N(R 2A )-, -N(R 2A )SO2-, -C(S)-, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene,
- L 2 is a bond.
- L 3 is a bond, -N(R 3A )-, -S-, -S(O)2-, -O-, -C(O)-, -C(O)O-, -OC(O)-, -N(R 3A )C(O)-, -C(O)N(R 3A )-, -NR 3A C(O)NR 3B -, -NR 3A C(NH)NR 3B -, -SO2N(R 3A )-, -N(R 3A )SO2-, -C(S)-, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted
- L 3 is N N , In embodiments, L 3 , wherein –NH- is adjacent to the phosphorous. In embodiments, L 3 , wherein the N heteroatom is adjacent to the phosphorous. In embodiments, L 3 , wherein the O heteroatom is adjacent to the phosphorous. In embodiments, L 3 , wherein the –S- is adjacent to the phosphorous. [0192] In the compounds of Formulae (4), (4A)-(4G), and embodiments thereof, and Formulae (8), (8A)-(8H), and embodiments thereof, L 2 is a bond and L 3 is N , In embodiments, L 2 is a bond and L 3 , wherein –NH- is adjacent to N N the phosphorous.
- L 2 is a bond and L 3 , wherein the heteroatom is adjacent to the phosphorous.
- L 3 is , wherein the heteroatom is adjacent to the phosphorous.
- L 3 is , wherein the —S- is adjacent to the phosphorous.
- Substituents [0194] With reference to the compounds described herein, L 4 is a bond, -(CH 2 ) 1-5 -, -O-(CH 2 ) 1-5 -, or –O-. In embodiments, L 4 is -(CH 2 ) 1-5 -, -O-(CH 2 ) 1-5 -, or –O-.
- L 4 is -O-(CH 2 ) 1-5 - or –O-. In embodiments, L 4 is–O-. In embodiments, L 4 is a bond. In embodiments, L 4 is -(CH 2 ) 1-5 -. In embodiments, L 4 is -(CH 2 ) 1-4 -. In embodiments, L 4 is -(CH 2 ) 1-3 -. In embodiments, L 4 is -(CH 2 ) 1-2 . In embodiments, L 4 is -CH 2 -. In embodiments, L 4 is -(CH 2 ) 2 -. In embodiments, L 4 is -(CH 2 ) 3 -. In embodiments, L 4 is -(CH 2 ) 4 -.
- L 4 is -(CH 2 ) 5 -. In embodiments, L 4 is -O-(CH 2 ) 1-5 -. In embodiments, L 4 is –O-(CH 2 ) 1-4 -. In embodiments, L 4 is –O-(CH 2 ) 1-3 -. In embodiments, L 4 is –O-(CH 2 ) 1-2 -. In embodiments, L 4 is –O-CH2-. In embodiments, L 4 is –O-(CH2)2-. In embodiments, L 4 is –O-(CH2)3-. In embodiments, L 4 is –O-(CH 2 ) 4 -. In embodiments, L 4 is –O-(CH 2 ) 5 -.
- x is an integer from 0 to 8. In embodiments, x is an integer from 1 to 8. In embodiments, x is an integer from 1 to 7. In embodiments, x is an integer from 1 to 6. In embodiments, x is an integer from 1 to 5. In embodiments, x is an integer from 1 to 4. In embodiments, x is an integer from 1 to 3. In embodiments, x is an integer of 1 or 2. In embodiments, x is 1. In embodiments, x is 2. In embodiments, x is 3. In embodiments, x is 4. In embodiments, x is 5.
- R 2 and R 3 are each independently substituted or unsubstituted C 1-5 alkyl group. In embodiments, R 2 and R 3 are each independently unsubstituted C1-5 alkyl group. In embodiments, R 2 and R 3 are each independently unsubstituted C 1-4 alkyl group. In embodiments, R 2 and R 3 are each independently unsubstituted C 1-3 alkyl group. In embodiments, R 2 and R 3 are each independently unsubstituted C1-2 alkyl group. In embodiments, R 2 and R 3 are methyl.
- R 2 and R 3 are ethyl. In embodiments, R 2 and R 3 are propyl. In embodiments, R 2 and R 3 are isopropyl. In embodiments, R 2 and R 3 are butyl. In embodiments, R 2 and R 3 are isobutyl.
- R 2 and R 3 are substituted C 1-5 alkyl group
- the substituents are one or more of halogen, -CF 3 , -CBr 3 , -CCl 3 , -CI 3 , -CHF 2 , -CHBr2, -CHCl2, -CHI2, -CH2F, -CH2Br, -CH2Cl, -CH2I, -OCF3, -OCBr3, -OCCl3, -OCI3, -OCHF 2 , -OCHBr 2 , -OCHCl 2 , -OCHI 2 , -OCH 2 F, -OCH 2 Br, -OCH 2 Cl, -OCH 2 I, -CN, -OH, -NH 2 , -COOH, -CONH2, -NO2, -SH, -SO3H, -SO4H, -SO2NH2, -NHNH2, -ONH2, -N(O)
- R 1 is hydrogen or an electron withdrawing group.
- R 1 is hydrogen.
- R 1 is hydrogen, halogen, -CX 1 3 , -CHX 1 2 , -CH 2 X 1 , -OCX 1 3 , -OCH 2 X 1 , -OCHX 1 2 , -CN, -SO n1 R 1A , -SO v1 NR 1A R 1B , -NHC(O)NR 1A R 1B , -N(O)m1, -NR 1A R 1B , -C(O)R 1A , -C(O)-OR 1A , -C(O)NR 1A R 1B , -OR 1A , -NR 1A SO2R 1B , -NR 1A C(O)R 1B , -NR 1A C(O)OR 1B , -NR 1A OR 1B ,
- R 1 is halogen, -CX 1 3 , -CHX 1 2 , -CH 2 X 1 , -OCX 1 3 , -OCH 2 X 1 , -OCHX 1 2 , -CN, -SO n1 R 1A , -SOv1NR 1A R 1B , -NHC(O)NR 1A R 1B , -N(O)m1, -NR 1A R 1B , -C(O)R 1A , -C(O)-OR 1A , -C(O)NR 1A R 1B , -OR 1A , -NR 1A SO 2 R 1B , -NR 1A C(O)R 1B , -NR 1A C(O)OR 1B , -NR 1A OR 1B , -NR 3 + , substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl.
- R 1 is halogen, -CX 1 3 , -CHX 1 2 , -CH 2 X 1 , -OCX 1 3, -OCH2X 1 , -OCHX 1 2, -CN, or -N(O)m1.
- R 1 is an electron-donating group or an electron-withdrawing group.
- R 1 is an electron-withdrawing group.
- R 1A and R 1B are hydrogen.
- R 1 is an electron-donating group.
- the electron-donating group is –Cl, -Br, -I, -CX2 3, -CHX2 2, -OCX1 3, -OCH2X1, -OCHX1 2, , -OCOR1A, -OC(O)R 1A , -OC(O)NR 1A R 1B , -SR 1A , -PR 1A R 1B -NHC(O)NR 1A R 1B , -NR 1A R 1B , -OR 1A , -NR 1A SO 2 R 1B , -NR 1A C(O)R 1B , -NR 1A C(O)OR 1B , substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted aryl, or substituted or unsub
- R 1 is unsubstituted 2 to 6 membered heteroalkyl. In embodiments, R 1 is unsubstituted 2 to 4 membered heteroalkyl. In embodiments, R 1 is –O(CH2)mCH3, and m is an integer from 0 to 6. In embodiments, R 1 is –O(CH2)mCH3, and m is an integer from 0 to 4. In embodiments, R 1 is –O(CH2)mCH3, and m is an integer from 0 to 3. In embodiments, R 1 is – O(CH2)mCH3, and m is an integer from 0 to 2. In embodiments, R 1 is –O(CH2)mCH3, and m is 0 or 1. In embodiments, R 1 is –OCH3.
- w is an integer from 1 to 5, and X 1 is halogen.
- w is 1.
- w is 2.
- w is 3.
- w is 4.
- w is 5.
- R 1A is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl. In embodiments, R 1A is hydrogen, unsubstituted alkyl, or unsubstituted heteroalkyl.
- R 1A is hydrogen, substituted or unsubstituted C 1-4 alkyl, or substituted or unsubstituted 2 to 4 membered heteroalkyl. In embodiments, R 1A is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl. In embodiments, R 1A is hydrogen. In embodiments, R 1A is unsubstituted C1-4 alkyl. In embodiments, R 1A is unsubstituted 2 to 4 membered heteroalkyl. In embodiments, R 1A is hydrogen and R 1B is hydrogen.
- R 1B is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl.
- R 1B is hydrogen, unsubstituted alkyl, or unsubstituted heteroalkyl.
- R 1B is hydrogen, substituted or unsubstituted C 1-4 alkyl, or substituted or unsubstituted 2 to 4 membered heteroalkyl.
- R 1B is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl.
- R 1B is hydrogen.
- R 1B is unsubstituted C1-4 alkyl. In embodiments, R 1B is unsubstituted 2 to 4 membered heteroalkyl. In embodiments, R 1A is hydrogen and R 1B is hydrogen. [0208] With reference to the compounds described herein, X 1 is independently –F, -Cl, -Br, or –I. In embodiments, X 1 is independently –F, -Cl, or -Br. In embodiments, X 1 is independently – F or -Cl. In embodiments, X 1 is –F. In embodiments, X 1 is -Cl. In embodiments, X 1 is -Br. In embodiments, X 1 is –I.
- n1 is an integer from 0 to 4. In embodiments n1 is an integer from 0 to 3. In embodiments n1 is an integer from 0 to 2. In embodiments n1 is 0. In embodiments n1 is 1. In embodiments n1 is 2. In embodiments n1 is 3. In embodiments n1 is 4. [0210] With reference to the compounds described herein, m1 is 1 or 2. In embodiments, m1 is 1. In embodiments, m1 is 2. [0211] With reference to the compounds described herein, v1 is 1 or 2. In embodiments, v1 is 1. In embodiments, v1 is 2.
- L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene. In embodiments, L 1 is a bond. In embodiments, L 1 is substituted or unsubstituted alkylene. In embodiments, L 1 is substituted or unsubstituted C1-6 alkylene. In embodiments, L 1 is substituted or unsubstituted C1- 4 alkylene. In embodiments, L 1 is unsubstituted alkylene. In embodiments, L 1 is unsubstituted C1-6 alkylene. In embodiments, L 1 is unsubstituted C1-4 alkylene. In embodiments, L 1 is methylene.
- L 1 is ethylene. In embodiments, L 1 is propylene. In embodiments, L 1 is substituted or unsubstituted heteroalkylene. In embodiments, L 1 is substituted or unsubstituted 2 to 8 membered heteroalkylene. In embodiments, L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. [0213] In embodiments, L 1 is –NH-C(O)-(CH 2 ) y - or –NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 6. In embodiments, y is an integer from 0 to 5. In embodiments, y is an integer from 0 to 4.
- y is an integer from 0 to 3. In embodiments, y is an integer from 0 to 2. In embodiments, y is an integer from 0 to 3. In embodiments, L 1 is —NH-C(O)-. In embodiments, L 1 is –NH-C(O)-(CH2)- In embodiments, L 1 is –NH-C(O)-(CH2)2-. In embodiments, L 1 is –NH- C(O)-(CH2)3-. In embodiments, L 1 is –NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 3. In embodiments, L 1 is –NH-C(O)-O-.
- L 1 is –NH-C(O)-O-(CH2)-. In embodiments, L 1 is –NH-C(O)-O-(CH2)2-. In embodiments, L 1 is –NH-C(O)-O-(CH2)3-. In embodiments, the —NH- is adjacent the phenyl ring. [0214] In embodiments, L 1 is –C(O)NH-(CH 2 ) y - or –C(O)-O-NH-(CH 2 ) y -, and y is an integer from 0 to 6, wherein the –C(O)- is adjacent the phenyl ring. In embodiments, y is an integer from 0 to 5.
- y is an integer from 0 to 4. In embodiments, y is an integer from 0 to 3. In embodiments, y is an integer from 0 to 2. In embodiments, y is an integer from 0 to 3. In embodiments, L 1 is -C(O)NH-. In embodiments, L 1 is –C(O)NH-(CH 2 )- In embodiments, L 1 is – C(O)-NH-(CH2)2-. In embodiments, L 1 is –C(O)-NH-(CH2)3-. In embodiments, L 1 is -C(O)-O- NH-(CH 2 ) y -, and y is an integer from 0 to 3.
- L 1 is –C(O)-O-NH-. In embodiments, L 1 is -C(O)-O-NH-(CH2)-. In embodiments, L 1 is -C(O)-O-NH-(CH2)2-. In embodiments, L 1 is -C(O)-O-NH(CH 2 ) 3 -. In embodiments, the –C(O)- is adjacent the phenyl ring.
- L 2 is a bond, -NR 2A -, -S-, -S(O)2-, -O-, -C(O)-, -C(O)O-, -OC(O)-, -N(R 2A )C(O)-, -C(O)N(R 2A )-, -NR 2A C(O)NR 2B -, -NR 2A C(NH)NR 2B -, -SO2N(R 2A )-, -N(R 2A )SO2-, -C(S)-, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene.
- L 2 is a bond, -NH-, -S-, -S(O) 2 -, -O-, -C(O)-, -C(O)O-, -OC(O)-, -NHC(O)-, -C(O)NH-, -NHC(O)NH-, -SO 2 NH-, -NHSO 2 -, -C(S)-, L 12 -substituted or unsubstituted alkylene, L 12 -substituted or unsubstituted heteroalkylene, L 12 -substituted or unsubstituted cycloalkylene, L 12 -substituted or unsubstituted heterocycloalkylene, L 12 -substituted or unsubstituted arylene, or L 12 -substituted or unsubstituted heteroarylene.
- L 2 is a bond, -NH-, -S-, -S(O)2-, -O-, -C(O)-, -C(O)O-, -OC(O)-, -NHC(O)-, -C(O)NH-, -NHC(O)NH-, -SO 2 NH-, -NHSO 2 -, -C(S)-, unsubstituted alkylene, unsubstituted heteroalkylene, unsubstituted cycloalkylene, unsubstituted heterocycloalkylene, unsubstituted arylene, or unsubstituted heteroarylene.
- L 2 is a bond.
- the alkylene is a C1-6 alkylene.
- the alkylene is a C 1-4 alkylene.
- the heteroalkylene is a 2 to 6 membered heteroalkylene.
- the heteroalkylene is a 2 to 4 membered heteroalkylene.
- the cycloalkylene is a C 5 -C 6 cycloalkylene.
- the heterocycloalkylene is a 5 or 6 membered heterocycloalkylene.
- the arylene is a C 5-6 arylene.
- the heteroarylene is a 5 or 6 membered heteroarylene.
- R 2A and R 2B are independently hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
- the alkylene is a C 1-4 alkylene.
- the heteroalkylene is a 2 to 6 membered heteroalkylene.
- the heteroalkylene is a 2 to 4 membered heteroalkylene.
- L 12 is halogen, -CF 3 , -CBr 3 , -CCl 3 , -CI3, -CHF2, -CHBr2, -CHCl2, -CHI2, -CH2F, -CH2Br, -CH2Cl, -CH2I, -OCF3, -OCBr3, -OCCl3, -OCI 3 , -OCHF 2 , -OCHBr 2 , -OCHCl 2 , -OCHI 2 , -OCH 2 F, -OCH 2 Br, -OCH 2 Cl, -OCH 2 I, -CN, -OH, -NH2, -COOH, -CONH2, -NO2, -SH, -SO3H, -SO4H, -SO2NH2, -NHNH2, -ONH2, -NHC(O)NHNH 2 , -N(O) 2 , -
- the alkylene is a C 1-4 alkylene.
- the heteroalkylene is a 2 to 6 membered heteroalkylene.
- the heteroalkylene is a 2 to 4 membered heteroalkylene.
- the cycloalkylene is a C 5 - C 6 cycloalkylene.
- the heterocycloalkylene is a 5 or 6 membered heterocycloalkylene.
- the arylene is a C5-6 arylene.
- the heteroarylene is a 5 or 6 membered heteroarylene.
- L 3 is a bond, -N(R 3A )-, -S-, -S(O)2-, -O-, -C(O)-, -C(O)O-, -OC(O)-, -N(R 3A )C(O)-, -C(O)N(R 3A )-, -NR 3A C(O)NR 3B -, -C(S)-, -NR 3A C(NH)NR 3B -, -SO2N(R 3A )-, -N(R 3A )SO2-, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, substituted or unsubstituted heteroarylene, a combination
- L 3 is a bond, -NH-, -S-, -S(O)2-, -O-, -C(O)-, -C(O)O-, -OC(O)-, -NHC(O)-, -C(O)NH-, -NHC(O)NH-, -SO 2 NH-, -NHSO 2 -, -C(S)-, L 13 -substituted or unsubstituted alkylene, L 13 -substituted or unsubstituted heteroalkylene, L 13 - substituted or unsubstituted cycloalkylene, L 13 -substituted or unsubstituted heterocycloalkylene, L 13 -substituted or unsubstituted arylene, L 13 -substituted or unsubstituted heteroarylene, a combination of two thereof, or a
- the alkylene is a C1-4 alkylene.
- the heteroalkylene is a 2 to 6 membered heteroalkylene.
- the heteroalkylene is a 2 to 4 membered heteroalkylene.
- the cycloalkylene is a C5-C6 cycloalkylene.
- the heterocycloalkylene is a 5 or 6 membered heterocycloalkylene.
- the arylene is a C5-6 arylene.
- the heteroarylene is a 5 or 6 membered heteroarylene.
- L 3 is a combination of two thereof or a combination of three thereof, at least one of the combination is C 5 -C 6 cycloalkylene, 5 or 6 membered heterocycloalkylene, C5-6 arylene, or 5 or 6 membered heteroarylene. In embodiments where L 3 is a combination of two thereof or a combination of three thereof, then only one of the combination is C5-C6 cycloalkylene, 5 or 6 membered heterocycloalkylene, C5-6 arylene, or 5 or 6 membered heteroarylene.
- R 3A and R 3B are independently hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
- the alkylene is a C1-4 alkylene.
- the heteroalkylene is a 2 to 6 membered heteroalkylene.
- the heteroalkylene is a 2 to 4 membered heteroalkylene.
- the cycloalkylene is a C5-C6 cycloalkylene.
- the heterocycloalkylene is a 5 or 6 membered heterocycloalkylene.
- the arylene is a C5-6 arylene.
- the heteroarylene is a 5 or 6 membered heteroarylene.
- L 13 is halogen, -CF3, -CBr3, -CCl3, -CI 3 , -CHF 2 , -CHBr 2 , -CHCl 2 , -CHI 2 , -CH 2 F, -CH 2 Br, -CH 2 Cl, -CH 2 I, -OCF 3 , -OCBr 3 , -OCCl 3 , -OCI3, -OCHF2, -OCHBr2, -OCHCl2, -OCHI2, -OCH2F, -OCH2Br, -OCH2Cl, -OCH2I, -CN, -OH, -NH 2 , -COOH, -CONH 2 , -NO 2 , -SH, -SO 3 H, -SO 4 H, -SO 2 NH 2 , -NHNH 2 , -ONH 2 , -NHC(O)NH2,
- the alkylene is a C1-4 alkylene.
- the heteroalkylene is a 2 to 6 membered heteroalkylene.
- the heteroalkylene is a 2 to 4 membered heteroalkylene.
- the cycloalkylene is a C5- C 6 cycloalkylene.
- the heterocycloalkylene is a 5 or 6 membered heterocycloalkylene.
- the arylene is a C5-6 arylene.
- the heteroarylene is a 5 or 6 membered heteroarylene.
- W is –H, a peptidyl moiety, or an amino acid moiety; and Y is —OH, a peptidyl moiety, or an amino acid moiety.
- W is –H and Y is –OH.
- W is not –H when Y is –OH.
- W is a peptidyl moiety and Y is peptidyl moiety.
- W is a an amino acid moiety and Y is an amino acid moiety.
- W is a peptidyl moiety or an amino acid moiety and Y is a peptidyl moiety or an amino acid moiety.
- W is a peptidyl moiety and Y is —OH, a peptidyl moiety, or an amino acid moiety.
- W is –H, a peptidyl moiety, or an amino acid moiety; and Y is a peptidyl moiety.
- the peptidyl moiety of R 5 is a target protein.
- the peptidyl moiety of R 5 is a receptor protein, a cytosolic protein, a transcriptional factor, or an enzyme.
- the peptidyl moiety of R 5 is a receptor protein.
- the receptor protein is an extracellular domain receptor protein, a transmembrane domain receptor protein, or an intracellular domain receptor protein.
- the peptidyl moiety of R 5 is a cytosolic protein.
- the peptidyl moiety of R 5 is a transcriptional factor.
- the peptidyl moiety of R 5 is an enzyme. [0224] In embodiments of the compounds described herein, the peptidyl moiety of R 4 comprises an antibody or an antibody variant; and the peptidyl moiety of R 5 comprises a target protein.
- the peptidyl moiety of R 4 comprises an antibody or an antibody variant; and the peptidyl moiety of R 5 comprises a target protein, wherein the target protein comprises a lysine, histidine, tyrosine, or cysteine bonded to L 3 , where L 3 is a bond.
- R 4 comprises an antibody.
- R 4 comprises an antibody variant.
- the antibody variant is a variant as defined herein.
- the antibody variant is a single-chain variable fragment, a single-domain antibody, an affibody, or an antigen- binding fragment.
- the antibody variant is a single-chain variable fragment.
- the antibody variant is a single-domain antibody.
- the antibody variant is an affibody. In embodiments, the antibody variant is an antigen-binding fragment.
- the target protein is a receptor protein, a cytosolic protein, a transcriptional factor, or an enzyme.
- R 5 is a receptor protein. In embodiments, R 5 is an extracellular domain receptor protein. In embodiments, R 5 is a transmembrane domain receptor protein. In embodiments, R 5 is an intracellular domain receptor protein. In embodiments, R 5 is a cytosolic protein. In embodiments, R 5 is a transcriptional factor. In embodiments, R 5 is an enzyme.
- the peptidyl moiety of R 4 comprises an antibody or an antibody variant; and the peptidyl moiety of R 5 comprises a receptor protein.
- the peptidyl moiety of R 4 comprises an antibody or an antibody variant; and the peptidyl moiety of R 5 comprises a receptor protein, wherein the receptor protein comprises a lysine, histidine, tyrosine, or cysteine bonded to L 3 , where L 3 is a bond.
- R 4 comprises an antibody.
- R 4 comprises an antibody variant.
- the antibody variant is a variant as defined herein.
- the antibody variant is a single-chain variable fragment, a single-domain antibody, an affibody, or an antigen-binding fragment. In embodiments, the antibody variant is a single-chain variable fragment. In embodiments, the antibody variant is a single-domain antibody. In embodiments, the antibody variant is an affibody. In embodiments, the antibody variant is an antigen-binding fragment. In embodiments, the receptor protein is any receptor protein described herein. [0226] In embodiments of the compounds described herein, the peptidyl moiety of R 4 comprises a receptor protein; and the peptidyl moiety of R 5 comprises an antibody or an antibody variant.
- the peptidyl moiety of R 4 comprises a receptor protein; and the peptidyl moiety of R 5 comprises an antibody or an antibody variant; wherein the antibody or antibody variant comprises a lysine, histidine, tyrosine or cysteine bonded to L 3 , where L 3 is a bond.
- R 5 comprises an antibody.
- R 5 comprises an antibody variant.
- the antibody variant is a variant as defined herein.
- the antibody variant is a single-chain variable fragment, a single-domain antibody, an affibody, or an antigen-binding fragment.
- the antibody variant is a single-chain variable fragment.
- the antibody variant is a single-domain antibody.
- the antibody variant is an affibody. In embodiments, the antibody variant is an antigen-binding fragment. In embodiments, the receptor protein is any receptor protein described herein. [0227] In embodiments, the biomolecules, proteins, and peptidyl moieties described herein comprise a receptor protein.
- the receptor protein is a 5-hydroxytryptamine receptor, an acetylcholine receptor, an adenosine receptor, an adenosine A2A receptor, an adenosine A2B receptor, an angiotensin receptor, an apelin receptor, a bile acid receptor, a bombesin receptor, a bradykinin receptor, a cannabinoid receptor, a chemerin receptor, a chemokine receptor, a cholecystokinin receptor, a Class A Orphan receptor, a dopamine receptor, an endothelin receptor, an epidermal growth factor receptor (EGFR), a formyl peptide receptor, a free fatty acid receptor, a galanin receptor, a ghrelin receptor, a glycoprotein hormone receptor, a gonadotrophin-releasing hormone receptor, a G protein-coupled receptor, a G protein-coupled estrogen receptor, a histamine receptor, a hydroxycarboxy
- the receptor protein is an integrin. In embodiments, the receptor protein is a somatostatin receptor. In embodiments, the receptor protein is a gonadotropin-releasing hormone receptor. In embodiments, the receptor protein is a bombesin receptor. In embodiments, the receptor protein is a vasoactive intestinal peptide receptor. In embodiments, the receptor protein is a neurotensin receptor. In embodiments, the receptor protein is a cholecystokinin 2 receptor. In embodiments, the receptor protein is a melanocortin receptor. In embodiments, the receptor protein is a ghrelin receptor. [0228] In embodiments, the receptor protein is a receptor expressed on a cancer cell.
- the receptor protein is a receptor overexpressed on a cancer cell relative to a control.
- the receptor protein is a G protein-coupled receptor.
- the receptor protein is a receptor tyrosine kinase.
- the receptor protein is a an ErbB receptor.
- the receptor protein is an epidermal growth factor receptor (EGFR).
- the receptor protein is epidermal growth factor receptor 1 (HER1).
- the receptor protein is epidermal growth factor receptor 2 (HER2).
- the receptor protein is epidermal growth factor receptor 3 (HER3).
- the receptor protein is epidermal growth factor receptor 4 (HER4).
- the disclosure provides cells comprising the compounds, compositions and complexes provided herein, including embodiments thereof.
- the cell comprises the compound of Formula (1), including any embodiment thereof.
- the cell comprises the compound of Formula (1A), including any embodiment thereof.
- the cell comprises the compound of Formula (1B), including any embodiment thereof.
- the cell comprises the compound of Formula (1C), including any embodiment thereof.
- the cell comprises the compound of Formula (1D), including any embodiment thereof.
- the cell comprises the compound of Formula (1E), including any embodiment thereof.
- the cell comprises PFY.
- the cell comprises PFK.
- the cell comprises the compound of Formula (5), including any embodiment thereof.
- the cell comprises the compound of Formula (5A), including any embodiment thereof.
- the cell comprises the compound of Formula (5B), including any embodiment thereof.
- the cell comprises the compound of Formula (5C), including any embodiment thereof.
- the cell comprises the compound of Formula (5D), including any embodiment thereof.
- the cell comprises the compound of Formula (5E), including any embodiment thereof.
- the cell further includes a mutant pyrrolysyl-tRNA synthetase as described herein, including embodiments thereof.
- the cell further includes a vector as described herein, including embodiments thereof.
- the cell further includes a tRNA Pyl .
- the compound of Formula (1) (including embodiments (1A)-(1E), PFY, PFK) is biosynthesized inside the cell, thereby generating a cell containing the compound of Formula (IV).
- the compound of Formula (1) is contained in the medium outside the cell and penetrates into the cell, thereby generating a cell containing the compound of Formula (1).
- the cell comprises the compound of Formula (1).
- the cell comprises the compound of Formula (1) that is synthesized inside the cell.
- the cell comprises the compound of Formula (1) that is synthesized outside a cell, and that penetrates into the cell.
- the method of forming a protein comprising an unnatural amino acid comprises contacting a protein, a mutant pyrrolysyl-tRNA synthetase, a tRNA Pyl , and a compound of Formula (1) (including embodiments thereof), thereby producing the protein comprising the unnatural amino acid of Formula (1) (including embodiments thereof).
- the protein produced by the method will comprise the unnatural amino acid side chain of Formula (2) (including embodiments thereof).
- the mutant pyrrolysyl-tRNA synthetase used in the method of producing the biomolecule is any described herein or known in the art (e.g., SEQ ID NO:9).
- the tRNA Pyl used in the method of producing the protein is any described herein.
- the reaction is performed in vitro. In embodiments, the reaction is performed in vivo. In embodiments, the reaction is performed in one or more living cells. In embodiments, the reaction is performed in one or more living bacterial cells. In embodiments, the reaction is performed in one or more living mammalian cells. [0249]
- the compositions provided herein are useful for forming a protein comprising an unnatural amino acid (e.g., a compound of Formula (3) or embodiments thereof).
- the protein produced by the method will comprise the unnatural amino acid side chain of Formula (6) (including embodiments thereof).
- the mutant pyrrolysyl-tRNA synthetase used in the method of producing the biomolecule is any described herein or known in the art (e.g., SEQ ID NO:9).
- the tRNA Pyl used in the method of producing the protein is any described herein.
- the reaction is performed in vitro.
- the reaction is performed in vivo.
- the reaction is performed in one or more living cells.
- the reaction is performed in one or more living bacterial cells.
- the reaction is performed in one or more living mammalian cells.
- the protein is an antigen-binding fragment.
- the method further comprises contacting the protein with a target protein, thereby covalently bonding the protein to the target protein.
- a method of enhancing the bioreactivity and/or binding efficacy of a protein comprising an unnatural amino acid the method comprising mutating an amino acid proximal to the unnatural amino acid to arginine; wherein the unnatural amino acid is any unnatural amino acid described herein (including embodiments thereof); thereby enhancing the bioreactivity and/or binding efficacy of the protein.
- a detectable moiety is a monovalent detectable agent or a detectable agent capable of forming a bond with another composition.
- paramagnetic ions that may be used as imaging agents in accordance with the embodiments of the disclosure include, e.g., ions of transition and lanthanide metals (e.g., metals having atomic numbers of 21-29, 42, 43, 44, or 57- 71). These metals include ions of Cr, V, Mn, Fe, Co, Ni, Cu, La, Ce, Pr, Nd, Pm, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb and Lu.
- the radioisotope is 131 I. In embodiments, the radioisotope is a positron-emitting radioisotope. In embodiments, the positron-emitting radioisotope is 11 C, 13 N, 15 O, 18 F, 64 Cu, 68 Ga, 78 Br, 82 Rb, 86 Y, 89 Zr, 90 Y, 22 Na, 26 Al, 40 K, 83 Sr, or 124 I. In embodiments, the positron-emitting radioisotope is 11 C. In embodiments, the positron- emitting radioisotope is 13 N. In embodiments, the positron-emitting radioisotope is 15 O.
- Such preparations can be sterilized and, if desired, mixed with auxiliary agents such as lubricants, preservatives, stabilizers, wetting agents, emulsifiers, salts for influencing osmotic pressure, buffers, coloring, and/or aromatic substances and the like that do not deleteriously react with the compounds of the disclosure.
- auxiliary agents such as lubricants, preservatives, stabilizers, wetting agents, emulsifiers, salts for influencing osmotic pressure, buffers, coloring, and/or aromatic substances and the like that do not deleteriously react with the compounds of the disclosure.
- auxiliary agents such as lubricants, preservatives, stabilizers, wetting agents, emulsifiers, salts for influencing osmotic pressure, buffers, coloring, and/or aromatic substances and the like that do not deleteriously react with the compounds of the disclosure.
- Pharmaceutically acceptable excipients can be used in pharmaceutical compositions for therapeutic purposes (e.g.
- the vaccines may be administered parenterally as injections (intravenous, intramuscular or subcutaneous).
- injections intravenous, intramuscular or subcutaneous.
- the amount of recombinant proteins used in a vaccine can depend upon a variety of factors including the route of administration, species, and use of booster administration. However, a person of ordinary skill in the art would immediately recognize appropriate and/or equivalent doses looking at dosages of approved whopping cough vaccines for guidance.
- the dosage and frequency (single or multiple doses) of the proteins administered to a subject can vary depending upon a variety of factors, for example, whether the mammal suffers from another disease, and its route of administration; size, age, sex, health, body weight, body mass index, and diet of the recipient; nature and extent of symptoms of the disease being treated, kind of concurrent treatment, complications from the disease being treated or other health- related problems.
- Other therapeutic regimens or agents can be used in conjunction with the methods and proteins (e.g., recombinant proteins, antibodies, antibody variants, single-domain antibodies) described herein. Adjustment and manipulation of established dosages (e.g., frequency and duration) are within the ability of the skilled artisan.
- the effective amount can be initially determined from cell culture assays.
- Target concentrations will be those concentrations of proteins that are capable of achieving the methods described herein, as measured using the methods described herein or known in the art.
- effective amounts of proteins for use in humans can also be determined from animal models.
- a dose for humans can be formulated to achieve a concentration that has been found to be effective in animals.
- the dosage in humans can be adjusted by monitoring effectiveness and adjusting the dosage upwards or downwards, as described above. Adjusting the dose to achieve maximal efficacy in humans based on the methods described above and other methods is well within the capabilities of the ordinarily skilled artisan.
- Embodiment 4 The compound of embodiment 3, wherein L 1 is -C(O)-NH-(CH 2 ) y -, -C(O)-O-NH-(CH2)y-, —NH-C(O)-(CH2)y- or –NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 2.
- Embodiment 5. The compound of embodiment 3, wherein L 1 is –NH-C(O)- or -C(O)- NH-
- Embodiment 6. The compound of any one of embodiments 1 to 5, wherein R 2 and R 3 are each independently unsubstituted C1-4 alkyl.
- Embodiment 8 The compound of any one of embodiments 1 to 7, wherein L 4 is bonded to the carbon atom para to the carbon atom to which L 1 is bonded.
- Embodiment 9. The compound of any one of embodiments 1 to 7, wherein L 4 is bonded to the carbon atom meta to the carbon atom to which L 1 is bonded.
- Embodiment 10. The compound of any one of embodiments 1 to 9, wherein L 4 is –O-.
- Embodiment 11 The compound of any one of embodiments 1 to 9, wherein L 4 is -O- (CH 2 ) 1-5 -.
- Embodiment 13 The compound of any one of embodiments 1 to 9, wherein L 4 is - (CH2)1-5-.
- Embodiment 13 The compound of any one of embodiments 1 to 9, wherein L 4 is a bond.
- Embodiment 14 The compound of any one of embodiments 1 to 13, wherein x is an integer from 1 to 4.
- Embodiment 15 The compound of any one of embodiments 1 to 14, wherein x1 is 1.
- Embodiment 16 The compound of any one of embodiments 1 to 15, wherein R 1 is hydrogen. [0287] Embodiment 17.
- the compound of Formula (I) is: H 2 N 2 .
- [0293] of Formula (II) is: .
- [0294] is a bond and x is 1.
- Embodiment 25 The compound of embodiment 23, wherein x is 4 and L 1 is –C(O)NH-, wherein–C(O)- is adjacent the phenyl ring.
- Embodiment 26 The compound of embodiment 23, wherein x is 4 and L 1 is –C(O)NH-, wherein–C(O)- is adjacent the phenyl ring.
- Embodiment 27 The protein of embodiment 26, wherein L 1 is a bond.
- Embodiment 28 The protein of embodiment 26, wherein L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene.
- Embodiment 29 The protein of embodiment 28, wherein L 1 is -C(O)-NH-(CH2)y-, —NH-C(O)-(CH 2 ) y - or –NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 2.
- Embodiment 30 The protein of embodiment 28, wherein L 1 is –NH-C(O)- or -C(O)-NH-
- Embodiment 31 The protein of any one of embodiments 26 to 30, wherein R 2 and R 3 are each independently unsubstituted C1-4 alkyl.
- Embodiment 32 Embodiment 32.
- Embodiment 33 The protein of any one of embodiments 26 to 32, wherein L 4 is bonded to the carbon atom para to the carbon atom to which L 1 is bonded.
- Embodiment 34 The protein of any one of embodiments 26 to 32, wherein L 4 is bonded to the carbon atom meta to the carbon atom to which L 1 is bonded.
- Embodiment 35 The protein of any one of embodiments 26 to 34, wherein L 4 is –O-.
- Embodiment 36 The protein of any one of embodiments 26 to 34, wherein L 4 is -O- (CH 2 ) 1-5 -.
- Embodiment 37 The protein of any one of embodiments 26 to 34, wherein L 4 is - (CH 2 ) 1-5 -.
- Embodiment 38 The protein of any one of embodiments 26 to 34, wherein L 4 is a bond.
- Embodiment 39 The protein of any one of embodiments 26 to 38, wherein x is an integer from 1 to 4.
- Embodiment 40 The protein of any one of embodiments 26 to 39, wherein x1 is 1.
- Embodiment 41 The protein of any one of embodiments 26 to 40, wherein R 1 is hydrogen.
- Embodiment 42 The protein of any one of embodiments 26 to 34, wherein R 1 is hydrogen.
- R 1 is hydrogen, halogen, -CX 1 3 , -CHX 1 2 , -CH 2 X 1 , -OCX 1 3 , -CH 2 X 1 , -OCHX 1 2 , -CN, -SO n1 R 1A , -SOv1NR 1A R 1B , -NHC(O)NR 1A R 1B , -N(O)m1, -NR 1A R 1B , -C(O)R 1A , -C(O)-OR 1A , -C(O)NR 1A R 1B , -OR 1A , -NR 1A SO 2 R 1B , -NR 1A C(O)R 1B , -NR 1A C(O)OR 1B , -NR 1A OR 1B , substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalky
- Embodiment 43 The protein of embodiment 42, wherein R 1 is halogen, -CX 1 3, - CHX 1 2 , -CH 2 X 1 , -OCX 1 3 , -OCH 2 X 1 , -OCHX 1 2 , -CN, -SO n1 R 1A , -N(O) m1 , -C(O)R 1A , -C(O)-OR 1A , -C(O)NR 1A R 1B , -OR 1A , -NR 1A SO2R 1B , -NR 1A C(O)R 1B , -NR 1A C(O)OR 1B , or -NR 1A OR 1B .
- Embodiment 44 The protein of embodiment 42, wherein R 1 is unsubstituted heteroalkyl, wherein the heteroalkyl is –O(CH 2 ) 1-4 .
- Embodiment 45 The protein of embodiment 42, wherein R 1 is halogen.
- Embodiment 46 The protein of embodiment 26, wherein the unnatural amino acid side chain is: .
- Embodiment 47 the unnatural amino acid side chain is: .
- Embodim natural amino acid side chain is: .
- Embodiment 1 L is a bond and x is 1.
- Embodiment 50 Embodiment 50.
- Embodiment 51 The protein of any one of embodiments 26 to 50, wherein the protein is an antibody.
- Embodiment 52 The protein of any one of embodiments 26 to 50, wherein the protein is a single-chain variable fragment, a single-domain antibody, an affibody, or an antigen-binding fragment.
- Embodiment 53 The protein of embodiment 51 or 52, wherein the unnatural amino acid is within a CDR region of the protein.
- Embodiment 54 The protein of embodiment 51 or 52, wherein the unnatural amino acid is within a CDR region of the protein.
- Embodiment 55 The protein of any one of embodiments 26 to 54, wherein the protein comprises: (i) an arginine proximal to the unnatural amino acid or (ii) a non-naturally occurring arginine proximal to the unnatural amino acid.
- Embodiment 56 The protein of any one of embodiments 26 to 54, wherein the protein comprises a non-naturally occurring arginine proximal to the unnatural amino acid.
- Embodiment 57 Embodiment 57.
- Embodiment 58 The protein of any one of embodiments 26 to 54, wherein the protein comprises a non-naturally occurring arginine proximal to the unnatural amino acid.
- Embodiment 58 The protein of any one of embodiments 26 to 57, further comprising a detectable agent.
- Embodiment 59 The protein of embodiment 58, wherein the detectable agent is a radioisotope.
- Embodiment 60 A nucleic acid encoding the protein of any one of embodiments 26 to 59.
- Embodiment 61 A vector comprising a nucleic acid, wherein the nucleic acid encodes the protein of any one of embodiments 26 to 59.
- Embodiment 62 A vector comprising a nucleic acid, wherein the nucleic acid encodes the protein of any one of embodiments 26 to 59.
- Embodiment 63 The protein conjugate of embodiment 62, wherein L 2 is a bond.
- Embodiment 64 The protein conjugate of embodiment 62 or 63, wherein L 1 is a bond.
- Embodiment 65 The protein conjugate of embodiment 62 or 63, wherein L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene.
- Embodiment 66 Embodiment 66.
- Embodiment 69 The protein conjugate of embodiment 68, wherein R 2 and R 3 are methyl.
- Embodiment 70 The protein conjugate of any one of embodiments 62 to 69, wherein L 4 is bonded to the carbon atom para to the carbon atom to which L 1 is bonded.
- Embodiment 71 The protein conjugate of any one of embodiments 62 to 69, wherein L 4 is bonded to the carbon atom meta to the carbon atom to which L 1 is bonded.
- Embodiment 72 Embodiment 72.
- Embodiment 73 The protein conjugate of any one of embodiments 62 to 71, wherein L 4 is -O-(CH 2 ) 1-5 -.
- Embodiment 74 The protein conjugate of any one of embodiments 62 to 71, wherein L 4 is -(CH 2 ) 1-5 -.
- Embodiment 75 The protein conjugate of any one of embodiments 62 to 71, wherein L 4 is a bond.
- Embodiment 76 The protein conjugate of any one of embodiments 62 to 75, wherein x is an integer from 1 to 4.
- Embodiment 77 The protein conjugate of any one of embodiments 62 to 76, wherein x1 is 1.
- Embodiment 78 The protein conjugate of any one of embodiments 62 to 77, wherein R 1 is hydrogen.
- Embodiment 79 Embodiment 79.
- R 1 is hydrogen, halogen, -CX 1 3 , -CHX 1 2 , -CH 2 X 1 , -OCX 1 3 , -OCH 2 X 1 , -OCHX 1 2 , -CN, -SOn1R 1A , -SOv1NR 1A R 1B , -NHC(O)NR 1A R 1B , -N(O)m1, -NR 1A R 1B , -C(O)R 1A , -C(O)-OR 1A , -C(O)NR 1A R 1B , -OR 1A , -NR 1A SO 2 R 1B , -NR 1A C(O)R 1B , -NR 1A C(O)OR 1B , -NR 1A OR 1B , substituted or unsubstituted alkyl, or substituted or unsubstit
- Embodiment 80 The protein conjugate of embodiment 79, wherein R 1 is halogen, -CX 1 3 , -CHX 1 2 , -CH 2 X 1 , -OCX 1 3 , -OCH 2 X 1 , -OCHX 1 2 , -CN, -SO n1 R 1A , -N(O) m1 , - - - - - - 1 -4 .
- Embodiment 82 The protein conjugate of embodiment 79, wherein R 1 is halogen.
- Embodiment 83 The protein conjugate of embodiment 62, wherein the protein conjugate of Formula (4) is: .
- Embodiment 90 The of embodiments 62 to 87, wherein L 2 is a bond and L 3 -R 5 is: .
- Embodiment 91 The of embodiments 62 to 87, wherein L 2 is a bond and L 3 -R 5 is: .
- Embodiment 92 The protein one of embodiments 62 to 91, wherein the peptidyl moiety of R 5 comprises a target protein.
- Embodiment 93 Embodiment 93.
- Embodiment 94 The protein conjugate of any one of embodiments 62 to 91, wherein the peptidyl moiety of R 4 comprises an antibody, and the peptidyl moiety of R 5 comprises a target protein.
- Embodiment 95 The protein conjugate of embodiment 94 wherein the antibody variant is a single-chain variable fragment, a single-domain antibody, an affibody, or an antigen-binding fragment.
- Embodiment 96 Embodiment 96.
- Embodiment 97 The protein conjugate of any one of embodiments 62 to 91, wherein the peptidyl moiety of R 4 comprises a target protein and the peptidyl moiety of R 5 comprises an antibody.
- Embodiment 98 The protein conjugate of embodiment 97, wherein the antibody variant is a single-chain variable fragment, a single-domain antibody, an affibody, or an antigen- binding fragment.
- Embodiment 99 Embodiment 99.
- Embodiment 100 The protein conjugate embodiment 99, wherein the receptor protein is expressed on a cancer cell.
- Embodiment 101 The protein conjugate of any one of embodiments 92 to 98, wherein the target protein is an extracellular domain receptor protein, a transmembrane domain receptor protein, or an intracellular domain receptor protein.
- Embodiment 102 The protein conjugate of any one of embodiments 92 to 98, wherein the target protein is a cytosolic protein, a transcriptional factor, or an enzyme.
- Embodiment 103 Embodiment 103.
- the target protein is a programmed death-ligand 1 receptor, a programmed cell death protein 1 receptor, a 5-hydroxytryptamine receptor, an acetylcholine receptor, an adenosine receptor, an adenosine A2A receptor, an adenosine A2B receptor, an angiotensin receptor, an apelin receptor, a bile acid receptor, a bombesin receptor, a bradykinin receptor, a cannabinoid receptor, a chemerin receptor, a chemokine receptor, a cholecystokinin receptor, a Class A Orphan receptor, a dopamine receptor, an endothelin receptor, an epidermal growth factor receptor, a formyl peptide receptor, a free fatty acid receptor, a galanin receptor, a ghrelin receptor, a glycoprotein hormone receptor, a gonadotrophin-releasing hormone receptor,
- Embodiment 104 A complex comprising the compound of any one of embodiments 1 to 25 and a pyrrolysyl-tRNA synthetase.
- Embodiment 105 The complex of embodiment 104, further comprising a tRNA Pyl .
- Embodiment 106 A cell comprising: (i) the compound of any one of embodiments 1 to 25, (ii) the protein of any one of embodiments 26 to 59, (iii) the nucleic acid of embodiment 60, (iv) the vector of embodiment 61, (v) the protein conjugate of any one of embodiments 62 to 103, or (vi) the complex of embodiment 104 or 105.
- Embodiment 107 Embodiment 107.
- Embodiment 108 A method of enhancing the bioreactivity or binding efficacy of the protein of any one of embodiments 26 to 59, the method comprising mutating a naturally- occurring amino acid in the protein to arginine, thereby producing a non-naturally occurring arginine, wherein the non-naturally occurring arginine is proximal to the unnatural amino acid.
- Embodiment 109 A method of enhancing the bioreactivity or binding efficacy of the protein of any one of embodiments 26 to 59, the method comprising mutating a naturally- occurring amino acid in the protein to arginine, thereby producing a non-naturally occurring arginine, wherein the non-naturally occurring arginine is proximal to the unnatural amino acid.
- a method of enhancing the bioreactivity or binding efficacy of a protein comprising: (i) mutating a first amino acid to an unnatural amino acid, and (ii) mutating a second amino acid proximal to the first amino acid to arginine, wherein the unnatural amino comprises a side chain of Formula (2) or Formula (6): O wherein: L 4 is a bond, 1 to 8, x1 is an integer from 0 to 5, L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene, and R 1 is hydrogen or an electron withdrawing group, R 2 and R 3 are each independently substituted or unsubstituted C1-5 alkyl.
- Embodiment 110 The method of embodiment 108 or 109, wherein the naturally- occurring amino acid in the protein is Ala, Ile, Leu, Met, Val, Phe, Trp, Tyr, Asn, Cys, Gln, Ser, Thr, Asp, Glu, His, Lys, Gly, or Pro.
- EXAMPLES [0381] The following examples are intended to further illustrate certain embodiments of the disclosure. The examples are put forth so as to provide one of ordinary skill in the art and are not intended to limit its scope.
- Two new Uaas PFY and PFK bearing phosphoramidofluoridate have been designed and incorporated into proteins in E. coli and mammalian cells through genetic code expansion.
- Example 9 PFK expands protein cross-linking unreachable by PFY in vitro and in cells
- PFK could react with target residue unreachable by PFY
- nanobody mNb6 to evaluate the in vitro cross-linking of mNb6 with its binding target: the Spike protein of SARS-Cov-2.
- FOG.6D Based on the structure of mNb6-Spike complex (FIG.6D), we decided to incorporate PFK into mNb6 at sites 50-59 individually to target Tyr351 of the Spike protein’s receptor binding domain (RBD). Schoof et al, Science 370, 1473–1479 (2020).
- PFK-incorporated mNb6 mutant proteins were purified and incubated with the Spike RBD followed with Western blot analysis. Robust cross-linking of mNb6 with the Spike RBD was detected when PFK was incorporated at site 54 but no other sites (FIG.6E). Tandem MS analysis of the cross-linked proteins confirmed that PFK reacted with the target Tyr351 as expected. This site-specific cross-linking indicated that PFK-based PFEx reaction was also proximity-driven and non-random. In contrast, when the shorter PFY was incorporated at site 54, no cross-linking of mNb6 with the Spike RBD was detected (FIG.6F).
- Example 10 [0425] PFY crosslinks nucleophilic residues via proximity-enabled reactivity through the new click PFEx reaction. We discovered that Arg could also accelerate PFY-mediated protein crosslinking.
- the affibody-Z protein pair described in figure 5 was used here to show the effect. PFY was incorporated at site 36 of the affibody, and residue 32 was mutated to Arg.24 ⁇ M of the affibody protein was incubated with 6 ⁇ M of MBP-Z(N6Y) protein for different time followed with SDS-PAGE analysis under denatured conditions.
- Affibody(36PFY) did not show apparent crosslinking with MBP-Z(N6Y) in 18 hours; in contrast, the Arg mutant Affibody(36PFY/32R) could crosslink MBP-Z(N6Y) in 1 hour with crosslinking efficiency increasing with incubation time.
- IDTT Integrated DNA Technologies
- plasmids were sequenced by Azenta Life Sciences. All molecular biology reagents were obtained from Vazyme. His-HRP antibody, GFP monoclonal antibodies, and GAPDH-HRP antibody were obtained from ProteinTech Group. pEvol-mFSYRS was used as previously described.
- PhCF3 and PPh3 were used as internal standards, respectively.
- SEQ ID NO:1 - Affibody (36TAG) MVDNFNKELSVAGREIVTLPNLNDPQKKAFIRSLWUDPSQSANLLAEAKKLNDAQAPK GSHHHHHH, where U is the amber codon TAG introduced at the 36 th position to encode Uaa incorporation.
- MBP-Z (6X) [0431] pBAD-MBP-Z (6A) was cloned with primers MBP-Z-N6A-For and MBP-Z-N6A-Rev.
- pBAD-MBP-Z (6H) was cloned with primers MBP-Z-N6H-For and MBP-Z-N6H-Rev.
- pBAD- MBP-Z (6Y) was cloned with primers MBP-Z-N6Y-For and MBP-Z-N6Y-Rev.
- pBAD-MBP-Z (6K) was cloned with primers MBP-Z-N6K-For and MBP-Z-N6K-Rev.
- pBAD-MBP-Z (6T) was cloned with primers MBP-Z-N6T-For and MBP-Z-N6T-Rev.
- pBAD-MBP-Z (6S) was cloned with primers MBP-Z-N6S-For and MBP-Z-N6S-Rev.
- pBAD-MBP-Z (6C) was cloned with primers MBP-Z-N6C-For and MBP-Z-N6C-Rev.
- pBAD-MBP-Z (6N) was cloned with primers MBP-Z-N6-For and MBP-Z-N6-Rev.
- pBAD-MBP-Z (6W) was cloned with primers MBP-Z- N6W-For and MBP-Z-N6W-Rev.
- pBAD-MBP-Z (6M) was cloned with primers MBP-Z-N6M- For and MBP-Z-N6M-Rev.
- pBAD-MBP-Z (6D) was cloned with primers MBP-Z-N6D-For and MBP-Z-N6D-Rev.
- pBAD-MBP-Z (6E) was cloned with primers MBP-Z-N6E-For and MBP-Z- N6E-Rev.
- ecGST (103TAG106X107A).
- PCDNA3.1-ecGST (103TAG106Y107A) was cloned with primers pCDNA-ecGST-106Y107A-For and pCDNA-ecGST-106Y107A-Rev.
- PCDNA3.1-ecGST (103TAG106K107A) was cloned with primers pCDNA-ecGST-106K107A- For and pCDNA-ecGST-106K107A-Rev.
- PCDNA3.1-ecGST (103TAG106C107A) was cloned with primers pCDNA-ecGST-106C107A-For and pCDNA-ecGST-106C107A-Rev.
- SEQ ID NO:3 MKLFYKPGACSLASHITLRESGKDFTLVSVDLMKKRLENGDDYFAVNPKGQVPALLLD DGTLLTEGVAIMQYLADSVPDRQLLAPVNSISRYKTIEWLNYIAUELXAGFTPLFRPDTP EEYKPTVRAQLEKKLQYVNEALKDEHWICGQRFTIADAYLFTVLRWAYAVKLNLEGL EHIAAFMQRMAERPEVQDALSAEGLKHHHHHH, where U is the amber codon TAG introduced at site 103 for Uaa incorporation and X is His106 mutated to A, Y, K, and C.
- ecGST (103TAG106A107A157A10X).
- PCDNA3.1-ecGST (103TAG106A107A157A10A) was cloned with primers pCDNA-ecGST-10A-For and pCDNA- ecGST-10A-Rev.
- PCDNA3.1-ecGST (103TAG106A107A157A10H) was cloned with primers pCDNA-ecGST-10H-For and pCDNA-ecGST-10H-Rev.
- PCDNA3.1-ecGST (103TAG106A107A157A10K) was cloned with primers pCDNA-ecGST-10K-For and pCDNA- ecGST-10K-Rev.
- PCDNA3.1-ecGST (103TAG106A107A157A10Y) was cloned with primers pCDNA-ecGST-10Y-For and pCDNA-ecGST-10Y-Rev.
- SEQ ID NO:4 MKLFYKPGAXSLASHITLRESGKDFTLVSVDLMKKRLENGDDYFAVNPKGQVPALLLD DGTLLTEGVAIMQYLADSVPDRQLLAPVNSISRYKTIEWLNYIAUELAAGFTPLFRPDTP EEYKPTVRAQLEKKLQYVNEALKDEHWICGQRFTIADAALFTVLRWAYAVKLNLEGL EHIAAFMQRMAERPEVQDALSAEGLKHHHHHH, where U is the amber codon TAG introduced at site 103 for Uaa incorporation, and X is Cys10 mutated to A, H, K, and Y.
- SEQ ID NO:7 is SR4 QVQLVESGGGLVQAGGSLRLSCAASGFPVYSWNMWWYRQAPGKEREWVAAIESHGD STRYADSVKGRFTISRDNAKNTVYLQMNSLKPEDTAVYYCYVWVGHTYYGQGTQVT VSAGRAGEQKLISEEDLNSAVD, where H54 and S5 are the amber codon TAG introduced at these sites separately for Uaa incorporation.
- SEQ ID NO:8 is mNb6 QVQLVESGGGLVQAGGSLRLSCAASGYIFGRNAMGWYRQAPGKERELVAGITRRGSI TYYADSVKGRFTISRDNAKNTVYLQMNSLKPEDTAVYYCAADPASPAYGDYWGQGT QVTVSSHHHHHH, where GITRRGSITY (sites 50-59) is where the amber codon TAG was introduced at separately for Uaa incorporation.
- pBad-EGFP (182TAG) was co-transformed with pEVOL-mFSYRS or pEVOL- NpYRS into DH10 ⁇ E. coli chemical competent cells and plated on LB agar plate supplemented with 100 ⁇ g/mL ampicillin and 34 ⁇ g/mL chloramphenicol. A single colony was picked and incubated into 2 mL 2xYT-Amp100Cm34 and grown at 37 °C. When OD600 reached 0.6-0.8, 500 ⁇ L cell culture was supplemented with 1 mM or 2 mM PFY and 0.2% arabinose, then incubation at 25 °C for 16 h.
- coli competent cells For the incorporation of PFK into mNB6 (54TAG), pBad-mNB6 (54TAG) was co-transformed with pEvol-PFKRS into DH10 ⁇ . After transformation, a single colony was inoculated into 2 mL of 2xYT-Amp100Cm34 and left grown at 37 °C for 8 h. Then, 0.6 mL cell culture was diluted into 30 mL 2xYT medium and agitated vigorously at 37 °C. When OD 600 reached 0.8-1.0, the cell culture was added with 0.2% arabinose with or without 2 mM PFY or PFK, and the expression were carried out at 25 °C for 24 h.
- HEK-293T cells were seeded with 4 ⁇ 10 5 cells per well in a 6 well-cell culture dish containing 2 mL of DMEM media with 10% FBS and 1% P/S, and grown at 37 °C in a CO2 incubator overnight.
- the plasmid pMP-PFYRS (1.5 ⁇ g) was co-transfected with pcDNA 3.1- EGFP (182TAG) (1.5 ⁇ g) into target cells with 9 ⁇ L polyethyleneimine (1 mg/mL) in 2 mL DMEM media.
- the cells were treated with or without 2 mM PFY.
- HEK-293T cells were seeded with 2 ⁇ 10 5 cells per well in a 12 well-cell culture dish containing 1 mL of DMEM media with 10% FBS and 1% P/S, and grown at 37 °C in a CO2 incubator overnight.
- the plasmid pMP-PFYRS (0.75 ⁇ g) was co-transfected with pcDNA 3.1- EGFP (182TAG) (0.75 ⁇ g) into target cells with 4.5 ⁇ L polyethyleneimine (1 mg/mL) in 1 mL DMEM media.
- plasmid pNEU-PFKRS (0.8 ⁇ g) was transfected into HeLa-GFP(182TAG) reporter cells using lipofectamine 3000 transfection reagent and treated with or without 1 mM PFK, incubation for additional 24 h. After incubation, the cells were washed with PBS and analyzed by BD LSRFortessa TM cell analyzer.
- MBP-Z (24PFY) and Affibody (4A7X) [0456] Purified 24 ⁇ M Affibody (4A7H/Y/K/C) was incubation with 6 ⁇ M MBP-Z (24PFY) in 6 ⁇ L PBS buffer, various concentrations of Na 2 SiO 3 (final conc.0, 1, 2, 5,10 mM) were added and incubation at 37 °C for 16 h. The reaction mixture was added 10 ⁇ L 4x laemmli sample buffer containing 100 mM DTT and incubation at r.t for 1 h. The samples were separated on SDS-PAGE.
Landscapes
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Plant Pathology (AREA)
- Gastroenterology & Hepatology (AREA)
- Microbiology (AREA)
- Physics & Mathematics (AREA)
- Medicinal Chemistry (AREA)
- Peptides Or Proteins (AREA)
Abstract
Sont présentement décrits, entre autres, des acides aminés non naturels, des protéines comprenant des acides aminés non naturels, des conjugués de protéines et des procédés de fabrication des acides aminés non naturels, des protéines et des conjugués de protéines. Dans des modes de réalisation, les acides aminés non naturels sont des composés ayant la formule suivante ou un stéréoisomère de ceux-ci : Z étant le soufre ou le phosphore et les substituants restants étant tels que décrits dans la description.
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202363608411P | 2023-12-11 | 2023-12-11 | |
| US63/608,411 | 2023-12-11 | ||
| US202363615691P | 2023-12-28 | 2023-12-28 | |
| US63/615,691 | 2023-12-28 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025128629A1 true WO2025128629A1 (fr) | 2025-06-19 |
Family
ID=96058356
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2024/059463 Pending WO2025128629A1 (fr) | 2023-12-11 | 2024-12-11 | Acides aminés non naturels, protéines bioréactives et leurs utilisations |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025128629A1 (fr) |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190359965A1 (en) * | 2017-02-13 | 2019-11-28 | The Regents Of The University Of California | Site-specific generation of phosphorylated tyrosines in proteins |
-
2024
- 2024-12-11 WO PCT/US2024/059463 patent/WO2025128629A1/fr active Pending
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190359965A1 (en) * | 2017-02-13 | 2019-11-28 | The Regents Of The University Of California | Site-specific generation of phosphorylated tyrosines in proteins |
| US20220411778A1 (en) * | 2017-02-13 | 2022-12-29 | The Regents Of The University Of California | Mutant Aminoacyl tRNA Synthetase |
Non-Patent Citations (3)
| Title |
|---|
| GHADIMI SAIED, EBRAHIMI VALMOOZI ALI ASGHAR, POURAYOUBI MEHRDAD, ASAD SAMANI KEYVAN: "Structure-activity study of phosphoramido acid esters as acetylcholinesterasf inhibitors", JOURNAL OF ENZYME INHIBITION AND MEDICINAL CHEMISTRY, INFORMA HEALTHCARE, GB, vol. 23, no. 4, 1 January 2008 (2008-01-01), GB , pages 556 - 561, XP093329482, ISSN: 1475-6366, DOI: 10.1080/14756360701731981 * |
| STEFAN WAGNER; MATTEO ACCORSI; JöRG RADEMANN: "Benzyl MonoâPâFluorophosphonate and Benzyl PentaâPâFluorophosphate Anions Are Physiologically Stable Phosphotyrosine Mimetics and Inhibitors of Protein Tyrosine Phosphatases", CHEMISTRY - A EUROPEAN JOURNAL, JOHN WILEY & SONS, INC, DE, vol. 23, no. 61, 11 October 2017 (2017-10-11), DE, pages 15387 - 15395, XP071844625, ISSN: 0947-6539, DOI: 10.1002/chem.201701204 * |
| XIAOZHOU LUO, FU GUANGSEN, WANG RONGSHENG E, ZHU XUEYONG, ZAMBALDO CLAUDIO, LIU RENHE, LIU TAO, LYU XIAOXUAN, DU JINTANG, XUAN WEI: "Genetically encoding phosphotyrosine and its nonhydrolyzable analog in bacteria", NATURE CHEMICAL BIOLOGY, NATURE PUBLISHING GROUP US, NEW YORK, vol. 13, no. 8, 1 August 2017 (2017-08-01), New York, pages 845 - 849, XP055555583, ISSN: 1552-4450, DOI: 10.1038/nchembio.2405 * |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| AU2019231893B2 (en) | Bioreactive compositions and methods of use thereof | |
| CN107148425B (zh) | 对mt1-mmp特异性的双环肽配体 | |
| US20240262791A1 (en) | Bioreactive proteins containing unnatural amino acids | |
| US20250283138A1 (en) | Bioreactive compounds and methods of use thereof | |
| CN113332448B (zh) | 肿瘤靶向多肽、制备方法及其应用 | |
| EP2970417B1 (fr) | Peptides bh4 stabilisés et leurs utilisations | |
| WO2020072674A1 (fr) | Agents de réticulation à cibles multiples et leurs utilisations | |
| WO2022256505A2 (fr) | Protéines ayant des acides aminés non naturels et méthodes d'utilisation | |
| Fang et al. | GPC3-mediated lysosome-targeting chimeras (GLTACs) for targeted degradation of membrane proteins | |
| WO2025128629A1 (fr) | Acides aminés non naturels, protéines bioréactives et leurs utilisations | |
| EP3986438B1 (fr) | Macrocycles peptidomimétiques activateurs de p53 | |
| WO2021262731A2 (fr) | Composés macrocycliques et leurs procédés d'utilisation | |
| EP4612127A1 (fr) | Protéines bioréactives contenant des acides aminés non naturels | |
| CN117098768A (zh) | 生物反应性化合物及其使用方法 | |
| WO2024145687A1 (fr) | Protéines bioréactives contenant un acide aminé non naturel et de l'arginine | |
| CN117651568A (zh) | 细胞内递送组合物 | |
| EP3688007A1 (fr) | Bioconjugaison de polypeptides | |
| WO2021173889A1 (fr) | Utilisations de bioconjugués folate-anticorps anti-cd3 | |
| CA3153354A1 (fr) | Conception informatique de proteines de liaison a l'integrine alpha (v) beta (6) | |
| JP2024546110A (ja) | P53ペプチド模倣大環状分子 | |
| HK40085490A (en) | Macrocyclic compounds and methods of use thereof |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24904765 Country of ref document: EP Kind code of ref document: A1 |