M&C PE963057WO 1 COVALENT TAG FIELD The present disclosure concerns a compound of formula (I) and/or a tagged compound of formula (IA), each of which are capable of binding to a complementary 5 hole-modified mutant bromodomain, also presented herein. The disclosure also concerns the use of the compounds in various applications, a method of covalently binding a target biomolecule with a compound or tagged compound, and a kit comprising: a compound or tagged compound, and a construct comprising a nucleic acid encoding a complementary hole-modified mutant bromodomain. 10 BACKGROUND A protein tag is a polypeptide sequence that is grafted to a protein to impart functionality to the protein, such as facilitating or enabling the purification, identification, visualisation, quantification, solubilisation, and/or binding of the protein. Protein tags can 15 also enable a protein to act as a capture agent in an assay or other technique to analyse a target molecule or biomolecule. Typically, the grafted protein tag is able to interact with another molecule (often referred to as a ligand) or entity to enable such functionality; a ligand may be specifically designed to interact with the protein tag. Recently, protein tags have found use in chemical-induced proximity (CIP), such as targeted protein 20 degradation (TPD). These techniques may involve first “tagging” a protein with a protein tag (said tagging performed either endogenously or post-translationally), then contacting the tagged protein with a heterobifunctional molecule. In TPD, the heterobifunctional molecule may then recruit an E3 ubiquitin ligase to the tagged protein, promoting the protein’s ubiquitination and subsequent degradation. 25 An example of a protein tag is BromoTag, as described in WO 2023/047121 A1. BromoTag is a hole-modified mutant Brd4 bromodomain, namely Brd4BD2L387A, which has a ‘hole’ in the binding site to enable highly specific and potent binding of a complementary ‘bumped’ ligand (the ‘bump-and-hole’ method). The bumped ligand may be comprised in or incorporated into a heterobifunctional molecule that comprises a 30 second ligand to recruit an E3 ubiquitin ligase, the second ligand joined to the bumped ligand through a linker, thereby enabling TPD. The binding of the bumped ligand to BromoTag occurs through non-covalent intermolecular interactions. Covalent protein tags, i.e., protein tags where the ligand interacts with the tag through the formation of a covalent bond, have also been 35 developed. As covalent bond formation, in this context, is irreversible, covalent labels have an advantage in that they are wash resistant, enabling applications such as lysate
M&C PE963057WO 2 studies, gel-based assays, live-cell fluorescence imaging, and pull-down assays, for example. An example of a covalent protein tag is HaloTag, which comprises two reactive amino acid residues, namely aspartic acid and histidine, which can react with a haloalkane ligand. HaloTag suffers from being a particularly large tag (33 KDa); this, 5 amongst other properties of HaloTag, makes it incompatible with many proteins. Additionally, the haloalkane ‘warhead’ of the ligand is relatively generic, resulting in a lack of specificity. A further example of a covalent protein tag is SNAP-tag, which is an alanine- glyoxylate transaminase (AGT) engineered to react with O-benzylguanine derivatives 10 through a cysteine residue on the tag. Drawbacks to SNAP-tag include relatively slow binding kinetics, and a relatively large size (19 KDa), again resulting in incompatibility with some proteins. An additional example of a covalent protein tag is SpyCatcher, which is an engineered CnaB2 domain of the FbaB protein from Streptococcus pyogenes. A lysine 15 residue on SpyCatcher can react with an aspartic acid residue on the small peptide ligand SpyTag, engineered from the same protein, forming a new peptide bond. While SpyCatcher/SpyTag is useful for lysine targeting, the reliance on peptide bond formation results in slow reaction kinetics. Additionally, the ubiquity of lysines in proteins can result in limited selectivity for the SpyTag ligand. 20 There is a need in the art for alternative covalent protein tags, preferably covalent tags that are small (to improve compatibility), have fast reaction kinetics with a complementary ligand, and/or are highly specific to the complementary ligand. The present disclosure seeks to address one or more of these issues. 25 SUMMARY The present investigators have adapted the bump-and-hole system of BromoTag to develop a covalent version of the tag. Through clever design and extensive testing, the investigators have developed mutant bromodomains presented herein, that are able to covalently bind to a series of newly developed complementary ligands, also presented 30 herein, with unexpectedly fast kinetics and/or high specificity. The ligands are derived from the non-covalent BromoTag ligands, but comprise an electrophilic moiety strategically positioned on the ligand structure to give rise to covalent binding to a complementary mutant bromodomain, thereby improving binding specificity and kinetics. Therefore, in a first aspect, there is provided a compound of formula (I):
M&C PE963057WO 3 (I), wherein
G is a 5- or 6-membered heteroarene or 6-membered arene, optionally substituted with one or two substituents selected from the group consisting of methyl, 5 halo, hydroxy, thiol, halomethyl, amino, methoxy, methylamino, dimethylamino, ethyl, haloethyl, amido, isopropyl, tert-butyl, and methylthio; E1 is selected from C1-4alkyl, C2-4alkenyl, C1-4haloalkyl, C2-4haloalkenyl, and an electrophile selected from C1-4alkanal (such as C2-4alkanal), OC(O)C1-4alkyl, OC(O)C2- 4alkenyl, OC(O)C2-4alkynyl, OC(O)epoxidyl, OC(O)aziridinyl, C(O)OC1-4alkyl,10 C(O)OC2-4alkenyl, C(O)OC2-4alkynyl, C(O)O-epoxidyl, C(O)O-aziridinyl, N(Ra)CO(C1- 4alkyl), NRaCO(C2-4alkenyl), NRaCO(C2-4alkynyl), NRaCO(bicyclo[1.1.0]butane), NRaCO(epoxidyl), NRaCO(aziridinyl), NRaSO2(C1-4alkyl), NRaSO2(C2-4alkenyl), epoxidyl, aziridinyl, episulfidyl, azacyclopropyl, cyclopropenonyl, C1-4alkylcyclopropenonyl, C(O)C(O)N(Ra)2, C(O)NRa(C1-4alkyl), C(O)NRa(C2-4alkenyl), C(O)NRa(C2-4alkynyl), 15 C(O)NRa(epoxidyl), C(O)NRa(aziridinyl), C1-4alkylC(O)N(Ra)2, C2-4alkenylC(O)N(Ra)2, C2-4alkynylC(O)N(Ra)2, SO2NRa(C1-4alkyl), SO2NRa(C2-4alkenyl), SO2NRa(C2-4alkynyl), SO2NRa(epoxidyl), and SO2NRa(aziridinyl), each of which is optionally substituted with one or more substituents selected from halo, cyano and amino; E2, E3 and E4 are each independently selected from H, R3, and an electrophile 20 selected from C2-4haloalkenyl, C1-4alkanal (such as C2-4alkanal), OC(O)C1-4alkyl, OC(O)C2-4alkenyl, OC(O)C24alkynyl, OC(O)epoxidyl, OC(O)aziridinyl, C(O)OC1-4alkyl, C(O)OC24alkenyl, C(O)OC2-4alkynyl, C(O)O-epoxidyl, C(O)O-aziridinyl, NRaCO(C1- 4alkyl), NRaCO(C24alkenyl), NRaCO(C2-4alkynyl), NRaCO(bicyclo[1.1.0]butane), NRaCO(epoxidyl), NRaCO(aziridinyl), NRaSO2(C1-4alkyl), NRaSO2(C2-4alkenyl), epoxidyl, 25 aziridinyl, episulfidyl, azacyclopropyl, cyclopropenonyl, C1-4alkylcyclopropenonyl, C(O)C(O)N(Ra)2, C(O)NRa(C1-4alkyl), C(O)NRa(C2-4alkenyl), C(O)NRa(C2-4alkynyl), C(O)NRa(epoxidyl), C(O)NRa(aziridinyl), C1-4alkylC(O)N(Ra)2, C2-4alkenylC(O)N(Ra)2, C24alkynylC(O)N(Ra)2, SO2NRa(C1-4alkyl), SO2NRa(C2-4alkenyl), SO2NRa(C2-4alkynyl), SO2NRa(epoxidyl), and SO2NRa(aziridinyl), each of which is optionally substituted with30 one or more substituents selected from halo, cyano and amino;
M&C PE963057WO 4 R1 is any one selected from the group consisting of C1-4alkyl, C1-4haloalkyl, H and halo; R2 is H, C1-3alkyl, C1-3haloalkyl or halo; R3 is independently selected from halo, hydroxyl, thiol, amido, NR4R5, 5 C(O)NR4R5, C1-6alkyl, C1-6haloalkyl, C1-6alkoxy and C1-6alkylthio; R4 and R5 are independently selected from H and C1-3alkyl; Ra is H, C1-3alkyl or C1-3haloalkyl; and D is a reactive group, wherein at least one of E1 to E4 is an electrophile. 10 The ligands of the first aspect may be functionalised so that, when the ligand binds to a complementary mutant bromodomain, it is able to confer or enable some functionality to a molecule, for example a protein, tagged by the complementary mutant bromodomain. Functionalised ligands of a second aspect (described below) are referred to herein as “tagged compounds”. In these, the functionality arises from a molecule that 15 is linked to the ligand by a linker molecule. The present investigators have demonstrated many functionalities (i.e., molecules) that can be linked to the ligands of the first aspect through a linker. Therefore, in a second aspect, there is provided a tagged compound of formula (IA): 20
(IA), wherein G, E1, E2, E3, E4, R1, and R2 are as defined in the first aspect of the invention, D’ is the product of a reactive group, D, with a pro-linker to form D’-L or D’-L- B, L is a molecule capable of linking D’ to B and B is a molecule. 25 As described above, the investigators have designed and developed hole- modified mutant bromodomains that are capable of covalently binding complementary ligands of the first and second aspects. That is to say, the compounds of the first and second aspect of the disclosure and the mutant bromodomains disclosed herein are interrelated: they form a ‘ligand-mutant protein pair’. The mutant bromodomains are 30 particularly small protein tags (~13 or ~12 KDa in size, e.g. ~13 KDa in size) and thus have wide compatibility.
M&C PE963057WO 5 Bromodomains are a protein domain found in several proteins, notably the Bromo- and Extra-terminal domain (BET) family of proteins. The BET family comprises Brd2, Brd3, Brd4, and BrdT – each protein comprises two bromodomains, namely BD1 and BD2. The present investigators have found that there is high homology between the 5 bromodomains of several bromodomain-containing proteins, namely Brd2, Brd3, Brd4, and BrdT. That is to say, there are conserved amino acid residues between each BD1 and BD2 of Brd2, Brd3, Brd4, and BrdT. The amino acid sequences SEQ ID No. 1-4 relate to the wild type Bromo- and Extra-terminal domain (BET) family of proteins. SEQ ID No.1 refers to bromodomain-containing protein 4, or Brd4. Amino acid 10 residues 351 to 460 of SEQ ID No. 1 describe bromodomain 2 of Brd4, or Brd4BD2. Amino acid residues 60 to 167 of SEQ ID No. 1 describe bromodomain 1 of Brd4, or Brd4BD1. Hence, sequences under part (i) of the third aspect refer to hole-modified mutant bromodomains, wherein the bromodomain is Brd4BD2. Similarly, sequences under part (ii) of the third aspect refer to hole-modified Brd4BD1 bromodomains. 15 SEQ ID No.2 refers to bromodomain-containing protein 2, or Brd2. Amino acid residues 347 to 456 of SEQ ID No. 2 describe bromodomain 2 of Brd2, or Brd2BD2. Amino acid residues 76 to 183 of SEQ ID No. 2 describe bromodomain 1 of Brd2, or Brd2BD1. Hence, sequences under part (iii) of the third aspect refer to hole-modified mutant bromodomains, wherein the bromodomain is Brd2BD2. Similarly, sequences20 under part (iv) of the third aspect refer to hole-modified Brd2BD1 bromodomains. SEQ ID No.3 refers to bromodomain-containing protein 3, or Brd3. Amino acid residues 309 to 416 of SEQ ID No. 3 describe bromodomain 2 of Brd3, or Brd3BD2. Amino acid residues 36 to 143 of SEQ ID No. 3 describe bromodomain 1 of Brd3, or Brd3BD1. Hence, sequences under part (v) of the third aspect refer to hole-modified 25 mutant bromodomains, wherein the bromodomain is Brd3BD2. Similarly, sequences under part (vi) of the third aspect refer to hole-modified Brd3BD1 bromodomains. SEQ ID No.4 refers to bromodomain testis-specific protein, or BrdT. Amino acid residues 270 to 379 of SEQ ID No. 4 describe bromodomain 2 of BrdT, or BrdTBD2. Amino acid residues 29 to 136 of SEQ ID No. 4 describe bromodomain 1 of BrdT, or 30 BrdTBD1. Hence, sequences under part (vii) of the third aspect refer to hole-modified mutant bromodomains, wherein the bromodomain is BrdTBD2. Similarly, sequences under part (viii) of the third aspect refer to hole-modified BrdTBD1 bromodomains. Therefore, in a third aspect, there is provided a hole-modified mutant bromodomain, which comprises an amino acid sequence selected from residues: 35 (i) 351 to 460 of SEQ ID No.1, and: (a) the mutation L387C;
M&C PE963057WO 6 (b) the mutations L387A or L387V, and E438C; and/or (c) the mutations L387A or L387V, and M442C; (ii) 60 to 167 of SEQ ID No.1, and: (a) the mutation L94C; 5 (b) the mutations L94A or L94V, and D145C; and/or (c) the mutations L94A or L94V, and M149C; (iii) 347 to 456 of SEQ ID No. 2, and: (a) the mutation L383C; (b) the mutations L383A or L383V, and D434C; and/or 10 (c) the mutations L383A or L383V, and M438C; (iv) 76 to 183 of SEQ ID No.2, and: (a) the mutation L110C; (b) the mutations L110A or L110V, and D161C; and/or (c) the mutations L110A or L110V, and M165C; 15 (v) 309 to 416 of SEQ ID No.3, and: (a) the mutation L345C; (b) the mutations L345A or L345V, and E396C; and/or (c) the mutations L345A or L345V, and M400C; (vi) 36 to 143 of SEQ ID No. 3, and: 20 (a) the mutation L70C; (b) the mutations L70A or L70V, and D121C; and/or (c) the mutations L70A or L70V, and M125C; (vii) 270 to 379 of SEQ ID No.4, and: (a) the mutation L306C; 25 (b) the mutations L306A or L306V, and E357C; and/or (c) the mutations L306A or L306V, and M361C; (viii) 29 to 136 of SEQ ID No.4, and: (a) the mutation L63C; (b) the mutations L63A or L63V, and D114C; and/or 30 (c) the mutations L63A or L63V, and M118C. In a fourth aspect, there is provided a nucleic acid encoding a hole-modified mutant bromodomain as defined in the third aspect. The present investigators have recognised and shown that the complementary mutant bromodomains and tagged compounds provide the foundation for a functional 35 protein tag system, which may find wide utility in the chemical and biological sciences.
M&C PE963057WO 7 The investigators have demonstrated, without limitation, said utility in a variety of uses of the protein tag system presented herein. Therefore, in a fifth aspect, there is provided a use of a tagged compound according to the second aspect in, for example, immunoassays, diagnostics, 5 photocaging, immobilisation, solubilisation, targeted protein degradation, fluorescence resonance energy transfer, time-resolved fluorescence resonance energy transfer, induced proximity, fluorescence imaging, assays, sensing, and detection, wherein the sensing and detection is of a biomolecule or bioactive molecule. Additionally, in a sixth aspect, there is provided a method of covalently binding a 10 target biomolecule with a compound, the method comprising contacting the biomolecule with the compound, wherein: the compound is according to the first or second aspect, and the biomolecule comprises a hole-modified mutant bromodomain engineered to comprise a cysteine residue, positioned such that contacting the biomolecule with the compound allows the nucleophilic group to interact with the electrophile of the compound15 and form a covalent bond. The present investigators have also recognised that the complementary mutant bromodomains and compounds can be provided in a kit that contains the necessary components to make use of the presently disclosed protein tag system. Therefore, in a seventh aspect, there is provided a kit comprising, as separate 20 components: (i) a compound according to the first or second aspect; and (ii) a nucleic acid, or construct comprising a nucleic acid encoding a hole-modified mutant bromodomain. BRIEF DESCRIPTION OF THE FIGURES 25 Fig. 1: The broad principle of the system disclosed herein, showing tagged compounds (where “D” corresponds to D or D’-L-B herein) and the different proposed mutant bromodomains. Three mutants (M) were designed: M1 corresponds to Brd4BD2L387C, M2 corresponds to Brd4BD2L387A,E438C, M3 is Brd4BD2L387A,M442 Fig. 2: Computational docking results using the co-crystal of ET-JQ1-OMe and 30 Brd2BD2L383V,D434C (prepared by computational mutagenesis of 6YTM). Brd2BD2L383V,D434C was used as an homologous model to Brd2BD2L387A,E438C. A representative example of computational docking using the chloroacetamide derivative of ET-JQ1-OMe (MR112) is depicted (“REVERSIBLE DOCKING”) (“COVALENT DOCKING”). D434 in Brd2BD2 corresponds to E438 in Brd4BD2, and L383 in Brd2BD235 is L387 in Brd4BD2.
M&C PE963057WO 8 Fig. 3: (A) Structure of biotin functionalised covalent ligand MR169; (B) Schematic representation of a tagged compound comprising biotin binding to a mutant bromodomain, thus binding to streptavidin; M = mutant; E = covalent ligand; (C) Sample separation by SDS-PAGE gel; the ratio of protein to ligand to streptavidin was 1:1.5:10. 5 Fig. 4: (A) Schematic representation of a tagged compound comprising an ‘always-on’ fluorescent label binding to a mutant bromodomain; M = mutant; F = fluorophore; E = covalent ligand; (B) the structure of tagged compound MR175, which comprises the fluorescent label Cyanine 5; (C) protein samples separated by gel electrophoresis; the fluorescence gel was imaged with an excitation wavelength of 64510 nm and an emission wavelength of 665 nm. Fig. 5: Relative fluorescence units (RFU) for Janelia Fluor® 635-tagged compound C10852L bound to Brd4BD2 mutants. Brd4BD2WT is used as a negative control, and 4% SDS is used as a positive control; the fluorescence gel was imaged with an excitation wavelength of 627 nm and an emission range of 660-720 nm. 15 Fig. 6: (A) Schematic representation of a tagged compound comprising an activatable fluorescent label binding to a mutant bromodomain; the fluorescence is activated upon binding; M = mutant; F = fluorophore; E = covalent ligand; (B) the structure of Janelia Fluor® 635-tagged compound C10852L; (C) protein samples separated by electrophoresis; the ligand C10852L is silent until bound to the short or 20 long mutants of Brd4BD2L387A-E438C; the fluorescence gel was imaged with an excitation range of 625-650 nm and an emission range of 675-725 nm. The SDS gel is serving for full activation of the fluorophore in the gel. Fig. 7: (A) Schematic representation of the attachment of a fluorophore to a mutant bromodomain covalently bound to a tagged compound, wherein the tagged 25 compound comprises a reactive alkyne handle, and the attachment is carried out using click chemistry, hence a click chemistry kit; MUT = mutant; F = fluorophore; E = covalent ligand; (B) components of the kit, including a fluorophore azide (Cy5 or BODIPY (“BDP”)) and the acrylamide covalent ligand MR155, that comprises a terminal alkyne as a click chemistry/reactive handle; (C) protein samples separated by electrophoresis, after30 mutant binding followed by click chemistry reaction. Fig. 8: (A) Quantitative live-cell degradation kinetics of an endogenously tagged HiBiT-Brd4BD2L387A,E438C-Brd4 (also referred to herein as BromoCatch or BrdBD2L387AE438C) in HEK293 cells following treatment with DMSO and a 4-fold serial dilution of MZ1, AGB1, MR156, and MR170 over a range of 600 pM to 50µM; RLU = 35 relative light units; (B) summary table of degradation rate and Dmax values for all four test
M&C PE963057WO 9 compounds; (C) summary table of DC50 at 9 h of treatment for each treated compound derived from the kinetic degradation profiles in (A). Fig. 9: Differential scanning fluorimetry (DSF) of acrylamide ligand MR116 co- incubated with a Brd2 mutant, Brd2BD2L383V,D434C (Brd4BD2L387A,E438C analogue used for 5 crystallisation work) to evaluate ligand-mutant protein pair stabilisation. Fig. 10: Cocrystal structure of bromodomain covalently modified with igand MR116. (A) Overlay of the Brd2BD2L383V/ET-JQ1-OMe (6YTM) and the Brd2BD2L383A,D434C/MR116 (RMSDcs 0.43) B) 2Fo-Fc electron density map around the covalent ligand Brd2BD2L383A,D434C /MR116 co-crystal where continuous density can be 10 observed around the area of the covalent bond between the cysteine residue and the ligand; (C) 2Fo-Fc electron density map zoomed in around the covalent bond. The density around the cysteine suggests two conformations of the thioether bond. Fig. 11: (A) Probe MR202 structure; (B) In-gel assay with purified proteins Brd4BD2L387A,E438C (BromoCatch), Brd4BD2WT, or Brd4BD2L387A (BromoTag); (C) 15 HEK293FT H2B-BromoCatch cell lysate experiment; MR202 specifically detects H2BBromoCatch at 1 µM; the probe showed no unspecific binding in HEK293FT WT cells when incubated with 2.5 µM of the probe. Fig. 12: In vitro and cellular validation of the functionalized MR116 Janalia-635 covalent probe (A) Schematic of C10852S and “switch-on” binding mode; (B) Relative 20 fluorescence intensity of C10852S in the presence of several species; (C) Fluorescence spectrum of C10852S (100 µM) in the presence of Brd4BD2WT vs. Brd4BD2L387A,E438C; (D) SDS gel; (E) Live cell confocal imaging of H2B-BromoCatch transiently transfected C10852S treated U2Os cells. 8-hour treatment of 100 nM C10852S or C10852S co treatment with 25 µM ET-JQ1-OMe. Hoechst 33342 nuclear counterstain and JF-635 25 fluorescence detected on separate channels. N2 independent repeats; (F) Live cell confocal imaging of H2B-BromoCatch transiently transfected C10852S treated HEK293FT cells. 6-hour treatment of 100nM C10852S or C10852S co treatment with 25µM ET-JQ1-OMe. Hoechst 33342 nuclear counterstain and JF-635 fluorescence detected. N3 independent repeats. 30 Fig. 13: Comparative experiment to evaluate the fluorogenicity of different Janelia Fluor 635 based probes based on the MR116 ligand and bearing different linkers (aliphatic or PEG based). Relative fluorescence intensity of C10852K, C10852L, C10852M, C10852N, C10852S, C10852T and C10852U in the presence of the species specified (at a 2.5-fold excess of probe) along the x-axis. Maximal fluorescence intensity 35 is achieved with both long and short Brd4BD2L387A,E438C mutants and results demonstrate poor binding to Brd4BD2WT (low levels of fluorescence emission).
M&C PE963057WO 10 Fig. 14: (A) Live cell confocal imaging of H2B-Brd4BD2L387A,E438C in U2OS cells, 2 hour treatment at 200 nM with the listed compounds. Hoechst 33342 nuclear counterstain and JF-635 fluorescence detected on separate channels; (B) depicts a similar experiment with a 1 hour pre-incubation step with 20 µM MR116. 5 Fig. 15: Comparative experiment to evaluate the fluorogenicity of different Janelia Fluor 635 were the acrylamide warhead MR116 was substituted for the fluoroacrylamide analogue MR135. The acrylamide based probes C10852L, C10852S, C10852T, were compared to their fluoroacrylamide analogues C10852AD, C10852AC, C10852AA, respectively, and an additional fluoroacrylamide with a peg linker was also 10 included C10852AD. The relative fluorescence in the presence of the species specified along the x-axis demonstrates the effective fluorogenic response of all tested compounds in presence of the shortBrd4BD2L387A,E438C when compared to the corresponding negative controls (Brd4BD2WT). Solutions of the probe in 0.1% TFA in ethanol or 4% SDS were used as positive controls for full fluorophore switch on. TFA = trifluoroacetic acid and15 SDS = sodium dodecyl sulfate. Fig. 16: Comparative experiment to evaluate the fluorogenicity of different Janelia Fluor 646 based probes based on fluoroacrylamide ligand MR135 and bearing different linkers C10852R has a linker comprising three PEG moieties C10852AH is a short aliphatic and C10852AI has a linker comprising two PEG moieties. The fluorogenic20 character of these probes was tested in the presence of the species specified (at a 2.5- fold excess of probe) along the x-axis, including 0.1% TFA in ethanol as a positive control for full fluorophore switch on. TFA = trifluoroacetic acid. DETAILED DESCRIPTION 25 Definitions In the discussion that follows, reference is made to a number of terms, which have the meanings provided below, unless a context indicates to the contrary. The nomenclature used herein for defining compounds, particularly the compounds according to the invention, is in general based on the rules of the IUPAC organisation for30 chemical compounds, specifically the “IUPAC Compendium of Chemical Terminology (Gold Book)”. For the avoidance of doubt, if a rule of the IUPAC organisation is in conflict with a definition provided herein, the definition herein is to prevail. Furthermore, if a compound structure is in conflict with the name provided for the structure, the structure is to prevail. 35 The term “comprising” or variants thereof is to be understood herein to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps,
M&C PE963057WO 11 but not the exclusion of any other element, integer or step, or group of elements, integers or steps. The term “consisting” or variants thereof is to be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, and the 5 exclusion of any other element, integer or step or group of elements, integers or steps. The term “arene” defines monocyclic or polycyclic aromatic hydrocarbons, where “aromatic” defines a cyclically conjugated molecular entity with a stability (due to delocalisation) significantly greater than that of a hypothetical localised structure. The Hückel rule is often used in the art to assess aromatic character; monocyclic planar (or 10 almost planar) systems of trigonally (or sometimes digonally) hybridised atoms that contain (4n+2) π-electrons (where n is a non-negative integer) will exhibit aromatic character. The rule is generally limited to n = 0 to 5. Examples of arenes are benzene, naphthalene, indene, fluorene, indene, anthracene, and phenanthrene. Benzene is a 6-membered arene. The term “arylene” is therefore understood to refer to divalent groups 15 derived from arenes by the removal of two hydrogen atoms from any one or two carbon atoms. In some cases, arylene refers to phenylene (derived from benzene), naphthylene, fluorenylene, anthracenylene, and phenanthrenylene. The term “heteroarene” defines a monocyclic or polycyclic aromatic hydrocarbon comprising one or more heteroatoms. Heteroarene may refer to pyrrole, imidazole, 20 pyrazole, triazole, tetrazole, furan, thiophene, oxazole, isothiazole, thiazole, thiadiazole, pyridine, pyridazine, pyrimidine, pyrazine, triazine, indole, benzimidazole, azaindole, benzofuran, and benzothiophene. Examples of 5-membered heteroarenes include but are not limited to pyrrole, imidazole, pyrazole, triazole, tetrazole, furan, thiophene, oxazole, isothiazole, thiazole, and thiadiazole. 25 The term “heterocyclylene” defines a divalent group derived from a heterocycle by the removal of two hydrogen atoms from one or two atoms of the heterocycle. Examples of heterocyclylene groups include but are not limited to pyrrolylene, imidazolylene, pyrazolylene, triazolylene, tetrazolylene, furanylene, thiophenylene, oxazolylene, isothiazolylene, thiazolylene, thiadiazolylene, pyridinylene, pyridazinylene, 30 pyrimidinylene, pyrazinylene, triazinylene, indolylene, benzimidazolylene, azaindolylene, benzofuranylene, benzothiophenylene, pyrolidinylene, pyrrolinylene, tetrahydrofuranylene, tetrahydrothiophenylene, piperidinylene, piperazinylene, tetrahydropyranylene, thianylene, dithianylene, morpholinylene, and thiomorpholinylene. The term “alkyl” is well known in the art and defines univalent groups derived 35 from alkanes by removal of a hydrogen atom from any carbon atom, wherein the term “alkane” is intended to define acyclic branched or unbranched hydrocarbons having the
M&C PE963057WO 12 general formula CnH2n+2, wherein n is an integer ≥1. Alkyl groups may be C1-6alkyl groups. In some cases, alkyl groups are C1-4alkyl groups, or C1-3alkyl groups. C1-4alkyl refers to any selected from the group consisting of methyl, ethyl, n-propyl, iso-propyl, n-butyl, sec- butyl, iso-butyl and tert-butyl. C1-3alkyl refers to any selected from the group consisting 5 of methyl, ethyl, n-propyl, and iso-propyl. The term “cycloalkyl” refers to a univalent alkyl group derived from a cyclic alkane by removal of a hydrogen atom. Cycloalkyl groups may comprise one or more rings, and include fused or spiro-cyclic groups. Cycloalkyl groups may be C6-10cycloalkyl groups. C6-10cycloalkyl may refer but is not limited to, cyclohexyl, cycloheptyl, cyclooctyl, 10 cyclononyl, and cyclodecyl. The term “cycloalkylene” therefore refers to divalent groups derived from a cyclic alkane by removal of two hydrogen atoms, and, for example, includes cyclohexylene, cycloheptylene, etc. The term “halo” refers to a halogen radical. Typically, halo refers to any selected from fluoro, bromo, chloro and iodo. In some cases, halo refers to fluoro. 15 The term “haloalkyl” is also well known and defines univalent groups derived from alkyl groups by replacement of one or more hydrogen atoms from one or more carbon atoms with a halo group. Haloalkyl groups may comprise one or more different types of halo. For example, one or more independently selected from fluoro, chloro, bromo and iodo. In some cases, the haloalkyl is a fluoroalkyl. Haloalkyl groups may be C1-6haloalkyl 20 groups. In some cases, haloalkyl groups are C1-6haloalkyl groups, C1-4alkyl groups, C1-3haloalkyl groups, or C1-2haloalkyl groups. Non-limiting examples of haloalkyl groups are trifluoromethyl, trifluoroethyl, perfluoroethyl, chloroethyl, bromoethyl, iodoethyl, chlorofluoroethyl, bromofluoroethyl, and iodofluoroethyl. The term “alkenyl” defines univalent groups derived from alkenes by removal of 25 a hydrogen atom from any carbon atom, wherein the term “alkene” is intended to define acyclic branched or unbranched hydrocarbons having one carbon-carbon double bond and the general formula CnH2n, where n is an integer ≥2. C2-4alkenyl refers to any one selected from the group consisting of ethenyl, prop-1-enyl, prop-2-enyl, 1-methyl-ethenyl, but-1-enyl, but-2-enyl, but-3-enyl, 1-methyl-prop-1-enyl, 30 1-methyl-prop-2-enyl, 2-methyl-prop-1-enyl, and 2-methyl-prop-2-enyl. The term “ethenyl” therefore defines the univalent group derived from ethene by removal of a hydrogen atom from any carbon atom, wherein ethene is the simplest alkene, sometimes depicted as H2C=CH2. Ethene may sometimes be referred to as “ethylene”. Ethenyl may sometimes be referred to as “vinyl”. 35 The term “alkenylene” is well known in the art and defines divalent groups derived from alkenes by removal of two hydrogen atoms from any one or two carbon atoms.
M&C PE963057WO 13 Alkenylene may refer but is not limited to ethenylene, prop-1-enylene, prop-2-enylene, 1-methyl-ethenylene, but-1-enylene, but-2-enylene, 1-methyl-prop-1-enylene, 1-methyl-prop-2-enylene, 2-methyl-prop-1-enylene, and 2-methyl-prop-2-enylene, pent-1-enylene, pent-2-enylene, hex-1-enylene, hex-2-enylene, and hex-3-enylene. The 5 term “cycloalkenylene” is therefore understood to refer to divalent groups derived from cycloalkenes by removal of two hydrogen atoms from any one or two carbon atoms. Cycloalkenylene may refer but is not limited to cyclopentenylene, cyclopentadienylene, cyclohexenylene, cycloheptenylene, cyclooctenylene, and cyclooctadienylene. The term “haloalkenyl” therefore defines univalent groups derived from alkenes 10 by replacement of one or more hydrogen atoms from one or more carbon atoms with a halo group. Haloalkenyl groups may comprise one or more different types of halo, for example, one or more independently selected from fluoro, chloro, bromo and iodo. In some cases, the haloalkenyl is a fluoroalkenyl. Haloalkenyl groups may be C2-4haloalkenyl groups. Non-limiting examples of C2-4haloalkenyl groups are haloethenyl 15 (such as fluoroethenyl, or chloroethenyl), halopropenyl (such as bromopropenyl, or chloropropenyl), and halobutenyl. The term “alkynyl” defines univalent groups derived from alkynes by removal of a hydrogen atom from any carbon atom, wherein the term “alkyne” is intended to define acyclic branched or unbranched hydrocarbons having one carbon-carbon triple bond and 20 the general formula CnH2n−2, where n is an integer ≥ 2. C2-4alkynyl refers to any one selected from the group consisting of ethynyl, 1-propynyl, 2-propynyl, 1-butynyl, 2-butynyl, 3-butynyl, and 1-methyl-prop-2-ynyl. The term “alkynylene” therefore defines divalent groups derived from alkynes by removal of two hydrogen atoms from any one or two carbon atoms. Alkynylene may refer but is not limited to ethynylene, 25 1-propynylene, 2-propynylene, 1-butynylene, 2-butynylene, 1-methyl-but-1-ynylene, 1-pentynylene, 2-pentynylene, 1-hexynylene, 2-hexynylene, and 3-hexynylene. The term “alkanal” is well known in the art and defines univalent groups derived from aldehydes by removal of a hydrogen atom from any non-carbonyl carbon atom, wherein the term “aldehyde” is intended to define acyclic branched or unbranched 30 hydrocarbons comprising an aldehyde functional group C(O)H, sometimes referred to as a formyl group. Alkanal groups may be C1-4alkanal (such as C2-4alkanal) groups. C1- 4alkanal refers to any selected from the group consisting of methanal (formyl), ethanal, propanal, 2-methylethanal, butanal, 2-ethylethanal, 2-methylpropanal, 3- methylpropanal, and 2,2-dimethylethanal. 35 The term “alkoxy” defines univalent groups derived from alcohols by removal of a hydrogen atom from an –OH group, wherein the term “alcohol” is intended to define
M&C PE963057WO 14 groups derived from alkanes by the replacement of a hydrogen atom with a hydroxy group. C1-6alkoxy refers but is not limited to methoxy, ethoxy, n-propoxy, iso-propoxy, n- butoxy, sec-butoxy, iso-butoxy, tert-butoxy, pent-1-oxy, pent-2-oxy, pent-3-oxy, neo-pentoxy, hex-1-oxy, hex-2-oxy, and hex-3-oxy. C1-4alkoxy refers to any one selected 5 from the group consisting of methoxy, ethoxy, n-propoxy, iso-propoxy, n-butoxy, sec- butoxy, iso-butoxy and tert-butoxy. The term “amino” refers herein to an –NR1R2 group, wherein R1 and R2 are independently H or hydrocarbon-derived substituents, such as alkyl, alkenyl, alkynyl, carbocycle, and their substituted counterparts. R1 and R2 may be unsubstituted. In some 10 cases, R1 and R2 are joined together. For example, R1 and R2 may be joined together in such a way that the nitrogen atom of –NR1R2 is part of a ring system. When R1 and R2 are H, the compound comprising the amino group may be referred to as a primary amine. When one of R1 or R2 is H, the compound may be referred to as a secondary amine. When neither R1 nor R2 are H, the compound may be referred to as a tertiary amine. 15 The term “amido” refers herein to an –C(O)NR1R2 group, wherein R1 and R2 are independently H or hydrocarbon-derived substituents, such as alkyl, alkenyl, alkynyl, carbocycle, and their substituted counterparts. R1 and R2 may be unsubstituted. In some cases, R1 and R2 are joined together. For example, R1 and R2 may be joined together in such a way that the nitrogen atom of –C(O)NR1R2 is part of a ring system. When R1 and 20 R2 are H, the compound comprising the amido group may be referred to as a primary amide. When one of R1 or R2 is H, the compound may be referred to as a secondary amide. When neither R1 nor R2 are H, the compound may be referred to as a tertiary amide. The term “hydroxy” is well known in the art and defines the univalent group25 derived from water by the removal of one hydrogen atom. Hydroxy groups are often depicted as –OH. The term “thio” is well known in the art and defines the univalent group derived from hydrogen sulfide by the removal of one hydrogen atom. Thio groups are often depicted as –SH. The term “alkylthio” is therefore understood to refer to an alkyl group30 wherein one of the hydrogen atoms is replaced with a thio group. Alkylthiols are sometimes referred to as “mercaptans”. C1-6alkylthio refers but is not limited to methylthio, ethylthio, n-propylthio, iso-propylthio, n-butylthio, sec-butylthio, iso-butylthio, tert-butylthio, pentylthio, and hexylthio. C1-4alkylthio refers to any one selected from the group consisting of methylthio, ethylthio, n-propylthio, iso-propylthio, n-butylthio,35 sec-butylthio, iso-butylthio and tert-butylthio.
M&C PE963057WO 15 The term “epoxidyl” defines the univalent group derived from the 3-membered O-heterocycle ethylene oxide (or “oxirane”) by removal of a hydrogen atom from any one carbon atom. Similarly, the term “episulfidyl” defines the univalent group derived from the 3-membered S-heterocycle ethylene sulfide (or “thiirane”) by removal of a hydrogen 5 atom from any one carbon atom. Furthermore, the term “azacyclopropyl” defines the univalent group derived from the 3-membered N-heterocycle azacyclopropane (or “aziridine”) by removal of a hydrogen atom from any one carbon atom. The term azacyclopropyl may be used interchangeably with “aziridinyl”. Additionally, the term “cyclopropenonyl” defines the univalent group derived from the 3-membered singly oxo- 10 substituted carbocycle (cyclopropenone) by removal of a hydrogen atom from any one carbon atom. The term “cyano” is well known in the art and refers to the –CN group, otherwise represented as –C≡N, or referred to as “nitrile”, derived from hydrogen cyanide by removal of the hydrogen atom. 15 The term “biological molecule” or “biomolecule” is well known in the art and relates to any molecule present in organisms that is involved in one or more (typically biological) processes. Biological molecules may include but are not limited to proteins, peptides, antibodies, antigens, carbohydrates, lipids, nucleic acids, polynucleotides, vitamins, amino acids, and hormones. Biological molecules may be extracted from their 20 natural source. They may be produced by synthetic or biotechnological means. Biological molecules may be of an unnatural origin, have no known biological purpose, or may not be known to be involved in any biological process. They may be engineered or produced in such a way to differ from their natural counterparts. In some cases, biological molecules are proteins. 25 The term “binding” is well known in the biochemical arts and relates to intermolecular interactions formed between a ligand (sometimes “substrate” or “binder”) and a biomolecule, to form a complex. The term “complex” is also well known and relates to a stable association between two or more molecules to form a single unit. The ligand molecule may be a small-molecule, or it may be a macromolecule, such as a protein. 30 The ligand may be a natural ligand for a biological molecule, or it may be unnatural. The ligand may also be a part of a molecule. That is to say, a ligand molecule may comprise parts that bind and other parts that do not, or parts that bind to particular biological molecules and other parts that bind to other biological molecules. The ligand may only bind to one or a few biological molecules, or it may be promiscuous. Non-covalent 35 binding typically occurs due to one or more non-covalent intermolecular forces, including but not limited to ionic bonding, hydrogen bonding, Van der Waals forces, London
M&C PE963057WO 16 dispersion forces, dipole-dipole interactions, ion-dipole interactions, salt bridges, π-π interactions, ion-π interactions, hydrophobic effects, hydrophilic effects and halogen bonding. The term “covalent binding” is therefore understood to relate to covalent intermolecular interactions formed between a ligand and a biomolecule. Covalent binding 5 is often the result of a reaction between a nucleophilic group on a first binding partner, and an electrophilic group on a second binding partner, resulting in shared electrons and the formation of a covalent bond. As a result of the ligand binding to a biological molecule, the biological molecule may undergo some conformational change, recruit another biological molecule, be 10 prevented from binding another ligand or substrate, or otherwise be affected in one or more ways. The term “contacting” is used herein to refer to any one or more of the acts of combining, such as reacting, mixing, stirring, slurrying, blending, dissolving, incubating, passing over, flowing over, or otherwise, in any order, and for any length of time. 15 The term “kit” is used herein to refer to a product containing the different components necessary for making the compounds or carrying out the uses and methods of the present invention. The different components may be provided within packaging so as to allow their transport and storage. The kit may comprise the different components in separate vessels or containers. 20 For the avoidance of doubt, a wavy line in a chemical structure bisects the bond linking the moiety shown to the rest of the compound. The term “stereoisomer” is used herein to refer to isomers that possess identical molecular formulae and sequence of bonded atoms, but which differ in the arrangement of their atoms in space. 25 The term “enantiomer” defines one of a pair of molecular entities that are mirror images of each other and non-superimposable, i.e. cannot be brought into coincidence by translation and rigid rotation transformations. Enantiomers are chiral molecules, i.e. are distinguishable from their mirror image. The term “racemic” is used herein to pertain to a racemate. A racemate defines 30 a substantially equimolar mixture of a pair of enantiomers, which typically comprises a pair of enantiomers in a ratio of about 1:1. The term “diastereoisomers” (also known as diastereomers) defines stereoisomers that are not related as mirror images. The term “solvate” is used herein to refer to a complex comprising a solute, such35 as a compound or salt of the compound, and a solvent. If the solvent is water, the solvate
M&C PE963057WO 17 may be termed a hydrate, for example a mono-hydrate, di-hydrate, tri-hydrate etc., depending on the number of water molecules present per molecule of substrate. The term “isotope” is used herein to define a variant of a particular chemical element, in which the nucleus necessarily has the same atomic number but has a 5 different mass number owing to it possessing a different number of neutrons. Compounds As described above, in a first aspect, there is provided a compound of formula (I): 10
(I), wherein G is a 5- or 6-membered heteroarene or 6-membered arene, optionally substituted with one or two substituents selected from the group consisting of methyl, halo, hydroxy, thiol, halomethyl, amino, methoxy, methylamino, dimethylamino, ethyl,15 haloethyl, amido, isopropyl, tert-butyl, and methylthio; E1 is selected from C1-4alkyl, C2-4alkenyl, C1-4haloalkyl, C2-4haloalkenyl, and an electrophile selected from C1-4alkanal (such as C2-4alkanal), OC(O)C1-4alkyl, OC(O)C2- 4alkenyl, OC(O)C2-4alkynyl, OC(O)epoxidyl, OC(O)aziridinyl, C(O)OC1-4alkyl, C(O)OC2-4alkenyl, C(O)OC2-4alkynyl, C(O)O-epoxidyl, C(O)O-aziridinyl, N(Ra)CO(C1- 20 4alkyl), NRaCO(C2-4alkenyl), NRaCO(C2-4alkynyl), NRaCO(bicyclo[1.1.0]butane), NRaCO(epoxidyl), NRaCO(aziridinyl), NRaSO2(C1-4alkyl), NRaSO2(C2-4alkenyl), epoxidyl, aziridinyl, episulfidyl, azacyclopropyl, cyclopropenonyl, C1-4alkylcyclopropenonyl, C(O)C(O)N(Ra)2, C(O)NRa(C1-4alkyl), C(O)NRa(C2-4alkenyl), C(O)NRa(C2-4alkynyl), C(O)NRa(epoxidyl), C(O)NRa(aziridinyl), C1-4alkylC(O)N(Ra)2, C2-4alkenylC(O)N(Ra)2, 25 C2-4alkynylC(O)N(Ra)2, SO2NRa(C1-4alkyl), SO2NRa(C2-4alkenyl), SO2NRa(C2-4alkynyl), SO2NRa(epoxidyl), and SO2NRa(aziridinyl), each of which is optionally substituted with one or more substituents selected from halo, cyano and amino; E2, E3 and E4 are each independently selected from H, R3, and an electrophile selected from C2-4haloalkenyl, C1-4alkanal (such as C2-4alkanal), OC(O)C1-4alkyl, 30 OC(O)C2-4alkenyl, OC(O)C24alkynyl, OC(O)epoxidyl, OC(O)aziridinyl, C(O)OC1-4alkyl,
M&C PE963057WO 18 C(O)OC24alkenyl, C(O)OC2-4alkynyl, C(O)O-epoxidyl, C(O)O-aziridinyl, NRaCO(C1- 4alkyl), NRaCO(C24alkenyl), NRaCO(C2-4alkynyl), NRaCO(bicyclo[1.1.0]butane), NRaCO(epoxidyl), NRaCO(aziridinyl), NRaSO2(C1-4alkyl), NRaSO2(C2-4alkenyl), epoxidyl, aziridinyl, episulfidyl, azacyclopropyl, cyclopropenonyl, C1-4alkylcyclopropenonyl, 5 C(O)C(O)N(Ra)2, C(O)NRa(C1-4alkyl), C(O)NRa(C2-4alkenyl), C(O)NRa(C2-4alkynyl), C(O)NRa(epoxidyl), C(O)NRa(aziridinyl), C1-4alkylC(O)N(Ra)2, C2-4alkenylC(O)N(Ra)2, C24alkynylC(O)N(Ra)2, SO2NRa(C1-4alkyl), SO2NRa(C2-4alkenyl), SO2NRa(C2-4alkynyl), SO2NRa(epoxidyl), and SO2NRa(aziridinyl), each of which is optionally substituted with one or more substituents selected from halo, cyano and amino; 10 R1 is any one selected from the group consisting of C1-4alkyl, C1-4haloalkyl, H and halo; R2 is H, C1-3alkyl, C1-3haloalkyl or halo; R3 is independently selected from halo, hydroxyl, thiol, amido, NR4R5, C(O)NR4R5, C1-6alkyl, C1-6haloalkyl, C1-6alkoxy and C1-6alkylthio; 15 R4 and R5 are independently selected from H and C1-3alkyl; Ra is H, C1-3alkyl or C1-3haloalkyl; and D is a reactive group, wherein at least one of E1 to E4 is an electrophile. In some embodiments, G is a 5-membered heteroarene, optionally substituted 20 with one or two substituents selected from the group consisting of methyl, halo, hydroxy, thiol, halomethyl, amino, methoxy, methylamino, dimethylamino, ethyl, haloethyl, amido, isopropyl, tert-butyl, and methylthio. In some embodiments, G is a 6-membered arene or heteroarene optionally substituted with methyl, halo, hydroxy, methoxy, and thiol. Typically, G is substituted with one or more substituents selected from the group 25 consisting of methyl, halo, hydroxy, thiol, halomethyl and amino. In some embodiments, G is an optionally substituted 5-membered heteroarene, such as pyrrole, imidazole, pyrazole, triazole, tetrazole, furan, thiophene, oxazole, isothiazole, thiazole, and thiadiazole. Typically, G is an optionally substituted thiophene. In some embodiments, G is substituted two times with methyl. 30 In some embodiments, E1 is selected from C1-4alkyl and C1-4haloalkyl; one of E2 and E4 is an electrophile and the other is selected from H and R3; and E3 is selected from H and R3. In some embodiments, R3 is selected from fluoro, hydroxyl, thiol, amido, NR4R5, C(O)NR4R5, C1-4alkyl, C1-4fluoroalkyl, C1-4alkoxy and C1-4alkylthio. In some embodiments, the electrophile is selected from C1-4alkanal (such as C2-4alkanal), 35 OC(O)C1-2haloalkyl, OC(O)ethenyl, NRaC(O)C1-2haloalkyl, NRaC(O)ethenyl, NRaSO2(ethenyl), NRaC(O)fluoroethenyl and epoxidyl. In some embodiments, the
M&C PE963057WO 19 electrophile is selected from C2-4alkanal, OC(O)C1-2haloalkyl, OC(O)ethenyl, NRaC(O)C1- 2haloalkyl, NRaC(O)ethenyl, NRaSO2(ethenyl), NRaC(O)fluoroethenyl and epoxidyl, such as NRaC(O)C1-2haloalkyl, NRaC(O)ethenyl and NRaC(O)fluoroethenyl. In some embodiments, the electrophile is selected from NH(CO)C1-2haloalkyl such as 5 NH(CO)CH2Cl, NHC(O)ethenyl, and NHC(O)fluoroethenyl. In some embodiments, the electrophile is NH(CO)CH2Cl or NHC(O)ethenyl. In more specific embodiments, E1 is ethyl; the one of E2 or E4 that is not an electrophile is H; and E3 is H. Typically, the electrophile is selected from NRaC(O)C1-2haloalkyl, NRaC(O)ethenyl, NRaSO2(ethenyl), and epoxidyl. 10 In some embodiments, E1 is selected from C1-4alkyl and C1-4haloalkyl; E2 and E4 are each independently selected from H and R3; and E3 is an electrophile. In some embodiments, R3 of E2 and E4 is independently selected from fluoro, hydroxyl, thiol, amido, NR4R5, C(O)NR4R5, C1-4alkyl, C1-4fluoroalkyl, C1-4alkoxy and C1-4alkylthio. In some embodiments, E3 is selected from C1-4alkanal (such as C2-4alkanal), OC(O)C1-2haloalkyl, 15 OC(O)ethenyl, NRaC(O)C1-2haloalkyl, NRaC(O)ethenyl, NRaSO2(ethenyl), NRaC(O)fluoroethenyl and epoxidyl. In some embodiments, E3 is selected from C2- 4alkanal, OC(O)C1-2haloalkyl, OC(O)ethenyl, NRaC(O)C1-2haloalkyl, NRaC(O)ethenyl, NRaSO2(ethenyl), NRaC(O)fluoroethenyl and epoxidyl, such as NRaC(O)C1-2haloalkyl, NRaC(O)ethenyl and NRaC(O)fluoroethenyl. In some embodiments, E3 is selected from 20 NH(CO)C1-2haloalkyl such as NH(CO)CH2Cl, NHC(O)ethenyl, and NHC(O)fluoroethenyl. In some embodiments, E3 is NH(CO)CH2Cl or NHC(O)ethenyl. In more specific embodiments, E1 is ethyl; and E2 and E4 are H. Typically, E3 is selected from NRaC(O)C1- 2haloalkyl, NRaC(O)ethenyl, NRaSO2(ethenyl), and epoxidyl. In some embodiments, E1 is an electrophile; and E2, E3 and E4 are each 25 independently selected from H and R3. In some embodiments, E1 is selected from C1- 4alkanal (such as C2-4alkanal), OC(O)C1-2haloalkyl, OC(O)ethenyl, NRaC(O)C1- 2haloalkyl, NRaC(O)ethenyl, NRaSO2(ethenyl), NRaC(O)fluoroethenyl and epoxidyl. In more specific embodiments, E2 and E4 are each independently selected from H and R3; and E3 is halo. In further specific embodiments, E2 and E4 are H and R3 is chloro. 30 Typically, E1 is selected from C2-4alkenyl, OCH2(epoxidyl), CH2C(O)H, OC(O)CH2Cl, and N(H)C(O)CH2Cl. More typically, E1 is selected from CH2C(O)H, OC(O)CH2Cl, and N(H)C(O)CH2Cl. Even more typically, E1 is CH2C(O)H or OC(O)CH2Cl. In some embodiments, R1 is any one selected from the group consisting of C1- 4alkyl, such as methyl; and C1-4fluoroalkyl, such as trifluoromethyl. In more specific35 embodiments, R1 is methyl or trifluoromethyl. Typically, R1 is methyl.
M&C PE963057WO 20 In some embodiments, R2 is selected from the group consisting of H; C1-3alkyl, such as methyl, ethyl, or propyl; C1-3haloalkyl, such as trifluoromethyl; or halo, such as fluoro. In more specific embodiments, R2 is H or halo. Typically, R2 is H. In some embodiments, where R3 is NR4R5 or C(O)NR4R5, R4 and R5 are 5 independently selected from H; and C1-3alkyl, such as methyl, ethyl, or propyl. In more specific embodiments, R4 and R5 are independently selected from H and methyl. In some embodiments, the electrophile of each of E1, E2, E3 and E4 is independently selected from C1-4alkanal (such as C2-4alkanal), OC(O)C1-2haloalkyl, OC(O)ethenyl, NRaC(O)C1-2haloalkyl, NRaC(O)ethenyl, NRaSO2(ethenyl), 10 NRaC(O)fluoroethenyl and epoxidyl; where E1 is an electrophile, E2, E3, and E4 are each independently H or R3, typically E2 and E4 are H and E3 is chloro; where one of E2, E3, or E4 is an electrophile, E1 is C1-4alkyl, typically ethyl, and the two of E2, E3, and E4 that are not an electrophile are H or R3, typically, H. Typically, one of E2, E3, and E4 is an electrophile, E1 is C1-4alkyl, typically ethyl, and the two of E2, E3, and E4 that are not an 15 electrophile are H or R3, typically, H. Typically, E3 is an electrophile, E1 is C1-4alkyl, typically ethyl, and E2 and E4 are H or R3, typically, H. In some embodiments, where E1 is an electrophile selected from N(Ra)CO(C1- 4alkyl), NRaCO(C2-4alkenyl), NRaCO(C2-4alkynyl), NRaCO(bicyclo[1.1.0]butane), NRaCO(epoxidyl), NRaCO(aziridinyl), NRaSO2(C1-4alkyl), NRaSO2(C2-4alkenyl), 20 C(O)C(O)N(Ra)2, C(O)NRa(C1-4alkyl), C(O)NRa(C2-4alkenyl), C(O)NRa(C2-4alkynyl), C(O)NRa(epoxidyl), C(O)NRa(aziridinyl), C1-4alkylC(O)N(Ra)2, C2-4alkenylC(O)N(Ra)2, C2-4alkynylC(O)N(Ra)2, SO2NRa(C1-4alkyl), SO2NRa(C2-4alkenyl), SO2NRa(C2-4alkynyl), SO2NRa(epoxidyl), and SO2NRa(aziridinyl); or where E2, E3 or E4 is an electrophile selected from NRaCO(C1-4alkyl), NRaCO(C2-4alkenyl), NRaCO(C2-4alkynyl), 25 NRaCO(bicyclo[1.1.0]butane), NRaCO(epoxidyl), NRaCO(aziridinyl), NRaSO2(C1-4alkyl), NRaSO2(C2-4alkenyl), C(O)C(O)N(Ra)2, C(O)NRa(C1-4alkyl), C(O)NRa(C2-4alkenyl), C(O)NRa(C2-4alkynyl), C(O)NRa(epoxidyl), C(O)NRa(aziridinyl), C1-4alkylC(O)N(Ra)2, C2- 4alkenylC(O)N(Ra)2, C2-4alkynylC(O)N(Ra)2, SO2NRa(C1-4alkyl), SO2NRa(C2-4alkenyl), SO2NRa(C2-4alkynyl), SO2NRa(epoxidyl), and SO2NRa(aziridinyl); each Ra is30 independently selected from H; C1-3alkyl, such as methyl, ethyl, and propyl; and C1- 3haloalkyl, such as trifluoromethyl. In more specific embodiments, each Ra is independently H or methyl. Typically, each Ra is H. As described above, D is a reactive group. The term “reactive group” refers to any group that is capable of reacting with a second compound (typically a pro-linker 35 compound) in order to form a bond with the second compound. Provided that D is capable of linking compound (I) with the second compound, for example with a pro-linker
M&C PE963057WO 21 molecule in order to form a bond to a linker, the precise identity of D is not important and the compound of formula (I) need not be limited to specific D groups. To react D with a second compound, D may first be reacted with reagents that convert D into a moiety that is more susceptible to reactivity with the second compound. In other words, D need not 5 react directly with the second compound, but may indirectly react and link to the second compound. In some embodiments, D is any one selected from the group consisting of (CH2)pC(O)O(CH2)pCH3, (CH2)pC(O)O(CH2)psuccinimidyl, (CH2)pC(O)OH, (CH2)pC(O)Cl, (CH2)pC(O)Br, (CH2)qNH2, (CH2)qC(O)NH2, (CH2)qN(C1-3alkyl)H, (CH2)qC(O)N(C1- 3alkyl)H, (CH2)qSH, (CH2)qOH, (CH2)qBr, (CH2)qI, (CH2)qN3, (CH2)qCCH, 10 (CH2)pmaleimidyl, (CH2)ptetrazinyl, (CH2)pC6-10cycloalkylenyl, (CH2)pdibenzocyclooctynyl, (CH2)pC6-10cycloalkylynyl, C(O)pentafluorophenyl, and C(O)tetrafluorophenyl; wherein each p is independently an integer from 0 to 4 or is 0 and q is an integer from 1 to 4 or is 1. In some embodiments, D is any one selected from the group consisting of (CH2)pC(O)O(CH2)pCH3, (CH2)pC(O)O(CH2)psuccinimidyl, 15 (CH2)pC(O)OH, (CH2)pC(O)Cl, (CH2)pC(O)Br, (CH2)qNH2, (CH2)qN(C1-3alkyl)H, (CH2)qSH, (CH2)qOH, (CH2)qBr, (CH2)qI, (CH2)qN3, (CH2)qCCH, (CH2)pmaleimidyl, (CH2)ptetrazinyl, (CH2)pC6-10cycloalkylenyl, (CH2)pdibenzocyclooctynyl, (CH2)pC6-10cycloalkylynyl, C(O)pentafluorophenyl, and C(O)tetrafluorophenyl; wherein each p is independently an integer from 0 to 4 or is 0 and q is an integer from 1 to 4 or is 1. 20 In particular embodiments, D comprises a carbonyl group. For example, D may comprise a -C(O)-RD moiety, wherein RD is a leaving group such as an alkoxy (e.g. C1- 6alkoxy), halo or hydroxy group, wherein the alkyl moiety of the alkoxy is optionally substituted with one or more selected from halo, succinimide and C1-4alkoxy. In more specific embodiments, D is C(O)OCH3 or C(O)OH. 25 In some embodiments, the compound is of formula (Ii):
(Ii), wherein E1 to E4 and D are as defined above. In some embodiments, the compound is selected from formulae (I1) to (I24):
M&C PE963057WO 22 (I3) I8)
(I9)
M&C PE963057WO 23 I14) I17)
(I18) (I19)
M&C PE963057WO 24
5 As described above, in a second aspect, there is provided a tagged compound of formula (IA):
(IA), wherein G, E1, E2, E3, E4, R1, and R2 are as defined in the first aspect of the10 invention, D’ is the product of a reactive group, D, with a pro-linker to form D’-L or D’-L- B, L is a molecule capable of linking D’ to B and B is a molecule. For the avoidance of doubt, the embodiments described herein in relation to formula (I) apply mutatis mutandis to formula (IA). For example, in some embodiments, G is an optionally substituted 5-membered heterocycle, typically substituted thiophene; 15 R1 is methyl or trifluoromethyl, typically methyl; R2 is halo or H, typically H; and the electrophile of each of E1, E2, E3 and E4 is independently selected from C1-4alkanal (such as C2-4alkanal), OC(O)C1-2haloalkyl, OC(O)ethenyl, NRaC(O)C1-2haloalkyl,
M&C PE963057WO 25 NRaC(O)ethenyl, NRaSO2(ethenyl), NRaC(O)fluoroethenyl and epoxidyl; where E1 is an electrophile, E2, E3, and E4 are each independently H or R3, typically E2 and E4 are H and E3 is chloro; where one of E2, E3, or E4 is an electrophile, E1 is C1-4alkyl, typically ethyl, and the two of E2, E3, and E4 that are not an electrophile are H or R3, typically, H. 5 As described above, D’ is the product of a reactive group, D, with a pro-linker to form D’-L or D’-L-B. A pro-linker is defined herein as a molecule that is capable of reacting with D’ to form D’-L or D’-L-B, where L is a molecule capable of linking D’ to B. Provided that D’ is capable of linking to L, the precise identity of D’ is not important and the compound of formula (IA) need not be limited to specific D’ groups. In some 10 embodiments, however, D’ is any one selected from the group consisting of (CH2)pC(O), (CH2)qNH, (CH2)qS, (CH2)qO, (CH2)q and 1,2,3-triazolylene, wherein p is an integer from 0 to 4 or is 0 and q is an integer from 1 to 4 or is 1. Typically, D’ comprises a carbonyl. Thus, in some embodiments, D’ is (CH2)pC(O), such as C(O). For the avoidance of doubt, the compounds of the second aspect may be 15 synthesised in any order. The reactive group, D, may react with a pro-linker L to form D’- L, or the reactive group, D, may react with a pro-linker bonded to a molecule, B, i.e. with L-B, to form D’-L-B. In some embodiments, the tagged compound is of formula (IAi):
(Ai), 20 wherein E1 to E4 and D’ are as defined above. In some embodiments, the tagged compound is selected from formulae (IA1) to (IA12):
(IA1) (IA2)
M&C PE963057WO 26 A4) A6) A8)
(IA9) (IA10)
M&C PE963057WO 27 12).
In some embodiments, L is of formula (VIA)
5 (VIA), wherein the wavy lines indicate the positions of attachment; X1 is optionally present and is any one selected from the group consisting of O(CH2)s, NH(CH2)s and C(O)(CH2)s; 10 X2 is optionally present and is selected from the group consisting of O(CH2)uC(O), (CH2)uNH, (CH2)uO and (CH2)uC(O); each L’ is independently selected from the group consisting of O(CH2)t, CH2, alkenylene, alkynylene, heterocyclylene, arylene, cycloalkylene, cycloalkenylene, ferrocenylene, and divalent amino acid, each optionally substituted with one or more15 substituents selected from C1-4alkyl, C1-4haloalkyl, halo, cyano, C1-4alkoxy and amino; L’’ is optionally present and is selected from -O-, -S-, -NRa- and -N=N-; each Ra is independently selected from H, C1-3alkyl and C1-3haloalkyl; r is an integer from 1 to 20, s is an integer from 0 to 4, u is an integer from 0 to 4 and t is an integer from 1 to 4. 20 Typically, the bond denoted by an asterisk is linked to D’. In some embodiments, X1 is present. In more specific embodiments, X1 is NH(CH2)s or O(CH2)s, wherein s is 0 to 4, often 2. Typically, X1 is NH(CH2)2. In some embodiments, X2 is present. In more specific embodiments, X2 is (CH2)uNH or (CH2)uO, wherein u is 0 to 4, often 0. Typically, X2 is NH. In some embodiments, each L’ is 25 independently selected from the group consisting of O(CH2)t, CH2, alkenylene, alkynylene, heterocyclylene, arylene, and divalent amino acid. In some embodiments,
M&C PE963057WO 28 the heterocyclylene is 6-membered, such as a 6-membered N-heterocyclylene. In some embodiments, the arylene is a phenylene. In some embodiments, each L' is independently selected from the group consisting of O(CH2)t, CH2, heterocyclylene such as 6-membered heterocyclylene and arylene such as phenylene. In more specific 5 embodiments, L’ is O(CH2)t, wherein t is 2 to 4. Typically, L’ is O(CH2)3. In some embodiments, L’’ is present, and is –NRa-, wherein each Ra is independently selected from H; C1-3alkyl, such as methyl; and C1-3haloalkyl, such as trifluoromethyl. Typically, L’’ is not present. In some embodiments, r is 1 to 10. In more specific embodiments, r is 1 to 5. Typically, r is 1. 10 In some embodiments, X1 is selected from HN(CH2)2 and O(CH2)2, each L' is independently selected from the group consisting of O(CH2)t, CH2, heterocyclylene such as 6-membered heterocyclylene and arylene such as phenylene, and X2 is (CH2)uNH, wherein t is 2 to 4 and u is 0 to 4. In some embodiments, X1 is selected from HN(CH2)2 and O(CH2)2, L’ is O(CH2)t,15 and X2 is (CH2)uNH, wherein t is 2 to 4 and u is 0 to 4. I1)
20 (VI2) (VI3) (VI4), wherein: each Ra is independently selected from H, C1-3alkyl and C1-3haloalkyl; and each X is independently selected from CH and N. For the avoidance of doubt, the wavy lines indicate the positions of attachment. Typically, the bond denoted by an asterisk is linked to D’. In more specific embodiments, L is of formula (VI1), wherein each 25 Ra is independently selected from H and C1-3alkyl, such as methyl. Typically, Ra is H. Typically, L is of formula (VI1), and Ra is H. In some embodiments, L is of formula (VI2) or (VI3), wherein Ra is methyl or H, typically H. In some embodiments, L is of formula (VI4), wherein each X is independently CH or N.
M&C PE963057WO 29 In some embodiments, the linker is selected from formulae (VIi to (VIviii): Iii) Iiv) 5 iii).
As described above, B is a molecule. Provided that B is capable of 1) linking to L, and 2) conferring or imparting some functionality on the tagged compound of formula (IA) (i.e., acting as a tag of some kind), the precise identity of B is not important and the compound of formula (IA) need not be limited to specific B molecules. Nonetheless, the 10 present inventors have identified particular B molecules that serve a variety of uses in the chemical and biological sciences. In some embodiments B comprises: a detectable label, such as an optically detectable label, e.g. a fluorescent, chemiluminescent or dye group; a radio-label; biotin; horseradish peroxidase; a 15 photolabile group; a FRET donor or acceptor; and/or a reactive handle capable of reacting with a biomolecule for conjugation, such as an alkyne or azide; and/or B is capable of: immobilising the compound onto a bead, resin or solid surface, such as a His-tag; solubilising the compound; and/or binding to a biomolecule, optionally a protein tag, an E3 ubiquitin ligase, an antibody or a BET protein. 20 A detectable label is understood to be a molecule that is detectable through analytical means, e.g. through spectroscopic, gel electrophoresis, or microscopic techniques or through the use of mass spectrometry or radiation detection. An optically detectable label is understood to be a molecule that is detectable through some spectroscopic, visual, or microscopic means, that is to say, observable through 25 spectroscopy, visual, or microscopy via absorption and/or emission of light. Suitable detectable labels can include fluorescent molecules, fluorophores, radioisotopes, chromophores, enzymes, substrates, colorants, chemiluminescent molecules, bioluminescent molecules, and the like. In some embodiments, the optically detectable label is a fluorescent group, which 30 is understood to be a molecule or moiety that can absorb light at a particular wavelength
M&C PE963057WO 30 and emit light at another wavelength. The skilled person is aware of many fluorescent groups that are suitable for use as fluorescent tags. Examples of fluorophores include hydroxycoumarin, methoxycoumarin, Alexa fluor 350, DY-415, aminocoumarin, Cy2, FAM, Alexa fluor 488, fluorescein FITC, Alexa fluor 430, JOE [6-JOE, SE (6-carboxy- 5 4',5'-dichloro-2',7'-dimethoxyfluorescein, succinimidyl ester)], VIC (2'-chloro-7'phenyl- 1,4-dichloro-6-carboxy-fluorescein) fluorophores, Alexa fluor 532, HEX, Cy3, TRITC, Alexa fluor 546, Alexa fluor 555, R-phycoerythrin (PE), rhodamine Red-X, Tamara, Cy3.5 581, Rox, Alexa fluor 568, Red 613, Texas Red, Alexa fluor 594, Alexa fluor 633, allophycocyanin, Alexa fluor 633, Cy5, Alexa fluor 660, Cy5.5, TruRed, Alexa fluor 68010 and Cy7. A photolabile group may alter and/or inhibit a function of the tagged compound or may alter and/or inhibit a function of a target molecule, such as a protein or polypeptide comprising one or more hole-modified mutant bromodomains as described herein. The function of the tagged compound or target molecule may be restored upon exposure of 15 the photolabile group to a predetermined wavelength of light such that the photolabile group is altered in some way such that function is restored. This may be an alteration in the structure and/or orientation of the photolabile group, or cleavage of the photolabile group from the tagged compound or protein or peptide. In some embodiments, B comprises biotin. Biotin is a well-known molecule in the 20 biochemical arts, and is often used for tagging and capturing molecules, due to its high affinity for avidin and streptavidin. For example, a molecule tagged with biotin may be captured – and subsequently isolated, purified, analysed, and/or detected etc. – by an avidin/streptavidin coated surface, such as beads, resins, solid surfaces, slides, chips, columns, and nanoparticles. 25 In some embodiments, B comprises a photolabile group that alters and/or inhibits a function of the tagged compound or may alter and/or inhibit a function of a target molecule, such as a protein or polypeptide comprising one or more hole-modified mutant bromodomains as described herein, wherein the function is restored upon exposure of the photolabile group to a predetermined wavelength of light such that the photolabile 30 group is cleaved from the tagged compound. Such photolabile groups are often used in a technique sometimes referred to as photocaging. Examples of photolabile groups include (2-nitropheynyl)ethyl groups, 2-nitrobenzyl-based groups, carbonyl-based groups, benzyl-based groups, coumarin-based groups, ortho-nitrobenzyl (NB)-based groups, or any variation or combination thereof. 35 In some embodiments, B comprises a FRET donor or acceptor. FRET (Förster Resonance Energy Transfer, or Fluorescence Resonance Energy Transfer) is an
M&C PE963057WO 31 interaction between molecules, wherein a donor molecule, excited by absorption of light, transfers energy to an acceptor molecule when sufficiently proximal, which enables the acceptor molecule to emit light. Such emission is detectable through fluorescence microscopy or spectroscopy. Examples of FRET donors include GFP, CFP, fluorescein, 5 Alexa Fluor dyes, lanthanide chelates and quantum dots. Examples of FRET acceptors include YFP, rhodamine, tetramethylrhodamine (TMR), Alexa Fluor dyes, mCherry, Texas Red, and quantum dots. In some embodiments, B comprises a reactive handle capable of reacting with a biomolecule for conjugation. Conjugation is understood to refer to linking by the formation 10 of a covalent bond. The biomolecule may be any suitable biomolecule, often peptides or proteins. A reactive handle is understood to be a moiety or functional group that can undergo reaction with another group to form a covalent bond. Typically, the reactive handle is a particularly reactive functional group that may undergo facile and selective reactions with a particular complementary functional group or set of functional groups. In 15 some embodiments, the reactive handle is a functional group often used in click chemistry. Click chemistry is a term used to group particularly efficient and selective chemical reactions used to rapidly generate new covalent bonds, linking two molecules together. Examples of click chemistry reactions include the copper(I)-catalyzed azide- alkyne cycloaddition (CuAAC), the strain-promoted azide-alkyne cycloaddition (SPAAC), 20 the thiol-ene reaction, and the Diels–Alder reaction. Examples of reactive handles, some of which are click chemistry functional groups, include but are not limited to azides, alkynes, thiols, alkenes, aldehydes, hydrazines, cyclooctynes, tetrazines, nitrile oxides, alkoxyamines, isothiocyanates, sulfonate esters, epoxides, carboxylic acids, esters, succinimidyl esters, pentafluorophenyl esters, tetrafluorophenyl esters, aziridine,25 isocyanates, and maleimides. Preferred groups include alkynes and azides. In some embodiments, B is capable of immobilising the compound onto a bead, resin or solid surface. The term immobilising is understood to refer to the attaching or fixing of molecules, such as biomolecules, onto a substrate of any kind. The substrate may be a bead, resin, solid surface, slide, chip, column, or nanoparticle. Immobilisation 30 is often achieved through one or more intermolecular interactions between the biomolecule and the substrate, for example, covalent and/or non-covalent interactions. Where the intermolecular interaction is a covalent interaction, one or more covalent bonds may be formed between the biomolecule and the substrate, typically through the chemical reaction of a reactive group on the biomolecule with a reactive group on the 35 substrate. Where the intermolecular interaction is a non-covalent interaction, it may occur through any one or more of ionic bonding, hydrogen bonding, Van der Waals forces,
M&C PE963057WO 32 London dispersion forces, dipole-dipole interactions, ion-dipole interactions, salt bridges, π-π interactions, ion-π interactions, hydrophobic effects, hydrophilic effects and halogen bonding. A specific example of an immobilisation system is the exploitation of the non- covalent interactions formed between biotin and avidin/streptavidin, whereby biotin may 5 be conjugated to a biomolecule, and the avidin/streptavidin may be coated on a substrate. In some embodiments, B is capable of solubilising the compound. That is to say, B may comprise functional groups that increase the solubility of the tagged compound in either polar or non-polar solvents. Examples of polar solvents include water, ethanol, 10 methanol, acetone, DMSO, acetonitrile, and isopropanol; examples of solubilising groups for polar solvents include groups that may become charged at a particular pH, such as hydroxy, carboxyl, amino, phosphate, phosphonium, and sulfate; such groups may exist or be installed in a salt form. Examples of non-polar solvents include chloroform, hexane, diethyl ether, toluene, benzene, xylene, and dichloromethane; 15 examples of solubilising groups for non-polar solvents include alkyl chains, aromatic rings, fluorinated groups, silicone-based groups (for example, polysiloxanes, such as polydimethylsiloxane (PDMS)), hydrophilic amino acid residues, and PEG chains. In some embodiments, B is capable of binding to a biomolecule. The term binding is generally understood to refer to intermolecular interactions that enable a stable 20 association between two or more molecules to form a single unit. Binding may be through non-covalent or covalent means, or a combination of covalent and non-covalent means. Examples of non-covalent interactions are provided above, as are examples of chemical reactions and functional groups that may form covalent interactions. In some embodiments, B is capable of binding to a protein tag. As described above, a protein tag25 is a polypeptide sequence that is grafted to a protein to impart functionality to the protein – the functionality may simply be that the tag can bind to a particular ligand, and that this binding may enable myriad further functionalities. Therefore, in some embodiments, B is a ligand that can bind to a protein tag. Non-limiting examples of protein tags include BromoTag, or Brd4BD2L387A, also Brd4BD1L94V; HaloTag; SNAP-tag; CLIP-tag; 30 SpyCatcher; eDHFR; dTAG, or FKBP12F36V; Non-limiting examples of ligands complementary to said protein tags include, respectively, JQ1 derivatives, such as ET- JQ1; haloalkane derivatives, such as chloroalkanes; O-benzylguanidine derivatives; O- benzylcytosine derivatives; SpyTag and derivatives; trimethoprim and derivatives; AP1867 and derivatives. In some embodiments, the biomolecule that B is capable of 35 binding to is an E3 ubiquitin ligase. Ligands capable of binding to an E3 ubiquitin ligase include but are not limited to the following and their derivatives: thalidomide,
M&C PE963057WO 33 lenalidomide, pomalidomide, CC-885, eragidomide, iberdomide, cemsidomide, golcadomide, ALV2, VH032, VH-298, TD-106, LCL161, VHL-IN-1, hydroxyproline, and VL-269. The skilled person is aware of many E3 ubiquitin ligase ligands, such as those found in “E3 Ligase Ligands in Successful PROTACs: An Overview of Syntheses and 5 Linker Attachment Points”, Bricelj et al., Front. Chem., 2021, 9:707317. A ligand that is capable of binding to an E3 ubiquitin ligase may be useful in targeted protein degradation (TPD). As mentioned above, TPD is a technique whereby a target protein is brought into proximity with an E3 ubiquitin ligase, such that the target protein is ubiquitinated, thus marking the target protein for degradation. TPD is a promising modality for novel10 therapeutic approaches to treat diseases. In some embodiments, the biomolecule that B is capable of binding to is a BET protein. The BET (bromo- and extra-terminal domain) family of proteins comprise bromodomains, of which many ligands are known to bind to. Examples of ligands that bind to BET proteins include the following and their derivatives: JQ1, GSK525762A (I-15 BET-762), OTX015, BMS-986158, TEN-010, CPI-0610 , INCB54329, BAY1238097, FT- 1101, ABBV-075, BI 894999, GS-5829, GSK1210151A (I-BET-151), CPI-203, RVX-208, XD46, MS436, PFI-1, RVX2135, ZEN3365, XD14, ARV-771, MZ-1, PLX5117, EPl 1313 and EPl 1336. In some embodiments, B is a molecule comprising a fluorescent group, optionally 20 a fluorogenic group or photoactivatable fluorescent group and/or a fluorescent group comprising a fluorophore that is excitable by light with a wavelength of about 400 nm to about 800 nm. In some embodiments, B is: (i) selected from structure (VIIa), (VIIb) and (VIIc): 25 Ib),
(VIIc)
M&C PE963057WO 34 wherein: Z is selected from Si(C1-4alkyl)2, C(C1-4alkyl)2, O, P(O)(phenyl), and SO2; each Rb is independently selected from C1-4alkyl and C1-4haloalkyl or each N(Rb)2 is azetidine, optionally substituted one or two times with halo; 5 each Rc is halo; each Rf is halo n1 is 0 to 3; n2 is 0 to 3; Rd is selected from C(O) and S-C1-4alkyl-C(O); and 10 Re is selected from S(O)2N(C1-4alkyl)2 and cyano; (ii) a cyanine dye; (iii) nitrobenzodiazole; or (iv) an Alexa Fluor dye. In some embodiments, n2 is 0. 15 In some embodiments, B is selected from structure (VIIa’), (VIIb’) and (VIIc’): Ib’),
(VIIc). In some embodiments where B is structure (VIIa) or (VIIa’), Z is selected from 20 C(C1-4alkyl)2, O, P(O)(phenyl), SO2, and Si(C1-4alkyl)2; Rb is C1-4alkyl such as methyl, C1- 4haloalkyl such as trifluoromethyl, or N(Rb)2 is azetidine, optionally substituted one or more times with fluoro; each Rc is fluoro; Rd is C(O); and n1 is 0 to 3. In more specific embodiments, Z is Si(C1-4alkyl)2, such as Si(methyl)2; Rb is methyl, or N(Rb)2 is azetidine, optionally substituted one or more times with fluoro; Rd is C(O); and n1 is 0. 25 In some embodiments where B is structure (VIIb) or (VIIb’), Z is selected from Si(C1-4alkyl)2, C(C1-4alkyl)2, and O; each Rb is independently selected from C1-4alkyl such as methyl and C1-4haloalkyl such as trifluoromethyl; Rd is C(O); and Re is selected from
M&C PE963057WO 35 S(O)2N(C1-4alkyl)2 (such as S(O)2N(methyl)2) and cyano. In more specific embodiments, Z is C(C1-4alkyl)2 (such as C(methyl)2) or O; Rb is C1-4alkyl, such as methyl; Rd is C(O); and Re is S(O)2N(C1-4alkyl)2, such as S(O)2N(methyl)2. In some embodiments where B is structure (VIIc) or (VIIc’), Z is selected from 5 Si(C1-4alkyl)2, C(C1-4alkyl)2, and O; each Rb is independently selected from C1-4alkyl such as methyl and C1-4haloalkyl such as trifluoromethyl; Rd is C(O); and Re is selected from S(O)2N(C1-4alkyl)2 (such as S(O)2N(methyl)2) and cyano. In more specific embodiments, Z is Si(C1-4alkyl)2, such as Si(methyl)2; Rb is C1-4alkyl, such as methyl; Rd is C(O); and Re is cyano. 10 In some embodiments, B is selected from a cyanine dye, such as Cy5; nitrobenzodiazole; and an Alexa Fluor dye. In some embodiments, B is selected from formulae (VII1) to (VII8):
15 O (VII2’) (VII3)
M&C PE963057WO 36
(VII7) (VII8), 5 wherein Z1 is selected from SiMe2, CHMe2, O, P(O)Ph and SO2; Z2 is selected from O and CHMe2; and A− is a counterion. In some embodiments, where B is selected from formulae (VII1) to (VII5), B is of formulae (VII’) to (VII5’):
M&C PE963057WO 37 II3’)
Wherein Z1 is selected from SiMe2, CHMe2, O, P(O)Ph and SO2; and Z2 is5 selected from O and CHMe2. In some embodiments, B is selected from formulae (VIIIa) to (VIIIh) and (VII8): IIb);
(VIIIc); O (VIIId);
M&C PE963057WO 38 IIIf); IIh);
In some embodiments, A- is halide, such as chloride, bromide or iodide. 5 In some embodiments where B is a molecule comprising a fluorescent group, the compound is selected from formulae (IX1) to (IX17):
(IX1);
Lt
M&C PE963057WO 43
( 17). 5 Each of the compounds of formulae (IX1) to (IX16) may be referred to with an alternative name, specifically: IX1: MR202 IX2: MR215 IX3: C10852K IX4: C10852L IX5: C10852M IX6: C10852N IX7: C10852R
M&C PE963057WO 44 IX8: C10852S IX9: C10852T IX10: C10852U IX11: C10852AA IX12: C10852AB IX13: C10852AC IX14: C10852AD IX15: C10852AH IX16: C10852AI In some cases, ‘C10852’ is replaced with ‘Jan’, e.g., C10852K may be referred to as “Jan K”, or the compound is simply referred to by its designating letter, e.g., C10852K may be referred to simply as “K”. 5 In some embodiments, B is of formula (VII7). In some embodiments, B is of formula (VII8). As described above, A− is a counterion. Provided that A is an anion capable of balancing the charge of the complex depicted in (VII7), the precise identity of A is not important. Nonetheless, in some embodiments, A− is selected from halide (such as fluoride, chloride, bromide, or iodide), acetate, nitrate, sulfate, phosphate, 10 bicarbonate, formate, oxalate, perchlorate, thiocyanate, cyanide, azide, benzoate, citrate, tartrate, carbonate, chromate, dichromate, hypochlorite, hydroxide, molybdate, tungstate, vanadate, arsenate, selenate, perrhenate, borate, salicylate, malonate, succinate, lactate, malate, ascorbate, picrate, tetraborate, ferrocyanide, ferricyanide, plumbate, stearate, palmitate, oleate, linoleate, propionate, butyrate, valerate, caproate, 15 caprylate, caprate, laurate, myristate, palmitate, stearate, arachidate, behenate, lignocerate, cerotate, montanate, melittate, abietate, tetrafluoroborate, and hexafluorophosphate. Typically, A− is selected from chloride, bromide, iodide, trifluoroacetate, and tetrafluoroborate. In some embodiments, where B is of formula (VII7), the species exists as a zwitterion and no counterion is required. 20 For the avoidance of doubt, all tautomeric forms and depictions of the compounds disclosed herein are included within the scope of the invention. For example, compounds according to formula (VIIa) may exist in a tautomeric form wherein the carboxylic acid comprises a negative charge, and an amino group comprises a positive charge, i.e., the compound is zwitterionic, as depicted below:
M&C PE963057WO 45 .
The formation of zwitterions may be dependent upon the pH of the system and the nature of the compound. The skilled person will recognise the different possible tautomeric forms of the 5 compounds of the disclosure, and will similarly recognise that by depicting only one form, any other from is not excluded from the scope of the invention. In some embodiments where B is a molecule comprising biotin, the compound is selected from formulae (IX17) or (IX18): 17)
10 (IX18). Compound (IX17) may be referred to as “MR131”. Compound (IX18) may be referred to as “MR169”. In some embodiments, where the tagged compound is a PROTAC, it is selected from formulae (IX19) or (IX20):
M&C PE963057WO 46 N N H S N
(IX20). Compound (IX19) may be referred to as “MR156”. Compound (IX20) may be 5 referred to as “MR170”. The compounds of the invention, both of the first and second aspects, exist in different stereoisomeric (namely enantiomeric and diastereoisomeric) forms owing to at least the stereogenic carbon atoms identified with asterisks below:
. 10 All stereoisomeric forms and mixtures thereof, including enantiomers and racemic mixtures, are included within the scope of the invention. Individual stereoisomers of compounds of formula I, i.e. compounds comprising less than 5%, 2% or 1% (e.g. less than 1%) of the other stereoisomer, are also included. Mixtures of stereoisomers in any proportion, for example a racemic mixture comprising substantially equal amounts of two15 enantiomers, are also included within the invention. Diastereoisomers may be separated using conventional techniques, e.g. chromatography or fractional crystallisation. The various stereoisomers may be isolated
M&C PE963057WO 47 by separation of a racemic or other mixture of the compounds using conventional, e.g. fractional crystallisation or HPLC, techniques. Alternatively the desired optical isomers may be made by reaction of the appropriate optically active starting materials under conditions which do not cause racemisation or epimerisation. 5 Often, the compounds of the disclosure are diastereomerically pure. Typically, the compounds of the disclosure are the following diastereomer:
. Often, the compounds of the disclosure are enantiomerically pure. Typically, the compounds of the invention are of the following absolute stereochemistry: 10
. Bromodomains As described above, in a third aspect there is provided a hole-modified mutant bromodomain, wherein the bromodomain comprises an amino acid sequence selected from residues: 15 (i) 351 to 460 of SEQ ID No.1, and: (a) the mutation L387C; (b) the mutations L387A or L387V, and E438C; and/or (c) the mutations L387A or L387V, and M442C; (ii) 60 to 167 of SEQ ID No.1, and: 20 (a) the mutation L94C; (b) the mutations L94A or L94V, and D145C; and/or (c) the mutations L94A or L94V, and M149C; (iii) 347 to 456 of SEQ ID No. 2, and: (a) the mutation L383C; 25 (b) the mutations L383A or L383V, and D434C; and/or
M&C PE963057WO 48 (c) the mutations L383A or L383V, and M438C; (iv) 76 to 183 of SEQ ID No.2, and: (a) the mutation L110C; (b) the mutations L110A or L110V, and D161C; and/or 5 (c) the mutations L110A or L110V, and M165C; (v) 309 to 416 of SEQ ID No.3, and: (a) the mutation L345C; (b) the mutations L345A or L345V, and E396C; and/or (c) the mutations L345A or L345V, and M400C; 10 (vi) 36 to 143 of SEQ ID No.3, and: (a) the mutation L70C; (b) the mutations L70A or L70V, and D121C; and/or (c) the mutations L70A or L70V, and M125C. (vii) 270 to 379 of SEQ ID No.4, and: 15 (a) the mutation L306C;
) the mutations L306A or L306V, and E357C; and/or (c) the mutations L306A or L306V, and M361C; (viii) 29 to 136 of SEQ ID No.4, and: (a) the mutation L63C; 20 (b) the mutations L63A or L63V, and D114C; and/or (c) the mutations L63A or L63V, and M118C. A hole-modified mutant bromodomain is understood to refer to a bromodomain that comprises a mutation such that there is a ‘hole’ in a site where substrates may bind. That is to say, one or more amino acid residues have been replaced with alternative 25 residues that may accommodate a substrate that, compared to substrates of the wild type domain, comprise a ‘bump’ (hence the ‘bump-and-hole’ technique). A bromodomain is a protein domain found in several proteins, notably the Bromo- and Extra-terminal domain (BET) family of proteins. The BET family comprises Brd2, Brd3, Brd4, and BrdT – each protein comprises two bromodomains, namely BD1 and BD2. 30 The present investigators have found that there is high homology between the bromodomains of several bromodomain containing proteins, namely Brd2, Brd3, Brd4, and BrdT. That is to say, there are conserved amino acid residues between each BD1 and BD2 of Brd2, Brd3, Brd4, and BrdT. The amino acid sequences SEQ ID No. 1-4 relate to the wild type Bromo- and Extra-terminal domain (BET) family of proteins. 35 SEQ ID No.1 refers to bromodomain-containing protein 4, or Brd4. Amino acid residues 351 to 460 of SEQ ID No. 1 describe bromodomain 2 of Brd4, or Brd4BD2.
M&C PE963057WO 49 Amino acid residues 60 to 167 of SEQ ID No. 1 describe bromodomain 1 of Brd4, or Brd4BD1. Hence, sequences under part (i) of the third aspect refer to hole-modified mutant bromodomains, wherein the bromodomain is Brd4BD2. Similarly, sequences under part (ii) of the third aspect refer to hole-modified Brd4BD1 bromodomains. 5 SEQ ID No.2 refers to bromodomain-containing protein 2, or Brd2. Amino acid residues 347 to 456 of SEQ ID No.2 describe bromodomain 2 of Brd2, or Brd2BD2. Amino acid residues 76 to 183 of SEQ ID No. 2 describe bromodomain 1 of Brd2, or Brd2BD1. Hence, sequences under part (iii) of the third aspect refer to hole-modified mutant bromodomains, wherein the bromodomain is Brd2BD2. Similarly, sequences10 under part (iv) of the third aspect refer to hole-modified Brd2BD1 bromodomains. SEQ ID No.3 refers to bromodomain-containing protein 3, or Brd3. Amino acid residues 309 to 416 of SEQ ID No.3 describe bromodomain 2 of Brd3, or Brd3BD2. Amino acid residues 36 to 143 of SEQ ID No. 3 describe bromodomain 1 of Brd3, or Brd3BD1. Hence, sequences under part (v) of the third aspect refer to hole-modified 15 mutant bromodomains, wherein the bromodomain is Brd3BD2. Similarly, sequences under part (vi) of the third aspect refer to hole-modified Brd3BD1 bromodomains. SEQ ID No.4 refers to bromodomain testis-specific protein, or BrdT. Amino acid residues 270 to 379 of SEQ ID No. 4 describe bromodomain 2 of BrdT, or BrdTBD2. Amino acid residues 29 to 136 of SEQ ID No. 4 describe bromodomain 1 of BrdT, or 20 BrdTBD1. Hence, sequences under part (vii) of the third aspect refer to hole-modified mutant bromodomains, wherein the bromodomain is BrdTBD2. Similarly, sequences under part (viii) of the third aspect refer to hole-modified BrdTBD1 bromodomains. The wild type (WT) bromodomains may be written with a superscript ‘WT’ denotation, for example, Brd4BD2WT, which comprises amino acid residues 351 to 46025 of SEQ ID No.1. The abovementioned 8 bromodomains encompassed by SEQ ID No.1 to 4 each comprise a particular leucine (L) residue that can be modified to form a ‘hole’. That is to say, the residue can be mutated to a smaller residue that results in said hole in which a ‘bumped’ ligand can bind. In some embodiments, the bromodomain comprises a leucine 30 to alanine (L to A) mutation or a leucine to valine (L to V) mutation of the hole residue. Each (b) and (c) mutant bromodomain of the third aspect describe mutants based on this hole-modification. Typically, the bromodomain comprises an L to A mutation of the hole- residue. Furthermore, the mutant bromodomains described in each (b) and (c) mutants above comprise an additional mutation to introduce a covalent handle, namely a thiol 35 group of a cysteine residue, so called ‘double-mutants’. Mutant bromodomains of the (b) group comprise a ‘D/E covalent handle’, whereby an aspartic acid (D) or glutamic acid
M&C PE963057WO 50 (E) residue is mutated to cysteine, i.e., a D to C or E to C mutation. Whether the mutation is D to C or E to C is dependent on the specific bromodomain, and relates to the position of the residue. Specifically, the residue to be mutated to a cysteine should be in a position such that contacting the mutant bromodomain with the compounds of the first and 5 second aspect allows the cysteine residue to interact with the electrophile of the compound and form a covalent bond. Mutant bromodomains of the (c) group comprise an ‘M covalent handle’, whereby a methionine (M) residue is mutated to cysteine, i.e., an M to C mutation. Again, the specific methionine residue to be mutated should be in a position such that contacting the mutant bromodomain with the compounds of the first 10 and second aspect allows the cysteine residue to interact with the electrophile of the compound and form a covalent bond. The double-mutants are designed to be able to form covalent bonds with compounds where E2 to E4 is an electrophile. That is to say, in some embodiments where the bromodomain comprises an amino acid sequence selected from any (b) or (c), E2, 15 E3, or E4 is an electrophile. More specifically, mutant bromodomains from the (b) group are designed to best interact with compounds where E3 is an electrophile, and (c) group bromodomains with compounds where E2 or E4 is an electrophile. That is to say, typically where the bromodomain comprises an amino acid sequence selected from any (b), E3 is an electrophile. Furthermore, typically where the bromodomain comprises an amino acid20 sequence selected from any (c), E2 or E4 is an electrophile. In some embodiments, the bromodomain comprises a leucine to cysteine (L to C) mutation of the hole residue. That is to say, the ‘hole’ residue has been modified to comprise a nucleophilic group, namely a thiol group on the cysteine residue. Each (a) mutant bromodomain of the third aspect describes mutants based on this hole- 25 modification, so called ‘single-mutants’. The single mutants are designed to interact with compounds that comprise an electrophile in the ‘bump’ position of the compound. That is to say, where the bromodomain comprises an amino acid sequence selected from any (a), E1 is an electrophile. In some embodiments, the bromodomain is suitable for binding the compounds 30 or tagged compounds of the first or second aspect of the invention, typically through covalent binding. In some embodiments, the bromodomain comprises an (a) mutation and a (b) and/or (c) mutation. That is to say, the mutant bromodomain may comprise an L to C mutation of the hole residue, and a D/E covalent handle, i.e., a D to C or E to C mutation, 35 and/or an M covalent handle, i.e., an M to C mutation. In some embodiments, the mutant bromodomain comprises the amino acid sequence selected from residues as defined in
M&C PE963057WO 51 part (ii), that is to say the bromodomain comprises an amino acid sequence selected from residues 60 to 167 of SEQ ID No. 1, and: (a) the mutation L94C; and: (b) D145 and/or (c) M149C. In other words: in some embodiments, wherein the mutant bromodomain comprises the amino acid sequence selected from residues 60 to 167 of 5 SEQ ID No.1, and the mutation L94C, the mutant bromodomain additionally comprises the mutations D145C and/or M149C, typically D145C or M149C. In some embodiments, the bromodomain comprises an amino acid sequence selected from residues: (i) 351 to 460 of SEQ ID No.1, and: 10 (a) the mutation L387C; (b) the mutations L387A or L387V, and E438C; or (c) the mutations L387A or L387V, and M442C; (ii) 60 to 167 of SEQ ID No. 1, and: (a) the mutation L94C; 15 (b) the mutations L94A or L94V, and D145C; or (c) the mutations L94A or L94V, and M149C; (iii) 347 to 456 of SEQ ID No.2, and: (a) the mutation L383C; (b) the mutations L383A or L383V, and D434C; or 20 (c) the mutations L383A or L383V, and M438C; (iv) 76 to 183 of SEQ ID No.2, and: (a) the mutation L110C; (b) the mutations L110A or L110V, and D161C; or (c) the mutations L110A or L110V, and M165C; 25 (v) 309 to 416 of SEQ ID No.3, and: (a) the mutation L345C; (b) the mutations L345A or L345V, and E396C; or (c) the mutations L345A or L345V, and M400C; (vi) 36 to 143 of SEQ ID No.3, and: 30 (a) the mutation L70C; (b) the mutations L70A or L70V, and D121C; or (c) the mutations L70A or L70V, and M125C. (vii) 270 to 379 of SEQ ID No.4, and: (a) the mutation L306C; 35 (b) the mutations L306A or L306V, and E357C; or (c) the mutations L306A or L306V, and M361C;
M&C PE963057WO 52 (viii) 29 to 136 of SEQ ID No.4, and: (a) the mutation L63C; (b) the mutations L63A or L63V, and D114C; or (c) the mutations L63A or L63V, and M118C. 5 In some embodiments, mutant bromodomain comprises an L to A mutation of the ‘hole’ residue. In some embodiments, the bromodomain comprises the amino acid sequence selected from residues as defined in parts (i), (iii), (v), or (vii). That is to say, the bromodomain comprises a BD2 bromodomain, typically a Brd4BD2 bromodomain (i) or10 a Brd2BD2 bromodomain (iii), more typically a Brd4BD2 bromodomain. In some embodiments, the bromodomain comprises the amino acid sequence selected from residues as defined in part (iii), that is to say the bromodomain comprises an amino acid sequence selected from residues 347 to 456 of SEQ ID No.2, and the mutations (b) or (c). Typically, the bromodomain comprises the mutation L383A or15 L383V, typically L383V; and D434C or M438C, typically D434C. In some embodiments, the bromodomain comprises the amino acid sequence selected from residues as defined in part (i), that is to say the bromodomain comprises an amino acid sequence selected from residues 351 to 460 of SEQ ID No. 1, and: (a) the mutation L387C; (b) the mutations L387A or L387V, and E438C; or (c) the mutations 20 L387A or L387V, and M442C. Typically, the bromodomain comprises the mutations described in (b) or (c). More typically, the bromodomain comprises the mutations described in (b) or (c) and an L to A mutation of the ‘hole’ residue. That is to say, in some embodiments, the bromodomain comprises an amino acid sequence selected from residues 351 to 460 of SEQ ID No. 1 and: the mutations L387A and E438C; or the25 mutations L387A and M442C. The hole-modified mutant bromodomains of the present disclosure may be used as a protein tag. That is to say, they may be grafted to a protein. In some embodiments, the hole-modified mutant bromodomain is grafted to the N- or C-terminus of the protein. In some embodiments, the tag may be integrated within the sequence of the protein: an30 “internal tag”. Suitable methods of grating or incorporating a protein tag to a protein include molecular biological (including CRISPR/Cas9, homologous recombination and transposon-mediated system) techniques known in the art and non-homologous end joining techniques known in the art. As described above, in a fourth aspect, there is provided a nucleic acid encoding35 a hole-modified mutant bromodomain according to the third aspect.
M&C PE963057WO 53 In some embodiments, the nucleic acid may be provided in the form of a nucleic acid construct comprising the nucleic acid in an expressible form. The nucleic acid construct may be provided in the form of a vector, for example. Suitable vectors include plasmids, bacterial vectors, viral vectors, phage vectors, insect vectors, yeast vectors, 5 mammalian vectors, BACs, YACs, or any other suitable vector. The nucleic acid may be expressed from various cell lines, or clones. In some embodiments, the vector is a vector that replicates in only one type of organism (e.g., bacteria, yeast, insects, mammals, etc.) or a vector that replicates in only one species. Some vectors can have a broad host range. Some vectors may have different functional sequences (e.g., origin or replication, 10 selectable markers, etc.) that are functional in different organisms. These can be used to shuttle a vector (and any nucleic acid cloned into the vector) between two different types of organisms, such as between bacteria and mammals, yeast and mammals, and the like. In some embodiments, the type of vector used may be determined by the type of host cell chosen. 15 Uses, methods, and kits As described above, in a fifth aspect, there is provided a use of a tagged compound according to the second aspect in, for example, immunoassays, diagnostics, photocaging, immobilisation, solubilisation, targeted protein degradation, fluorescence resonance energy transfer, time-resolved fluorescence resonance energy transfer, 20 induced proximity, fluorescence imaging, assays, sensing, and detection, wherein the sensing and detection is of a biomolecule or bioactive molecule. For the avoidance of doubt, all embodiments described herein, in relation to the second aspect of the invention, apply mutatis mutandis to the fifth aspect of the invention. An immunoassay is an assay wherein the detection of a species, typically a 25 biomolecule (such as a protein), is based on the formation of an immune complex, e.g., a complex formed between an antibody and an antigen. For example, the tagged compound according to the second aspect may comprise biotin, and/or may be capable of immobilising the compound onto a bead (such as a Luminex bead), resin or solid surface. 30 Photocaging is a technique whereby a tagged compound or a target molecule, such as a protein or polypeptide comprising one or more hole-modified mutant bromodomains as described herein is modified with a photolabile group, wherein the photolabile group alters and/or inhibits a function of the tagged compound or the target molecule, and wherein the function is restored upon exposure of the photolabile group 35 to a predetermined wavelength of light such that the photolabile group is cleaved from the protein. Photocaging can be utilised in studies of biological systems, by enabling
M&C PE963057WO 54 precise control over the function of a protein. In some embodiments, the tagged compound according to the second aspect for use in photocaging comprises a photolabile group, wherein the wherein the photolabile group alters or prevents the binding of the tagged compound to the target molecule/protein, and wherein the function 5 of the tagged compound is restored upon exposure of the photolabile group to a predetermined wavelength of light such that the photolabile group is altered in some way such that function is restored. This may be an alteration in the structure and/or orientation of the photolabile group, or cleavage of the photolabile group from the tagged compound. For example, where a tagged compound comprises a photolabile group, it may not be 10 able to covalently bind to a target molecule, such as a protein or peptide, and may only be able to bind to the target molecule when the photolabile group is exposed to a predetermined wavelength of light, thereby altering it in some way, for example structurally or in its orientation, or is cleaved from the tagged compound. Alternatively, where a tagged compound comprises a photolabile group, it may covalently bind to a 15 protein tagged with a hole-modified mutant bromodomain; and that protein may lose its function when it is bound to the tagged compound, and regain its function when the photolabile group is exposed to a predetermined wavelength of light. In some embodiments, the photolabile group is selected from MNI (4-methoxy-7-nitroindolinyl), CNB (α-carboxy-2-nitrobenzyl), MDNI (4-methoxy-5,7-dinitroindolinyl), DMNPE (4,5- 20 dimethoxy-2-nitroacetophenone), NPE (1-(2-nitrophenyl)ethyl), NV (2-nitroveratryl), and CNV (carboxynitroveratryl). The terms “immobilisation”, “solubilisation”, “targeted protein degradation”, “fluorescence resonance energy transfer”, “fluorescence imaging” and related terms have been defined or discussed above, thus their use in the present context may be 25 construed accordingly. “Time-resolved fluorescence resonance energy transfer” differs from FRET in that it utilises a long lifetime donor (such as a lanthanide complex) to reduce the background signal noise in complex samples - detection of the signal starts after a delay to allow the background noise to decay. On the other hand, standard FRET uses conventional dyes for the donor and do not enable a time delay. 30 Many types of assay are known to the skilled person, such as ELISA (enzyme- linked immunosorbent assay), PCR (polymerase chain reaction) assay, Western blot, cell viability assays, microarrays, and other assays described herein. Additionally, the skilled person will recognise which compounds of the second aspect are suitable for use in a particular assay. 35 With respect to the present disclosure, the term "diagnostics" refers to methods and processes involved in identifying the presence, absence, or specifics of a disease,
M&C PE963057WO 55 condition, or other biological state in a patient or sample. Compounds of the second aspect suitable for use in diagnostics may comprise contrast agents, tracers, probes, or other biochemical reagents known to the skilled person, including those described herein. 5 The term “induced proximity”, sometimes “chemical induced proximity” (CIP) (mentioned above), refers to a technique that can facilitate or enable the interaction of two or more biological molecules, such as proteins, that may not necessarily interact, or only interact to an impractical extent. Using small molecules (proximity-inducing compounds/chimera, PIC) and/or engineered proteins, CIP can artificially induce the 10 proximity of biological molecules, triggering specific cellular processes or reactions. This is often achieved in practice by tagging one or more target proteins with a protein tag, and contacting the target proteins with a bifunctional molecule that comprises ligands for the target proteins or the tags. Thus, when the ligands bind to the tag(s) or protein(s), the target proteins are necessarily brought into proximity by the fact that the ligands are 15 comprised within a small molecule. An example of a specific type of CIP is TPD, described above. Hence, in some embodiments, the tagged compound according to the second aspect for use in induced proximity comprises B, wherein B is capable of binding to: a biomolecule, optionally a protein tag. Particular examples of B molecules that are capable of binding to the above-mentioned entities are described above. 20 As described above, the sensing and detection is of a biomolecule or bioactive molecule. In some embodiments, the biomolecule is a protein, carbohydrate, nutrient, polysaccharide, glycoprotein, hormone, receptor, antigen, antibody, virus, growth factor, lipoprotein, or any other biomolecule known to the skilled person. Typically, the biomolecule is a protein. In some embodiments, the bioactive molecule (i.e., a typically 25 small molecule that may have some effect on a biological system, e.g., upon binding to a protein) is a metabolite, substrate, inhibitor, drug, and/or nutrient. In some embodiments, the biomolecule is tagged with a polypeptide comprising one or more hole- modified mutant bromodomains. In more specific embodiments, the hole-modified mutant bromodomain comprises the amino acid sequence of the third aspect of the30 invention. As described above, in a sixth aspect, there is provided a method of covalently binding a target biomolecule with a compound, the method comprising contacting the biomolecule with the compound, wherein: the compound is according to the first or second aspect, and the biomolecule comprises a hole-modified mutant bromodomain 35 engineered to comprise a cysteine residue, positioned such that contacting the
M&C PE963057WO 56 biomolecule with the compound allows the cysteine residue to interact with the electrophile of the compound and form a covalent bond. The term “contacting”, as defined above, is understood to relate to any one or more of the acts of combining, such as reacting, mixing, stirring, slurrying, blending, 5 dissolving, incubating, passing over, flowing over, or otherwise, in any order, and for any length of time. In some embodiments, the contacting comprises incubating. In some embodiments, the contacting comprises incubating in a buffer solution, wherein the buffer solution comprises a buffer agent. Typically, the buffer agent is HEPES (4-(2- hydroxyethyl)-1-piperazineethanesulfonic acid). In some embodiments, the buffer10 solution additionally comprises NaCl and/or TCEP (tris(2-carboxyethyl)phosphine). In some embodiments, the target biomolecule is a protein, optionally a tagged protein. In some embodiment, the biomolecule comprises one or more hole-modified mutant bromodomain(s) from the Bromo- and Extra-terminal domain (BET) proteins, Brd2, Brd3, Brd4 and BrdT, or a bromodomain containing fragment thereof. In more 15 specific embodiments, the hole-modified mutant bromodomain comprises the amino acid sequence according to the third aspect of the invention. In a further aspect, there is provided a use of a hole-modified mutant bromodomain according to the third aspect for binding a compound according to the first or second aspect. In another aspect, there is provided a use of a nucleic acid encoding 20 a hole-modified mutant bromodomain according to the fourth aspect for the production of a hole-modified mutant bromodomain according to the third aspect. In some embodiments, the hole-modified mutant bromodomain is for binding a compound according to the first or second aspect. 25 As described above, in a seventh aspect, there is provided a kit comprising, as separate components: (i) a compound according to the first or second aspect; and (ii) a nucleic acid, or construct comprising a nucleic acid, encoding a hole-modified mutant bromodomain. In some embodiments, the kit comprises (i) a compound according to the first or 30 second aspect; and (ii) one or more reagents suitable for conjugating the compound to a biomolecule, optionally an antibody. In some embodiments, the kit comprises (i) a compound according to the first or second aspect; and (ii) an anti-BromoTag antibody detection reagent. In some embodiments, the hole-modified mutant bromodomain(s) is from the 35 Bromo and Extra-terminal domain (BET) proteins, Brd2, Brd3, Brd4 and BrdT, or a bromodomain containing fragment thereof. In more specific embodiments, the hole-
M&C PE963057WO 57 modified mutant bromodomain comprises the amino acid sequence according to the third aspect of the invention. In some embodiments, the nucleic acid may be provided in the form of a nucleic acid construct comprising the nucleic acid in an expressible form. The nucleic acid 5 construct may be provided in the form of a vector, for example. Suitable vectors include plasmids, bacterial vectors, viral vectors, phage vectors, insect vectors, yeast vectors, mammalian vectors, BACs, YACs, or any other suitable vector. In some embodiments, the vector is a vector that replicates in only one type of organism (e.g., bacteria, yeast, insects, mammals, etc.) or a vector that replicates in only one species. Some vectors 10 can have a broad host range. Some vectors may have different functional sequences (e.g., origin or replication, selectable markers, etc.) that are functional in different organisms. These can be used to shuttle a vector (and any nucleic acid cloned into the vector) between two different types of organisms, such as between bacteria and mammals, yeast and mammals, and the like. In some embodiments, the type of vector15 used may be determined by the type of host cell chosen. In some embodiments, the compound is a tagged compound of the second aspect of the invention, wherein B comprises a reactive handle. Suitable examples of reactive handles have been described above, but nonetheless include azides, alkynes, thiols, alkenes, aldehydes, hydrazines, cyclooctynes, tetrazines, nitrile oxides, 20 alkoxyamines, isothiocyanates, sulfonate esters, epoxides, carboxylic acids, esters, and maleimides. In more specific embodiments, the reactive handle is an alkyne or azide. In some embodiments, the kit further comprises a biomolecule capable of reacting with the reactive handle of the tagged compound. For example, the biomolecule may comprise a complementary reactive handle suitable for reacting with the reactive handle on the25 tagged compound, for example via a click reaction, as described above. In a further aspect, there is provided a kit comprising, as separate components: (i) a tagged compound according to the second aspect, wherein B comprises a reactive handle capable of reacting with a biomolecule for conjugation, such as an alkyne or 30 azide; and (ii) a biomolecule capable of reacting with the reactive handle of the tagged compound.
M&C PE963057WO 58 EXAMPLES Compounds General Experimental Information: Unless otherwise stated, all reagents and solvents were purchased from commercial sources and used without further purification. 5 Nuclear magnetic resonance spectra were recorded on a Bruker Ascend 500 MHz spectrometer or a Bruker Avance III HD spectrometer, operating at 500 MHz and 400 MHz for 1H NMR, respectively, and 100 MHz for 13C NMR. 1H NMR and 13C NMR chemical shifts (δ) are reported in parts per million (ppm) and are referenced to residual protium in solvent and to the carbon resonances of the residual solvent peak 10 respectively. DEPT and correlation spectra were run in conjunction to aid assignment. Coupling constants (J) are quoted in Hertz (Hz), and the following abbreviations were used to report multiplicity: s = singlet, d = doublet, dd = doublet of doublets, ddd = double doublet of doublets, t = triplet, q = quartet, m = multiplet, br s = broad singlet. Liquid chromatography-mass spectrometry (LC-MS) was carried out on a Shimadzu HPLC/MS 15 2020 equipped with a Hypersil Gold column (1.9 µm particle size, 50 × 2.1 mm), photodiode array detector and ESI detector, or by using Agilent InfinityLab LC/MSD systems. Purification by flash column chromatography was carried out using Fisher Scientific silica gel 60Å (35-70 μm), or by using Biotage fSelekt, Biotage Isolera, Grace Reveleris, Buchi Pure, or Teledyne Isco Combiflash systems. Thin layer chromatography 20 was performed on glass plates pre-coated with silica gel (Analtech, UNIPLATE™ 250 μm / UV254), with visualization being achieved using UV light (254 nm) and/or by staining with alkaline potassium permanganate dip. Me
25 Compound MR100: A solution of methyl ET-JQ1-OMe (described in Bond et al., J. Med. Chem., 2021, 64, 15477) (1.00 equiv., 40 mg, 0.0903 mmol), potassium vinyltrifluoroborate (3.00 equiv., 36 mg, 0.271 mmol), XPhos Pd G2 (0.300 equiv., 21 mg, 0.0271 mmol) and N,N-diisopropylethylamine (5.00 equiv., 0.079 mL, 0.452 30 mmol) in DMF (0.6315 mL) and water (0.0631 mL) was heated to 100 °C. The reaction was monitored by LC-MS and only the mass of the product was observed after 24 h, with
M&C PE963057WO 59 a similar retention time to that of the starting material. After being stirred for 24 h, the mixture was filtered through celite, the solvent was evaporated and redissolved in DCM . The organic layer was washed with water (2 x 50 mL) and brine (20 x 50 mL) and dried with MgSO4. The crude material was purified by normal phase column chromatography 5 using a 0/100 to 5/95 MeOH/DCM gradient to give a yellow powder (35 mg, 100%). ESI-MS m/z [M+H]+ 435.0. 1H NMR (500 MHz, CDCl3) δ 7.32 – 7.24 (m, 4H), 6.64 (dd, J = 17.6, 10.9 Hz, 1H), 5.72 (d, J = 17.6, 1H), 5.24 (d, J = 10.9 Hz, 1H), 4.17 (d, J = 10.9 Hz, 1H), 3.93 (td, J = 10.8, 3.6 Hz, 1H), 3.79 (s, 3H), 2.60 (s, 3H), 2.34 (s, 3H), 2.16-2.07 (m, 1H), 1.64- 10 1.55 (m, 4H), 0.96 (t, J = 7.4 Hz, 3H).
Compound MR70: To a stirred solution of MR100 (1.00 equiv., 20 mg, 15 0.0460 mmol) and sodium periodate (3.00 equiv., 30 mg, 0.138 mmol) in acetone (0.5 mL) and water (0.1 mL) was added osmium tetroxide (0.0500 equiv., 15 µL, 0.0023 mmol, 4%w/v solution in water). The reaction was stirred for 2 h, then diluted with EtOAc. The organic layer was washed with water and dried with MgSO4 and concentrated under reduced pressure. The crude material was purified by preparative reverse phase column 20 chromatography using a mixture of 5/95 to 95/5 MeCN/H2O 0.1% NH4OHgradient to give a white powder (18 mg, quantitative yield). ESI-MS [M+H]+ m/z : 437.0. 1H NMR (500 MHz, CDCl3) δ 10.07 (s, 1H), 7.88 (d, J = 8.5 Hz, 2H), 7.57 (d, J = 8.1 Hz, 2H), 4.32 (d, J = 11.0 Hz, 1H), 4.05 (td, J = 10.7, 3.7 Hz, 1H), 3.90 (s, 3H), 2.71 25 (s, 3H), 2.45 (s, 3H), 2.26 – 2.18 (m, 1H), 1.77 – 1.66 (m, 4H), 1.06 (t, J = 7.4 Hz, 3H).
M&C PE963057WO 60
Compound MR101: A mixture of ET-JQ1-OMe (1 equiv., 70 mg, 0.158 mmol), XPhos Pd G2 (0.3, 37 mg, 0,0475 mmol), XPhos (0.300 equiv., 23 mg, 0.0475 mmol), caesium carbonate (5.00 equiv., 258 mg, 0.792 mmol), and prop-2-yn-1-amine (10.0 5 equiv., 0.10 mL, 1.58 mmol) were suspended in THF (1.5 mL). The reaction was heated to 100 °C; 5 mL of THF every 15 min due to evaporation. After stirring for 1 h, the reaction was cooled to 90 °C, whilst maintaining THF addition. Then the mixture was stirred for 2 h and cooled to 60 °C and stirred overnight. The mixture was suspended in DCM and filtered through celite. The organic layer was washed with water and dried with MgSO4. 10 The crude was purified by preparative reverse phase column chromatography using a mixture of 5/95 to 95/5 MeCN/H2O 0.1% NH4OH gradient to give a white powder (3 mg, 20%). ESI-MS [M+H]+ m/z : 462.1. 1H NMR (500 MHz, CDCl3) δ 7.39 – 7.28 (m, 4H), 4.23 (d, J = 10.9 Hz, 1H), 3.98 15 (td, J = 10.7, 3.6 Hz, 1H), 3.84 (s, 3H), 3.65 (s, 2H), 2.65 (s, 3H), 2.40 (s, 3H), 2.20 – 2.13 (m, 1H), 1.71 – 1.60 (m, 4H), 1.01 (t, J = 7.4 Hz, 3H).
20 Compound MR104: To a stirred solution of MR100 (30 mg, 0.05 mmol) in dichloromethane (1 mL) was added m-chloroperbenzoic acid (m-CPBA, 12 mg, 0.05 mmol) and the reaction was left stirring overnight. The mixture was diluted with DCM (50 mL) and the organic layer was washed with water (2 x 50 mL) and brine (2 x 50 mL)
M&C PE963057WO 61 and dried with MgSO4 and solvent evaporated under reduced pressure. The crude was purified by preparative reverse phase column chromatography using a mixture of 5/95 to 95/5 MeCN/H2O 0.1% NH4OH gradient to give a white powder (3 mg, 10%). ESI-MS [M+H]+ m/z : 451.0. 5 1H NMR 500 MHz, CDCl3) δ 7.40 – 7.32 (m, 2H), 7.32 (d, J = 8.5 Hz, 1H), 7.23 (d, J = 7.6 Hz, 1H), 4.23 (d, J = 11.0 Hz, 1H), 3.98 (td, J = 10.9, 3.6 Hz, 1H), 3.86 – 3.85 (m, 1H), 3.84 (s, 3H), 3.16 – 3.13 (m, 1H), 2.74 (td, J = 5.8, 2.5 Hz, 1H), 2.69 (s, 1H), 2.66 (d, J = 2.9 Hz, 3H), 2.40 (s, 3H), 2.21 – 2.13 (m, 1H), 1.71 – 1.59 (m, 4H), 1.01 (t, J = 7.4 Hz, 3H) 10
Compound ET-JQ1-OMe-aniline: To a stirred solution of ET-JQ1-OMe (2.50 g, 5.64 mmol) in 1,4-dioxane (125 mL) was added benzophenone imine (1.22 g, 6.77 mmol) 15 and potassium phosphate tribasic (2.99 g, 14.11 mmol), and argon was bubbled through the stirring mixture for 30 minutes. After this time, tBuXPhos (0.36 g, 0.85 mmol) and Pd2(dba)3 (0.26 g, 0.28 mmol) were added, and after passage of argon for a further 5 minutes, the reaction mixture was heated at 85 °C overnight. The reaction mixture was cooled to 55 °C and partially concentrated under reduced pressure, the subsequent 20 residue being partitioned between water and ethyl acetate. The organic phase was separated, and the aqueous component was extracted with ethyl acetate. The combined organic extracts were washed with brine, dried over anhydrous magnesium sulfate and concentrated under reduced pressure to a brown oil. The crude imine was dissolved in THF (50 mL) and treated with 1 M hydrochloric acid (50 mL), and the resulting solution 25 was stirred at ambient temperature for 2 h before being partially concentrated under reduced pressure. The aqueous residue was further diluted with water and washed with ethyl acetate. The aqueous phase was then adjusted to pH 8-9 by slow addition of saturated aqueous NaHCO3 solution and extracted with ethyl acetate (3 portions). These organic extracts were combined, dried over anhydrous magnesium sulfate and 30 concentrated under reduced pressure to a pale-yellow foam. Purification by flash column chromatography, eluting with 1-2% 7 M methanolic ammonia/DCM, afforded a yellow
M&C PE963057WO 62 foam. Lyophilisation from MeCN/H2O (1:2) afforded the title compound as a pale yellow solid (1.20 g, 50%). ESI-MS [M+Na+]+ m/z: 446.2. 1H NMR (400 MHz, CDCl3) δ: 7.19 (d, J= 8.3 Hz, 2H), 6.58 (d, J= 8.8 Hz, 2H), 5 4.18 (d, J= 11.0 Hz, 1H), 3.97 (td, J= 10.7, 3.7 Hz, 1H), 3.83 (s, 3H), 2.64 (s, 3H), 2.40 (s, 3H), 2.22-2.11 (m, 1H), 1.74-1.72 (m, 3H), 1.68-1.59 (m, 1H), 1.01 (t, J= 7.4 Hz, 3H). N N Me
10 Compound MR112: ET-JQ1-OMe-aniline (1.00 equiv., 75 mg, 0.177 mmol) was dissolved in DCM (1.9 mL) and N,N-diisopropylethylamine (1.00 equiv., 0.031 mL, 0.177 mmol) was added. Following stirring for 1 min, 2-chloroacetyl chloride (3.00 equiv., 0.043 mL, 0.531 mmol) was added dropwise and the reaction stirred for another 2 hours. After 2 hours, the reaction was quenched with a few drops of MeOH and the solvent 15 evaporated in vacuo. The residue was partitioned between water and dichloromethane, and the organic layer was washed with water (2 x 50 mL) and brine (2 x 50 mL) dried with MgSO4 and concentrated in vacuo. The crude was purified by normal phase chromatography using a gradient of DCM/MeOH 100/0 to 90/10 to give a white solid. (60 mg, 67%). 20 ESI-MS [M+H]+ m/z : 500.1. 1H NMR (500 MHz, CDCl3) δ 8.49 (s, 1H), 7.51 (d, J = 9.0 Hz, 2H), 7.31 (d, J = 8.6 Hz, 2H), 4.17 (d, J = 11.0 Hz, 1H), 4.12 (s, 2H), 3.91 (td, J = 10.7, 3.7 Hz, 1H), 3.78 (s, 3H), 2.60 (s, 3H), 2.34 (s, 3H), 2.13 – 2.05 (m, 1H), 1.65 – 1.53 (m, 4H), 0.95 (t, J = 7.4 Hz, 3H). 25
M&C PE963057WO 63 Compound MR121: ET-J
Q1-OMe-aniline (1 equiv., 75 mg, 0.18 mmol) was dissolved in DCM (1.9 mL) and pyridine (1.00 equiv., 0.031 mL, 0.177 mmol) was added and the reaction was left stirring at 0 °C. Following stirring for 1 min, vinyl sulfonyl chloride 5 (1.00 equiv., 0.017 mL, 0.177 mmol) was added dropwise and reaction stirred for another 2 hours. After 2 hours, the reaction was quenched with a few drops of MeOH and the solvent evaporated in vacuo. The residue was partitioned between dichloromethane and water, and the organic layer was washed with water (2 x 50 mL) and brine (2 x 50 mL), dried with MgSO4 and concentrated in vacuo to give a white crystalline solid (20 mg,10 30%). ESI-MS [M+H]+ m/z : 514.0. 1H NMR (500 MHz, CDCl3) δ 7.44 (s, 1H), 7.35 (d, J = 8.2 Hz, 2H), 7.21 – 7.16 (m, 1H), 6.53 (dd, J = 16.5, 9.9 Hz, 1H), 6.27 (d, J = 16.6 Hz, 1H), 5.95 (d, J = 9.9 Hz, 1H), 4.25 (d, J = 11.0 Hz, 1H), 3.99 (td, J = 10.7, 3.7 Hz, 1H), 3.86 (s, 3H), 2.69 (s, 3H), 15 2.43 (s, 3H), 2.24 – 2.12 (m, 1H), 1.74 – 1.61 (m, 4H), 1.04 (t, J = 7.4 Hz, 3H).
Compound MR116: ET-JQ1-OMe-aniline (1 equiv., 75 mg, 0.18 mmol) was 20 dissolved in DCM (1.9 mL) and triethylamine (1.20 equiv., 0.030 mL, 0.212 mmol) was added. Following stirring for 1 min, acryloyl chloride (1.50 equiv., 0.021 mL, 0.266 mmol) was added dropwise and reaction stirred for another 2 hours at room temperature. The reaction was then diluted with DCM and the organic layer was washed with water (2 x 50 mL) and brine (2 x 50 mL), dried with MgSO4 and concentrated in vacuo. The crude
M&C PE963057WO 64 material was purified by normal phase column chromatography using a 0/100 to 5/95 MeOH/DCM gradient to give a yellow powder (70 mg, 82%). ESI-MS [M+H]+ m/z : 478.1. 1H NMR (500 MHz, CDCl3) δ 8.46 (s, 1H), 7.58 (d, J = 8.3 Hz, 2H), 7.26 (d, J = 5 8.8 Hz, 2H), 6.36 (dd, J = 16.9, 1.6 Hz, 1H), 6.27 (dd, J = 16.8, 9.9 Hz, 1H), 5.65 (dd, J = 9.9, 1.7 Hz, 1H), 4.16 (d, J = 10.9 Hz, 1H), 3.89 (td, J = 10.8, 3.7 Hz, 1H), 3.76 (s, 3H), 2.58 (s, 3H), 2.33 (s, 3H), 2.12 – 2.04 (m, 1H), 1.62 – 1.53 (m, 4H), 0.93 (t, J = 7.4 Hz, 3H). 10
Compound MR135: A solution of 2-fluoroprop-2-enoic acid (1.2 equiv., 0.011 mg), HATU (1.2 equiv., 43 mg, 0.11 mmol) and triethylamine (1 equiv., 0.01 mL, 0.9 mmol) in DMF (1 mL) was mixed for 3 minutes. The resulting solution was added to 15 a solution of ET-JQ1-OMe-aniline (1 equiv., 40 mg, 0.1 mmol) and triethylamine (1 equiv., 0.01 mL, 0.9 mmol) in DMF (1.4 mL). The solvent was evaporated in vacuo, the crude material was dissolved in DCM, and the organic layer was washed with water (2 x 50 mL) and brine (2 x 50 mL) and dried with MgSO4. The crude was purified by normal phase column chromatography using a 0/100 to 5/95 MeOH/DCM gradient to20 give methyl the title compound as a white crystalline solid (35 mg, 75% yield). ESI-MS [M+H]+ m/z : 496.3. 1H NMR (500 MHz, CDCl3) δ 8.25 (d, J = 4.9 Hz, 1H), 7.94 (s, 1H), 7.59 – 7.54 (m, 2H), 7.34 – 7.29 (m, 2H), 5.25 – 5.14 (m, 1H), 4.17 (d, J = 11.0 Hz, 1H), 3.92 (td, J = 10.7, 3.7 Hz, 1H), 3.78 (s, 3H), 2.59 (s, 3H), 2.34 (s, 3H), 2.16 – 2.05 (m, 1H), 1.65 – 25 1.53 (m, 4H), 0.94 (t, J = 7.4 Hz, 3H).
M&C PE963057WO 65 (2-Amino-4,5-dimethylthiop
en-3-y)(3-c orophenyl)methanone: To a stirred suspension of 3-chlorobenzoylacetonitrile (40.00 g, 222.71 mmol) in EtOH (900 mL) was added methyl ethyl ketone (24.94 mL, 278.39 mmol), morpholine 5 (3.90 mL, 44.54 mmol) and then sulfur (62.84 g, 244.98 mmol), and the resulting reaction mixture was stirred at 70 °C overnight. Upon cooling to ambient temperature, the reaction mixture was poured into brine (2500 mL) and extracted with ethyl acetate (3 x 1000 mL). The combined organic extracts were washed with brine (1000 mL), dried over anhydrous magnesium sulfate and concentrated under reduced pressure to an orange oil. 10 Trituration with TBME overnight at ambient temperature afforded a yellow solid (16.91 g). Purification of the mother liquors by dry flash chromatography, eluting with 0-20% EtOAc/DCM, followed by trituration with TBME and petroleum ether 40:60 afforded a second crop of crude product (6.11 g). The combined material was converted to the oxalic acid salt by treatment with oxalic acid dihydrate (10.90 g, 86.6 mmol) in MeOH 15 (200 mL) and water (60 mL), and after concentration under reduced pressure, the salt was recrystallized from refluxing acetonitrile (330 mL) to give a crystalline solid (16.35 g). Treatment with 0.5 M NaOH (aq.) (400 mL), DCM (300 mL) and TBME (400 mL) gave a biphasic mixture. The organic component was separated, and the aqueous phase extracted with TBME (2 x 200 mL). The combined organic extracts were dried over 20 anhydrous magnesium sulfate and concentrated under reduced pressure to afford the title compound as a green solid (13.59 g, 23%). ESI-MS [M+H+]+ m/z: 266.1. 1H NMR (400 MHz, CDCl3) δ: 7.52-7.32 (m, 4H), 2.13 (d, J= 0.7 Hz, 3H), 1.54 (d, J= 0.7 Hz, 3H). 25
Methyl (R)-2-((S)-5-(3-chlorophenyl)-6,7-dimethyl-2-oxo-2,3-dihydro-1H- thieno[2,3-e][1,4]diazepin-3-yl)butanoate: To a stirred suspension of (2-amino-4,5-
M&C PE963057WO 66 dimethylthiophen-3-yl)(3-chlorophenyl)methanone (8.59 g, 32.31 mmol) in toluene (40 mL) at ambient temperature was added freshly dried and activated 4Å molecular sieves (30 g) followed by TFA (4.80 mL, 64.52 mmol). The resulting red solution was stirred for 5 minutes, after which time a solution of methyl (R)-2-((S)-2,5-dioxooxazolidin- 5 4-yl)butanoate (6.50 g, 32.31 mmol) in toluene (20 mL) was added dropwise, and the reaction mixture was heated at 60 °C for 2.5 hours. Triethylamine (13.51 mL, 96.93 mmol) was added, and the reaction mixture was heated at 80 °C overnight. Upon cooling to ambient temperature, the reaction mixture was filtered, the filter cake being washed with DCM, and the filtrate was concentrated under reduced pressure. The residue was 10 partitioned between saturated aqueous NaHCO3 solution and DCM. The organic phase was separated, and the aqueous component was extracted with DCM. The combined organic extracts were washed with brine, dried over anhydrous magnesium sulfate and concentrated under reduced pressure. Purification by flash column chromatography, eluting with 0-10% EtOAc/DCM, and then a for a second time eluting with 2-10%15 EtOAc/cyclohexane, afforded the title compound as a yellow solid (5.20 g, 38%). ESI-MS: [M+H+]+ m/z: 405.0. 1H NMR (400 MHz, CDCl3) δ: 8.63 (br s, 1H), 7.42-7.36 (m, 2H), 7.29-7.22 (m, 2H), 3.83 (s, 3H), 3.71-3.61 (m, 1H), 2.30 (d, J= 0.5 Hz, 3H), 1.96-1.85 (m, 1H), 1.65- 1.53 (m, 5H), 1.02 (t, J= 7.4 Hz, 3H). 20 O
Cl Compound m-chloro-ET-JQ1-OMe: To a stirred solution of methyl (R)-2-((S)-5- (3-chlorophenyl)-6,7-dimethyl-2-oxo-2,3-dihydro-1H-thieno[2,3-e][1,4]diazepin-3- 25 yl)butanoate (5.00 g, 12.35 mmol) in THF (60 mL) at −78 °C was dropwise added potassium tert-butoxide (24.70 mL, 24.70 mmol, 1 M solution in THF), and stirring was maintained at this temperature for 30 minutes. Diethyl chloro-phosphate (3.55 mL, 24.70 mmol) was added dropwise, and upon completion of the addition, the reaction mixture was allowed to warm to ambient temperature over 90 minutes, with an additional portion 30 of THF (8 mL) being added during this time. Acetohydrazide (2.74 g, 37.05 mmol) was added portion-wise, and stirring was maintained at ambient temperature for 1 hour. n-
M&C PE963057WO 67 Butanol (90 mL) was added and the reaction mixture was heated at 90 °C overnight, before being re-cooled to ambient temperature and concentrated under reduced pressure. The residue was partitioned between saturated aqueous NaHCO3 solution and DCM. The organic phase was separated, and the aqueous component was extracted 5 with DCM. The combined organic extracts were washed with brine, dried over anhydrous magnesium sulfate and concentrated under reduced pressure. Purification by flash column chromatography, eluting with 5-75% EtOAc/heptane, followed by re- concentration from cyclohexane/TBME mixture and drying under high vacuum at 60 °C afforded the title compound as a beige solid (2.80 g, 51%). 10 ESI-MS: [M+H+]+ m/z: 443.2. 1H NMR (400 MHz, CDCl3) δ: 7.40-7.36 (m, 2H), 7.30-7.23 (m, 2H), 4.25 (d, J= 11.0 Hz, 1H), 3.99 (td, J= 10.7, 3.7 Hz, 1H), 3.87 (s, 3H), 2.68 (s, 3H), 2.42 (d, J= 0.6 Hz, 3H), 2.23-2.11 (m, 1H), 1.74-1.61 (m, 4H), 1.02 (t, J= 7.4 Hz, 3H). 15
Compound MR108: A solution of m-chloro-ET-JQ1-OMe (1.00 equiv., 100 mg, 0.241 mmol), potassium vinyltrifluoroborate (3.00 equiv., 96 mg, 0.723 mmol), XPhos Pd G2 (0.300 equiv., 56.9 mg, 0.0723 mmol) and N,N-diisopropylethylamine (4.00 equiv., 20 0.167 mL, 0.964 mmol) in DMF (2 mL) and water (0.5 mL) was heated to 100 °C. The reaction was monitored by LC-MS and only the mass of the product was observed after 24 hours, with a similar retention time to that of the starting material. After being stirred for 24 h, the mixture was filtered and washed with water and saturated aqueous NaCl. The organic layer was washed with water (2 x 50 mL) and brine (20 x 50 mL) and dried 25 with MgSO4. The crude material was purified by normal phase column chromatography (MeOH/DCM = 0 to 12%, eluted around at 8%, UV detection) gradient to give a yellow powder (90 mg, 91%). ESI-MS [M+H]+ m/z : 437.0. 1H NMR (500 MHz, CDCl3) δ 7.47 – 7.41 (m, 2H), 7.31 – 7.25 (m, 1H), 7.23 – 30 7.17 (m, 1H), 6.68 (dd, J = 17.6, 10.9 Hz, 1H), 5.74 – 5.66 (m, 1H), 5.26 (d, J = 10.8 Hz, 1H), 4.24 (d, J = 11.0 Hz, 1H), 4.01 (td, J = 10.8, 3.6 Hz, 1H), 3.87 (s, 3H), 2.67 (s, 3H), 2.40 (s, 3H), 2.22 – 2.14 (m, 1H), 1.71 – 1.61 (m, 4H), 1.03 (t, J = 7.4 Hz, 3H).
M&C PE963057WO 68
Compound MR115: To a solution of MR108 (1.00 equiv., 28 mg, 0.0660 mmol) 5 and sodium periodate (3.00 equiv., 42 mg, 0.198 mmol) in acetone (0.5 mL) and water (0.1 mL) was added osmium tetroxide (0.0500 equiv., 0.033 mL, 0.0023 mmol) as a 4% aqueous solution. The reaction was stirred for 2 h, then diluted with EtOAc. The organic layer was washed with water, dried with MgSO4 and evaporated in vacuo. The crude was purified by preparative reverse phase column chromatography using a mixture of10 5/95 to 95/5 MeCN/H2O 0.1% NH4OH gradient to give a white powder (27 mg, 93.8 %). ESI-MS [M+H]+ m/z : 437.2. 1H NMR (500 MHz, CDCl3) δ 10.00 (s, 1H), 7.92 (d, J = 7.6 Hz, 1H), 7.84 (s, 1H), 7.72 (d, J = 7.8 Hz, 1H), 7.54 (t, J = 7.7 Hz, 1H), 4.28 (d, J = 11.0 Hz, 1H), 4.05 – 3.97 (m, 1H), 3.89 (s, 3H), 2.69 (s, 3H), 2.42 (s, 3H), 2.20 – 2.14 (m, 1H), 1.75-1.65 (m, 4H), 15 1.03 (t, J = 7.5 Hz, 3H). Me O
Compound MR111: To a stirred solution of MR108 (30 mg, 0.05 mmol) in 20 dichloromethane (1 mL) was added m-chloroperbenzoic acid (m-CPBA, 12 mg) and the reaction was left stirring overnight. The mixture was diluted with DCM (50 mL) and the organic layer was washed with water (2 x 50 mL) and brine (2 x 50 mL) and dried with MgSO4. The crude was purified by preparative reverse phase column chromatography using a mixture of 5/95 to 95/5 MeCN/H2O 0.1% NH4OH gradient to give a white powder 25 (2.5 mg, 7%) ESI-MS [M+H]+ m/z : 451.2.
M&C PE963057WO 69 1H NMR (500 MHz, CDCl3) δ 7.27 – 7.18 (m, 4H), 4.20 – 4.14 (m, 1H), 3.98 – 3.89 (m, 1H), 3.82 – 3.75 (m, 4H), 3.09 – 3.02 (m, 1H), 2.72 – 2.66 (m, 1H), 2.65 – 2.58 (m, 3H), 2.34 (s, 3H), 2.15 – 2.06 (m, 1H), 1.65 – 1.52 (m, 4H), 0.95 (t, J = 7.4 Hz, 3H). 5 N N H2
Compound MR109: A mixture of ET-JQ1-OMe (1.0 equiv., 50 mg, 0.11 mmol), XPhos (0.300 equiv., 16 mg, 0.034 mmol), caesium carbonate (5.00 equiv., 184 mg, 0.566 mmol), XPhos Pd G2 (0.3 equiv., 26.7 mg, 0.034 mmol) and prop-2-yn-1-amine 10 (10.0 equiv., 0.07 mL, 1.13 mmol) were suspended in THF (1.5 mL). The reaction was heated to 100 °C, after stirring for 1 h, the reaction was cooled to 90 °C, and kept adding THF not to dried up. Then the mixture was stirred for 2 h, cooled to 60 °C and stirred overnight. The mixture was suspended in dichloromethane and filtered through celite. The organic layer was washed with water and dried with MgSO4. The crude was purified 15 by preparative reverse phase column chromatography using a mixture of 5/95 to 95/5 ACN/H2O 0.1% formic acid gradient to give the compound as a white powder (7 mg, 13%). ESI-MS [M+H]+ m/z : 462.1. 1H NMR (500 MHz, CDCl3) δ 7.36 (d, J = 8.2 Hz, 2H), 7.31 (d, J = 8.0 Hz, 2H), 20 4.23 (d, J = 10.9 Hz, 1H), 3.98 (td, J = 10.7, 3.6 Hz, 1H), 3.84 (s, 3H), 3.65 (s, 2H), 2.65 (s, 3H), 2.40 (s, 3H), 2.17 – 2.05 (m, 1H), 1.62 – 1.55 (m, 4H), 1.01 (t, J = 7.4 Hz, 3H).
25 Compound ET-JQ1-OMe-m-aniline: To a stirred solution of m-chloro-ET-JQ1- OMe (1.60 g, 3.61 mmol) in 1,4-dioxane (50 mL) was added benzophenone imine (0.79
M&C PE963057WO 70 g, 4.33 mmol) and potassium phosphate tribasic (1.92 g, 9.03 mmol), and the reaction mixture was degassed and back-filled with argon. tBuXPhos (0.23 g, 0.54 mmol) and Pd2(dba)3 (0.17 g, 0.18 mmol) were added, and after further degassing with back-filling of argon, the reaction mixture was heated at 85 °C overnight. The reaction mixture was 5 cooled to 55 °C and partially concentrated under reduced pressure, the subsequent residue being partitioned between water and ethyl acetate. The organic phase was separated, and the aqueous component was extracted with ethyl acetate. The combined organic extracts were washed with brine, dried over anhydrous magnesium sulfate and concentrated under reduced pressure to a brown oil. The crude imine was dissolved in 10 THF (30 mL) and treated with 1 M hydrochloric acid (30 mL), and the resulting solution was stirred at ambient temperature for 2 h before being partially concentrated under reduced pressure. The aqueous residue was further diluted with water and washed with ethyl acetate. The aqueous phase was then adjusted to pH 8-9 by slow addition of saturated aqueous NaHCO3 solution and extracted with ethyl acetate (3 portions). These 15 organic extracts were combined, dried over anhydrous magnesium sulfate and concentrated under reduced pressure. Purification by flash column chromatography, eluting with 0-2.5% 7 M methanolic ammonia/DCM, afforded a yellow foam. This process was repeated, followed by trituration with TBME/cyclohexane, affording the title compound as a yellow solid (0.76 g, 50%). 20 ESI-MS: [M+Na]+ m/z: 446.2. 1H NMR (400 MHz, CDCl3) δ: 7.10 (t, J= 7.8 Hz, 1H), 6.75-6.69 (m, 2H), 6.66 (d, J= 7.6 Hz, 1H), 4.22 (d, J= 11.0 Hz, 1H), 3.99 (td, J= 10.7, 3.7 Hz, 1H), 3.84 (s, 3H), 2.66 (s, 3H), 2.40 (s, 3H), 2.24-2.13 (m, 1H), 1.73-1.61 (m, 4H), 1.01 (t, J= 7.4 Hz, 3H). 25
Compound MR118: ET-JQ1-OMe-m-aniline (1.00 equiv., 60 mg, 0.142 mmol) was dissolved in DCM (1.4 mL) and N,N-diisopropylethylamine (1.00 equiv., 0.025 mL, 0.142 mmol) was added. Following stirring for 1 min, 2-chloroacetyl chloride (3.00 equiv., 30 0.043 mL, 0.531 mmol) was added dropwise and reaction stirred for another 2 hours.
M&C PE963057WO 71 After 2 hours, the reaction was quenched with a few drops of MeOH and the solvent evaporated in vacuo. The residue was partitioned between dichloromethane and water, and the organic layer was washed with water (2 x 50 mL) and brine (2 x 50 mL), dried with MgSO4 and concentrated in vacuo to give a white solid (50 mg, 71%). 5 1H NMR (500 MHz, CDCl3) δ 8.46 (s, 1H), 7.66 (d, J = 9.5 Hz, 1H), 7.58 (s, 1H), 7.30 (t, J = 7.9 Hz, 1H), 7.11 (d, J = 7.7 Hz, 1H), 4.24 (d, J = 11.0 Hz, 1H), 4.17 (s, 2H), 3.98 (td, J = 10.7, 3.7 Hz, 1H), 3.84 (s, 3H), 2.66 (s, 3H), 2.41 (s, 3H), 2.16-2.10 (m, 1H), 1.73 – 1.59 (m, 4H), 1.00 (t, J = 7.4 Hz, 3H). 10
Compound MR117: ET-JQ1-OMe-m-aniline (1 equiv., 60 mg, 0.14 mmol) was dissolved in DCM (1.42 mL) and pyridine (1.00 equiv., 0.023 mL, 0.28 mmol) was added and the reaction was left stirring at 0 °C. Following stirring for 1 min, vinyl sulfonyl chloride 15 (1.00 equiv., 0.012 mL, 0.12 mmol) was added dropwise and reaction stirred for another 2 hours. After 2 hours, the reaction was quenched with a few drops of MeOH and the solvent evaporated in vacuo. The residue was dissolved in dichloromethane and organic layer was washed with water (2 x 50 mL) and brine (2 x 50 mL), dried with MgSO4 and concentrated in vacuo to give a white crystalline solid (43 mg, 54% yield). 20 ESI-MS [M+H]+ m/z : 514.0 1H NMR (500 MHz, CDCl3) δ 7.43 (s, 1H), 7.31 – 7.20 (m, 2H), 7.08 (d, J = 7.1 Hz, 1H), 6.52 (dd, J = 16.5, 9.9 Hz, 1H), 6.25 (d, J = 16.6 Hz, 1H), 5.93 (d, J = 9.9 Hz, 1H), 4.25 (d, J = 11.0 Hz, 1H), 3.96 (td, J = 10.7, 3.7 Hz, 1H), 3.85 (s, 3H), 2.66 (s, 3H), 2.40 (s, 3H), 2.23 – 2.01 (m, 1H), 1.73 – 1.50 (m, 4H), 1.01 (t, J = 7.4 Hz, 3H). 25
M&C PE963057WO 72 Compound MR119: ET-J
Q1-OMe-m-aniline (1 equiv., 60 mg, 0.14 mmol) was dissolved in DCM (1.4 mL) and triethylamine (1.20 equiv., 0.024 mL, 0.2 mmol) was added. Following stirring for 1 min, acryloyl chloride (1.5 equiv., 0.010 mL, 0.13 mmol) 5 was added dropwise and reaction stirred for 2 hours. The reaction was quenched with MeOH and the solvent was evaporated. Then the crude material was dissolved in DCM and the organic layer was washed with water (2 x 50 mL) and brine (2 x 50 mL), then dried with MgSO4. The crude material was further purified by normal phase column chromatography using a 0/100 to 5/95 MeOH/DCM gradient to give a yellow powder 4010 mg, 59% yield). ESI-MS [M+H]+ m/z : 478.1. 1H NMR (500 MHz, CDCl3) δ 7.70 – 7.63 (m, 2H), 7.51 (s, 1H), 7.22 (t, J = 7.9 Hz, 1H), 7.02 (d, J = 7.7 Hz, 1H), 6.34 (dd, J = 16.8, 1.3 Hz, 1H), 6.19 (dd, J = 16.9, 10.2 Hz, 1H), 5.68 (dd, J = 10.2, 1.3 Hz, 1H), 4.18 (d, J = 11.0 Hz, 1H), 3.91 (td, J = 10.7, 3.7 15 Hz, 1H), 3.77 (s, 3H), 2.59 (s, 3H), 2.34 (s, 3H), 2.17 – 2.05 (m, 1H), 1.64 – 1.56 (m, 4H), 0.94 (t, J = 7.4 Hz, 3H). Me
Cl 20 Compound MR85: To a solution of MR75 (see below), in THF, a freshly prepared solution of NaOMe (10 equiv., 3.29 mmol) in MeOH (2.5 mL) was added. The mixture was then heated at 120 °C for 40 min. After 40 min, acetic acid was added to quench the reaction and the mixture of diastereomers extracted with DCM (x 3). LC-MS showed presence of both diastereomers, and ester hydrolysis to the free acid as minor side 25 reaction. To re-esterify, the crude was treated with thionyl chloride (10 equiv.) in
M&C PE963057WO 73 methanol (2.5 mL). The two diastereomers were separated by preparative reverse phase column chromatography using a mixture of 5/95 to 95/5 ACN/H2O 0.1% formic acid gradient to give methyl (±)-(S*)-2-((R*)-4-(3-fluoroacrylamidophenyl)-2,3,9-trimethyl-6H- thieno[3,2-f][1,2,4] triazolo[4,3-a][1,4]diazepin-6-yl)butanoate as a white solid (26% yield 5 for the desired diastereomer (±)-(1S*-2R*)). ESI-MS [M+H]+ m/z : 455.1. 1H NMR (500 MHz, CDCl3) δ 7.43 – 7.31 (m, 4H), 5.96 – 5.84 (m, 1H), 5.12 – 5.04 (m, 2H), 4.31 (d, J = 10.9 Hz, 1H), 4.21 – 4.13 (m, 2H), 3.84 (s, 3H), 2.69 (s, 3H), 2.53 – 2.42 (m, 4H), 1.71 (s, 3H). 10 O Me
Compound MR107: To a stirred solution of MR85 (15 mg, 0.033 mmol) and sodium periodate (3.00 equiv., 21 mg, 0.0989 mmol) in acetone was added water15 followed by dropwise addition of osmium tetroxide (0.05 equiv., 0.010 mL, 0.00165 mmol) as a 4% aqueous solution. The reaction was stirred for 2 h, then diluted with EtOAc and washed with NaHCO3. The crude was purified by phase column chromatography using a mixture of 5/95 to 95/5 ACN/H2O 0.1% formic acid gradient reverse to give (±)-methyl-(R*)-2-((S*)-4-(4-chlorophenyl)-2,3,9-trimethyl-6H-thieno[3,2-20 f][1,2,4]triazolo[4,3-a][1,4]diazepin-6-yl)-4-oxobutanoat as a white solid (3 mg, 15%). ESI-MS [M+H]+ m/z : 457.1. 1H NMR (500 MHz, CDCl3) δ 9.93 (s, 1H), 7.37 (d, J = 8.5 Hz, 2H), 7.33 (d, J = 9.0 Hz, 2H), 4.58 (d, J = 7.8 Hz, 1H), 4.26 – 4.20 (m, 1H), 3.76 (s, 3H), 3.50 – 3.44 (m, 1H), 2.69 – 2.64 (m, 4H), 2.42 (s, 3H), 1.70 (s, 3H). 25
M&C PE963057WO 74 N H O Compound MR89: JQ1-OM
e (1 equiv., 100 mg, 0.24 mmol) was dissolved in THF (2.5 mL) and cooled to −78 °C. A solution of 0.5 M KHMDS in toluene (1.5 equiv., 0.72 mL, 0.36 mmol) was added dropwise to the flask and stirred at −78°C for 30 min. 5 The colour went from pale yellow to deep green. A solution of Davis reagent (2-(phenylsulfonyl)-3-phenyloxaziridine, 1.5 equiv., 82 mg, 0.36 mmol) was added dropwise and left stirring at −78 °C for 1 h. A few drops of acetic acid were added to stop the reaction and the solvent was evaporated. The solvent was evaporated in vacuo and the residue was partitioned between saturated aqueous NaHCO3 and dichloromethane. 10 The aqueous phase was extracted 3 times with dichloromethane. The combined organic layers were dried with MgSO4 and concentrated under reduced pressure. The resulting crude was purified by phase column chromatography using a mixture of 5/95 to 95/5 ACN/H2O 0.1% formic acid gradient reverse to give the title compound as a brown solid (28 mg, 27%). 15 ESI-MS [M+H]+ m/z : 431.1. Diastereomer 1: 1H NMR (500 MHz, CDCl3) δ 7.40 (d, J = 8.2 Hz, 2H), 7.32 (d, J = 8.7 Hz, 2H), 5.32 (d, J = 2.6 Hz, 1H), 4.60 (d, J = 2.6 Hz, 1H), 3.88 (s, 3H), 2.68 (s, 3H), 2.41 (s, 3H), 1.67 (s, 3H). Diastereomer 2: 1H NMR (500 MHz, CDCl3) δ 7.43 (d, J = 8.2 Hz, 2H), 7.33 (d, J 20 = 8.3 Hz, 2H), 5.26 (d, J = 5.0 Hz, 1H), 4.61 (d, J = 5.0 Hz, 1H), 3.83 (s, 3H), 2.67 (s, 3H), 2.41 (s, 3H), 1.68 (s, 3H). Cl Me
Cl
M&C PE963057WO 75 Compound MR94: A mixture of diastereomers MR89 (1.0 equiv., 40 mg, 0.1 mmol) was dissolved in dichloromethane (1 mL) and reacted with chloroacetic chloride (2.0 equiv., 14 µL, 0.2 mmol) and pyridine (2.0 equiv., 15 µL, 0.2 mmol) and the reaction was left stirring at room temperature for 2 h and monitored by LC-MS. Once the 5 reaction was complete, the solvent was removed in vacuo and the resulting mixture of diastereomers was purified by normal phase chromatography using a gradient of DCM/MeOH 100/0 to 90/10 to give the title compound as a white solid (diastereomer ratio was calculated to be 9:1 based on NMR peak ratio). This mixture was taken forward as mixture for in vitro experiments. 10 ESI-MS [M+H]+ m/z : 507.1 Major diastereomer 1H NMR (500 MHz, CDCl3) δ 7.38 (d, J = 8.4 Hz, 2H), 7.34 – 7.29 (m, 2H), 6.35 (d, J = 4.3 Hz, 1H), 4.75 (d, J = 4.3 Hz, 1H), 4.36 – 4.25 (m, 2H), 3.84 (s, 3H), 2.65 (s, 3H), 2.40 (s, 3H), 1.67 (s, 3H). 15 Me
Compound MR75: JQ1-OMe (1 equiv., 1 g) was dissolved in THF (25 mL) and cooled to −78 °C under an inert atmosphere. A solution of 0.5 M KHMDS in toluene (1.5 equiv., 7.2 mL, 3.62 mmol) was added dropwise to the flask and stirred at −78 °C for 10 20 min. The colour went from pale yellow to deep green to purple/black.3-Iodoprop-1-ene (4.00 equiv., 0.88 mL, 9.64 mmol) was added dropwise and the reaction was left at −78 °C for 1 h. A few drops of acetic acid were added to stop the reaction and the solvent was evaporated. The residue was partitioned between saturated aqueous NaHCO3 and dichloromethane and the aqueous phase was extracted 3 times with dichloromethane. 25 The combined organic layers were dried with MgSO4 and concentrated in vacuo. The crude was purified by normal phase chromatography using a gradient of DCM/MeOH 100/0 to 90/10 to give the product as a yellowish solid (800 mg, 73%). ESI-MS [M+H]+ m/z : 455.1.
M&C PE963057WO 76 1H NMR (500 MHz, CDCl3) δ 7.36 (d, J = 8.4 Hz, 2H), 7.28 (s, 2H), 5.84 – 5.72 (m, 1H), 5.10 – 4.94 (m, 2H), 4.26 (d, J = 11.0 Hz, 1H), 3.97 – 3.88 (m, 1H), 3.63 (s, 3H), 2.68 – 2.52 (m, 3H), 2.37 – 2.29 (m, 4H), 1.61 (s, 3H) 5 N N Me
Compound MR126: ET-JQ1-OMe-aniline (1 equiv., 50 mg) was dissolved in MeOH (1.2 mL) and di-tert-butyl dicarbonate (1.05 equiv., 27 mg) and DMAP (0.1 equiv., 1 mg) were added and the reaction was left stirring at 50 °C overnight. The solvent was 10 evaporated in vacuo and resuspended in DCM. The organic layer was washed with NaHCO3 (2 x 50 mL) and brine (2 x 50 mL), dried with MgSO4, then concentrated in vacuo. The crude was purified by normal phase column chromatography using a 0/100 to 5/95 MeOH/DCM gradient to give a white solid (40 mg, 65%). 15 ESI-MS: [M+H]+ m/z : 524.5. 1H NMR (500 MHz, CDCl3) δ 7.40 – 7.29 (m, 4H), 6.81 (s, 1H), 4.22 (d, J = 10.9 Hz, 1H), 4.0 (td, J = 10.8, 3.7 Hz, 1H), 3.85 (s, 3H), 2.68 (s, 3H), 2.41 (s, 3H), 1.87 – 1.81 (m, 1H), 1.70 (s, 3H), 1.52 (s, 9H), 1.03 (t, J = 7.4 Hz, 3H). 20 OH
M&C PE963057WO 77 Compound MR137: A solution of MR126 (1.00 equiv., 20 mg, 0.0382 mmol) was dissolved in THF (1 mL) and MeOH (0.3 mL) and a solution of LiOH (0.5 M, aq., 0.3 mL) was added and the reaction stirred at 45 °C overnight. LC-MS (basic) showed conversion of the ester into the acid overnight. The solvent was evaporated and the crude material 5 was dissolved in DCM. Water was added and the organic material was extracted with DCM after acidifying the aqueous layer to pH 5-6. Conversion observed in LC-MS was 100%; no epimerisation was observed. The crude material was taken forward without further purification. ESI-MS: [M+H]+ m/z : 510.3. 10 N H NH O
Compound MR129: MR137 (1 equiv., 20 mg, 0.039 mmol) was dissolved in anhydrous DMF and N,N-diisopropylethylamine (3 equiv., 205 µL, 0.12 mmol) was 15 added, followed by HATU (1.2 equiv., 17.9 mg, 0.047 mmol) and the resulting mixture was stirred for 5 min before adding N-(14-amino-3,6,9,12-tetraoxatetradecyl)-5- ((3aS,4S,6aR)-2-oxohexahydro-1H-thieno[3,4-d]imidazol-4-yl)pentanamide (2.0 equiv., 36 mg, 0.078 mmol). The reaction was left stirring for 16 h at room temperature. The solvent was removed in vacuo and the crude material was purified by normal phase 20 column chromatography using a 0/100 to 10/90 MeOH/DCM gradient to give MR129 as a white solid (30 mg, 80%). ESI-MS [M+2H]2+ m/z : 477.9.
M&C PE963057WO 78 N N H NH O C
ompound MR131: MR129 (1.0 equiv., 30 mg, 0.031 mmol) was dissolved in a solution of 20% TFA in dichloromethane (1 mL) and stirred for 30 min. Once Boc removal was confirmed by LC-MS, the solvent was removed in vacuo. The crude aniline (1.0 5 equiv., 23 mg, 0.027 mmol) was re-dissolved in dichloromethane (1 mL) and N,N- diisopropylethylamine (2.0 equiv., 10 µL, 0.05 mmol) was added. Following stirring for 1 min, 2-chloroacetyl chloride (1.5 equiv., 0.003 mL, 0.04 mmol) was added dropwise and reaction stirred for another 2 h. After 2 h, the reaction was quenched with a few drops of MeOH and the solvent evaporated in vacuo and the crude was purified by 10 preparative reverse phase HPLC using a gradient of 5/95 to 95/5 of MeCN/H2O 0.1% formic acid to give MR131 as a white powder (21 mg, 84% over two steps). ESI-MS: [M+H]+ m/z : 930.4. 11H NMR (500 MHz, MeOD) δ 8.45 (s, 1H), 7.54 (d, J = 8.6 Hz, 2H), 7.32 (d, J = 8.4 Hz, 2H), 4.48 (s, 4H), 4.40 – 4.33 (m, 1H), 4.21 – 4.15 (m, 1H), 4.14 – 4.08 (m, 1H), 15 4.09 (s, 2H), 3.67 – 3.31 (m, 13H), 3.08 (s, 9H), 3.15 – 2.97 (m, 1H), 2.83 – 2.76 (m, 1H), 2.60 – 2.55 (m, 3H), 2.35 (s, 3H), 2.12 – 2.04 (m, 2H), 1.90 – 1.85 (m, 1H), 1.62 (s, 3H), 1.59 – 1.41 (m, 3H), 1.36 – 1.23 (m, 2H), 1.19 (s, 2H), 0.94 (t, J = 7.4 Hz, 3H). H NH O
20 Compound MR169: MR129 (1.0 equiv., 10 mg, 0.012 mmol) was dissolved in a solution of 20% TFA in dichloromethane (1 mL) and stirred for 30 min. Once Boc deprotection was confirmed by LC-MS, the solvent was removed in vacuo. The crude aniline (1.0 equiv., 8 mg, 0.012 mmol) was re-dissolved in dichloromethane (1 mL) and 25 N,N-diisopropylethylamine (2.0 equiv., 10 µL, 0.01 mmol) was added. Following stirring
M&C PE963057WO 79 for 1 min, acryloyl chloride (2.0 equiv., 0.002 mL, 0.02 mmol) was added dropwise and reaction stirred for another 30 min. After 30 min, the reaction was quenched with a few drops of MeOH and the solvent evaporated in vacuo and the crude was purified by preparative reverse phase HPLC using a gradient of 5/95 to 95/5 of MeCN/H2O 0.1% 5 formic acid to give the title compound as a white powder (3 mg, 28% over two steps). ESI-MS: [M+H]+ m/z : 910.1. 1H NMR (500 MHz, CDCl3) δ 9.27 (s, 1H), 8.03 – 7.99 (m, 1H), 7.66-7.60 (m, 3H), 7.31 – 7.26 (m, 2H), 6.88 (s, 1H), 6.38 – 6.33 (m, 2H), 5.73 (s, 1H), 5.68-5.63 (m, 1H), 4.80 (s, 1H), 4.24 (s, 1H), 4.15 (d, J = 10.2 Hz, 1H), 3.87 (s, 1H), 3.68 – 3.49 (m, 10 14H), 3.57 (s, 5H), 3.52 – 3.43 (m, 2H), 3.36 – 3.25 (m, 1H), 2.86 – 2.78 (m, 1H), 2.74 – 2.67 (m, 1H), 2.59 (s, 3H), 2.57 – 2.51 (m, 1H), 2.32 (s, 3H), 2.06 – 2.00 (m, 2H), 1.94 – 1.88 (m, 2H), 1.51 – 1.42 (m, 1H), 1.35 – 1.29 (m, 2H), 1.25 – 1.16 (m, 4H), 0.95 (t, J = 7.3 Hz, 3H). 15
Compound MR152: MR137 (1 equiv., 17 mg, 0.033 mmol) was dissolved in anhydrous DMF and N,N-diisopropylethylamine (4.0 equiv., 23 µL, 0.13 mmol) was added, followed by HATU (1.0 equiv., 12.7 mg, 0.0334 mmol) and the resulting mixture20 was stirred for 5 min before adding 2-(2-(2-(prop-2-yn-1-yloxy)ethoxy)ethoxy)ethan-1- amine (1.3 equiv., 8.1 mg, 0.043 mmol). The reaction was left stirring for 1 h at room temperature. The solvent was removed in vacuo and the crude was purified by normal phase column chromatography using a 0/100 to 10/90 MeOH/DCM gradient to give MR152 as a yellow powder (10 mg, 44%). 25 ESI-MS: [M+H]+ 679.1.
M&C PE963057WO 80 N N
Compound MR155: MR152 (10 mg, 0.5 equiv., 0.017 mmol) was dissolved in a solution of 20% TFA in dichloromethane (1 mL) and stirred for 30 min. Once Boc removal was confirmed by LC-MS, the solvent was removed in vacuo. A mixture of acrylic acid 5 (1.0 equiv., 2.4 mg, 0.033 mmol), HATU (1 equiv., 12.7 mg, 0.033 mmol) and N,N- diisopropylethylamine (2.5 equiv., 15 µL, 0.083 mmol) in dichloromethane (0.5 mL) was stirred for 5 minutes before adding a solution of the crude aniline (0.5 equiv., 8 mg, 0.017 mmol) in dichloromethane (0.5 mL). Following stirring for 1 h the solvent was evaporated in vacuo and the crude material was purified by preparative reverse phase HPLC using 10 a gradient of 5/95 to 95/5 of MeCN/H2O 0.1% formic acid to give the title compound as a white powder (3 mg, 15% over two steps). ESI-MS: [M+H]+ 633.4. 1H NMR (500 MHz, CDCl3) δ 7.73 (s, 1H), 7.55 (d, J = 8.4 Hz, 2H), 7.36 – 7.31 (m, 2H), 6.72 (s, 1H), 6.36 (d, J = 16.8 Hz, 1H), 6.20 (dd, J = 16.8, 10.3 Hz, 1H), 5.69 (d, 15 J = 10.1 Hz, 1H), 4.19 – 4.08 (m, 4H), 3.74 – 3.47 (m, 9H), 3.48 – 3.35 (m, 1H), 2.58 (s, 3H), 2.40 – 2.35 (m, 1H), 2.32 (s, 3H), 2.03 – 1.94 (m, 1H), 1.70 – 1.61 (m, 1H), 1.60 (s, 3H), 0.96 (t, J = 7.3 Hz, 3H).
20 Compound MR172: 2-((1E,3E)-5-((E)-1-(5-Carboxypentyl)-3,3-dimethylindolin- 2-ylidene)penta-1,3-dien-1-yl)-1,3,3-trimethyl-3H-indol-1-ium (1.00 equiv., 11 mg, 0.0231 mmol) was dissolved in dichloromethane (1 mL) and N,N-diisopropylethylamine
M&C PE963057WO 81 (2.0 equiv., 0.008 mL, 0.04 mmol) was added followed HATU (1.10 equiv., 9.7 mg, 0.0254 mmol) and the reaction was stirred for 5 min before adding a mixture of tert-butyl (2-(2-(2-(2-aminoethoxy)ethoxy)ethoxy)ethyl)carbamate (2.00 equiv., 0.013 mL, 0.0462 mmol) and N,N-diisopropylethylamine (2.0 equiv., 0.008 mL, 0.04 mmol). The solvent 5 was removed in vacuo and the product isolated by normal phase column chromatography with a gradient from 0/100 to 20/80 MeOH/DCM and the pure compound was eluted by addition of 0.1% formic acid at 20/80 MeOH/DCM. The resulting Boc protected intermediate (1.0 equiv., 15 mg, 0.0198 mmol) was treated with a solution of 20% TFA in dichloromethane for 15 min. Following full Boc deprotection, 10 monitored by LC-MS, the solvent was removed in vacuo to give the title compound as a dark blue film (10 mg, 64% over two steps). This product was taken forward without further purification. ESI-MS: [M]+ m/z : 657.9. 15
Compound MR173: MR137 (1 equiv., 9 mg, 0.018 mmol) was dissolved in anhydrous DMF and N,N-diisopropylethylamine (2.5 equiv., 0.012 mL, 0.07 mmol) was added, followed by HATU (1.3 equiv., 6.8 mg, 0.018 mmol) and the resulting mixture was 20 stirred for 5 min before adding MR172 (1.0 equiv., 9 mg, 0.014 mmol). The reaction was left stirring for 16 h at room temperature. The solvent was removed in vacuo and the crude material was purified by normal phase column chromatography using a 0/100 to 10/90 MeOH/DCM gradient to give the title compound as a white solid (8 mg, 51%). ESI-MS: [M]+ m/z : 1149.5. 25
M&C PE963057WO 82 N N H H
Compound MR175: MR173 (1 equiv., 9 mg, 0.0078 mmol) was treated with a solution of 20% TFA in dichloromethane to give the aniline intermediate after solvent removal in vacuo (8 mg, 0.0078 mmol). The crude aniline was dissolved in 5 dichloromethane (1 mL) and to this solution, N,N-diisopropylethylamine (3.0 equiv., 0.004 mL, 0.023 mmol) was added followed by dropwise addition of a solution of acryloyl chloride (1.5 equiv., 1 µL, 0.0114 mmol) in dichloromethane (0.2 mL), the reaction was stirred for 30 min and the reaction was quenched by addition of a small amount of MeOH. The solvent was removed in vacuo and the crude purified by preparative reverse phase 10 chromatography using a gradient of 5/95 to 95/5 MeCN/H2O 0.1% formic acid to give a dark blue powder (4 mg, 47% over two steps). ESI-MS: [M+H]+ 1104.5. 1H NMR (500 MHz, CDCl3) δ 11.11 (s, 1H), 8.37 (s, 1H), 8.08 (d, J = 8.5 Hz, 2H), 7.88 – 7.78 (m, 2H), 7.43 – 7.28 (m, 7H), 7.24 – 7.13 (m, 5H), 7.12 – 6.97 (m, 3H), 6.87 15 – 6.78 (m, 1H), 6.44 – 6.32 (m, 2H), 6.30 – 6.23 (m, 1H), 5.60 – 5.55 (m, 1H), 4.21 (d, J = 9.5 Hz, 1H), 3.99 (s, 2H), 3.72 – 3.43 (m, 11H), 3.40 – 3.35 (m, 2H), 2.64 (s, 3H), 2.38 (s, 3H), 2.15 – 2.08 (m, 2H), 2.06 – 2.03 (m, 2H), 1.78 – 1.58 (m, 25H), 1.02 (t, J = 7.4 Hz, 3H). 20 O
M&C PE963057WO 83 Compound MR153: MR137 (1 equiv., 10 mg, 0.020 mmol) was dissolved in anhydrous DMF and N,N-diisopropylethylamine (4 equiv., 0.014 mL, 0.08 mmol) was added, followed by HATU (1.0 equiv., 7.5 mg, 0.020 mmol) and the resulting mixture was stirred for 5 min before adding 4-((2-(2-(2-aminoethoxy)ethoxy)ethyl)amino)-2-(2,6- 5 dioxopiperidin-3-yl)isoindoline-1,3-dione (1.2 equiv., 9.5 mg, 0.023 mmol). The reaction was left stirring for 4 h at room temperature and conversion monitored by LC-MS. The solvent was removed in vacuo and the crude material was purified by normal phase column chromatography using a 0/100 to 10/90 MeOH/DCM gradient to give the title compound as a white solid (15 mg, 85%). 10 ESI-MS: [M+H]+ m/z : 896.7. N O
Compound MR156: MR153 (14 mg, 0.45 equiv., 0.017 mmol) was dissolved in 15 a solution of 20% TFA in dichloromethane (1 mL) and stirred for 30 min. Once Boc removal was confirmed by LC-MS, the solvent was removed in vacuo. A mixture of acrylic acid (1.0 equiv., 2.75 mg, 0.038 mmol), HATU (1.1 equiv., 15.9 mg, 0.042 mmol) and N,N-diisopropylethylamine (2.5 equiv., 15 µL, 0.083 mmol) in dichloromethane (0.5 mL) was stirred for 5 minutes before adding a solution of the crude aniline (0.5 equiv., 8 20 mg, 0.017 mmol) and N,N-diisopropylethylamine (2.5 equiv., 15 µL, 0.083 mmol) in dichloromethane (0.5 mL). Following stirring for 1 h the solvent evaporated in vacuo and the crude material was purified by preparative reverse phase HPLC using a gradient of 5/95 to 95/5 of MeCN/H2O 0.1% formic acid to give the title compound as a white powder (7 mg, 21% over two steps). 25 ESI-MS: [M+H]+ m/z : 850.2. 1H NMR (500 MHz, CDCl3) δ 7.73 (s, 1H), 7.55 (d, J = 8.4 Hz, 2H), 7.33 (d, J = 8.6 Hz, 2H), 6.76 – 6.64 (m, 2H), 6.36 (dd, J = 16.9, 1.4 Hz, 1H), 6.20 (dd, J = 16.8, 10.2 Hz, 1H), 5.69 (dd, J = 10.2, 1.3 Hz, 1H), 4.17 – 4.10 (m, 3H), 3.69 – 3.49 (m, 15H), 3.43
M&C PE963057WO 84 (td, J = 10.2, 3.6 Hz, 1H), 2.58 (s, 3H), 2.37 (t, J = 2.4 Hz, 1H), 2.32 (s, 3H), 2.06 – 1.92 (m, 1H), 1.70 – 1.61 (m, 1H), 1.60 (s, 3H), 0.96 (t, J = 7.4 Hz, 3H). 5 N S N
Compound MR162: MR137 (1 equiv., 10 mg, 0.020 mmol) was dissolved in anhydrous dichloromethane and N,N-diisopropylethylamine (4 equiv., 0.014 mL, 0.08 mmol) was added, followed by HATU (1.0 equiv., 7.5 mg, 0.020 mmol) and the10 resulting mixture was stirred for 5 min before adding 4-((2-(2-(2- aminoethoxy)ethoxy)ethyl)amino)-2-(2,6-dioxopiperidin-3-yl)isoindoline-1,3-dione (1.2 equiv., 9.5mg, 0.014 mmol). The reaction was left stirring for 4 h at room temperature and conversion monitored by LC-MS. The solvent was removed in vacuo and the crude material was purified by normal phase column chromatography using a 15 0/100 to 10/90 MeOH/DCM gradient to give the title compound as a white solid (15 mg, 76%). ESI-MS: [M+2H]2+ m/z : 556.9. S
N 20 Compound MR170: MR162 (1 equiv., 15 mg, 0.0135 mmol) was treated with a solution of 20% TFA in dichloromethane to give the crude aniline after solvent removal
M&C PE963057WO 85 in vacuo. The crude aniline was dissolved in dichloromethane (1 mL) and to this solution, N,N-diisopropylethylamine (3 equiv., 0.004 mL, 0.022 mmol) was added followed by dropwise addition of a solution of acryloyl chloride (1.1 equiv., 0,001 mL, 0.015 mmol) in dichloromethane (0.2 mL), the reaction was stirred for 30 min and the reaction was 5 quenched by addition of a small amount of MeOH. The solvent was removed in vacuo and the crude material was purified by preparative reverse phase chromatography using a gradient of 5/95 to 95/5 MeCN/H2O 0.1% formic acid to give the title compound as a white powder (5 mg, 34% over two steps). ESI-MS: [M+H]+ m/z : 1066.3. 10 1H NMR (500 MHz, CDCl3) δ 8.60 (s, 1H), 7.90 (s, 1H), 7.77 (t, J = 5.5 Hz, 1H), 7.44 – 7.39 (m, 1H), 7.31 – 7.22 (m, 5H), 7.21 – 7.18 (m, 2H), 7.18 – 7.14 (m, 2H), 6.35 (dd, J = 16.8, 1.3 Hz, 1H), 6.17 (dd, J = 16.9, 10.2 Hz, 1H), 5.69 (dd, J = 10.2, 1.3 Hz, 1H), 4.74 – 4.64 (m, 2H), 4.43 – 4.30 (m, 1H), 4.15 (d, J = 9.9 Hz, 1H), 4.06 – 3.97 (m, 3H), 3.87 – 3.79 (m, 1H), 3.67 – 3.47 (m, 15H), 3.47 – 3.39 (m, 1H), 2.56 (s, 3H), 2.44 15 (s, 3H), 2.33 – 2.29 (m, 3H), 2.27 – 2.18 (m, 1H), 2.11 – 2.01 (m, 1H), 1.90 – 1.81 (m, 1H), 1.62 – 1.52 (m, 4H), 0.97 – 0.88 (m, 12H).
Compound ET-JQ1-OH-acrylamide: Prepared in two steps from MR126: N-Boc 20 deprotection and amidation according to the procedure for MR169, and ester hydrolysis according to the procedure for MR137. Typical Procedures for compounds comprising fluorogenic groups: Linker- biomolecule conjugates typically can be synthesised, isolated and purified directly from 25 commercially available starting materials in one or two steps using conventional protocols within the remit of the skilled person. Typically, a linker-fluorophore conjugate could be prepared by treatment of a solution of the relevant fluorophore NHS ester with the relevant diamine. Upon completion of the reaction, the mixture is concentrated under reduced pressure and purified by flash column chromatography to afford the product. 30
M&C PE963057WO 86
Compound JF635-ethylenediamine: Ethylene diamine was reacted with 4- (((2,5-dioxopyrrolidin-1-yl)oxy)carbonyl)-2-(3-(3-fluoroazetidin-1-ium-1-ylidene)-7-(3- fluoroazetidin-1-yl)-5,5-dimethyl-3,5-dihydrodibenzo[b,e]silin-10-yl)benzoate (Janelia 5 Fluor® 635, NHS ester) according to the above method, affording the product as a yellow solid. 1H NMR (400 MHz, DMSO-d6) δ 8.70 – 8.60 (m, 1H), 8.08 (dd, J = 8.1, 0.9 Hz, 1H), 8.02 (d, J = 8.0 Hz, 1H), 7.70 – 7.64 (m, 1H), 6.82 (d, J = 2.6 Hz, 2H), 6.67 (d, J = 8.7 Hz, 2H), 6.42 (dd, J = 8.7, 2.6 Hz, 2H), 5.47 (dtt, J = 57.6, 5.5, 2.8 Hz, 2H), 4.17 (ddd, 10 J = 20.8, 9.5, 5.8 Hz, 4H), 3.91 (ddd, J = 24.3, 9.4, 2.1 Hz, 4H), 3.25 – 3.17 (m, 2H), 2.63 (t, J = 6.5 Hz, 2H), 1.85 – 1.52 (m, 2H), 0.63 (s, 3H), 0.52 (s, 3H); LCMS purity = 98.7%; m/z (ES+): 575.10 [M+H+]+. General Procedure for Compounds comprising fluorogenic groups: To a 15 stirred solution of the relevant (+)-JQ1 bump, NHS ester moiety, which can be formed in situ or isolated independently, in dichloromethane was added a solution of the relevant amine-reactive linker-fluorophore conjugate in dichloromethane, optionally together with DMAP and/or triethylamine at ambient temperature. Upon completion of the reaction, the reaction mixture was concentrated under reduced pressure and purified by flash column 20 chromatography or preparative HPLC to afford the product. The isolated product may be the free parent compound or a salt, typically formic acid salt or TFA salt. That is to say, the compound may have a counterion, sometimes referred to herein as A−, defined above. As such, in some cases, A− is formate or trifluoroacetate.
M&C PE963057WO 87 Compound C
10852L: Following the above general procedure afforded the title compound as an off-white solid (8 mg, 48%).1H NMR (400 MHz, DMSO-d6): δ 10.30 (s, 1H), 8.86 (s, 1H), 8.54 – 8.45 (m, 1H), 8.09 (dd, J = 8.1, 1.2 Hz, 1H), 8.03 (d, J = 8.1 Hz, 5 1H), 7.73 – 7.70 (m, 1H), 7.66 (d, J = 8.9 Hz, 2H), 7.29 (d, J = 8.6 Hz, 2H), 6.80 (dd, J = 5.9, 2.7 Hz, 2H), 6.64 (d, J = 8.7 Hz, 1H), 6.61 (d, J = 8.7 Hz, 1H), 6.41 (dd, J = 17.0, 10.1 Hz, 1H), 6.35 (dd, J = 8.8, 2.6 Hz, 2H), 6.24 (dd, J = 17.0, 2.0 Hz, 1H), 5.76 (dd, J = 10.1, 2.0 Hz, 1H), 5.57 – 5.35 (m, 2H), 4.21 – 4.08 (m, 4H), 4.04 (d, J = 10.8 Hz, 1H), 3.96 – 3.81 (m, 4H), 3.53 – 3.36 (m, 5H), 2.58 (s, 3H), 2.39 (d, J = 0.5 Hz, 3H), 1.87 – 10 1.76 (m, 1H), 1.53 (d, J = 0.5 Hz, 3H), 1.42 – 1.30 (m, 1H), 0.81 (t, J = 7.4 Hz, 3H), 0.62 (s, 3H), 0.51 (s, 3H); LCMS purity = 98.6%; m/z (ES+): 1020.20 [M+H+]+. Other compounds comprising fluorogenic groups, e.g.,including MR202, MR215, 15 C10852K, C10852M, C10852N, C10852R, C10852S, C10852T, C10852U, C10852AA, C10852AB, C10852AC, C10852AD, C10852AH, and C10852AI, can be synthesised in a similar fashion to C10852L above. In some cases, alternative conventional amide coupling chemistries can be deployed to similar effect, for example through pre-formation and isolation of an NHS ester, or via amide coupling reagents (such as HATU and the20 like). Such synthetic procedures are within the remit of the skilled person.
M&C PE963057WO 88 LCMS data for compounds comprising fluorogenic groups are detailed below: Compound MR202: m/z (ES+): LCMS purity = 99.3; m/z (ES+): 525.9 [M+2H+]+2; Compound MR215: m/z (ES+): LCMS purity = 97.5; m/z (ES+): 574.5 [M+2H+]+2; Compound C10825K: m/z (ES+): LCMS purity = 96.9%; m/z (ES+): 1076.30 [M+H+]+; 5 Compound C10852M: m/z (ES+): LCMS purity = 97.1%; m/z (ES+): 1096.40 [M+H+]+ Compound C10852N: m/z (ES+): LCMS purity = 98.6%; m/z (ES+): 1143.30 [M+H+]+; Compound C10852R: m/z (ES+): LCMS purity = 98.4%; m/z (ES+): 1116.40 [M+H+]+; Compound C10852S: m/z (ES+): LCMS purity = 96.6%; m/z (ES+): 1152.30 [M+H+]+; Compound C10852T: m/z (ES+): LCMS purity = 98.3%; m/z (ES+): 1064.30 [M+H+]+;10 Compound C10852U: m/z (ES+): LCMS purity = 99.0%; m/z (ES+): 1196.30 [M+H+]+; Compound C10852AA: m/z (ES+): LCMS purity = 99.8%; m/z (ES+): 1082.20 [M+H+]+; Compound C10852AB: m/z (ES+): LCMS purity = 99.8%; m/z (ES+): 1126.20 [M+H+]+; Compound C10852AC: m/z (ES+): LCMS purity = 99.8%; m/z (ES+): 1170.30 [M+H+]+; Compound C10852AD: m/z (ES+): LCMS purity = 99.8%; m/z (ES+): 1038.20 [M+H+]+;15 Compound C10852AH: m/z (ES+): LCMS purity = 98.9%; m/z (ES+): 1002.20 [M+H+]+; Compound C10852AI: m/z (ES+): LCMS purity = 99.3%; m/z (ES+): 1090.30 [M+H+]+. Biology Mutant design: The Bromodomain and Extra-Terminal Domain (BET) family 20 proteins share one or more copies of a common alpha helical domain, referred to simply as a ‘bromodomain’. These domains lack any catalytic function, but natively bind acetylated lysine (LysAc) on histones as readers of epigenetic code. Bromodomains themselves have favourable properties for structural and biophysical studies: they express well in E. coli; they are soluble in the mM range; they are amenable to 25 crystallisation; and they are stable. With use of the ‘bump-and-hole’ approach, BromoTag was developed using a single point mutation (L to A) (the ‘hole) in a single bromodomain (demonstrated in Brd2 and Brd4), designed to accommodate an ethyl group (the ‘bump’) on the bromodomain ligand JQ1 (described in Bond et al., J. Med. Chem., 2021, 64, 15477). 30 Among the many bromodomain containing proteins a closely related group is represented by Brd2, Brd3, Brd4, and BrdT, each comprising two bromodomains. The present investigators carried out multiple sequence alignment of these bromodomains, which showed that the ‘hole’ of the L to A mutation is conserved. From there, the investigators identified that they could: switch from an L to A mutation to an L to C 35 mutation to introduce a covalent handle; or alternatively, they could find an additional conserved residue suitable for covalently binding to a different part of the ligand.
M&C PE963057WO 89 Using a combination of existing crystal structures and structural prediction with AlphaFold2, the present investigators aligned all eight bromodomains from the BET family. They were nearly superimposable (Cα-RMSD less than 1 Å) along with the position of the conserved leucine residue and potential covalent handles. Whilst 5 individual bromodomains are well defined, DNA construct design requires precise boundaries for the N- and C-termini. All bromodomains, variants of BromoTag, and the mutant bromodomains presented herein encompass alpha helices at the N- and C- termini; the constructs mainly vary in how much unstructured regions are retained at either end. 10 The known structures of bromodomains are typically solved by X-ray crystallography, with a significant proportion coming from Brd4BD1. Oftentimes, longer versions of a bromodomain sequence are reported (sometimes with hexa-histidine tag or protease site), however only the helical part of the bromodomain is accounted for the density. Importantly, the C-terminal regions often agree much better than the N-terminus 15 for defining bromodomain constructs in both BD1 and BD2. Furthermore, the current sequence for BromoTag starting at residue 333 and ending at 460 has ~17 unstructured residues from 333 to 350 in Brd4. For DNA construct design, this gave rise to ‘short’ and ‘long’ bromodomains. For example, Brd4BD2-mutants may have short and long versions, comprising residues 351-460 and 333-460, respectively. 20 The hole mutation (L387A) and covalent handle mutations (E438C or M442C) were introduced to Brd4BD2 on both the short and long versions. The short Brd4BD2 has a mass of about 13 KDa. All versions allow for expression as 6xHis-TEV, His-SUMO, and in many other vectors. Cloning with restriction enzymes using the 5’ BamHI site results in an additional Gly-Ser at the N-terminus with the TEV protease sequence, or a 25 single Ser using SUMO vectors for which the Gly of BamHI is encoded as the terminal Gly of SUMO. A mutant Brd4 bromodomain has also been successfully prepared from a HiBiT-BromoTag(L387A+E438C)-Brd4 clone. Plasmids and peptides: To generate plasmids for recombinant expression in E. coli DNA encoding the respective Brd4BD2 mutant was PCR amplified using primers 30 encoding 5’ BamHI and 3’ EcoRI restriction sites. The PCR fragments were double digested with BamHI and EcoRI, gel purified, from 1% (w/v) agarose gel. The extracted Brd4BD2 fragments were ligated into pRSF-DUET1 vectors with N-terminal 6xHis-TEV or His-SUMO. All sequences were confirmed by conventional sequencing. Additional mutations were introduced using site directed mutagenesis with the relevant primer pairs 35 for the desired point mutation.
M&C PE963057WO 90 Differential scanning fluorimetry: DSF experiments were performed on a Biorad CFX96 RT-PCR machine or in a Prometheus Panta (NanoTemper Technologies GmbH). For assays in the Biorad CFX96 RT-PCR machine, ligand (50 µM) or DMSO stock were incubated with 50 µL of Brd4BD2WT, Brd4BD2L387A, Brd4BD2L387A,E438C, 5 Brd4BD2L387C , Brd4BD2L387A,M442C (10 µM) in 25 mM HEPES, 100 mM NaCl, 1 mM TECP and incubated for 2 hours at RT followed by addition of SYPRO Orange (1 µL) to give a final dilution of 5 × SYPRO Orange. The temperature was ramped up in 1 °C steps between 20 or 25 °C and 95 °C with 30 s incubation at each step. Melting curves were analysed by determining the minimum of the first derivative using the Biorad CFX 10 Manager. For analysis in the Prometheus Panta (NanoTemper Technologies GmbH), a similar procedure was used, but no addition of SYPRO Orange was required, samples containing a mixture of the protein (10 µM) and ligand (50 µM) were taken into capillaries for subsequent reading by Prometheus Panta, the temperature was similarly ramped up in 1 °C steps between 25 and 95 °C with 30 s incubation at each step. Values of Tm and 15 were obtained from the melting curve at 350 nm emission (inherent fluorescence emission of aromatic amino acids). INTACT MS (ESI-MS): Brd4BD2, Brd4BD2L387A, Brd4BD2L387A,E438C, Brd4BD2L387C, Brd4BD2L387A,M442C (50 µM), were incubated with a 1:1 or 1:2 stoichiometric concentration of ligands in 20mM HEPES, pH7.5, 100 mM NaCl, 1 mM 20 TCEP. At 2 hours, the sample was taken directly for LC-MS analysis without further manipulation or precipitated by the addition of 4 volumes of cold methanol. The precipitated protein was pelleted by centrifugation and resuspended in an aqueous solution of 15% acetonitrile and 0.1% TFA. Samples were separated by HPLC on a C3 column using a 10–75% gradient of acetonitrile and analysed using an Agilent 6130 25 quadrupole MS. Spectra were deconvoluted and integrated using Agilent LC/MSD ChemStation. Compound stability in DMSO and buffer: Stability of compounds was evaluated in 100% DMSO or DMSO/HEPES buffer. For all samples, stability of compound was confirmed by LC-MS traces. DMSO stocks were evaluated after 20 days30 while DMSO/buffer stocks were evaluated after 5 days incubation. Protein expression and purification: Human Brd4BD2 mutants were expressed with an N-terminal His6-TEV tag in E. coli BL21(DE3) at 37 °C with LB supplemented with 50 µg/ml kanamycin once OD600 reached 0.8. Protein expression was induced with 0.5 mM IPTG with 50 µM MgCl2 added at induction and grown overnight at 35 18°C. After harvesting, the cells were resuspended in buffer containing 120 mM sodium phosphate buffer, 500 mM NaCl and 40 mM imidazole and MgCl2 (1mM) and DNase I
M&C PE963057WO 91 (10 µg/mL) added, and cells were lysed using Continuous Flow Cell Disruptor (Constant Systems) at 30,000 psi. Cell lysates were clarified by centrifugation at 18,000 g for 30 minutes at 4°C. The lysate was filtered and loaded to HisTrap FF affinity column (GE Healthcare) and eluted with 120 mM sodium phosphate buffer, 500 mM NaCl and 500 5 mM imidazole. The proteins were then dialyzed against 25 mM HEPES, pH 7.5, 150 mM NaCl and 1 mM TCEP with TEV protease (1:100 ratio) overnight. This mixture was passed through a HisTrap FF column, concentrated in a 3,500 MWCO centrifugal unit (Amicon) and loaded on a Superdex 16/600 size exclusion column pre-equilibrated in 25 mM HEPES, 150 mM NaCl, 1 mM TCEP, pH 7.5. Pure variants of Brd4 eluted ~0.74 cv 10 and were confirmed using SDS-PAGE. The final proteins were all >95% pure, concentrated, and stored at -80°C until further use. All chromatography purification steps were performed using a BioRad NGC system at 4°C. Click test: Click probe Copper catalysed Alkyne Azide Click (CuAAC) assays were performed in 1.5 mL Eppendorf tubes at 37 °C or at room temperature, in the dark, 15 and gently shaken, unless otherwise stated. The mutant protein Brd4BD2L387A,E438 (50 µM, 200 µL) was incubated with the alkyne containing ligand MR155 (75 µM) for 1 hour in 25 mM HEPES 7.5 pH, 100 mM NaCl 1 mM TECP. To this mixture, BODIPY-azide or Cy5-azide (300 µM) was added followed by a premixed solution of THPTA/CuSO4 (5:1, 0.5 mM/0.1 mM) in 25 mM HEPES 7.5 pH, 100 mM NaCl 1 mM TCEP and NaAsc 20 (5 mM) as added. The reaction was mixed at 37 °C for 1 h, after which point a sample was taken and the reaction was left at room temperature for another 16 h, when a second sample was taken. Samples were diluted by 5-fold, and prepared for gel electrophoresis by mixing the sample (10 µL) with a mixture of LDS/ DTT (10 µL). Prepared samples (10 µL) were separated by NuPAGE™ 4–12%, Bis-Tris precast polyacrylamide gels. The 25 gels were the imaged with ChemiDock (BODIPY fluorescence was detected in the Alexa 488 channel ex: 460–490 nm; em: 518–546 nm and Cyanine 5 fluorescence in the Cyanine 5 channel (excitation range of 625–650 nm; emission range of 675–725 nm). The gels were subsequently stained with Instant Blue Coomasie® stain, destained, and imaged using Bio-rad molecular imager. 30 Computational docking: A model of the mutants L94/C, E438/C or M442C were generated by introducing the mutation with the Maestro editing tools, using the x-ray structure of Brd2(2)L383V co-crystallised with ET-JQ1-OMe (6YTM) as a template, depicted in Fig. 2. The Brd2BD2L383V co-crystal and the corresponding mutations were prepared using single point mutations and the Protein Preparation Wizard from 35 Schrodinger, and the corresponding grids were generated with Glide. Ligands were prepared (Ligprep) and docked reversibly (Glide) or covalently (CovDock) in either
M&C PE963057WO 92 mutant or reference crystal. The best 5 scored poses from each docked ligand were filtered and analysed visually with Maestro. Root mean square deviation (RMSD) was obtained with respect to the reference crystal for the common structures. Protein samples: Protein samples were separated by gel electrophoresis on 5 NuPAGE™ 4–12%, Bis-Tris precast polyacrylamide gels (Thermofisher) in MES running buffer unless otherwise stated. Following electrophoresis, gels were imaged with ChemiDoc MP Imaging System for fluorescence read-out at the appropriate wavelength (for fluorescent labelling experiments) or stained for 15 min with InstantBlueTM Coomasie® stain (Abcam), destained, and imaged using Bio-rad molecular imager. Bio- 10 rad Image Lab 6.1 was used to process the data. Protein concentrations were measured prior to use in assays using a NanoDrop Microvolume Spectrophotometer. Fluorescence polarisation: Stock solutions of reaction components including Brd4BD2L387A,E438C and FAM- JQ1 probe (Tocris Biotechne) were prepared in FP assay buffer (25 mM HEPES pH 7.5, 15 100 mM NaCl, 1.0 mM TCEP). shortBrd4BD2L387A,E438C . Components were added to Corning 384-well solid black polystyrene microplates to a final volume of 15 µl. Components were mixed by spinning down plates at 50g for 1 min and the plate was covered and incubated for 5 min at room temperature, before analysis on a PHERAstar FS (BMG LABTECH) with fluorescence excitation and emission wavelengths of 485 and20 520 nm, respectively, with a settling time of 0.3 s. A final concentration of 10 nM for FAM- JQ1 probe was used while Brd4BD2L387A,E438C was titrated by a 2-fold dilution factor from 33 µM to 1 nM. The fluorescence polarization (anisotropy) of a 2-fold serial dilution of 300 nM shortBrd4BD2L387A,E438C in 10 nM of the probe in 100 mM HEPES pH 7.5, 100 mM NaCl, 1 mM TCEP, 2% DMSO was measured to determine the IC50 of the ligands. 25 Protein labelling with biotin covalent ligand and streptavidin trapping assay: Brd4BD2L387A Brd4BD2L387A,E438C (50 µM), were incubated with acrylamide-based biotin containing covalent ligand MR169 (75 µM) or chloroacetamide-based biotin containing ligand MR131 (75 µM) in 20mM HEPES, pH 7.5, 100 mM NaCl, 1 mM TCEP for 1 to 2 hours. At 2 hours, to this mixture (5 µL), a solution of streptavidinin 20mM 30 HEPES, pH 7.5, 100 mM NaCl, 1 mM TCEP (45 µL, 10 equiv., 270 µM) was added and incubated for 2 hours. After 3 hours of incubation with streptavidin, samples were taken and prepared for gel electrophoresis by mixing the sample (10 µL) with a mixture of LDS/ DTT (10 µL). Prepared samples (10 µL) were separated by NuPAGE™ 4–12%, Bis-Tris precast polyacrylamide gels. The gels were subsequently stained with Instant BlueTM35 Coomasie® stain, destained, and imaged using BioRad gel molecular imager.
M&C PE963057WO 93 Protein labelling with fluorescent covalent ligands: Brd4BD2WT, Brd4BD2L387A, Brd4BD2L387A,E438C, Brd4BD2L387C, Brd4BD2L387A,M442C (10 µM) were co- incubated with either DMSO or increasing concentrations of the Cyanine 5 or TAMRA labelled acrylamide compound (MR175, MR202) or Janelia Fluor functionalised covalent 5 ligands (C10852 series) in 20 mM HEPES, pH7.5, 100 mM NaCl, 1 mM TCEP. Concentrations covered a range of concentrations from sub-stoichiometric up to excess (2.5 to 50 µM). Samples (10 µL) were prepared for gel electrophoresis by mixing with a solution of LDS/ DTT (10 µL). Prepared samples (10 µL) were separated by NuPAGE™ 4–12%, Bis-Tris precast polyacrylamide gels. The in-gel fluorescence was the imaged 10 with ChemiDock (Cyanine 5 fluorescence with the Cy5 channel (excitation range of 625– 650 nm; emission range of 675–725 nm). The gels were subsequently stained with Instant Blue Coomasie® stain, destained, and imaged using BioRad gel molecular imager. Evaluation of fuorogenicity of Janelia covalent ligands: Brd4BD2WT,15 Brd4BD2L387A, Brd4BD2L387A,E438C, Brd4BD2L387C, Brd4BD2L387A,M442C (10 µM) were co- incubated with increasing concentrations of the Janelia 635 functionalised covalent ligand C10852L or C10852S from sub-stoichiometric up to excess (2.5 to 50 µM). Relative fluorescence Units (RFU) were measured using a GloMax (Promega plate reader) instrument (excitation: 645 nm; emission: 665 nm). 20 Crystallization of BRD2-BD2L383A,D434C For crystallization, the 1.2 mM stock of BRD2-BD2L383A,D434C was diluted into crystallization buffer (25 mM HEPES, 120 mM NaCl, 0.5 mM TCEP, pH 7.5) at a final concentration of 100 µM in 1 ml. From a 10 mM stock, 12 µl of MR116 was added in 2 25 µl increments with gentle mixing in-between reach a final concentration of 120 µM (1.2 fold molar excess). BRD2-BD2L383A,D434C and MR116 were allowed to react at ambient temperature for 120 min. The reaction was spun at top speed in a microcentrifuge for 10 minutes and the 1ml of supernatant was injected to a 5 ml sample look and loaded on a 16/600 Superdex s75 pg size exclusion column (SEC) pre-equilibrated in crystallization 30 buffer. The run was carried out on a BioRad NGC equipped with a multi-wavelength detector and monitored at 215 nm, 255 nm, and 280 nm. Fractions with UV absorbance at 280 nm were analyzed using NuPAGE Bis-Tris mini protein 4-12% gels (Invitrogen) with MES running buffer ran at 200 V for 34 min. The gel was stained using colloidal Coomassie Instant Blue Stain (ISB1L, abcam), revealing pure BRD2-BD2L383A,D434C. The 35 additional UV absorbance at 215 nm suggested the presence of MR-116, which was confirmed from the ΔTm over 25 °C (Tm BRD2-BD2L383A,D434C = 45.37 °C and Tm MR-
M&C PE963057WO 94 116/BRD2-BD2L383A,D434C = 71.72 °C) using nanoDSF (20 to 95 °C, at 1°C/min) on the Prometheus Panta (Nanotemper). With this verification, MR116/BRD2-BD2L383A,D434C was concentrated to 610 µM (8.2 mg/ml) using a 3,000 MWCO centrifuge concentrator (Amicon). Because MR116 did not exhibit UV-absorbance at 280 nm on the SEC run, 5 the concentration was determined using the molar extinction coefficient at 280 nm(ε280) = 15,930 cm-1*M-1. Aliquots of MR116/BRD2-BD2L383A,D434C were stored at -80 °C until use. Initial crystal screens of MR-116/BRD2-BD2L383A,D434C (8.2 mg/ml) were set against commercially available screens (JSCG+, PACT, ProPLEX, Morpheus-I, PGA, 10 Index-HT, PEGion, MIDAS, Classics, and BCS) in 96-well MRC 2-drop sitting well plates with 50 µl reservoirs, using a Mosquito (SPT LabTech) dispensing 200 nl of reservoir and 200 nl of MR-116/BRD2-BD2L383A,D434C for each drop. Plates were sealed and crystals were monitored at 19 °C in a Rock Imager (Formulatrix). Numerous conditions yielded suitable crystals between 1-3 days. Crystals were harvested with Dual Thickness 15 MicroLoops (MiTeGen) ranging from 50-200 µm. Our reported structure was obtained from a single ~100 µm crystal after 2 days in the Morpheus-I condition E9 (PEG 500 MME 40% (v/v), PEG 20,00020 % (w/v), 0.12 M ethylene glycols, 0.1 M Tris-BICINE pH 8.5) and was harvested without additional cryoprotection. 20 Macromolecular X-ray Data collection Data sets for all crystals were collected at Diamond Light Source (Didcot, UK) on MX beamline I24 under bag proposal MX-35324-32. A total of 3,600 images were recorded on the EIGER2 X CdTe 9M detector positioned at a distance corresponding to a maximum resolution of 1.1 Å, total exposure time of 0.005 sec/image, 0.1° oscillation 25 over 360°, wavelength of 0.6199 Å, and beam size of 20x20 µm. Data sets were automatically processed SynchWeb/ISPyB (ref) for indexing (XDS), scaling, and merging. The crystal was processed in space group P 222 with unit cell dimensions of 32.05 Å 52.12 Å 71.63 Å in length and angles of 90° 90° 90°. We estimate that this crystal achieved diffraction at the limit of the beamline and detector, but the highest resolution 30 shells were incomplete. Therefore, we reprocessed the images with a resolution cutoff of 1.3 Å to obtain a complete dataset for structural determination. Structure determination of MR116/BRD2-BD2L383A,D434C To facilitate molecular replacement, we generated an atomic model of BRD2-35 BD2L383A,D434C without MR116 using Boltz-1 and automatic MSA generation. Crystallographic analysis was carried out in CCP4 v8. Initial inspection of the crystal had
M&C PE963057WO 95 a Mathews coefficient of 2.28 and high probability of 1 copy of BRD2-BD2L383A,D434C per unit cell with a solvent content of 46 %. Running CCP4 pipeline (Data reduction to complete structure with ligand fitting) with Aimless for data reduction, Phaser for molecular replacement using the Blotz-I model, Buccaneer autobuild, and REFMAC5. 5 This initial attempt resulted with structure factors of Rwork=0.20 and Rfree=0.23 and clear density for MR116. The SMILES string with the saturated propionamide ([NH]C(CC)=O) analogue and was used to generate restrains for the ligand and for subsequent linkage with the right carbon valency the hydrogen in the terminal methyl group in the propionamide was manually removed in COOT. The covalent bond was formed using 10 AceDRG, between the beta carbon of the propionamide and the sulfur in C434 of BRD2- BD2L383A,D434C in CPP4. This was used as a covalent restraint for further refinement. After several rounds of manual model building in COOT and REFMAC, the final structure factors were Rfree= 0.18 and Rwork= 0.14 for the predominant bond conformation. An alternate conformation of the covalent bond was built and further refined with REFMAC515 to give a final Rfree= 0.18 and Rwork= 0.15. The polder map (omit map) was made in PHENIX, omitting the ligand density MR116 and then visualized in UCSF ChimeraX. Data collection and refinement statistics:
Bond angles ( ) 2.07 20 *Values in parentheses are for highest-resolution shell.
M&C PE963057WO 96 HiBiT live-cell kinetic degradation assay of MR156 and MR170 A validated endogenous HiBiT-BromoCatch-Brd4 cell line was subsequently used to perform a live-cell kinetic degradation assay to determine the activity of the 5 functionalised degrader probes for BromoCatch MR156 and MR170. BromoCatch-Brd4 HEK293 cells were plated at a density of 1x106 cells into a single well of a 6-well plate in 2 mL of DMEM:10%FBS and left to adhere for 16 hours in a humidified incubator with 5% CO2 at 37 °C. The BromoCatch-Brd4 HEK293 cells were then transfected with 2 µg of pCMV LgBIT expression vector (Promega, N2681) using Fugene HD at a ratio of 1:3 10 DNA: transfection reagent. The DNA: transfection reagent mixture was incubated for 20 minutes at room temperature and then subsequently added dropwise to the BromoCatch-Brd4 HEK293 cells and left to incubate in a humidified incubator with 5% CO2 at 37 °C overnight. The following day, the transfected BromoCatch-Brd4 HEK293 cells were 15 trypsinised and plated at a density of 2x104 cells per well into a non-adherent white walled 96 well plate in 100 μL of DMEM containing 20 μM Endurazine (Promega) and plates were incubated at 37 °C, 5% CO2, for 2 h. The plate was then imaged for luminescence using a GloMax Discover (Promega) imager as a baseline, then subsequently addition of a range of concentrations of between 50µM-1 µM of MR156, MR170, MZ1, or AGB1 20 was added. A breathable film was placed over the plate and continuously imaged every ~5 min for a period of 11 h on a GloMax Discover (Promega) set to 37 °C. The data produced from this experiment was subsequently analysed in graphpad Prism 10.2.3 using a one phase decay model to extract the DMAX and T1/2 degradation of each bifunctional compound as well as the DC50 values for each degrader. 25 NanoBRET competition ligand binding assay of electrophilic warheads In order to perform the NanoBRET competition assay, HEK293 cells were first transfected with the pCMV Brd4BD2L387A, E438C-Nanoluciferase vector. One days prior to initiation of the experiment 1 mL of HEK293 cells were plated at a density of 2x106 cells 30 per mL into one well of a 6-well plate and the cells were incubated at 37 °C, 5% CO2 overnight. To perform the transfection, firstly 2 µg of the Brd4BD2L387A, E438C- Nanoluciferase vector was added to 100 µL of Optimem media and 6 µL of Fugene HD was added for a 1:3 DNA:Fugene HD ratio and gently mixed and left at room temperature for 20 minutes. After this incubation period, the mixture was added dropwise onto the 35 HEK293 cells plated the day prior and then left to incubate in a humidified incubator with 5% CO2 at 37 °C for at least 16 hours prior to initiation of experimentation.
M&C PE963057WO 97 For the live cell mode of NanoBRET, the HEK293 cells transfected with the Brd4BD2L387A, E438C-Nanoluciferase vector, and were subsequently trypsinised following standard procedure, counted and plated in Optimem media - containing 10% FBS – at a density of 2.36x105 cells per mL in 10 mL of media.85 µL of this cell containing media 5 was then subsequently plated into every well of a non-adherent white walled 96 well plate to have a final density of ~2x104 cells per well.5µL of optimem media containing 20µM ET-JQ1-Bodipy probe was added to each well and the plate was incubated for 5 minutes at room temperature on a plate shaker. Following this 10µL of a 10x solution of electrophilic warheads in optimem was added to the cells with concentration range from 10 100 µM – 2pM with ET-JQ1-OMe and DMSO (Vehicle) used as positive controls. The plate was subsequently incubated at 5% CO2 at 37 °C for 10 minutes. A 50µL 3x solution of NanoBRET Nano-Glo® Substrate + extracellular Nanoluciferase inhibitor was added to each experimental condition and subsequently imaged on a PHERAstar plate reader for both luminescent and fluorescence using emission (460nm) and acceptor emission 15 (618nm). The relative BRET ratio was calculated by normalizing the data. The data produced from this experiment was subsequently analysed in graphpadPrism 10.2.3, with competitive binding curves being produced using a "log(inhibitor) vs. response (three parameters)" model to extract the IC50 nM for each electrophilic warhead normalised to the DMSO (vehicle) control. 20 NanoBRET Residence time of electrophilic warheads In order to perform the NanoBRET residence time aassay, HEK293 cells were first transfected with the pCMV Brd4BD2L387A, E438C-Nanoluciferase vector. One day prior to initiation of the experiment 1 mL of HEK293 cells were plated at a density of 2x106 25 cells per mL into one well of a 6-well plate and the cells were incubated at 37 °C, 5% CO2 overnight. To perform the transfection, firstly 2 µg of the Brd4BD2L387A, E438C- Nanoluciferase vector was added to 100 µL of Optimem media and 6 µL of Fugene HD was added for a 1:3 DNA:Fugene HD ratio and gently mixed and left at room temperature for 20 minutes. After this incubation period, the mixture was added dropwise onto the 30 HEK293 cells plated the day prior and then left to incubate in a humidified incubator with 5% CO2 at 37 °C for at least 16 hours prior to initiation of experimentation. The HEK293 cells transfected with the Brd4BD2L387A, E438C-Nanoluciferase vector, were subsequently trypsinised following standard procedure, counted and plated in Optimem media - containing 10% FBS – at a density of 2 x105 cells per mL in 1 mL of 35 media into 5 separate 15 mL conical tubes. These tubes of cells were then treated with a saturating concentration of 250 nM of MR112, MR116, 250 nM ET-JQ1-OMe,
M&C PE963057WO 98 equivalent DMSO and a non-treatment control. These flacons were then left to equilibrate for 2 hours in a humidified incubator with 5% CO2 at 37 °C. After the 2 hours, cells were spun down and washed with warm optimem: 10% FBS containing media, this was performed 3 times. Cells were then plated at a density of 2x104 cells in 90 µL per well of 5 a white-walled 96 well plate. 100μl of 2X NanoBRETTM Nano-Glo® Substrate plus Extracellular NanoLuc® Inhibitor Solution was added to each well of the 96-well plate. The plate was then imaged on a GloMax Discover (Promega) plate reader for both luminescent and fluorescence using emission (460nm) and acceptor emission (618nm) as a baseline measurement.10µL of a 20x ET-JQ1-BODIPY tracer final concentration 10 of 25µM was added to each well and quickly shaken for 10 seconds on a plate shaker. A breathable film was then placed on the plate and the cells were continuously imaged for 120 minutes at 37 °C. The data produced from this experiment was subsequently analysed in graphpadPrism 10.2.3, with BRET curves being produced using a "One- phase decay" model normalised to the DMSO (vehicle) and non-treatment controls. 15 Confocal live cell imaging of U2OS, HEK293 FT using C10852S To perform confocal live cell imaging of C10852S, U2OS and HEK293FT cells were first transfected with the pCMV H2B-BromoCatch vector, based on the smaller 12 kDa Bromocatch tag. Two days prior to initiation of the experiment 1 mL of cells were 20 plated at a density of 2x106 cells per mL into one well of a 6-well plate and the cells were incubated at 37 °C, 5% CO2 overnight. To perform the transfection, firstly 2 µg of the H2B-BromoCatch vector was added to 100 µL of Optimem media and 6 µL of Fugene HD was added for a 1:3 DNA:Fugene HD ratio and gently mixed and left at room temperature for 20 minutes. After this incubation period, the mixture was added dropwise 25 onto the cells plated the day prior and then left to incubate in a humidified incubator with 5% CO2 at 37 °C for at least 16 hours prior to initiation of experimentation. In parallel 2x106 cells of HEK293FT and U2OS were plated but not transfected with the H2B- BromoCatch vector to act as a control. Both these transfected and un-transfected cells were then subsequently 30 trypsinised and plated at a density of 4x104 cells per well of an ibidi µ-slide 18 Well chamber microscopy slide and left to adhere for 16 hours. Following adherence, the DMEM media was replaced with Optimem:10% FBS containing either DMSO, 100nM C10852S or 100nM C10852S + 25µM ET-JQ1-OME and left to incubate for 6 hours in HEK293FT or 8 hours in U2OS in the transfected cells in a humidified incubator with 5% 35 CO2 at 37 °C. In the un-transfected control 100nM C100852S in optimem: 10% FBS was added and left in a humidified incubator with 5% CO2 at 37 °C for the same length as
M&C PE963057WO 99 the transfected cells. The cells were then washed gently in warm Optimem:10% FBS media three times and Hoescht 33342 was added to each well 30 minutes prior to imaging at manufactures specified concentration. Confocal microscopy images were acquired on a Zeiss 880 Airyscan Confocal Microscope using Plan-Apochromat 63x/1.4 5 objective lens and equipped with 488/561/633-nm excitation laser lines. Post-acquisition analysis was conducted with Fiji (version 1.54j). For performing confocal live cell imaging of C10852L, S, U and T, U2OS cells were transfected with H2B-Bromocatch -based on the 15 kDa tag and plated in ibidi µ- slide 18 well chambed plates as described above. At initation of experimentation, U2OS 10 cells were treated with 200nM of each probe or DMSO. In parralel, some wells were pre- incubated for 1 hr with a 20µM treatment of MR116, washed and then treated with 200nM of each probe or DMSO and left to incubate in a humified incubator with 5% CO2 at 37 °C for 2 hours. The cells were then washed gently in warm Optimem:10% FBS media three times and Hoescht 33342 was added to each well 30 minutes prior to imaging at 15 manufactures specified concentration. Confocal microscopy images were acquired on a Zeiss 880 Airyscan Confocal Microscope using Plan-Apochromat 63x/1.4 objective lens and equipped with 488/561/633-nm excitation laser lines. Post-acquisition analysis was conducted with Fiji (version 1.54j). 20 Results The present system expands on the established ‘bump-and-hole’ technology, where a ‘hole’ is introduced in the protein tag (mutant) and covalency is added by including a nucleophilic residue (a cysteine), with the complementary ligand containing 25 a ‘bump’ to fill the hole as well as an electrophilic warhead to react with the desired cysteine. These two-fold modifications achieve exquisite specificity and orthogonality of the ligand-mutant protein pair. Furthermore, the tag was reduced to a smaller size protein tag, of high interest for protein fusion labels to increase compatibility of the system. The present investigators have disclosed herein: the chemical synthesis of a wide variety of 30 ligands comprising electrophilic groups in different positions; the expression of complementary stable mutant protein constructs; the demonstration of successful interaction (i.e., covalent bond formation) between said ligands and mutant proteins; and the exemplification of a variety of functional ‘tagged’ compounds (Fig.1). Table 1 below summarises the covalency results of several compounds and35 mutant bromodomains. The percentage modification was measured using INTACT-MS. Brd4BD2 comprises amino acid residues 351 to 460 of SEQ ID No.1, that is to say, it is a wild type bromodomain. Brd4BD2L387A is a hole-modified mutant bromodomain known
M&C PE963057WO 100 as BromoTag. Brd4BD2L387A,E438C and Brd4BD2L387A,M442C both comprise the amino acid residues 351 to 460 of SEQ ID No. 1, but also comprise the listed mutations; that is to say, they correspond to the amino acid sequences of the third aspect of the invention, namely sequence (i)(b), and (i)(c), respectively. The data presented in Table 1 5 demonstrates that the inventors have prepared several effective ligand-mutant protein pairs. Additionally, several of the prepared ligands are capable of forming a covalent bond with more than one mutant protein. So called ‘double-mutants’ Brd4BD2L387A,E438C and Brd4BD2L387A,M442C, each comprising two mutations over the wild type Brd4BD2, are highly effective at forming a covalent bond with several ligands. 10 Table 1: Covalency results. N.d. = not detected.
42C
MR116 <10% <5% >95% >95%
M&C PE963057WO 101 MR135 N.d. N.d. >95% >95%
Table 2 below summarises results from a thermal shift assay of ligand:protein pairs, as measured by differential scanning fluorimetry (DSF), to demonstrate the stabilising effect of the ligands for the different proteins. The difference in melting 5 temperature (ΔTm) is relative to the protein in the absence of the ligand. The most stabilising ligands were those which provided the greatest ΔTm. The melting temperature (Tm) for wild type Brd4BD2, BromoTag Brd4BD2L387A, single-mutant Brd4BD2L387C, and double-mutants Brd4BD2L387A,E438C and Brd4BD2L387A,M442C are 51, 50, 49, 49, and 49 °C, respectively (in the absence of any ligand), demonstrating that the mutant proteins have 10 high stability. Table 2: Stability results. Tm = melting temperature. MR135 measured in Prometheus Panta nanoDSF analyser. 2C
MR111 2.3 ±0.6 8 ±0 24.3 ±0.6 6.7 ±0.6
M&C PE963057WO 102 MR117 3.0 ±0 10 ±0 21.7 ±0.6 7 ±0
Table 3 shows the calculated inhibition constants (IC50) for a selection of ligands acting on Brd4BD2L387A,E438C using fluorescence polarisation; this was calculated via 5 Brd4BD2L387A,E438C titration with a 10 nM fluorescent probe to provide a Kd of the probe for the mutant and determine the probe and protein concentration to use for the competition assay to determine the IC50 values of the inhibitors. These data further demonstrate the high affinity of the ligands disclosed herein for the mutant bromodomains, as many are more potent than the established inhibitor ET-JQ1. 10 Table 3: Calculated IC50 values for certain inhibitors. 119
Table 4 below shows the covalency results of several compounds with the ‘single- mutant’ Brd4BD2L387C, i.e., amino acid residues 351 to 460 of SEQ ID No. 1, and the 15 mutation L387C, namely sequence (i)(a). These ligand:protein pairs comprise ligands wherein E1 is an electrophile. The data presented in Table 1 demonstrates that the inventors have prepared several effective ligand-mutant protein pairs. Table 4: Covalency results. N.d. = not detected.
20
Modification
M&C PE963057WO 103 Compound Brd4BD2L387A Brd4BD2L387C
Furthermore, Table 5 shows results from a thermal shift assay of ligand:protein pairs, as measured by differential scanning fluorimetry (DSF), to demonstrate the stabilising effect of the ligands for the Brd4BD2L387C mutant. 5 Table 5: Stability results. Tm = melting temperature. °
Both biotin ligands MR131 and MR169 (Fig. 3A) were successfully covalently bound to mutant bromodomain Brd4BD2L387A,E438C, demonstrated by the increase in10 mass of the molecular ion measured by INTACT-MS corresponding to the tag. Fig. 3B shows a general scheme for how a biotin ligand bound to a mutant bromodomain can bind to streptavidin; the protein or mutants were pre-incubated with the biotin ligand (MR169) for 1 hour at room temperature followed by addition of 10 equivalents of streptavidin. Fig. 3C depicts sample separation by SDS-PAGE gel, showing the 15 disappearance of the band (white arrows) corresponding to both short and long mutants Brd4BD2L387A,E438C when incubated with the ligand and streptavidin. No band disappearance was observed when ligand and streptavidin were incubated with Brd4BD2WT or Brd4BD2L387A or in absence of the ligand or streptavidin. This example shows that proteins may be functionalised, in this case biotinylated, through the20 complementary ligand-mutant bromodomain pairs presented herein. Fig. 4A shows a general scheme for labelling a mutant bromodomain with a tagged compound comprising an ‘always-on’ fluorescent label. Fig. 4B shows the structure of MR175, a tagged compound of the present disclosure wherein the compound comprises the fluorescent group Cyanine 5. Brd4BDWT, Brd4BD2L387A, Brd4BD2L387A,E438C 25 or Brd4BD2L387A,M442C, were incubated with increasing concentrations of MR175 ligand,
M&C PE963057WO 104 from sub-stoichiometric up to excess equivalents, for 1 h. Fig. 4C depicts the protein samples separated by gel electrophoresis; maximal fluorescence signal remains constant above stoichiometric ratio of ligand to protein, with any excess unreacted ligand detected at the bottom of the gel. This demonstrates that tagged compounds comprising 5 fluorescent groups can successfully bind to mutant bromodomains, conferring fluorescent properties on the mutant. No labelling was observed for the Brd4BD2WT or Brd4BD2L387A, neither of which comprise the necessary nucleophilic group to react with the electrophilic warhead of the tagged compound. Fig. 5 and Fig. 6 depict the tagged compound C10852L binding to mutant 10 bromodomains. C10852L, with a structure depicted in Fig.6B, comprises the fluorescent group Janelia Fluor® 635. Fig. 6A shows a general scheme, wherein the ligand is ‘silent’ until bound to the mutant bromodomains. Fig.5 shows the fluorogenicity of the mutant- ligand pairs of C10852L with Brd4BD2WT, Brd4BD2L387A, Brd4BD2L387A,E438C and Brd4BD2L387A-M442C at a variety of concentrations; proteins (10 µM) were incubated with15 increasing concentrations of the fluorogenic covalent ligand C10852L from sub- stoichiometric up to excess for 1 or 8 h. As expected, the fluorogenic ligand is also capable of binding Brd4BD2L387A reversibly, which results in a switch on in signal similar to that observed for short-Brd4BD2L387A,E438C and long-Brd4BD2L387A;E438C. Nonetheless, the fluorogenicity conferred on the nucleophilic mutant bromodomains is clear. Fig.6C 20 depicts the protein samples separated by gel electrophoresis; maximal fluorescence signal remains constant above stoichiometric ratio of ligand to protein, with any excess unreacted ligand detected at the bottom of the gel. No labelling was observed for the Brd4BD2WT or Brd4BD2L387A, with all the unreacted ligand detected at the bottom of the gel. These results further demonstrate that tagged compounds comprising fluorescent25 groups can successfully bind to mutant bromodomains. Additionally, the presently disclosed system was used to develop a Click chemistry kit (Fig. 7A). An alkyne containing ligand MR155 (60 µM) (Fig. 7B) was first reacted with mutant bromodomains, including Brd4BD2L387A,E438C (50 µM) and incubated for 1 h in HEPES buffer. This enabled the formation of a tagged compound, covalently 30 bound to a mutant bromodomain, wherein the tag comprises a reactive handle. The compound was then reacted with BDP-Azide or Cy5-Azide (300 µM) (structures shown in Fig. 7B), with THPTA/CuSO4 (5:1, 0.5 mM/0.1 mM). The reaction was mixed at 37 °C for 1 h, and then taken to room temperature for another 16 h. After 16 h, samples were taken and separated by gel electrophoresis, depicted in Fig. 7C. The results of the gel 35 show the fluorogenicity apportioned to the mutant bromodomains bound to the tagged
M&C PE963057WO 105 compounds through the click chemistry methods, demonstrating that the reactive handle successfully reacted with the fluorogenic species comprising BDP or Cy5. Fig. 8 depicts the targeted protein degradation (TPD) of endogenous HiBiT-BRD4BD2L387A-E438C-Brd4 in HEK293 cells, to demonstrate the capability of tagged 5 compounds disclosed herein, comprising a binder for an E3 ubiquitin ligase, to degrade a tagged protein (Brd4, in this case). Four compounds were tested – positive controls MZ1 and AGB1, and tagged compounds MR156 (recruiting CRBN) and MR170 (recruiting VHL). Fig.8A shows the results of quantitative live-cell degradation kinetics of HEK293 cells following treatment with DMSO and a 4-fold serial dilution of MZ1, 10 AGB1, MR156, and MR170 over a range of 600 pM to 50 µM. Luminescence (RLU = relative light units) was continuously monitored over a 11 h time period and was plotted normalised to the DMSO control as Fractional RLU. The luminescence, which originates from luciferase, is proportional to the protein's integrity, as luciferase only emits light when the protein remains intact; thus lower RLU indicates higher protein degradation. 15 Data are presented as mean values with error bars representing the SD of two independent repeats. Fig. 8B shows a summary table of degradation rate and Dmax (maximum degradation) values for HiBiT-Brd4BD2L387A-E438C-Brd4 in HEK293. Fig. 8C shows a summary of DC50 values calculated from curve fitting normalised to the DMSO control for each compound concentration. The results demonstrate successful 20 degradation using the tagged compounds disclosed herein. The TPD of endogenous HiBiT-long-BRD4BD2L387A,E438C has also been achieved by the present inventors using MR170 and MR156. Fig. 9 depicts a melting curve, measured by DSF, of the ligand-mutant protein pair of MR116 co-incubated with a Brd2 mutant, Brd2BD2L383A,D434C. Brd2BD2L383A,D434C 25 was used for crystallographic studies as a Brd4BD2L387A,E438C homologous protein. The stability of the Brd2BD2 homologues (to the Brd4BD2 counterparts) is comparable. To determine the intracellular target engagement of several compounds disclosed herein against the Brd4BD2L387A,E438C tag (sometimes referred to herein as “BromoCatch”), the NanoBRET target engagement assay was utilised. Table 6 below30 depicts the results of these NanoBRET target engagement assays, displaying the IC50 values of the compounds against BromoCatch-NanoLuciferase in live cells. To do this, a BromoCatch-NanoLuciferase construct was designed, based on the original 15 KDa BromoTag design to ensure stability in cells and a ET-JQ1-BODIPY tracer was utilised as a tracer for the active site of BromoCatch from which a range of covalent compound 35 concentrations were used to displace the tracer 2pM-100µM, which was read out as a loss of BRET signal in the assay. The cells harbouring the BromoCatch-NanoLuciferase
M&C PE963057WO 106 were first preincubated with 1 µM ET-JQ1-BODIPY, before addition of the compounds. The cells were then left to incubate for 10 minutes prior to readout of BRET signal. Despite the short incubation time of this assay, it was determined that ligands such as MR112 and MR116 had intracellular IC50 values 10-fold greater than the reversible ligand 5 Et-JQ1-OMe, indicating the rapid uptake of these ligands (intracellular IC50 values of 14 and 31 nM, respectively, compared to 88 nM for ET-JQ1-OMe). To provide evidence that ligands MR112 and MR116 were effective due to their covalent binding nature, the inventors utilised a NanoBRET based residence time assay. As such, the BRET method was reconfigured to measure the relative rates of compound dissociation from10 BromoCatch-NanoLuciferase. HEK293 cells expressing BromoCatch-NanoLuciferase were first equilibrated with a saturating concentration of 250 nM of MR112 or MR116. The non-covalent inhibitor ET-JQ1-OMe was implemented as a control. HEK293 cells were treated with DMSO (vehicle), ET-JQ1 at 250 nM, MR112 at 250 nM or MR116 at 250 nM for 2 hours followed by ligand removal and immediate addition of with a 15 saturating concentration of 25 µM ET-JQ1-BODIPY tracer. An increase in BRET signal is indicative of tracer binding to the BromoCatch-NanoLuciferase, which cannot occur until after the compound has dissociated. The increase in BRET signal was observed for 120 minutes after which the assay was stopped. No observable increase was observed for either conditions in which MR112 or MR116 was pre-incubated over the 120-minute 20 experiment, indicating that the ET-JQ1-BODIPY was unable to displace either ligand, indicating covalent engagement of the tag with the covalent warhead (electrophile) intracellularly. Table 6: NanoBRET target engagement assay of several compounds disclosed herein25 against “BromoCatch-NanoLuciferase” (Brd4BD2L387A,E438C tethered to NanoLuciferase)
M&C PE963057WO 107 displaying the IC50 values of the compounds against BromoCatch-NanoLuciferase in live cells. Entry Live IC50 (nM)
Fig.10 depicts the X-ray crystal structure of MR116 bound to Brd2BD2L383A,D434C 5 (333-460). The co-crystals showed exquisite overlap with 6YTM, with an RMSD value for the common structure of 0.43 (Brd2BD2L383V co-crystallised with ET-JQ1-OMe), implying that the binding mode is not affected and the covalent bond is efficiently formed. Fig. 11 demonstrates further exemplification of the compounds herein for application in fluorescent labelling/imaging. To build an always on fluorescent probe, the 10 present inventors functionalized the acrylamide compound MR116 via the established carboxylic acid exit vector with a TAMRA fluorophore via a PEG3 linker. The TAMRA probe (MR202) selectively labelled the recombinant Brd4BD2L387A,E438C protein in a dose dependent manner, displaying full orthogonality over the wild type or the reversible BromoTag protein. The recombinant proteins (10 µM) were co-incubated with increasing 15 concentrations of MR202 (2.5 to 50 µM). The samples were separated by SDS gel electrophoresis and fluorescence signal detected, the probe demonstrated high selectivity for Brd4BD2L387A,E438C, and showed that the maximal fluorescence intensity is achieved at stoichiometric levels of protein to probe (1:1, at 10 µM). At sub-stoichiometric levels, the probe is fully consumed in presence of the cysteine containing mutant, while 20 above stoichiometric ratios, the fluorescence intensity does not increase, and the excess probe can be detected at the bottom of the gel. No fluorescence signal was observed for Brd4BD2WT or Brd4BD2L387A, demonstrating no labelling of these bromodomains (due to the lack of nucleophilic mutation), with the probe detected at the bottom of the gel. To validate the utility of MR202 for fluorescent detection of intracellular proteins in cell 25 lysates, H2B-BromoCatch constructs were transiently expressed in HEK293-FT cells. The transfected cells were subsequently treated for 2 hours with MR202 (100 nM), and
M&C PE963057WO 108 the cell lysates evaluated by western blots. MR202 specifically labelled the H2B tagged protein with the TAMRA probe, showing no unspecific labelling of any other targets neither in WT nor in transfected cell lines, even when the compound was in a large excess (2.5 µM). The success of transfection and the confirmation of specific H2B 5 detection was confirmed by anti-H2B antibody detection, where only cell lysates from transfected cell lines show the corresponding band for the H2B-BromoCatch construct (31 KDa), while WT HEK293-FT cell lines only express the WT-H2B. The inventors also functionalized the acrylamide compound MR116 via the established carboxylic acid exit vector with a Cy5 fluorophore via a PEG3 linker, depicted 10 in (see Fig 4 description). The Cy5 probe (MR175) selectively labelled the recombinant Brd4BD2L387A,E438C protein and exhibited effective fluorescence when bound. As an alternative to the “always on” nature of TAMRA, the present inventors designed a “fluorogenic” or “switch on” probe using a Janelia Fluor 635 NIR fluorophore. This development is depicted in Fig. 12. This fluorophore emits in the far red region, 15 which may overcome problems of cell autofluorescence in the blue and green region of the spectrum, which can affect the signal-to-noise ratio in live imaging experiments. To build this probe, the present inventors exchanged the 5-TAMRA fluorophore in MR202 by Janelia Fluor 635, whilst maintaining the same linker. The JF635 probe C10852S was co-incubated with recombinant protein and the results obtained were consistent with20 those obtained with MR202. The JF635 fluorophore can be fully switched on in by SDS (sodium dodecyl sulfate), which is a useful feature to evaluate the probe (by fluorescence detection) by SDS gel electrophoresis. Additionally, the fluorogenic efficiency was quantified by measuring the fluorescence signal increase in presence of the recombinant proteins in solution. The present inventors evaluated the fluorescence emission of 25 C10852S in 50 mM HEPES buffer or in presence of Brd4BD2L387A,E438C, Brd4BD2L387A or Brd4BD2WT (10 µM). The probe displayed highly specific “switch on” in presence of the Brd4BD2L387A,E438C cysteine containing mutant, with low levels of background observed when the probe was in the presence of buffer only, or in the presence of the wild-type protein Brd4BD2WT. The non-covalent tag Brd4BD2L387A also provided switch on when 30 co-incubated with C10852S, consistent with efficient reversible binding, but the maximal signal was substantially higher for the covalent mutant. A solution of 0.1% TFA in ethanol was used as a positive control to fully switch on the JF635. To demonstrate that BromoCatch can be successfully used to fluorescently label tagged proteins in live cells, the present inventors transfected the 12.8 kDa H2B- 35 BromoCatch fusion protein construct in both U2OS and HEK293FT cells. U2OS and HEK293FT cells were transiently transfected and plated on ibidi µ-Slide 18 Well
M&C PE963057WO 109 microscope slide, left to adhere, and subsequently labelled with 100 nM of the C10852S compound. it was observed using confocal imaging that in the majority of transfected cells, an increasing intensity of fluorescence upon treatment with 100 nM C10852S over a 6 hour treatment window in HEK293FT, and over an 8 hour window in U2Os cells, Fig. 5 12 E & F. Confocal microscopy images were acquired on a Zeiss 880 Airyscan Confocal Microscope using Plan-Apochromat 63x/1.4 objective lens. Post-acquisition analysis was conducted with Fiji (version 1.54j). Fluorescent localization was observed strictly in the nucleus, as determined by counterstaining with Hoechst 33342 nuclear stain. It was observed that fluorescence was effectively quenched with the addition of a saturating 10 concentration of 25 µM ET-JQ1-OMe. Further, a lack of switch on of the Jamelia 635 in non-transfected HEK293 FT and U2Os cells was observed, indicating (without being bound by theory) that the JF635 can only activate once C10852S engages with the H2B- BromoCatch in cells, highlighting the selectivity of the compounds herein. The fluorogenicity of further “switch on” fluorescent probes is also disclosed 15 herein (see Fig. 13). In particular, based on the acrylamide ligand MR116 and the fluorophore Janelia Fluor 635 bearing different linkers (aliphatic or PEG based). Results demonstrate that an increase in fluorescence intensity was observed upon contacting Brd4BD2L387A,E438C for probes C10852K (Jan K), C10852L (Jan L), C10852M (Jan M), C10852N (Jan N), C10852S (Jan S), C10852T (Jan T) and C10852U (Jan U) with low 20 or negligible emissions when contacting the wild-type protein Brd4BD2WT. The inventors have found that the fluorescence exhibited by the C10852L (Jan L) probe is concentration-specific and that this probe is able to successfully label recombinant Brd4BD2L387A,E438C, exhibiting negligible background signal when the probe is unbound. The inventors also found that probes C10852L (Jan L), C10852S (Jan S), C10852T (Jan 25 T) and C10852U (Jan U) were particularly effective probes of the mutant bromodomains disclosed herein when used in vitro. The same probes were then tested for their use as fluorogenic probes in live cells. The images in Fig. 14 (A) and (B) live-cell confocal images of H2B-BromoCatch in U2OS cells treated with the fluorogenic probes based on the original 15 KDa BromoTag design. Fig. 14 (A) shows that the probes were 30 successfully switched on within the U2OS cells and exhibited effective fluorescence after 2 hour 200nM treatment of the fluorogenic probes Fig. 14 (B) is a control experiment: in this, the cells were pre-incubated for 1 hour with 20 µM MR116 before treatment with the probes. In the 1-hour pre-treatment, MR116 is able to covalently bind to the mutant bromodomains, such that the probes that were then added could not themselves bind to 35 the mutant bromodomains and be “switched on”. The lack of fluorescence shows that in
M&C PE963057WO 110 Fig. A it is the binding of the probes to the mutant bromodomains that switches the probes on and provides fluorescence. The alternative fluoroacrylamide warhead for Janelia Fluor 635 based probes was tested by using MR135 as the BromoCatch covalent binder. In Fig. 15, the 5 fluorogenicity of the acrylamide based probes C10852L (Jan L), C10852S (Jan S), C10852T (Jan T) was compared to that of the fluoro acrylamide analogues C10852AA (Jan AA), C10852AC (Jan AC) and C10852AD (Jan AD), C10852AB (Jan AB) was tested as an additional peg 2 probe (not synthesised in the acrylamide series). Results demonstrate that the fluoro acrylamide is an appropriate warhead with comparable 10 reactivity to that of the acrylamide analogues. An increase in fluorescence intensity was observed upon contacting the probes with the Brd4BD2L387A,E438C disclosed herein compared to when contacting the compounds with the Brd4BD2WT. The plots labelled TFA and 4% SDS were included as a control experiment: the probes will “switch on” when exposed to these solutions. 15 Fig. 16 demonstrates that Janelia Fluor 646 fluorophores can also be used to build BromoCatch specific fluorogenic probes. The acrylamide probe C10852R (Jan R) and fluoroacrylamide based probes C10852AH (Jan AH) and C10852AI (Jan AI) showed an increase in fluorescence intensity upon contacting mutant Brd4BD2L387A,E438C disclosed herein compared to when contacting the compounds with the Brd4BD2WT. The 20 probes in 0.1% TFA in ethanol were included as a control experiment: the probes will “switch on” in this acidic condition.
M&C PE963057WO 111 Sequences Number Name Amino acid sequence Q QY Y M E T IV QP PI P H TI PD VS D K P S V Y S QK P P E P LP Q V V P EA QE R RP K RA SR Q R E RI VS F N A A P
PA QPLAKKKGVKRKADTTTPTPTAILAPGSPASPPG
M&C PE963057WO 112 SLEPKAARLPPMRRESGRPIKPPRKDLPDSQQQ W E V M E A G L S V Y LE A G A SE L A TP P D A Y RE A VS Q K K E Q S L SR YL YT S E K A EK E L F F EA LQ P DE
Q SREPSLSNSNPDEIEIDFETLKASTLRELEKYVSA
M&C PE963057WO 113 CLRKRPLKPPAKKIMMSKEELHSQKKQELEKRLL ES F IG N M S RT SF KD F