WO2011102796A1 - Nouvelles protéines synthétiques à doigts de zinc et leur conception spatiale - Google Patents
Nouvelles protéines synthétiques à doigts de zinc et leur conception spatiale Download PDFInfo
- Publication number
- WO2011102796A1 WO2011102796A1 PCT/SE2011/050174 SE2011050174W WO2011102796A1 WO 2011102796 A1 WO2011102796 A1 WO 2011102796A1 SE 2011050174 W SE2011050174 W SE 2011050174W WO 2011102796 A1 WO2011102796 A1 WO 2011102796A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- library
- zinc finger
- protein
- target
- zfps
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1044—Preparation or screening of libraries displayed on scaffold proteins
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/46—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
- C07K14/47—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
- C07K14/4701—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals not used
- C07K14/4702—Regulators; Modulating activity
Definitions
- the present invention relates to the field of synthetic zinc finger proteins, including methods for their design, preparation and selection.
- the invention also relates to the generation of zinc finger libraries and to methods of making and to uses of said libraries, said zinc finger proteins and/or a nucleic acid encoding said zinc finger proteins.
- ZFPs zinc finger proteins
- ZF domains The class of proteins known as zinc finger proteins (ZFPs) was originally discovered by Klug et al in studies of the transcription factor TFIIIA. (Choo and Klug, PNAS vol. 91, 1994, pp 11163-11167).
- ZFPs comprise a domain known as the zinc finger (ZF) domain, which in its turn comprises several (usually three or more than three) independently folded motifs known as ZF motifs.
- ZF domains are known to mediate interaction with DNA.
- ZF domains may also mediate interaction with RNA or protein.
- the secondary structure of the ZF motif is characterized by being arranged in a finger- like structure and it coordinates one or more zinc ions.
- the zinc ions are important for the structural stability of the molecule.
- the ZF motifs may be divided into three types according to the type and order of the zinc co-ordinating residues. ZFPs of the classical Cys2His2 type are estimated to account for about 3% of the genes in the human genome.
- the naturally occurring ZF motif is responsible for bringing a protein into contact with a target molecule. This characteristic of ZF motifs has been exploited in synthetic proteins comprising ZF domains designed to target a selected DNA sequence. By fusing an engineered
- an artificial transcription factor is made which may selectively target and switch on, off or otherwise modulate the activity of a specific gene. Designing such artificial transcription factors capable of modulating gene expression directly at the DNA level is an attractive new approach to controlling cellular processes.
- each finger is selected from a library in context with it's the neighbouring finger (Greisman, H. A. & Pabo, C. O. A general strategy for selecting high-affinity zincfinger proteins for diverse DNA target sites. Science 275, 657-61 (1997).
- Isalan, Klug, and Choo described a bipartite optimization strategy wherein two one-and-a-half ZFs are selected for their binding to the target sequence (Isalan, M., Klug, A. & Choo, Y.
- the parallel method published by Joung and colleagues uses the Spl protein as a scaffold, and inserts alpha helices known to have affinity towards the base-pairs in the target sequence (Hurt, J. A., Thibodeau, S. A., Hirsh, A. S., Pabo, C. O. & Joung, J. K.
- the fourth method is known as modular method. It attempts to apply the principle of selection.
- ZF tandems are constructed from a panel of 140 pre-determined ZF modules (collected from the literature) in accordance with the target DNA sequence of interest.
- ZF modules with known specificities are aligned in new orders and made to bind to new selected target DNA sequences.
- Libraries constructed using ZF monomers with pre-defined binding chemistries are thus biased in terms of their complexity and applicability.
- Partial randomisation yields biased ZF libraries that are limited in terms of variety of the ZF candidates in it. Thus, it limits the chances of designing candidate ZF tandems with superior binding affinities to desired target DNA sequences. Partial randomisation similarly limits chances for targeting of all possible combinations of nucleotides in a genome, because of the introduced bias in the library.
- ZFP libraries are screened against rigidly selected DNA target sequences, where no flexibility to the core binding sequences is given. This is a biased approach that cannot efficiently lead ZFP candidates with optimal binding qualities, since they may not exist in the library.
- ZFPs designed using these methods cannot effectively bind to the target DNA sequences in the cell, since the target sequence may be epigenetically unaccesible. In the best case, this makes the process of ZFP design more laborious and expensive.
- the construction of synthetic ZFPs also involves other problems. The ideal transcription factor would have high specificity and affinity to the target sequence.
- tandem repeats of the ZF motif such as 6 or 9 repeats can be strung after each other, leading to ZF tandems which could theoretically recognize target sequences of up to 18 base pairs.
- the natural periodicity of the ZF repeat does not exactly match the periodicity of the bases on the DNA double helix and thus extensions of the peptide beyond three fingers tend to get out of register.
- the conventional methods of construction mentioned above do not take this shift into account, and therefore are less suitable for the generation of synthetic ZFPs, longer than three ZF modules.
- the preferred target sequence may or may not consist of a contiguous sequence. This is most often the case when the target sequence is in a protein. In this case it is desirable to create synthetic ZFPs which tolerate gaps in the target binding sequence. Again, the above- mentioned methods do not provide for allowing gaps of varying sizes in the binding site. Yet another limitation of the conventional methods is that they yield ZFPs which
- ZFPs can recognize and bind to other proteins and peptides with significant affinity and specificity; these ZFPs can have one, two or more ZF modules.
- Protein- ZFP type of interaction cannot be characterized. Proteins are biological macromolecules of much higher complexity than nucleic acids. Therefore, unlike targeting DNA sequences, it is impossible to start with a limited number of pre-defined ZF modules in order to generate ZFPs that are specific to a target protein / peptide.
- ZFPs specifically targeting proteins and peptides of interest can be potentially used to stabilize unstable, intrinsically disordered proteins; to cause mis- / un-folding of proteins; to inhibit interaction of proteins with the partner compounds; and other purposes with potential therapeutic applications.
- the invention in one embodiment relates to a method for the synthesis of zinc finger proteins comprising the steps of; selecting a scaffolding protein, generating a zinc finger monomer library through artificial gene synthesis, providing a multimerization vector, placing said library into said multimerization vector, selecting the minimal/flexible target sequence from the sequence of interest, defining the spatial factor for the target sequence, and multimerizing the monomer library in order to generate sub-libraries of multimeric zinc finger proteins, wherein no cell transformation is involved in said multimerisation.
- the said method may further comprise the steps of introducing a spacer element between two candidate ZF tandems wherein i.) two zinc finger monomers are chosen as anchors, ii.) a new sub-library randomizing the spacer region between these anchors is generated and iii.) said new sub-library is screened against the gapped target sequence.
- Methods of the invention yield ZF monomers and/or tandems, which display high and specific binding to sequences irrespective of the nucleotide content.
- methods of the invention yield ZF monomers and/or tandems, which display high and specific binding to target proteins and peptides.
- the high affinity and specificity is achieved in part by the use of length-adjustable linkers, to yield spatially enabled monomers and their tandems.
- the method utilizes different strategies for end- joining of ZF monomers during their multimerization.
- the core of the invention is summarized with the following components:
- Figure 1 Diagram illustrating spatial design. Striped regions indicate the adjustable linker region. Cross-hatched regions indicate the Randomized binding region. N indicates the number of monomers present. A indicates a library of spatially enabled ZF monomers, with n number of members. B indicates the multimerization of monomers. The distance x may or may not be equal to distance y which in turn may or may not be equal to the distance z. C indicates two possible spatially enabled ZF tandems which are outcomes of the
- the distance a may or may not be equal to distance b which in turn may or may not be equal to the distance c.
- Figure 2 Flowchart depicting the steps of an example of method. Hexagons indicate steps which both provide information from which conclusions may be drawn. See Example 8 for description of the steps.
- Figure 3 Example of construction and multimerization of ZF library.
- A- indicates a ZFC vector.
- B indicates ZF monomer library (artificially synthesized gene with preferred pre- definitions). Upright bars indicate randomized residues in the binding and linker regions. Number of randomized residues is pre-determined by user.
- the 3-mer library can be further multimerized to yield larger libraries, i.e.4-mer, 5-mer, 6-mer etc.; 10) The multimerized library can now be used for selection through Ribosome Display (coupled to transcription) and further evaluation. Note: Digestion with Agel and ⁇ yields compatible sticky ends.
- Table 4 DNA targeting using ZF monomer library. Table 4 shows the binding of generated zf 3-mers to different target sequences. The experiment is described in Example 10.
- Table 5 DNA targeting using ZF monomer library with introduction of Spatial Factor (SF) Table 5 shows the binding of generated zf 3-mers to different target sequences. The experiment is described in Example 11.
- SF Spatial Factor
- Table 6 DNA targeting using ZF monomer library with introduction of spacers Table 6 shows the binding of generated ZF 3-mers to different target sequences. The experiment is described in Example 12.
- FIG. 4 Multimerization of ZF monomers using scanning end-joining
- A- indicates a ZFC vector.
- B indicates ZF monomer library (artificially synthesized gene with preferred pre- definitions). Striped and hatched regions indicate linkers and spacers, respectively.
- C ZF 1- mer library;
- D ZF 1-mer library with scanning linkers;
- E ZF 2-mer library with scanning linkers; non-palindromic restriction sites can scan against each other and end-join at various positions thus generating a variety of spacers;
- F ZF 2-mer library with variable spacer;
- nnn is the variety of spacers generated due to scanning end-joining. Steps: 1) Digestion with restriction enzymes ⁇ and BglW; 2) Ligation; 3) Digestion with non- palindromic enzyme RE2, dephosphorylation; 4) Ligation with ZF unit library digested with non-palindromic restriction enzymes REl and/or RE2; 5) Closing ligation gaps using DNA polymerase, and elimination of out-of-frame multimers; 6) Continue multimerization for higher library complexity; 7) Proceed to screening and selection. For the other annotations see Figure 3.
- Figure 5 Complexity of ZFP library
- Figure 5 demonstrates complexity of ZFP library generated using ligation with one-to-one complimentarity, scanning end-joining, and scanning end-joining in combination with anchoring ZFs. The example is described in Example 18.
- Figure 6 Multimerization of ZF monomers
- Figure 6 demonstrates and compares multimerization of ZF monomers using classical ligation with one-to-one palindromic complimentarity, and scanning end-joining using non-palindromic linkers. The experiment is described in Example 19.
- Figure 7 shows how Spatial Factor contributes to selection of ZFPs with the most optimal binding properties to a target DNA sequence of interest. The experiment is described in Example 20.
- Figure 8 Flowchart of library screening against a target protein or peptide.
- Figure 8 demonstrates how the ZFP libraries generated using this invention can be screened against a target protein or peptide. The experiment is described in Example 21. This scheme is analogous to the scheme in Figure 2, which is described in Example 8.
- Figure 9 Screening with the use of anchoring ZFs for DNA targeting.
- Figure 9 shows how the invention can be used with anchoring ZFs for selection of spacers for spanning gaps between several binding subsites, in combination with scanning end-joining. The experiment is demonstrated in Example 22.
- Figure 10 Screening with the use of anchoring ZFs for protein targeting.
- Figure 10 demonstrates how ZFPs of invention can be applied for targeting of proteins using anchoring ZFPs. The experiment is demonstrated in Example 23.
- Figure 11 Targeting proteins using ZFPs.
- Figure 11 demonstrates how ZFPs of invention can be applied for targeting proteins with the purpose of interfering with protein folding, stabilization and interaction with its partners. The experiment is demonstrated in Example 24.
- Table 7 Protein / peptide targeting using ZF library. Table 7 shows the binding of generated ZF 2-mers to different target proteins. Table 7 also shows the binding of derived ZF 4-mers to the same target proteins. The experiment is described in Example 25.
- ZF corresponds to "zinc finger”.
- ZFP corresponds to "zinc finger protein”.
- ZFP zinc finger protein
- the term "zinc finger protein”, or ZFP refers to a protein comprising ZF domain.
- the ZFP may or may not also comprise further domains, for example an effector domain such as a repressor or an activator.
- ZFPs may bind DNA, RNA or protein via their ZF domains.
- zinc finger domain refers to the domain within a ZFP which comprises one or more ZF motifs.
- the term "zinc finger motif, “zinc finger module” and “zinc finger monomer” are used interchangeably and refer to the protein sequence comprising an interaction domain which coordinates one or more zinc ions. The terms refer also to a single ZF module.
- the ZF monomer may or may not comprise further amino acids, such as flanking amino acids, linker amino acids, or spacer amino acids, (q.v.).
- ZF tandem may also comprise a single ZF monomer.
- This term is also referred to as "zinc finger polypeptide” in the art. This term can be used inter-changeably with the term “zinc finger protein", which is described above.
- linker refers to sequence of amino acids between ZF monomers; a linker may or may not contain a spacer within it.
- spacer refers to sequence of amino acids optionally inserted into the linker between ZF monomers with the purpose of providing structural flexibility to linked ZF tandems in binding to target molecules.
- Recognition subsites or “subsites” refers to 3-basepair triplets of a target DNA sequence, such that each ZF module has affinity to a specific recognition subsite.
- artificial transcription factor refers to a fusion protein comprising at least one ZF motif and an effector domain.
- flexible target sequence (also referred to as “minimal / flexible DNA sequence”) refers to a target sequence, which includes extra nucleotides/residues on its flanks for the purpose of spatial design of ZF tandems.
- spatial design refers to unbiased and free-style construction of zinc-finger tandems where the key engineering parameters (such as length and composition of linker, length and composition of scaffolding ZF monomer etc.), that are believed to determine quality of zinc finger tandems, are not pre-determined, or largely neglected, and instead chosen randomly from pool of many possible combinations.
- SF Spatial Factor
- SF refers to the degree of flexibility in a chosen (DNA or protein) sequence, i.e. it is the number of nucleotides / base pairs that are added to a target sequence for the purpose of allowing flexibility in targeting.
- SF defines the length of the flexible target sequence and thus defines flexibility for screening for potential ZF candidate tandems.
- the higher SF the more flexible the core target sequence will be, and thus the more comprehensive the master ZF monomer library will be, and the more robust the selection process will be.
- self-defining ability refers to the ability of a selected minimal/ flexible target DNA sequence to select from a library of many possible combinations the ZF tandems that have the most optimal binding properties, i.e. specificity and affinity.
- ligation refers to the molecular process by means of which ZF monomers are stitched together during the process of multimerization.
- end-joining refers to the molecular process by means of which ZF monomers are stitched together during the process of multimerization. This term can be used interchangeably with the term "ligation”.
- scanning end-joining refers to the process by means of which non-palindromic (and usually non-complimentary) DNA sticky ends can be end-joined. Scanning end-joining enables multimarization of ZF monomers at various ligation positions in relation to each other.
- scanning linker refers to ZF linker region carrying a non-palindromic sticky end, which can lead to "scanning end-joining".
- gaps refers to a distance in DNA / gene or protein / peptide, which interrupts the binding subsites of two sets of ZFs. Such gaps are to be spanned by spacers.
- cell may refer to any type of cell, prokaryotic or eukaryotic, including plant cells, and includes cultured and primary cells.
- a cell culture may refer to the culture of any type of cell, such as primary cells and/or immortalised cells.
- the present invention provides methods which in their design largely disregard any co- operative effect between ZF modules and instead generates ZF libraries on the monomer level.
- the methods of the invention rely on statistical power of the ZF libraries to select the best spatial design of ZF tandem which provides the best binding to the target sequence.
- This invention enables construction of ZFP libraries with superior complexities - in most cases higher than 10 10 , such as more than 10 10 , more than 10 11 , more than 10 12 , between 10 10 and 10 12 , between 10 10 and 10 and 10 15 such as between 10 10 and 10 20 variants, such as more than 10 15 variants, such as more than 10 20 variants. This implies approximately 5xl0 6 times as many transcription factors as there are genes in, for example, the human genome.
- this invention uses this invention to generate highly complex libraries of artificial ZFP transcription factors, out of which one can select the most optimal ZFPs specific to any DNA sequence / gene and protein / peptide. It has surprisingly been found that using a library of ZF monomers randomized at the monomer level enables the resulting ZF tandems to bind all nucleotides, including purine and pyrimidine residues. Thus ZF tandems generated using this approach can bind DNA sequences represented as NNN, whereas previously ZFPs were said to bind mainly GNN or ANN. By virtue of the high library complexity, this invention similarly enables screening and selection for ZFPs that are specific to target proteins / peptides of interest. A further advantage of the invention is that it eliminates the need for laborious and time- consuming cell-based steps by instead using ligation-based steps. This greatly improves the speed and efficiency of constructing ZFPs.
- the spatial factors of the methods of the invention are two-fold.
- the monomers are spatially enabled by introduction of flanking length-adjustable spacers.
- the spatial factor applied to the target sequence is in the form of including extra flanking basepairs to influence the selection process.
- Introduction of length-adjustable spacers become possible during the multimerization process thanks to non-palindromic restriction sequences, which generate scanning linkers that can lead to scanning end-joining, and thus to variety inside the linker region.
- a target sequence is selected, and a library of ZF monomers is created and multimerized.
- the library is screened against the target sequence and the best binding candidates selected. These can be further joined, with spacers, and the process iterated.
- the unique features include: very high complexity of ZFP library (higher than 10 10 variants); unbiased multimerization of ZF monomers, and possibility of scanning end- joining and thus length-adjustable spacers that can be inserted between linkers.
- a target protein or peptide is selected, and a library of ZF monomers is created, that can optionally be multimerized.
- the library is screened against the target and the best binding candidates are selected.
- the initial steps of the methods involve the selection of a target sequence. For example, a sequence in the promoter of a gene of interest may be chosen. In other cases, a target sequence from a protein or R A may be chosen. This selected sequence is the core target sequence.
- the spatial factor refers to the inclusion of base pairs adjoining the core target sequence to create the flexible target sequence. This in effect loosens the rigidity with which the sequence is screened against the ZF library and provides more flexibility when the ZF library screens for best binding ZF tandems.
- Spatial factor is one of the central features of this invention.
- long segments of a DNA sequence of interest can be screened against the multimerized ZF library in search for ZFP candidates with the most optimal binding properties.
- This feature eliminates the need for laborious screening for "open" chromatin regions that are accessible for artificial transcription factors.
- this feature provides the possibility to screen for potential binding locations inside a gene of interest depending on cellular context. In most cases, 2-3 amino acid residues are allowed per 1 base pair of the corresponding spacer DNA in the target sequence. However, any number may be used.
- SF spatial factor
- SF is selected from the nucleotides that are adjacent to either side of the target sequence of interest.
- the choice of SF nucleotides i.e. how many nucleotides are chosen from left and right side
- SF the extra nucleotide can be added to either flank of the core target sequence.
- SF>1 all of the extra nucleotides can be added to either flank of the core target sequence, or they can be split and added to both flanks.
- SF does not affect complexity of the generated ZF monomer library, since additional randomized residues need not be introduced.
- choosing SF>0 provides an option of introducing a spacer into a linker in between ZF monomers, such that to provide extra flexibility for accessing the preferred target sequence. This option can be used either instead of adding extra nucleotides to a core target sequence or in combination with it.
- the complexity of ZF monomer library increases by the factor of introduced residues. For example, if a preferred spacer is one amino acid long, then the complexity of the corresponding ZF monomer library will increase by a factor of 64 (4x4x4) due to introduction of additional NNN.
- spatial factor is set as "undefined".
- the monomers of the library can be schematically represented as:
- X is any amino acid residue
- X L is any amino acid residue in the linker region
- X s is any amino acid residue in the spacer
- X R is any amino acid residue in the randomized region internal region
- n is any number between 1-30
- n is any number between 0-300
- Subscript numbers represent the interval of allowable amino acid residues.
- Two internal areas may carry randomised residues (X R1 and X R2 ).
- the linker region and the spacer region may also be randomised, see below.
- the nature of the target molecule (DNA, RNA, or protein) is taken into account when choosing the scaffold molecule on which to build the ZF monomer library.
- naturally occurring zinc-finger proteins are used as templates for construction of ZF monomer libraries.
- template proteins are chosen from the group comprising WT1, Spl, Zif268, FOG1, Ikaros, LIM, and MYND.
- other naturally occurring zinc-finger proteins may be used.
- the invention does not completely rely on a naturally occurring ZFP as a scaffold for library generation. Instead, the library is generated by chemical synthesis of
- ZF monomers which are 'adapted' to resemble different types of ZF families such as LIM,
- Such adaptations may be, for example, limited randomization of some parts of the proteins while preserving the native amino acid sequence in the linker region and regions preceding or following the amino acids that determine binding specificity.
- the user may choose to randomize internal residues at positions in the alpha helical region of
- ZF backbone known to determine the binding specificity of that particular ZF module.
- the positions of the alpha helices are conventionally numbered from -1 to 9. Any number of positions including null, and any combination of positions may be selected for randomization.
- the linker region between ZF monomer units also comprises spacers (X L and X s respectively in the schema above).
- the user may choose to randomize residues in either or both of these regions.
- Methods of the invention thus may comprise the step of introducing a spacer element between two candidate ZF tandems wherein i.) two zinc finger monomers are chosen as anchors, ii.) a new sub-library randomizing the spacer region between these anchors is generated and iii.) said new sub-library is screened against the gapped target sequence.
- the size and content of the linkers that bridge ZF monomers are preferably pre-determined and are selected according to the intended use of the ZF tandem to be constructed (i.e.
- pre-determined linkers may be chosen from a panel of naturally occurring linkers.
- Such naturally occurring linkers typically consist of five residues, such as: TGEKP (SEQ ID NO:46 ), TGSKP (SEQ ID NO:53 ), GFRDP (SEQ ID NO:54), GYRDP (SEQ ID NO:55 ), GYENP (SEQ ID NO:56 ), SCDDV (SEQ ID NO:57 ), GLKNP (SEQ ID NO:58 ), GYFNP (SEQ ID NO:59 ), TPGNP (SEQ ID NO:60 ), GDSGP (SEQ ID NO:61 ) etc.
- longer linkers may also be preferred.
- the linkers may be included during artificial synthesis of the ZF monomer. Spacers suitable for proteins and RNA may be either selected through randomisation or taken from the prior art.
- the spacers which may be included in the linker are also called length-adjustable spacers.
- length-adjustable spacers placed inside the linkers can be either predetermined or be introduced through complete or partial randomisation of this region.
- Randomisation of the linker region can be done by introduction of spacers at the initial level during construction of the ZF monomer library, but can also be introduced at a later stage at the joining of two tandems. In some methods, randomisation is introduced at both levels. Both methods are described.
- Length-adjustable spacers can comprise any length, combination and repetition of preferably these amino acids: K, T, S, G, D, Q, E and P.
- the length-adjustable spacers comprise amino acid stretches comprising from to 1 to 300 in length.
- length-adjustable spacers are inserted between 3 rd and 4 th residues of a linker, for example: TGE(G) precedeKP (SEQ ID NO: 62), TGE(KTG) precedeKP (SEQ ID NO: 63), TGE(KTS) dislikeKP (SEQ ID NO:64 ), TGE(GGGS)RP (SEQ ID NO: 65).
- length-adjustable spacers may be inserted at any position in the linker.
- length adjustable spacers comprising of 2-3 amino acid residues are used.
- length-adjustable spacers are (G) n (SEQ ID NO:66 ), (KTS) n (SEQ ID NO:67), (KTG)n (SEQ ID NO:68), (GGGS) n (SEQ ID NO:69), (GSEP) n (SEQ ID NO:70), (DGGGS) n (SEQ ID NO:71), (LRQKDGRP) n (SEQ ID NO:72), (LRQKDGGGSRP) n (SEQ ID NO:73)etc.
- length-adjustable spacers are added to the ends of the ZF monomers during construction of the monomers by artificial gene synthesis. The length-adjustable spacers will also then be included during the multimerization process.
- length-adjustable spacers can be included after the initial round of selection of best binding ZF monomers / tandems. In this embodiment, the selected ZF monomers/ tandems are used as anchors for binding to the target molecules, while spacers are randomized. This approach is more suitable to generate ZF tandems for the purpose of adjoining target proteins.
- length adjustable spacers are introduced both at the initial construction of the ZF monomer and after the initial round of selection.
- introduction of length-adjustable spacers is enabled through insertion of non-palindromic restriction sites during the synthesis of the ZF library.
- these sites Upon digestion with the corresponding restriction enzymes, these sites will generate long non-complimentary over-hangs that will enable scanning end-joining during multimerization of ZF monomers. Scanning end-joining of these sites in turn will generate variety inside the spacer region.
- restriction enzymes recognizing non-palindromic sequences are: BstXl, Dralll, EcdNl, PflMI, Sfi ⁇ , Alw , ApaBI, BstAPI, Bgll, Drdl, Bsll, Mwol etc.
- length-adjustable spacers inside linkers are important for spanning gaps between potential ZF binding sites in a target protein or peptide. Variety of such spacers provides the possibility for selecting the most optimal combination of ZFs to spacers.
- the ZF monomer is preferably generated through artificial gene synthesis. (See
- the ZF monomer is introduced into a specialised vector creating a library of ZF monomers (see Figure 3).
- Said specialised vector is suitable for the ensuing multimerization of the ZF monomers, creating a library of ZF multimers.
- the vector comprises elements such that when
- independent ZF monomers are mixed and matched, they nevertheless encode continuous ZF tandems without frame-shifts or other interruptions when recombined.
- the multimerization vector comprises a specialized combination of restriction sites (such as for example Agel, Kpn2 ⁇ , BgHT) at the flanks of ZF monomers, such that they not only yield compatible sticky ends, but, when recombined, also encode the amino acids that are compatible with the structure and biology of ZFPs.
- restriction sites such as for example Agel, Kpn2 ⁇ , BgHT
- the vector comprises a tag suitable for the later purification of the protein selected from the group comprising His-tag, BCCP (biotin carboxyl carrier protein) -tag, Myc (c-Myc)-tag , Calmodulin-tag , Calmodulin-binding protein (CBP) -tag, FLAG (FLAG octapeptide)-tag , HA (Hemagglutinin)-tag , His (histidine)-tag , Maltose binding protein-tag , Nus-tag , Glutathione- S -transferase (GST)-tag , Green fluorescent protein (GFP)-tag , Thioredoxin-tag , S-tag , Softag 1 , Softag 3 , Strep-tag , T7-tag , Chitin binding protein-tag , Thioredoxin-tag , Xylanase 1 OA-tag and SBP (streptavidin-binding peptide)-
- the multimerization vector comprises the following features: T7-Kozak-tag sequence: T7 promoter, Kozak sequence and His-tag; MCS (multiple cloning site): Ncol, Smal, BamRl, Kpn2l, Xhol, Nhel, Bglll, respectively, a Spacer R.D.: spacer for ribosome display and AmpR: ampicillin resistance gene
- the multimerization vector is the vector ZFC (SEQ ID NO. 74) Insertion of monomer to specialised vector
- the monomer and the vector are digested with the appropriate restriction enzymes and allowed to ligate.
- the enzymes ⁇ and BgUl are used in digestion.
- the library of ZF monomers is then fed directly into selection step against the target sequence, without multimerisation.
- the monomer library is not multimerised.
- the monomers are multimerised without the use of a specialised vector, see figure 1.
- a specialised vector is used. See above for description of the specialised vectors.
- the monomer library is not multimerised by ligation process described. Instead, length adjustable spacers are inserted to bridge the monomers (see above).
- the monomer library is multimerized to generate sub-libraries of for ZF multimer. The user may determine the extent of the multimerization process.
- the sub- libraries may comprise from 1 to 100 repeats, such as 2, 3, 4, 5, 6, 7, 8, 9, 1,0, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 repeats; for example 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 repeats; for example 31,32, 33, 34, 35, 36, 37,38, 39 or 40 repeats or for example, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 repeats.
- the sublibraries may for further example comprise 50, 70, 80, or 90 repeats.
- multimerization does not include a step for transformation of any host cell. Instead, an ex vivo (that is, no live cells are involved) approach is employed whereby multimerization of the ZF monomers or multimers is performed in a chemical environment, without involvement of live host cells. This also contributes to the efficiency of the method, and renders it scaleable to high through-put dimensions.
- This feature of the invention allows generating libraries of superior variety / representation, as well as screening the libraries at superior speeds. Similarly, this feature of the invention allows coupling the (screening of) generated libraries to further high throughput processes.
- multimerization includes a step for transformation of a host cell, such as prokaryote cell, such as E. coli.
- this invention allows generation of multiple more specialized sub- libraries that can be used for spatial design of ZF tandems specific to other (than DNA) target macro molecules, especially proteins.
- this invention allows building libraries of ZF multimers
- these ZF multimers can be used for screening against any rigidly selected or flexibly selected DNA sequences. Thus, no bias is involved in the multimerization process.
- spatially designed ZF tandems from specialized sub-libraries can be potentially used for adjoining macro molecules of similar or different classes (E.g. DNA-protein, protein-protein etc).
- scanning end-joining can be used. To be able use it, non-palindromic restriction sites need to be introduced into the spacer regions of ZF linkers during the synthesis of the ZF library. Scanning end-joining enables
- a step is inserted in a method of the invention wherein said step consists of a screening step for selection of ZF candidates with best binding.
- This step may be inserted before the multimerization step, and/or after defining the spatial factor for the target sequence.
- One or more screening methods may be employed. Examples of suitable screening methods are : Ribosome display, Two-hybrid system, One-hybrid system, Protein-protein interaction- based screening, mRNA display, Phage display, Yeast display (or yeast surface display) ,and Bacterial display (or bacteria display or bacterial surface display).
- one or more chemical screening methods are used. This contributes to the efficiency of the method, and renders it scaleable to high through-put dimensions.
- ribosome display screening is used (Zand et al, 2007, Ribosome display: selecting and evolving proteins in vitro that specifically bind to a target, Nat
- a step is inserted in a method of the invention wherein said step consists of verifying the DNA sequences encoding the ZF candidates with best binding.
- a method suitable for verifying the DNA sequences is sequencing of the DNA sequences. Methods for performing such sequencing are well known in the art.
- This step may be inserted before multimerization step. In one embodiment this step is inserted after defining the spatial factor and before multimerization. In one embodiment this step is ribosome display screening.
- length adjustable spacers between the selected best binding ZF tandems may also be performed at this stage.
- two best binding ZF multimers are selected as anchors and a new sub-library randomizing the spacer region between these anchors is generated at DNA level. This new sub-library is screened against the 'gapped' target sequence.
- a chemical linker is used to connect synthetically produced ZF monomers or multimers.
- Such flexible linkers are known to persons of skill in the art.
- polyethylene glycol linkers are available commercially.
- the best binding ZF tandems are expressed and purified.
- the tandems are purified on with His-tag.
- Other suitable purification methods may be used.
- the multimerisation vector may be modified to accomodate this (see above).
- the method of the invention comprises the step of expression-purification of selected zinc finger tandems as zinc finger protein.
- ChlP-Seq ChIP - Chromatin Immuno-Precipitation
- ChlP-on-chip ChlP-PET (paired-end ditag)
- ChlP-PCR polymerase chair reaction
- DNA microarray Two-hybrid system, One-hybrid system, Blotting for example Southern blot or hybridization.
- ChlP-seq is employed.
- Methods include SPR (surface plasmon resonance) methods, Scatchard method , any interactant titration (depletion)-based assay , Isothermal calorimetry (ITC), microcalorimetry, X-ray crystallography, NMR (nuclear magnetic resonance), Microwell- based assays (Similar to this: Hallikas O, Taipale J., 2006, High-throughput assay for determining specificity and affinity of protein-DNA binding interactions, Nat Protoc. 2006;l(l):215-22) , PCR-based assays (Similar to this: Roy M. Pollock, 2001, Determination of Protein-DNA Sequence Specificity by PCR- Assisted Binding Site Selection, Curr Protoc Mol Biol. 2001 May; Chapter 12:Unit 12.11).
- Biacore or EMSA epitrophoretic mobility shift assay
- the invention further relates to methods comprising the step of addition of an effector domain to the selected zinc finger protein to form a hybrid zinc finger protein.
- the invention also relates to a method comprising the step of transformation of nucleotides encoding a hybrid zinc finger protein into a host.
- Further steps may include the addition of effector domain (activator or repressor) to the selected ZF candidates, and optionally the transformation of the hybrid ZFP into a cell system of interest.
- Any reporter-based gene activity assay may be used, including GFP (green fluorescence protein), LacZ (beta-galactosidase-based system), GUS (beta-glucuronidase), CAT (chloramphenicol acetyltransferase), Chemiluminesence-based assays.
- the activity of the target gene is monitered through luciferase and GFP assays.
- the method of the invention comprises the step of transformation of nucleotides encoding a hybrid zinc finger protein into a host. Examples of such a host are a cell system of interest, a plant, an eukaryotic cell, a prokaryotic cell and an animal.
- the invention also relates to a zinc finger protein or fragments thereof obtained by the methods of the invention.
- the invention relates to methods of producing ZFPs and to ZFPs comprising a ZF domain obtainable or obtained by the method described above.
- the invention relates to the products obtainable or obtained by the methods of the invention.
- This includes, but is not limited to, the generated library of ZF monomers, the library of ZF multimers and to ZFPs generated by the described methods.
- the invention also relates to ZFPs obtainable or obtained by the method, wherein said ZFP has a target selected from the group comprising DNA, RNA and protein.
- the invention also relates to hybrid ZFPs, in which the ZF domains have specificity for different targets selected from the group comprising DNA, RNA and protein.
- the hybrid ZFPs may comprise an effector domain selected from the group consisting of activators and repressors.
- the invention relates to ZFPs which comprise two, three or more ZF monomers, for example 3-meric, 4-meric, 5-meric, 6-meric, 7-meric, 8-meric, 9-meric, 10-meric, 11- meric, 12-meric, 13-meric, 14-meric, 15-meric, 16-meric, 17-meric, 18-meric, 19-meric or 20-meric ZFPs.
- the invention relates further to ZFPs which comprise three, four or more ZF monomers, wherein said ZFPs have a dissociation constant (equilibrium constant), (K D ), lower than 10 ⁇ 6 .
- ZFPs have a binding constant from 10 ⁇ 8 and lower, for example in the range of 10 ⁇ 9 to 10 ⁇ 15 such as in the range of 10 ⁇ 9 ' 10 ⁇ 10 ' 10 "11 ' 10 ⁇ 12 ' 10 ⁇ 13 ' 10 ⁇ 14 or 10 ⁇ 15 .
- the invention relates to ZFPs generated by the methods as well as nucleic acids encoding for ZF monomers or multimers generated by the methods.
- the invention further relates to libraries of ZF monomer, libraries of ZF multimers and to sub-libraries of linkers and/or spacers which are generated by the method.
- the invention in another aspect relates to the use of the generated ZFPs for the manufacture of medicaments. Synthetic ZFPs are useful as endogenous regulators of malfunctioning genes.
- one embodiment of the invention is the use of the ZFP or fragments thereof in the manufacture of medicaments.
- nucleotides encoding the ZFP or fragments thereof obtained by the methods of the invention are used in the manufacture of a medicament.
- Yet another embodiment of the invention is the transformed host comprising the ZFP or nucleotides encoding the ZFP.
- ZFPs obtainable or obtained by the methods that bind to a particular target gene, and the nucleic acids encoding them, can be used for a variety of applications. These applications include therapeutic methods in which a ZFP or a nucleic acid encoding it is administered to a subject and used to modulate the expression of a target gene within the subject.
- the modulation can be in the form of repression, for example, when the target gene resides in a pathological infecting microrganisms, or in an endogenous gene of the patient, such as an oncogene or viral receptor, that is contributing to a disease state.
- the modulation can be in the form of activation when activation of expression or increased expression of an endogenous cellular gene can ameliorate a diseased state.
- ZFPs, or more typically, nucleic acids encoding them are formulated with a pharmaceutically acceptable carrier as a pharmaceutical composition.
- Pharmaceutically acceptable carriers are determined in part by the particular composition being administered, as well as by the particular method used to administer the composition, (see, e.g., Remington's Pharmaceutical Sciences, 17th ed. 1985)).
- the dose administered to a patient should be sufficient to effect a beneficial therapeutic response in the patient over time.
- the dose is determined by the efficacy and K D of the particular ZFP employed, the target cell, and the condition of the patient, as well as the body weight or surface area of the patient to be treated.
- the size of the dose also is determined by the existence, nature, and extent of any adverse side-effects that accompany the administration of a particular compound or vector in a particular patient.
- ZFPs are used in diagnostic methods for sequence specific detection of target nucleic acid in a sample.
- ZFPs can be used to detect variant alleles associated with a disease or phenotype in patient samples.
- ZFPs can be used to detect the presence of particular m NA species or cDNA in a complex mixture of mRNAs or cDNAs.
- ZFPs can be used to quantify copy number of a gene in a sample. For example, detection of loss of one copy of a p53 gene in a clinical sample is an indicator of susceptibility to cancer.
- ZFPs are used to detect the presence of pathological microorganisms in clinical samples.
- a suitable format for performing diagnostic assays employs ZFPs linked to a domain that allows immobilization of the ZFP on an ELISA plate.
- the immobilized ZFP is contacted with a sample suspected of containing a target nucleic acid under conditions in which binding can occur.
- nucleic acids in the sample are labelled (e.g., in the course of PCR amplification).
- unlabelled probes can be detected using a second labelled probe. After washing, bound labelled nucleic acids are detected.
- ZFPs also can be used for assays to determine the phenotype and function of gene expression.
- Current methodologies for determination of gene function rely primarily upon either over- expression or removing (knocking out completely) the gene of interest from its natural biological setting and observing the effects. The phenotypic effects observed indicate the role of the gene in the biological system.
- ZFP-mediated regulation of a gene relative to conventional knockout analysis is that expression of the ZFP can be placed under small molecule control.
- expression levels of the ZFPs By controlling expression levels of the ZFPs, one can in turn control the expression levels of a gene regulated by the ZFP to determine what degree of repression or stimulation of expression is required to achieve a given phenotypic or biochemical effect.
- This approach has particular value for drug development.
- problems of embryonic lethality and developmental compensation can be avoided by switching on the ZFP repressor at a later stage in mouse development and observing the effects in the adult animal.
- Transgenic mice having target genes regulated by a ZFP can be produced by integration of the nucleic acid encoding the ZFP at any site in trans to the target gene. Accordingly, homologous recombination is not required for integration of the nucleic acid. Further, because the ZFP is trans-dominant, only one chromosomal copy is needed and therefore functional knock-out animals can be produced without backcrossing.
- the ZFPs obtainable or obtained by the method are useful as investigative tools in mapping of gene function.
- the present invention relates to a library of ZF monomers generated by the methods of the invention. More specifically, the invention relates to a novel library of DNA sequences which encodes a library of independent zinc-finger peptide monomers, each monomer containing two unique regions:
- the invention also relates to a multimerization vector useful in the method and specifically to the vector described by the SEQ ID NO: 74
- the invention relates to a kit of parts comprising a library, or selected parts or part thereof and a vector.
- the invention relates to the multimerization vector according to claim 1 step c, to the ZF library according to claim 1 step d and to the multimerized ZF library according to claim 1 step g.
- Insertion of length-adjusted spacers is enabled through the use of scanning end-joining during multimerization of ZFs.
- non-palindromic restriction enzymes are used for generation of variable sticky ends that generates a variety of spacers.
- alternative techniques can be used for generation of such variable sticky ends.
- ZF flanks can be treated with single strand nucleases such as RecJ.
- the invention features construction of ZFP libraries with superior complexities, as
- this invention provides a robust platform based on which ZFPs specific to target proteins and peptides of interest can be designed and selected.
- ZFPs are known to specifically bind to proteins and peptides.
- this invention enables screening and selection of ZFPs specific to proteins and peptides of interest.
- ZFPs obtained or selected using this invention can be used for structural and energetic stabilization of proteins, including intrinsically disordered regions. Many proteins in the cell exist in unfolded /dynamic state, many of the co-fold together with their partner proteins upon interaction. This invention enables screening and selection of ZFPs that bind specifically to various interfaces on such proteins. These ZFPs can interfere with functions of such unfolded / dynamic proteins.
- ZFPs obtained or selected using this invention can be used as causing agents for mis-folding or un-folding of target proteins and peptides. These ZFPs can be further developed for therapeutic purposes.
- ZFPs obtained or selected using this invention can be used as causing agents for inhibition of interaction of target proteins and peptides with their partner molecules/ligands such as DNA, RNA, proteins, small molecules etc. These ZFPs can be further developed for therapeutic purposes.
- the invention also relates to fragments of ZFPs obtainable or obtained by the method, wherein said fragments of ZFPs carry specificity- determining residues for target molecules, such as DNA, RNA and protein / peptide. These fragments can be combinations of amino acid retrieved from selected ZFPs. For example, these fragments can be portions of the alpha-helix or beta-sheets that comprise ZFs.
- ZFPs In many cases, specific interaction of ZFPs with their target molecules is enabled by several critical residues, which reside in certain fragments within ZFPs. These fragments of ZFPs can still retain their biochemical integrity and specificity to their molecular targets, albeit being dissected from their core components.
- fragments of ZFPs can be further modified to adapt them as required for downstream purposes.
- modifications can be addition of non-natural amino acids, functional groups, stabilizing bridges etc.
- fragments of ZFPs can be used for therapeutic purposes and manufacturing of medicaments.
- Using fragments of ZFPs instead of whole ZFPs may in some cases provide pharmacological advantages, such as lower molecular weight, longer bioavailability, efficient drug delivery and so on.
- the invention also relates to gene activators and repressors generated using ZFPs obtainable or obtained by the method, wherein said ZFPs have a target selected from the group comprising DNA, RNA and protein / peptide.
- ZFPs can be fused to activator (such as p64) or repressor (such as KRAB) functional groups to yield libraries of ZF activators (ZFA) and/or repressors (ZFR), respectively.
- ZFA and ZFR libraries can be used for stimulation of new phenotypes, cell events etc. without any prior knowledge about their targets. Cells displaying the new phenotypes and cell events of interest can then be selected, and the causing ZFAs and ZFRs can be used for therapeutic purposes.
- ZFAs and ZFRs can be similarly used for generation of new cell lines.
- the invention also relates to cells, cell cultures and cell lines generated using ZFPs, ZFAs, ZFRs or fragments of these obtainable or obtained by the method.
- the cells and/or cell lines are primary cultures, ex vivo cultures and cell lines, such as immortalized cell lines.
- the cells may be of prokaryotic origin or eukaryotic origin. Examples include
- ZFPs, ZFAs, ZFRs or fragments thereof can lead to biochemical, proteomic, genomic, epigenetic and other changes within the cell that might lead to its differentiation to other cell lineages. In many cases, these new cell lineages can be regarded as new cell lines with potential therapeutic uses.
- these cell lines can be used for therapeutic purposes, manufacturing of cell therapies and other pharmaceutical needs.
- Cell-based therapies are shown to be effective for short and long-term amelioration of symptoms in many diseases, such as diabetes, leukemias etc.
- Custom cell lines are similarly shown to be useful for screening of drug compounds and their pre-clinical evaluation.
- the invention relates to a transgenic organism comprising the zinc finger protein or nucleotide according to the invention.
- Said organism may be a cell, prokaryotic or eukaryotic, plant or animal.
- the invention further relates to the use of the zinc finger proteins and/or fragments, and/or a nucleotide encoding said ZFP or fragments thereof, and/or a cell comprising said proteins or fragments or nucleotides, in the manufacture of medicaments.
- the multimerization vector is produced by modification of the pUC19 vector, which is commercially available.
- the method of construction comprises the following steps:
- pUC19 vector is digested with Sacl and Xbal; vector backbone is recovered from gel.
- SEQ ID NOs 1 and 2 The following oligonucleotides (SEQ ID NOs 1 and 2) are synthesized, hybridised and digested with Sacl and Xbal; adaptor backbone is recovered from gel.
- Steps 1, 2 and 3 introduced the multiple restriction sites and a necessary coding region.
- pUC19-ml is digested with Kpnl and Ncol; vector backbone is recovered from gel.
- the following oligonucleotides SEQ ID NO: 3 and SEQ ID NO:4 are synthesized, hybridised and digested with Kpnl and Ncol; adaptor backbone is recovered from gel.
- Steps 4, 5 and 6 introduced T7 promoter, Kozak region and 6xHIS-tag.
- pUC19-m2 is digested with BglW and Sail; vector backbone is recovered from gel.
- PETm-12 vector is digested with BglW and Sail; approx. 282 base pair fragment is recovered from gel. (Map and sequence of PETm-12 vector is available at:
- Steps 7, 8 and 9 introduced a spacer region encoding a random non-helical peptide for ribosome display.
- the targeted sequence can either be 5 ' -aCTAAGGTGC-3 ' (SEQ ID NO:6)or
- the targeted sequence can either be 5 '-aaCTAAGGTGC-3 ', (SEQ ID NO:8) (5'-
- the targeted sequence can either be
- a new DNA sub-library can be generated to encode these particular two 3-ZF tandems separated with a randomized spacer region in between such as TTSILTD-tgekp-QNGTLNE-tgekp- QRAKLER-tge-NNNN-kp-RSEKLVR-tgekp-TKARLER-tgekp-SVDNLTE (SEQ ID NO:23).
- This latter generated sub-library will provide all the possible amino acid
- a new DNA sub-library can be generated to encode these particular two 2-ZF tandems separated with a randomized spacer region in between such as SRKHLVD-tgekp-QSGQLTE-tge-NNNN-kp- RSEHLNN-tgekp-SKRVLTE (SEQ ID NO:28).
- This latter generated sub-library will provide all the possible amino acid combination to bridge the gap TT between the two initially targeted 6-basepair target sites. Screening of this sub-library yielded an optimal spacer fragment GSST (SEQ ID NO:29) to bridge the TT gap.
- the spatial factor provides the flexibility with which the ZF library screens for best binding ZF tandems.
- Chip-Seq chromatin immonu-precipitation coupled with sequencing
- other equivalent assay for verification of binding specificity i.e. to verity if the candidate ZFP binds to the gene of interest in the context of the whole genome.
- linker that is, X L and define spacer, that is, regions X s .
- linker that is, X L and define spacer, that is, regions X s .
- Several of preferable linkers can be TGE(X) n KP (SEQ ID NO:33),GFR(X) n DP (SEQ ID NO:34), GYR(X) unusualDP (SEQ ID NO:35), GYE(X) n NP (SEQ ID NO:36), SCD(X) n DV (SEQ ID NO:37), GLK(X) n NP (SEQ ID NO:38), GYF(X) n NP (SEQ ID NO:39), TPG(X) meaningNP (SEQ ID NO:40), GDS(X) n GP (SEQ ID NO:41) etc.
- a ZF monomer library is generated from the template of ZF2 of Wilms Tumour protein, which is a C2H2-type ZFP.
- the following ZF monomer template is synthesized:
- Randomized residues within the alpha-helical region positions -1, 1, 2, 3, 5 and 6 (total 6 positions) - indicated as X in the above sequence.
- Canonical TGEKP (SEQ ID NO:46) linker is preferred.
- the library was multimerized to the 3-mer state and was used to target the following 9-basepair DNA sequences, which represent various degrees of GC and AT content.
- GGGCTGTTG GGCCCGTCT
- GGGGGACCA GGGGGACCA
- ATTTTGGTA AGGGTTTTT
- SF Spatial Factor
- the ZF 3-mer library is used to target several sequences (underlined) from the human (Homo sapiens) OCT-4 gene (refer to Table 4).
- the sequences are chosen to represent various percentage content of purines and pyrimidines.
- the DNA binding properties of the finally selected best 3-ZF tandems are shown in the column 'BIACORE results'.
- BIACORE is a surface plasmon resonance-based technique that measures interaction kinetics between molecules, in this case, target DNA and candidate 3-mer ZF tandems.
- Table 1 Fragment from the promoter of Human OCT-4 gene (SEQ ID NO:48) AGGGCTGTTGGCTTTGGACAGAATGTCCAAGCAGTCAGGCCTGTCTCAGC -651
- a ZF monomer library is generated from the template of ZF 2 of Wilms Tumour protein, which is a C2H2-type ZFP.
- the following ZF monomer template is synthesized:
- Randomized residues within the alpha-helical region positions -1, 1, 2, 3, 5 and 6 (total 6 positions) - indicated as X in the above sequence.
- Canonical TGEKP SEQ ID NO:46
- the library was multimerized to the 3-mer state.
- the ZF 3-mer library is used to target a single 9-basepair DNA sequence (underlined, bold) from the human (Homo sapiens) OCT-4 gene.
- Various Spatial Factors (SF) are preferred for targeting this DNA sequence.
- the DNA binding properties of the finally selected best 3-ZF tandems are shown in the column 'BIACORE results'. See Table 5.
- SF Spatial Factor
- SF allows for the 'self-defining' ability of selected flexible DNA target sequence to prefer the best-binding 3- mer ZF tandems.
- the selected 3-mer ZF tandems are similar in content to the ones selected in Example 10.
- BIACORE results demonstrate that 3-mer ZF tandems selected against spatially enabled target DNA sequence have affinities in the nano-mo lar (10 "9 M) range.
- a ZF monomer library is generated from the template of ZF 2 of Wilms Tumour protein, which is a C2H2-type ZFP.
- the following ZF monomer template is synthesized:
- Randomized residues within the alpha-helical region positions -1, 1, 2, 3, 5 and 6 (total 6 positions) - indicated as X in the above sequence.
- Canonical TGEKP SEQ ID NO:46
- the library was multimerized to 2-mer and 3-mer state.
- the ZF 2-mer and 3-mer libraries were used to target the following 6- and 9-basepair DNA sequences (underlined) from the human (Homo sapiens) OCT-4 gene (SEQ ID NO:50).
- the corresponding best binding 2-ZF tandems were used as anchors to create a new library where a randomized spacer region (encoding 4 amino acid residues) is introduced into the TG-EKP (after G) linker to span the gap between 6-basepair and 9-basepair subsites.
- the finally selected ZF tandems comprise of two sets of 2-ZF tandems separated by spacers. The DNA binding properties of these tandems are shown in the column 'BIACORE results'. See Table 6 for results.
- GATA1 is a multi-functional protein complex that is involved in regulation of many mammalian genes in the erythropoiesis process. Another established fact is that the four C- terminal ZFs of FOG1 specifically bind to the GATA1 protein in this complex.
- FOG1 ZFs as templates, a narrow ZF library is generated that is broadly specific to GATAl-like proteins.
- this library one generates another library of hybrid ZFPs with two sets of ZF tandems - one set contains the four native FOG1 ZFs (specific to GATA1), and the other set contains engineered ZFs from the generated narrow library (broadly specific to GATAl-like proteins). When introduced into the cell, these hybrid ZFPs recruit many potential transcription factors into the GATAl-like complexes and thus broadly up-regulate erythropoiesis genes.
- Native ZFs (of PHD type) of the protein ING2 specifically bind to the protein H3K4me3, which is critically involved in chromatin regulation.
- Native ZFs (of PHD type) of the protein BPTF also specifically bind to the protein H3K4me3.
- this hybrid ZFP is used in the cell to increase recruitment of the protein H3K4me3, and thus alter / enhance efficiency of chromatin regulation.
- Native ZFs (of RING type) of the protein Rbxl are involved in specific binding to the protein Cull .
- Native ZFs (of RanBP type) of the protein Npl4 are involved in specific binding to ubiquitin.
- this hybrid ZFP can be used in the cell to increase recruitment of the ubiquination proteins and thus alter / enhance efficiency of the ubiquination process.
- the following examples 16 and 17 illustrate how the current invention can be used for generation of hybrid ZF tandems in which one cluster of ZF is protein-specific, and the other cluster of ZFs is DNA-specific.
- ZFs (especially C-terminal clusters) of transcription factors from the Ikaros or ROAZ family are involved in homo-dimerization, i.e. dimerization with itself.
- 1 st hybrid ZFP has two spacer- linked ZF clusters such that one ZF cluster contains native homo-dimerizing ZFs, and the other cluster is selected from a library of DNA-binding ZF monomers to specifically bind to a hypothetical DNA sequence 1.
- 2 nd hybrid ZFP has two spacer-linked ZF clusters such that one ZF cluster contains native homo-dimerizing ZFs, and the other cluster is selected from a library of DNA-binding ZF monomers to specifically bind to a hypothetical DNA sequence 2 (which is in close proximity to sequence 1).
- sequence 1 and sequence 2 are identical to each other.
- This property can be used to target a gene with elevated specificity, or to increase occupancy of a gene promoter (e.g. to block its
- GATA1 is a multi-functional protein complex that is involved in regulation of many mammalian genes in the erythropoiesis process. Another established fact is that the four C- terminal ZFs of FOG1 specifically bind to the GATA1 protein in this complex.
- This cluster of four ZFs can be used to construct a hybrid ZFP where the other spacer-linked cluster is selected from a library of DNA-binding ZF monomers to specifically bind to a DNA sequence within a desired gene promoter. When introduced into the cell, this hybrid ZFP can potentially increase occupancy of the promoter and thus alter / enhance activity of the gene of interest through FOGl-mediated recruitment of the GATA1 complex to it.
- A Complexity of ZFP library when using ligation with one-to-one complimentarity.
- Al ZF monomer library with 6 randomized amino acid positions.
- A3 ZF tri-mer library constructed from the library in Al and A2.
- A4 Theoretical complexity of ZF library, constructed from the above libraries, multimerized m times is (6.4 x 10 7 ) m .
- B Complexity of ZFP library when using scanning end-joining.
- Bl ZF monomer library with 6 randomized amino acid positions, and a scanning linker with a 7-base pair non-palindromic restriction site.
- the 7-basepair non-palindromic restriction site will generate 1 or 2 randomized amino acid residues in the linker region, thus, adding a 400-fold complexity to the library.
- C Complexity when using scanning end-joining in combination with anchoring ZFs (non- randomized, biding target defined).
- CI Two anchoring ZFs with pre-defined binding properties, and a linker with 7-basepair non-palindromic restriction site.
- C2 Complexity of ZF di-mer library, constructed using ZFs in CI .
- RE is a palindromic restriction enzyme that produces only TGAC type sticky ends.
- multimerization using classical one-to-one complimentary palindromic sticky ends produces ZF multimers with non-variable linkers.
- randomized residues will exist only within the ZF body, not in the linker region.
- Steps 1 - Digestion with restriction enzymes RE1 and RE2 that cut at sequences RSI and RS2 respectively; 2 - Scanning end-joining; 3 - Non-palindromic linkers can scan along each other and end-join at various positions, thus, producing a variety of spacers; 4 - End-joining of spacers at various positions.
- RE1 and RE2 are two non-palindromic restriction enzymes that produce a variety of sticky ends. End-joining of ZF monomers with such linkers provides a variety of complimentary positions at which the linkers can ligate. Hence, this produces ZF tandems with randomized residues not only within the ZF body, but also within the linker region. See Figure 6.
- A: SF 0 (i.e. target sequence is chosen rigid).
- a 9-basepair rigidly selected core target DNA sequence (a) is of interest.
- a library of ZF tri-mers (b) is used for selection of best binding ZFPs. ZF tri-mers bind to the 9 basepair DNA sequence only in one possible orientation. Steps: 1 - Combination of target DNA and ZFP library; 2 - screening process.
- SF 1.
- the same DNA sequence as above is chosen with ony 1 nucleotide flanking on either side of the core target sequence. Indicated in bold italic are extra nucleotides for flexibility.
- C: SF Undefined (i.e. a core target sequence is not defined). Indicated with (c) is a target DNA in which core target sequence is not defined. ZF tri-mer library will scan along the whole DNA sequence (with undefined core target) and thus the screening process will select only ZFPs with the most optimal binding properties, which bind anywhere within the provided DNA sequence. See Figure 7.
- A target protein that carries binding subsites for ZFPl and ZFP2, (b) and (c), respectively.
- the two binding subsites are interrupted by a gapping space (d).
- B ZFPl and ZFP2 are used as anchors; scanning end-joining is used to select for the most optimal linker to bridge the gap between the binding the target sites of ZFPl and ZFP2. Refer to Figure 6 to see how scanning end-joining works. Refer to Figure 8 to see how screening with the use of anchoring ZFs works. Thus, the most optimal length-adjusted linker (e) for the ZFP1-ZFP2 tandem is selected to span the gap between their binding sub-sites. See Figure 10.
- Example 24 Example 24
- ZFPs of this invention for targeting proteins with the purpose of interfering with protein folding, stabilization and interaction with partner molecules.
- A Stabilizing of a target protein (a), which contains a structurally unstable / unfolded / disordered region (1), as shown by the sharp-end the arrow.
- a designed three-finger ZFP (b) is used. Targeting of this protein yields a stable protein-ZFP complex (c), as shown by the blunt-end arrow.
- B Myc-Max complex (d) that is formed in the live cell.
- a designed four- finger ZFP e
- the Myc-Max complex is disrupted (f) by the interfering ZFP. Screening process described in Figure 6 and Figure 8 is used to select these ZFPs that can specifically interfere with their protein targets. See Figure 11.
- ZF monomer library is constructed in the same way as described in Example 10. The library was multimerized to the 2-mer state and was used to target the proteins BCL11 A, p53, c-Myc and NuRD domain. Spatial factor was selected as "undefined”. The protein binding properties of the selected 2-ZF tandems are shown in the column "Biacore results”. The selected ZFPs display superior specificity and affinity to their targets.
- B 2-ZF tandems selected above in (A) are used as anchors to generate a sub-library of ZFPs where a variable spacer region is inserted using non-palindromic rescriction enzymes BslI and Mwol. Scanning end-joining is used to generate variable spacers in the process of multimerization of the 2-ZFs.
- the length-adjusted spacers span the gaps between the binding subsites of 2-ZF modules.
- the protein binding properties (see "Biacore results") of the designed 4-ZF tandems with length-adjusted spacers are stronger than those of single 2-ZF modules.
- the 2-ZF tandems which are specific to their respective protein targets, can be further sub-divided into their fragments for subsequent use. Self- sufficient fragments that carry the specificity-determining residues can thus be identified through additional experimentation and extracted.
- GQCDFKDCERRFS£EE L73 ⁇ 4HQRRH (SEQ ID NO. 101) are the amino acid sequences of two of the ZF monomers specific for the protein BCL11 A, where indicated in bold are those residues that determine specificity to the target.
- RRF SEL TELPPHQKR (SEQ ID NO. 102)
- RF SEL TELPPHQR (SEQ ID NO. 103)
- F SELTELPPRQ SEQ ID NO. 104
- SELTELPPH SEQ ID NO. 105
- ELTELPP SEP ID NO. 106
- RRF SPEEP 7ISH0RR SEQ ID NO. 107
- KFSPEEPLTSHQR SEQ ID NO. 108
- FSPEEPLTSHO SEQ ID NO. 109
- SPEEPLTSR SEQ ID NO. 110
- PEEPLTS SEQ ID NO. 111
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Wood Science & Technology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biochemistry (AREA)
- Physics & Mathematics (AREA)
- Crystallography & Structural Chemistry (AREA)
- Microbiology (AREA)
- Plant Pathology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Toxicology (AREA)
- Gastroenterology & Hepatology (AREA)
- Medicinal Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Peptides Or Proteins (AREA)
Abstract
La présente invention concerne un procédé de conception, de préparation et de sélection de protéines à doigts de zinc utilisant une bibliothèque de monomères aléatoires à partir de laquelle sont produits des multimères appartenant à une bibliothèque de multimères aléatoires. Eventuellement, les monomères sont liés par des espaceurs appartenant à une sous-bibliothèque de régions d'espacement aléatoires. La bibliothèque de monomères est produite par la synthèse de gènes artificiels et la multimérisation est réalisée sans transformation cellulaire. L'invention concerne également la génération de bibliothèques de doigts de zinc et des procédés de fabrication et des utilisations desdites bibliothèques, desdites protéines à doigts de zinc et/ou d'un acide nucléique codant pour lesdites protéines à doigts de zinc.
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| SE1000159-2 | 2010-02-18 | ||
| SE1000159 | 2010-02-18 | ||
| US32286710P | 2010-04-11 | 2010-04-11 | |
| US61/322,867 | 2010-04-11 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2011102796A1 true WO2011102796A1 (fr) | 2011-08-25 |
Family
ID=44483193
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/SE2011/050174 Ceased WO2011102796A1 (fr) | 2010-02-18 | 2011-02-17 | Nouvelles protéines synthétiques à doigts de zinc et leur conception spatiale |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2011102796A1 (fr) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2021188286A3 (fr) * | 2020-02-28 | 2022-01-20 | The Broad Institute, Inc. | Domaines de dégradation des doigts de zinc |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2003066828A2 (fr) * | 2002-02-07 | 2003-08-14 | The Scripps Research Institute | Bibliotheques en doigt de gant |
| US20050202498A1 (en) * | 1998-03-02 | 2005-09-15 | Massachusetts Institute Of Technology | Poly-zinc finger proteins with improved linkers |
| US20060223114A1 (en) * | 2001-04-26 | 2006-10-05 | Avidia Research Institute | Protein scaffolds and uses thereof |
| WO2006103106A1 (fr) * | 2005-04-01 | 2006-10-05 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Ligature de peptides en doigts de zinc de synthese pour former des proteines de liaison configurees en serie pour l'adressage specifique de zones d'adn double brin (sondes en doigts de zinc) |
| US20070154989A1 (en) * | 2006-01-03 | 2007-07-05 | The Scripps Research Institute | Zinc finger domains specifically binding agc |
| WO2009146179A1 (fr) * | 2008-04-15 | 2009-12-03 | University Of Iowa Research Foundation | Nuclease a doigts de zinc pour le gene cftr et methodes d’utilisation associees |
-
2011
- 2011-02-17 WO PCT/SE2011/050174 patent/WO2011102796A1/fr not_active Ceased
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20050202498A1 (en) * | 1998-03-02 | 2005-09-15 | Massachusetts Institute Of Technology | Poly-zinc finger proteins with improved linkers |
| US20060223114A1 (en) * | 2001-04-26 | 2006-10-05 | Avidia Research Institute | Protein scaffolds and uses thereof |
| WO2003066828A2 (fr) * | 2002-02-07 | 2003-08-14 | The Scripps Research Institute | Bibliotheques en doigt de gant |
| WO2006103106A1 (fr) * | 2005-04-01 | 2006-10-05 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Ligature de peptides en doigts de zinc de synthese pour former des proteines de liaison configurees en serie pour l'adressage specifique de zones d'adn double brin (sondes en doigts de zinc) |
| US20070154989A1 (en) * | 2006-01-03 | 2007-07-05 | The Scripps Research Institute | Zinc finger domains specifically binding agc |
| WO2009146179A1 (fr) * | 2008-04-15 | 2009-12-03 | University Of Iowa Research Foundation | Nuclease a doigts de zinc pour le gene cftr et methodes d’utilisation associees |
Non-Patent Citations (4)
| Title |
|---|
| DATABASE PUBMED HANDEL EVA-MARIA ET AL: "Expanding or restricting the target site repertoire of zinc-finger nucleases: the inter-domain linker as a major determinant of target site selectivity", Database accession no. 19002164 * |
| MEADER M.L. ET AL: "Rapid "open-source" engineering of customized zinc-finger nucleases for highly efficient gene modification", MOLECULAR CELL & SUPPLEMENT, vol. 31, 2008, pages 294 - 301 * |
| MOL. THER., vol. 17, no. 1, January 2009 (2009-01-01), pages 104 - 111 * |
| WRIGHT D.A. ET AL: "Standardized reagents and protocols for engineering zinc finger nucleases by modular assembly", NATURE PROTOCOLS, vol. 1, 2006, pages 1637 - 1652 * |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2021188286A3 (fr) * | 2020-02-28 | 2022-01-20 | The Broad Institute, Inc. | Domaines de dégradation des doigts de zinc |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Rinn et al. | Long noncoding RNAs: molecular modalities to organismal functions | |
| Choudhuri | Bioinformatics for beginners: genes, genomes, molecular evolution, databases and analytical tools | |
| Re et al. | RNA–protein interactions: an overview | |
| Gardiner et al. | Mouse models of Down syndrome: how useful can they be? Comparison of the gene content of human chromosome 21 with orthologous mouse genomic regions | |
| US7177766B2 (en) | Selection of sites for targeting by zinc finger proteins and methods of designing zinc finger proteins to bind to preselected sites | |
| US20220127597A1 (en) | Haplotagging - haplotype phasing and single-tube combinatorial barcoding of nucleic acid molecules using bead-immobilized tn5 transposase | |
| JP2025161809A (ja) | 新規crispr dnaターゲティング酵素及びシステム | |
| US5861246A (en) | Multiple selection process for binding sites of DNA-binding proteins | |
| Sousa et al. | Amino acids–nucleotides biomolecular recognition: from biological occurrence to affinity chromatography | |
| Szalay et al. | Evolution and function of chromatin domains across the tree of life | |
| JP2023503618A (ja) | 遺伝子発現を活性化するためのシステムおよび方法 | |
| Xu et al. | Genome reconstruction and haplotype phasing using chromosome conformation capture methodologies | |
| Huang et al. | The matrix revolutions: towards the decoding of the plant chromatin three-dimensional reality | |
| AU2003215094B2 (en) | Zinc finger libraries | |
| WO2022170117A1 (fr) | Intégration continue de doigts de zinc modifiés en facteurs de transcription endogènes pour commander leurs fonctions naturelles | |
| Marinov et al. | Genome-wide distribution of 5-hydroxymethyluracil and chromatin accessibility in the Breviolum minutum genome | |
| Raabe et al. | The rocks and shallows of deep RNA sequencing: Examples in the Vibrio cholerae RNome | |
| US20160017410A1 (en) | Highly multiplex single amino acid mutagenesis for massively parallel functional analysis | |
| WO2016197065A1 (fr) | Sondes à base d'oligonucléotides monobrin d'adaptation longs (lasso) pour capturer et cloner des bibliothèques complexes | |
| JP2004528850A (ja) | 定方向進化の新規方法 | |
| US20230167436A1 (en) | Compositions and methods for identification of zinc fingers | |
| WO2011102796A1 (fr) | Nouvelles protéines synthétiques à doigts de zinc et leur conception spatiale | |
| Imanishi et al. | Design of novel zinc finger proteins: towards artificial control of specific gene expression | |
| Huang et al. | CTCF mediates dosage and sequence-context-dependent transcriptional insulation through formation of local chromatin domains | |
| Gao et al. | HITAC-seq enables high-throughput cost-effective sequencing of plasmids and DNA fragments with identity |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11744976 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205N DATED 25/10/2012) |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 11744976 Country of ref document: EP Kind code of ref document: A1 |