WO2010001189A1 - Structure cristalline de i-dmoi dans un complexe avec sa cible d'adn, méganucléases chimères améliorées, et leurs utilisations - Google Patents
Structure cristalline de i-dmoi dans un complexe avec sa cible d'adn, méganucléases chimères améliorées, et leurs utilisations Download PDFInfo
- Publication number
- WO2010001189A1 WO2010001189A1 PCT/IB2008/002756 IB2008002756W WO2010001189A1 WO 2010001189 A1 WO2010001189 A1 WO 2010001189A1 IB 2008002756 W IB2008002756 W IB 2008002756W WO 2010001189 A1 WO2010001189 A1 WO 2010001189A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- dmol
- dna
- dna target
- polypeptide
- domain
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases [RNase]; Deoxyribonucleases [DNase]
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2299/00—Coordinates from 3D structures of peptides, e.g. proteins or enzymes
Definitions
- the present invention relates to the three-dimensional structure of the meganuclease l-Dmol in combination with its DNA target.
- the present invention also relates to l-Dmol enzymes with altered characteristics such as altered target half sites or altered catalysis properties and to chimeric meganucleases comprising portions of I-Dmol and to the use of the three-dimensional structure of the meganuclease l-Dmol in combination with its DNA target in an in silico screening method.
- Meganucleases are sequence-specific enzymes which recognize large (12-45 bp) DNA target sites. These enzymes are often encoded by introns or inteins behaving as mobile genetic elements.
- DSB repair by homologous recombination with an intron- or intein- containing gene results in the insertion of the intron or intein where DSB occurred, in a specific locus in living cells (Thierry and Dujon, 1992).
- meganuclease-induced recombination has long been limited by the repertoire of natural meganucleases.
- meganucleases are essentially represented by homing endonucleases, a family of endonucleases encoded by mobile genetic elements, whose function is to initiate DSB induced recombination events in a process referred to as homing (Chevalier and Stoddard, 2001).
- homing endonucleases Several hundred homing endonucleases have been identified in bacteria, eukaryotes, and archaea (Chevalier and Stoddard, 2001); however, the probability of finding a homing endonuclease cleavage site in a chosen gene is extremely low.
- Sequence homology has been used to classify homing endonucleases into four families, the largest one having the conserved LAGLIDADG sequence motif. Homing endonucleases with only one such motif function as homodimers. In contrast, larger homing endonucleases containing two motifs are single chain proteins (Dalgaard et al., 1993; Jacquier and Dujon, 1985).
- LAGLIDADG endonuclease family Structural information for several members of the LAGLIDADG endonuclease family indicate that these proteins adopt a similar active conformation as homodimers or as monomers with two separate domains (Chevalier et al., 2001 ; Ichiyanagi et al., 2000; Silva et al., 1999; Spiegel et al., 2006).
- the LAGLIDADG motifs form structurally conserved ⁇ -helices tightly packed at the center of the interdomain or intermonomer interface.
- the last acidic residue of the LAGLIDADG motif participates in the DNA cleavage by a metal dependent mechanism of phosphodiester hydrolysis (Chevalier and Stoddard, 2001).
- Homing endonucleases with one LAGLIDADG (L) are around 20 kDa in molecular mass and act as homodimers. Those with two copies (LL) range from 25 kDa (230 amino acids) to 50 kDa (HO, 545 amino acids) with 70 to 150 residues between each motif and act as a monomer. Cleavage of the target sequence occurs inside the recognition site, leaving a 4 nucleotide staggered cut with 3'OH overhangs.
- ⁇ -Ceul and I-Crel are homing endonucleases with one LAGLIDADG motif (mono- LAGLIDADG).
- I-Dmol (194 amino acids, SWISSPROT accession number P21505 (SEQ ID NO: 2)), l-Scel, Pl-Pful and Fl-Scel are homing endonucleases with two LAGLIDADG motifs.
- residue numbers refer to the amino acid numbering of the I-Dmol sequence SWISSPROT number P21505 (SEQ ID NO: 2).
- LAGLIDADG proteins have been crystallized and they have been shown to exhibit a striking conservation of the core structure that contrasts with a lack of similarity at the primary sequence level (Jurica et al., 1998; Chevalier et al., 2001 ; Chevalier et al., 2003; Moure et al., 2003; Moure et al., 2002; Ichiyanagi et al., 2000; Duan et al., 1997; Bolduc et al., 2003; Silva et al., 1999).
- LAGLIDADG proteins should they cut as dimers like I-Crel or as a monomer like /- Dmol, adopt a similar active conformation.
- the LAGLIDADG motifs are central and form two packed ⁇ -helices where a 2-fold (pseudo-) symmetry axis separates two monomers or apparent domains.
- the LAGLIDADG motif corresponds to residues 13 to 21 in l-Crel, and to positions 12 to 20 and 109 to 1 17, in I-Dmol.
- a four ⁇ -sheet provides a DNA binding interface that drives the interaction of the protein with the half site of the target DNA sequence.
- I-Dmol is similar to I-Crel dimers, except that the A domain (residues 1 to 95) and the B domain
- LAGLIDADG proteins including Pl-Scel (Gimble et al.,
- I-Crel (Seligman et al., 2002; Sussman et al., 2004; Rosen et al., 2006;
- I-Scel Doyon et al., 2006
- I-Msol I-Msol
- Semi rational design assisted by high throughput screening methods have allowed the Applicants to derive thousands of novel proteins from I-Crel, an homodimeric protein from the LAGLIDADG family (Smith et al., 2006; Arnould et al., 2006).
- Another strategy is to combine domains from distinct meganucleases. This approach has been illustrated by the creation of new meganucleases by domain swapping between I-Crel and I-Dmol, leading to the generation of a meganuclease cleaving the hybrid sequence corresponding to the fusion of the two half parent target sequences (Epinat et al., 2003; Chevalier et al.,
- I-Dmol is a 22 kDa endonuclease from the hyperthermophilic archae
- Desulfurococcus mobilis It is a monomelic protein comprising two similar domains, which have both a LAGLIDADG motif. The structure of the protein alone, without its
- D 1234 DNA target henceforth referred to as D 1234 (SEQ ID NO: 7), has been solved (Silva et al., 1999).
- E-Drel Engineered I-Dmol/I-Crel
- E-Drel consists of the fusion of the first or A domain of I-Dmol to a single subunit of the I-Crel homodimer linked by a flexible linker to create the initial scaffold for the enzyme.
- Chevalier et al. then made a number of residue modifications based upon the predictions of computational interface algorithms so as to alleviate any potential steric clashes predicted from a 3D model generated by combining elements of previously generated I-Dmol and I-Crel models.
- Residues were identified between the facing surfaces of the two component molecules; in particular residues at positions 47, 51, 55, 108, 193 and 194 were identified as potentially clashing. These residues were replaced with alanine residues but such a modified protein was found to be insoluble.
- Residue numbers refer to the E-Drel open reading frame which comprises 101 residues (beginning at the first methionine) from domain A of I-Dmol fused to the last 156 residues of I-Crel separated by a three amino acid NGN linker which mimics the native I-Dmol linker in length.
- the interface was then optimised through a combination of computational redesign for residues 47, 51, 55, 108, 193 and 194 as well as residues 12, 13, 17, 19, 52, 105, 109 and 1 13; followed by an in vivo protein folding assay upon selected sequences to determine the solubility of E-Drel enzymes modified at these residues.
- a final scaffold was designed with modifications: 119, H51 and H55 of I-Dmol and E8, LI l, F16, K96 and L97 of I-Crel (corresponding to E105, L108, Fl 13, K193 and L194).
- E-Drel Cholier et al., 2002 structure in complex with its chimeric DNA target dre3 (C12D34 (SEQ ID NO: 5) using the applicants nomenclature) was solved. E-Drel was shown able to recognise and cut this hybrid
- C12D34 (SEQ ID NO: 5) target only. From this structure a number of residues were predicted to be base-specific contacts of E-Drel to its target hybrid site, these residues were 25, 29, 31, 33, 34, 35, 37, 70, 75, 76, 77, 79, 81 of I-Dmol; and residues 123, 125, 127, 130, 137, 135, 139, 141, 163, 165, 167, 172 of /-Oe/.
- DmoCre is a chimeric molecule built from the two homing endonucleases I-Dmol and I-Crel. It includes the N-terminal portion from I-Dmol linked to an I-Crel monomer.
- DmoCre could have a tremendous advantage as scaffold: mutation in the I-Dmol moiety could be combined with mutations in the /- Crel domain, and thousands of such variant I-Crel molecules have already been identified and profiled (Smith J et al., 2006; Arnould S et al., 2006; Arnould S et al., 2007).
- DmoCre is a monomeric protein that corresponds to the A domain of I-Dmol up to residue F 109 followed by I-Crel from residue L 13. To avoid a steric clash, 1107 of the I-Dmol domain was mutated into a leucine residue. In addition, residues 47, 51 and 55 of /- Dmol, which were found to be close to residues 96 and 97 of I-Crel, were mutated to alanine, alanine and aspartic acid respectively.
- DmoCre has been shown to be active in vitro (Epinat et al., 2003) and was able to cleave the hybrid target C12D34 (SEQ ID NO: 5) composed from the left part of C 1234 (SEQ ID NO: 4) or C 1221 (SEQ ID NO: 6) (the palindromic target derived from C 1234) and the D 1234 (SEQ ID NO: 7).
- I-Dmol and DmoCre variants able to cleave their DNA target sequences more efficiently at 37°C were identified by random mutagenesis and screening in yeast cells (WO 2005/105989; Prieto et al., 2008).
- E-Drel and DmoCre chimeric enzymes therefore have in common the A domain of I-Dmol.
- These E-Drel and DmoCre chimeric enzymes differ significantly in other respects as outlined above.
- the inventors are interested in creating a new generation of chimeric enzymes which recognize a wider set of target sequences. By being able to target new
- the applicants provide the tools to thereby induce a DNA recombination event, a loss of a particular DNA segment or cell death.
- This double-strand break can be used to: repair a specific sequence, modifying a specific sequence, restoring a functional gene in place of a mutated one, attenuating or activating an endogenous gene of interest, introducing a mutation into a site of interest, introducing an exogenous gene or a part thereof, inactivating or detecting an endogenous gene or a part thereof, translocating a chromosomal arm, or leaving the DNA unrepaired and degraded.
- modified meganuclease enzymes therefore give a user a wide variety of potential options in the therapeutic, research or other productive use of such modified meganuclease enzymes.
- the inventors have therefore sought to improve chimeric meganuclease enzymes comprising at least one l-Dmol domain by seeking to increase the number of DNA targets these chimeric enzymes can recognize and cut.
- the present invention relates to a polypeptide, comprising the sequence of an l-Dmol endonuclease or a chimeric derivative thereof, including at least the l-Dmol domain B and characterized in that it comprises the substitution of at least one of the residues at positions 124, 126, 154, 155 of said l-Dmol domain B and wherein the polypeptide recognises an l-Dmol DNA target half-site which differs from a wild type I-Dmol DNA target half-site SEQ ID NO: 1, in at least one of positions ⁇ 2, ⁇ 3, ⁇ 5 ⁇ 6, ⁇ 7.
- substitution of one amino acid residue for another is well known in the art to cause changes to the structure and activity of a protein.
- substitution of a non-polar amino acid residue for a polar amino acid would be expected to alter the interaction of this residue with the polypeptide in which it is present potentially affecting the three-dimensional structure thereof or conformation of an active/binding site and also to affect the function of the residue if this is linked to the presence of a polar side chain.
- this gross replacement of one type of amino acid with another more subtle alterations are also possible.
- polypeptide further comprises the substitution of at least one of the residues in positions 1 19, 128, 157 of the I-Dmol domain B, by any amino acid, which alters the recognition of said polypeptide for an I-Dmol DNA target half-site which differs from a wild type I-Dmol DNA target half-site SEQ ID NO: 1, in at least one of positions ⁇ 2 ⁇ 3.
- polypeptide further comprises the substitution of at least one of the residues in positions 1 15, 1 16, 1 17, 1 18, 120, 130, 150, 152, 153, 156, 158, 160, 164, 166, 167, 170 of said I-Dmol domain B, by any amino acid, which alters the recognition of said polypeptide for an I-Dmol DNA target half-site which differs from a wild type I-Dmol DNA target half-site SEQ ID NO: 1, in at least one of positions ⁇ 1, ⁇ 2 ⁇ 3, ⁇ 4, ⁇ 5 ⁇ 6, ⁇ 7, ⁇ 8, ⁇ 9.
- polypeptide is a chimeric-Dm ⁇ endonuclease consisting of the fusion of said I-Dmo I domain B to a sequence of a dimeric LAGLIDADG homing endonuclease or to a domain of another monomeric LAGLIDADG homing endonuclease.
- said I-Dmol domain B is fused to a domain selected from one of the enzymes in the group: /See I, I-Chu I, I-Cre I, I-Csm I, Pl-Sce I, PITH I, PI-Mtu I, I-Ceu I, I-Sce II, I-Sce III, HO, Pi-Civ I, PI-Ctr I, PI-Aae I, PI-Bsu I, PI-Dha I, PI-Dr a I, PI-Mav I, PI-Mch I, PI-Mfu I, PI-MjI I, PI-Mga I, PI-Mgo I, PI- Min I, PI-Mka I, PI-MIe I, PI-Mma I, PI-Msh I, PI-Msm I, Pl-Mth I, PI-Mtu I 1 PI-Mxe I, PI-Npu
- polypeptide is characterized in that said I-Dmol domain B is at the NH 2 -terminus of said chimeric-Dw ⁇ endonuclease.
- polypeptide is characterized in that said dimeric LAGLIDADG homing endonuclease is I-Crel.
- polypeptide is characterized in that it comprises a detectable tag at its NH 2 and/or COOH terminus.
- polypeptide is characterized in that it comprises a
- NLS Nuclear Localisation Signal
- the NLS comprises a peptide sequence selected from the group consisting of: SEQ ID NO: 19, 20, 21, 22, 23.
- a nuclear localizing sequence is an amino acid sequence which acts to target the protein to the cell nucleus through the Nuclear Pore Complex and to direct a newly synthesized protein into the nucleus via its recognition by cytosolic nuclear transport receptors.
- NLSs consist of one or more short sequences of positively charged amino acids such as lysines or arginines.
- the NLS is selected from the NLS sequences of the known proteins SV40 large T antigen -PKKKRKV- (SEQ ID NO: 19), nucleoplasm ⁇ -KR[PAATKKAGQA]KKKK- (SEQ ID NO: 20), p54 -RIRKKLR- (SEQ ID NO: 21), SOX9 -PRRRK- (SEQ ID NO: 22), NS5A -PPRKKRTVV- (SEQ ID NO: 23).
- a polynucleotide characterized in that it encodes a polypeptide according to the present invention.
- a vector characterized in that it comprises a polynucleotide according to the present invention.
- a host cell characterized in that it is modified by a polynucleotide or a vector according to the present invention.
- a non-human transgenic animal characterized in that all or part of its cells are modified by a polynucleotide or a vector according to the present invention.
- a transgenic plant characterized in that all or part of its cells are modified by a polynucleotide or a vector according to the present invention.
- a seventh aspect of the present invention there is provided the use of a polypeptide, a polynucleotide, a vector, a cell, a non-human animal or a plant, according to the present invention for the selection and/or the screening of meganucleases with novel DNA target specificity.
- a method of identifying polypeptides comprising at least one domain of /- Dmol which can recognise and bind to an altered DNA target, comprising at least the steps of: i) applying a 3-dimensional molecular modelling algorithm to at least the set of atomic coordinates set out in Table II and figures 8 and 9 to determine the spatial coordinates of the DNA interacting portions of a candidate polypeptide and its native DNA target, modelled from the set of atomic coordinates and generating a model; ii) modifying at least one residue of the candidate polypeptide and altering the characteristics of the model accordingly; iii) electronically screening the modified candidate polypeptide of step ii) against a stored set of spatial coordinates representing the native DNA target sequence and at least one variant thereof; iv) calculating from said model the interaction energies of the modified candidate polypeptide of step ii) with the stored set of DNA targets; v) converting said interaction energies into a probability score to predict the preference of the modified polypeptide for
- the inventors have developed a new means to cut down the number of in vitro/in vivo experiments that need to be performed when attempting to identify I-Dmol enzymes or chimeric derivatives thereof that bind to altered DNA targets, by modelling these variants in a first screen in silico to identify possible candidate polypeptides for further in vitro/in vivo studies.
- the stored set of spatial coordinates of step iii) comprises the native DNA target in which at least one base therein is changed to the three alternate possible bases.
- the stored set of spatial coordinates comprises all possible variants of the native DNA target sequence.
- the modified residue of step ii) forms a direct contact between the candidate polypeptide and the native DNA target sequence.
- the modified residue of step ii) forms an indirect contact between the candidate polypeptide and the native DNA target sequence.
- the modified residue of step ii) forms a molecular interaction selected from the group: hydrogen bond, polar contact and van der Waals interactions, between said candidate polypeptide and said native DNA target sequence.
- the candidate polypeptide comprises at least said I-Dmol domain B; and wherein in step ii) at least one of residues in positions 1 15, 1 16, 1 17, 1 18, 1 19,120, 124, 126, 128,130, 150, 152, 153, 154, 156, 155, 157, 158, 160, 164, 166, 167, 170 of said I-Dmol domain B is altered; and wherein the at least one altered DNA target differs from a native DNA target consisting of SEQ ID NO: 7, in at least one of positions -1, -2, -3, -4 -5 -6, -7, -8, -9.
- the candidate polypeptide comprises at least the I-Dmol domain A; and wherein in step ii) at least one of residues in positions 15,
- the at least one altered DNA target differs from a native DNA target consisting of SEQ ID NO: 7, in at least one of positions +1, +2, +3, +4, +5 +6, +7, +8, +9, +10, +1 1, +12, +13.
- the candidate polypeptide consists of an A or B domain of I-Dmol fused to a dimeric LAGLIDADG homing endonuclease or to a domain of another monomeric LAGLIDADG homing endonuclease.
- the candidate polypeptide consists of either said I-Dmol domain A or B fused to a domain selected from one of the enzymes in the group: I-Sce I, I-Chu I, I-Cre I 1 I-Csm I, Pl-Sce I, PI-TU I, PI-Mtu I, I-Ceu I, I-Sce II, I-Sce III, HO 1 Pi-Civ I, Pl-Ctr I, PI-Aae I, PI-Bsu I, PI-Dha I, PI-Dra I, PI-Mav I, PI- Mch I, PI-Mfu I 1 PI-Mfl I, PI-Mga I, PI-Mgo I, PI-Min I, PI-Mka I, PI-MIe I, PI-Mma I 1 PI-Msh I, PI-Msm I, PI-Mth I, PI-Mtu I,
- Figure 1. shows the crystal structure of I-Dmol in complex with its target DNA.
- Figure 2a. - shows a detailed view of the I-Dmol active site.
- Figure 2c. - shows two atoms of manganese in the digested DNA structure.
- Figure 2d. - shows a schematic diagram of the hypothetical enzymatic mechanism proposed for I-Dmol.
- Figure 3. shows a scheme of the Protein-DNA contacts in the Ca 2+ and Mn 2+ bound structures.
- Figure 4. shows the loops involved in DNA binding by I-Dmol.
- the upper part of the figure depicts a ribbon diagram of the I-DmoIIOHA. complex.
- the lower part of the figure shows detailed insets of the three loops involved in DNA interactions.
- Figure 5a shows a structural sequence alignment between the archaeal I-Dmol, eukaryotic I-Scel and I-Crel homing endonucleases.
- Figure 5b. - shows a comparison of the location of the protein-base contacts in the I-Dmol, I-Scel and I-Crel protein-DNA structures.
- Figure 5c. - shows a schematic view of the protein-base contacts.
- Figure 6a - shows In silico binding patterns for I-Dmol, E-Drel, I- Crel and I-Scel.
- Figure 6b shows In silico R-10NNN binding pattern predicted by FoIdX.
- Figure 7a shows in vivo cleavage patterns for the I-Dmol recognition site.
- Figure 7b. - shows cleavage activities of I-Dmol wild type and of two mesophilic I-Dmol variants (Dl, D2).
- Figure 8 lists the atomic coordinate data of I-Dmol in combination with its target in the presence Of Mn 2+ in pdb format.
- Figure 9 lists the atomic coordinate data of I-Dmol in combination with its target in the presence of Ca 2+ in pdb format.
- - Amino acid residues in a polypeptide sequence are designated herein according to the one-letter code, in which, for example, Q means GIn or Glutamine residue, R means Arg or Arginine residue and D means Asp or Aspartic acid residue.
- - hydrophobic amino acid refers to leucine (L), valine (V), isoleucine (I), alanine (A), methionine (M), phenylalanine (F), tryptophane (W) and tyrosine (Y).
- nucleosides are designated as follows: one-letter code is used for designating the base of a nucleoside: a is adenine, t is thymine, c is cytosine, and g is guanine.
- r represents g or a (purine nucleotides)
- k represents g or t
- s represents g or c
- w represents a or t
- m represents a or c
- y represents t or c (pyrimidine nucleotides)
- d represents g, a or t
- v represents g, a or c
- b represents g, t or c
- h represents a, t or c
- n represents g, a, t or c.
- parent LAGLIDADG homing endonuclease is intended a wild-type LAGLIDADG homing endonuclease or a functional variant thereof.
- Said parent LAGLIDADG homing endonuclease may be a monomer, a dimer (homodimer or heterodimer) comprising two LAGLIDADG homing endonuclease core domains which are associated in a functional endonuclease able to cleave a double-stranded DNA target of 22 to 24 bp.
- LAGLIDADG homing endonuclease a wild-type homodimeric LAGLIDADG homing endonuclease having a single LAGLIDADG motif and cleaving palindromic DNA target sequences, such as I-Crel or l-Msol or a functional variant thereof.
- LAGLIDADG homing endonuclease variant or “variant” is intended a protein obtained by replacing at least one amino acid of a LAGLIDADG homing endonuclease sequence, with a different amino acid.
- LAGLIDADG homing endonuclease variant which is able to cleave a DNA target, preferably a new DNA target which is not cleaved by a wild type LAGLIDADG homing endonuclease.
- such variants have amino acid variation at positions contacting the DNA target sequence or interacting directly or indirectly with said DNA target.
- homose variant with novel specificity is intended a variant having a pattern of cleaved targets (cleavage profile) different from that of the parent homing endonuclease.
- the variants may cleave less targets (restricted profile) or more targets than the parent homing endonuclease.
- the variant is able to cleave at least one target that is not cleaved by the parent homing endonuclease.
- novel specificity refers to the specificity of the variant towards the nucleotides of the DNA target sequence.
- I-CreF is intended the wild-type I-Crel having the sequence SWISSPROT P05725 or pdb accession code Ig9y (SEQ ID NO:8) .
- I-DmoF is intended the wild-type I-Dmol having the sequence SWISSPROT number P21505 (SEQ ID NO: 2) .
- domain or “core domain” is intended the "LAGLIDADG homing endonuclease core domain” which is the characteristic ⁇ fold of the homing endonucleases of the LAGLIDADG family, corresponding to a sequence of about one hundred amino acid residues. Said domain comprises four beta-strands folded in an antiparallel beta-sheet which interacts with one half of the DNA target. This domain is able to associate with another LAGLIDADG homing endonuclease core domain which interacts with the other half of the DNA target to form a functional endonuclease able to cleave said DNA target.
- the LAGLIDADG homing endonuclease core domain corresponds to the residues 6 to 94.
- two such domains are found in the sequence of the endonuclease; for example in I-Dmol ( 194 amino acids), the A domain (residues 7 to 99) and the B domain (residues 104 to 194) are separated by a short linker (residues 100 to 103).
- subdomain is intended the region of a LAGLIDADG homing endonuclease core domain which interacts with a distinct part of a homing endonuclease DNA target half-site.
- Two different subdomains behave independently or partly independently, and the mutation in one subdomain does not alter the binding and cleavage properties of the other subdomain, or does not alter it in a number of cases. Therefore, two subdomains bind distinct part of a homing endonuclease DNA target half-site.
- Beta-hairpin is intended two consecutive beta-strands of the antiparallel beta-sheet of a LAGLIDADG homing endonuclease core domain which are connected by a loop or a turn, - by "C 1221" it is intended to refer to the first half of the I-Crel target site ' 12' repeated backwards so as to form a palindrome ' 1221 '.
- the cleavage activity of the variant of the invention may be measured by a direct repeat recombination assay, in yeast or mammalian cells, using a reporter vector, as described in the PCT Application WO 2004/067736; Epinat et al, 2003; Chames et al., 2005 and Arnould et al., 2006.
- the reporter vector comprises two truncated, non-functional copies of a reporter gene (direct repeats) and a chimeric DNA target sequence within the intervening sequence, cloned in a yeast or a mammalian expression vector.
- the DNA target sequence is derived from the parent homing endonuclease cleavage site by replacement of at least one nucleotide by a different nucleotide.
- a panel of palindromic or non- palindromic DNA targets representing the different combinations of the 4 bases (g, a, c, t) at one or more positions of the DNA cleavage site is tested (4 n palindromic targets for n mutated positions).
- Expression of the variant results in a functional endonuclease which is able to cleave the DNA target sequence. This cleavage induces homologous recombination between the direct repeats, resulting in a functional reporter gene, whose expression can be monitored by appropriate assay.
- cleavage site is intended a 22 to 24 bp double- stranded palindromic, partially palindromic (pseudo-palindromic) or non-palindromic polynucleotide sequence that is recognized and cleaved by a LAGLIDADG homing endonuclease.
- These terms refer to a distinct DNA location, preferably a genomic location, at which a double stranded break (cleavage) is to be induced by the endonuclease.
- the DNA target is defined by the 5' to 3' sequence of one strand of the double-stranded polynucleotide.
- the palindromic DNA target sequence cleaved by wild type I-Crel is defined by the sequence 5'- t.i 2 c.na-ioa -9 a-ga -7 c -6 g. 5 t -4 c- 3 g. 2 t-ia + [C +2 g+ 3 a + 4C + 5g + 6t+7t+ 8 t+9t+iog+iia + i2 (SEQ ID NO:4).
- Cleavage of the DNA target occurs at the nucleotides in positions +2 and -2, respectively for the sense and the antisense strand. Unless otherwise indicated, the position at which cleavage of the DNA target by a meganuclease variant occurs, corresponds to the cleavage site on the sense strand of the DNA target.
- DNA target half-site by "DNA target half-site", "half cleavage site” or half-site” is intended the portion of the DNA target which is bound by each LAGLIDADG homing endonuclease core domain.
- DClONNN (SEQ ID NO: 3) it is intended that this is the target sequence of DmoCre with variability in positions +8, +9 and +10 of the sequence, hence DmoCre in position 10 variable at 3 nucleotides sequentially backwards from 10.
- DC4NNN (SEQ ID NO: 9) refers to the target sequence of DmoCre with variability in positions +2, +3 and +4 of the sequence
- DC7NNN (SEQ ID NO: 10) refers to the target sequence of DmoCre with variability in positions +5, +6 and +7 of the sequence.
- chimeric DNA target or "hybrid DNA target” is intended the fusion of a different half of two parent meganuclease target sequences.
- at least one half of said target may comprise the combination of nucleotides which are bound by separate subdomains (combined DNA target).
- mutation is intended the substitution, the deletion, and/or the addition of one or more nucleotides/amino acids in a nucleic acid/amino acid sequence.
- homologous is intended a sequence with enough identity to another one to lead to a homologous recombination between sequences, more particularly having at least 95 % identity, preferably 97 % identity and more preferably 99 %.
- Identity refers to sequence identity between two nucleic acid molecules or polypeptides. Identity can be determined by comparing a position in each sequence which may be aligned for purposes of comparison. When a position in the compared sequence is occupied by the same base, then the molecules are identical at that position. A degree of similarity or identity between nucleic acid or amino acid sequences is a function of the number of identical or matching nucleotides at positions shared by the nucleic acid sequences.
- Various alignment algorithms and/or programs may be used to calculate the identity between two sequences, including FASTA, or BLAST which are available as a part of the GCG sequence analysis package (University of Wisconsin, Madison, Wis.), and can be used with, e.g., default settings.
- mammals as well as other vertebrates (e.g., birds, fish and reptiles).
- mammals e.g., birds, fish and reptiles.
- Examples of mammalian species include humans and other primates (e.g., monkeys, chimpanzees), rodents (e.g., rats, mice, guinea pigs) and ruminants (e.g., cows, pigs, horses).
- genetic disease refers to any disease, partially or completely, directly or indirectly, due to an abnormality in one or several genes.
- Said abnormality can be a mutation, an insertion or a deletion.
- Said mutation can be a punctual muta- tion.
- Said abnormality can affect the coding sequence of the gene or its regulatory sequence.
- Said abnormality can affect the structure of the genomic sequence or the structure or stability of the encoded mRNA. This genetic disease can be recessive or dominant.
- Such genetic disease could be, but are not limited to, cystic fibrosis, Huntington's chorea, familial hyperchoiesterolemia (LDL receptor defect), hepatoblastoma, Wilson's disease, congenital hepatic porphyrias, inherited disorders of hepatic metabolism, Lesch Nyhan syndrome, sickle cell anemia, thalassaemias, xeroderma pigmentosum, Fanconi's anemia, retinitis pigmentosa, ataxia telangiectasia, Bloom's syndrome, retinoblastoma, Duchenne's muscular dystrophy, and Tay-Sachs disease.
- vectors which can be used in the present invention includes, but is not limited to, a viral vector, a plasmid, a RNA vector or a linear or circular DNA or RNA molecule which may consists of a chromosomal, non chromosomal, semi-synthetic or synthetic nucleic acids.
- Preferred vectors are those capable of autonomous replication (episomal vector) and/or expression of nucleic acids to which they are linked (expression vectors). Large numbers of suitable vectors are known to those of skill in the art and commercially available.
- Viral vectors include retrovirus, adenovirus, parvovirus (e. g.
- RNA viruses such as orthomyxovirus (e. g., influenza virus), rhabdovirus (e. g., rabies and vesicular stomatitis virus), paramyxovirus (e. g. measles and Sendai), positive strand RNA viruses such as picor- navirus and alphavirus, and double-stranded DNA viruses including adenovirus, herpesvirus (e. g., Herpes Simplex virus types 1 and 2, Epstein-Barr virus, cytomegalovirus), and poxvirus (e. g., vaccinia, fowlpox and canarypox).
- orthomyxovirus e. g., influenza virus
- rhabdovirus e. g., rabies and vesicular stomatitis virus
- paramyxovirus e. g. measles and Sendai
- positive strand RNA viruses such as picor- navirus
- viruses include Norwalk virus, togavirus, flavivirus, reoviruses, papovavirus, hepadnavirus, and hepatitis virus, for example.
- retroviruses include: avian leukosis- sarcoma, mammalian C-type, B-type viruses, D type viruses, HTLV-BLV group, lentivirus, spumavirus (Coffin, J. M., Retroviridae: The viruses and their replication, In Fundamental Virology, Third Edition, B. N. Fields, et al., Eds., Lippincott-Raven Publishers, Philadelphia, 1996).
- vector refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked.
- One type of preferred vector is an episome, i.e., a nucleic acid capable of extra-chromosomal replication.
- Preferred vectors are those capable of autonomous replication and/or expression of nucleic acids to which they are linked.
- Vectors capable of directing the expression of genes to which they are operatively linked are referred to herein as "expression vectors”.
- a vector according to the present invention comprises, but is not limited to, a YAC (yeast artificial chromosome), a BAC (bacterial artificial), a baculovirus vector, a phage, a phagemid, a cosmid, a viral vector, a plasmid, a RNA vector or a linear or circular DNA or RNA molecule which may consist of chromosomal, non chromosomal, semi-synthetic or synthetic DNA.
- expression vectors of utility in recombinant DNA techniques are often in the form of "plasmids" which refer generally to circular double stranded DNA loops which, in their vector form are not bound to the chromosome.
- Vectors can comprise selectable markers, for example: neomycin phosphotransferase, histidinol dehydrogenase, dihydrofolate reductase, hygromycin phosphotransferase, herpes simplex virus thymidine kinase, adenosine deaminase, glutamine synthetase, and hypoxanthine-guanine phosphoribosyl transferase for eukaryotic cell culture; TRPl for S. cerevisiae; tetracycline, rifampicin or ampicillin resistance in E. coli.
- selectable markers for example: neomycin phosphotransferase, histidinol dehydrogenase, dihydrofolate reductase, hygromycin phosphotransferase, herpes simplex virus thymidine kinase, adenosine deaminase, glut
- said vectors are expression vectors, wherein a sequence encoding a polypeptide of the invention is placed under control of appropriate transcriptional and translational control elements to permit production or synthesis of said protein. Therefore, said polynucleotide is comprised in an expression cassette. More particularly, the vector comprises a replication origin, a promoter operatively linked to said encoding polynucleotide, a ribosome site, an RNA-splicing site (when genomic DNA is used), a polyadenylation site and a transcription termination site. It also can comprise an enhancer. Selection of the promoter will depend upon the cell in which the polypeptide is expressed. EXAMPLE 1 - Materials and Methods
- E. coli Rosetta(DE3)pLysS cells were transformed with plasmid pET24d(+) containing the I-Dmo-I ORF with a 6His tag at the C-terminus. His-tagged I-Dmo-I was overexpressed in LB medium at 24.85°C for 5 h after addition of 0.3 mM IPTG when the OD 60O was around 0.6-0.8. Selenomethionine-labelled I-Dmo-I was expressed using the same strain.
- the cells were collected from a 50 ml overnight culture grown in LB medium containing 30 mg ml "1 kanamycin until OD 600 ⁇ 1 0; at this point the cells were spun down, washed once with M9 minimal medium and finally resuspended in M9 minimal medium supplemented with thiamine (0.01 mg ml "1 ), glucose [0.4%(w/v)], CaCl 2 (0.0147 mg ml "1 ), MgSO 4 (0.246 mg ml "1 ) and kanamycin (30 mg ml "1 ).
- the culture was shaken at 36.85°C for 30 min and selenomethionine (50 mg ml '1 ; Molecular Dimensions) was then added together with lysine hydrochloride, threonine, phenylalanine, leucine, isoleucine and valine as described in Van Duyne et al. (1993). After an additional 15 min of shaking, protein expression was induced for 5 h at 24.85°C by the addition of 0.3 mM IPTG.
- the bacterial pellet was resuspended and the cells were disrupted by sonication in 50 mM sodium phosphate pH 8.0, 300 mM NaCl and 5% glycerol including protease inhibitors (Complete EDTA-free tablets, Roche).
- the lysate was clarified by centrifugation (20 00Og for 1 h).
- the supernatant was applied onto a Co 2+ - loaded HiTrap Chelating HP column (GE Healthcare) and the protein was eluted using an imidazole gradient (0-0.5 M).
- the fractions containing I-Dmo-I were collected and the pH was adjusted to 6.0.
- the sample was loaded onto a 5 ml HiTrap Heparin HP column (GE Healthcare) previously equilibrated with 20 mM sodium phosphate pH 6.0.
- the sample was eluted with a continuous gradient from 0 to 1 M NaCl in 20 mM sodium phosphate pH 6.0 buffer.
- the purified protein was subsequently concentrated using an Amicon Ultra system equipped with a 10 kDa cut off filter and loaded onto a PD-IO Desalting column (GE Healthcare) pre-equilibrated with 5 mM Tris-HCl pH 8.0 and 150 mM NaCl.
- the protein was concentrated to 16 mg ml "1 , flash frozen in liquid nitrogen and stored at -8O.15°C.
- the protein concentration was determined from the absorbance at 280 nm.
- the purity of the samples was checked by SDS-PAGE and their homogeneity was evaluated using dynamic light scattering. Finally, the incorporation of selenomethionine was tested by mass spectrometry (data not shown).
- the I-Dmol target DNA was purchased from Proligo and consisted of two strands of sequence 5'-GCCTTGCCGGGTAAGTTCCGGCGCG-S ' (SEQ ID NO: 1 1 and 5 '-CGCGCCGGAACTTACCCGGCAAGGC-S' (SEQ ID NO: 12). The construct forms a 25 bp blunt-end duplex.
- the I-DmoI-DNA complex was formed after pre-warming the meganuclease and the oligonucleotide samples to 14.85°C and then mixing them in a 1.5: 1 molar ratio (DNA:protein). The mixture was incubated for 50 min and then spun down for 5 min. The supernatant was stored at room temperature to avoid precipitation. To assess the presence of DNA in the complex with I-Dmo-I, the purified complex was analyzed by running a 15% SDS-PAGE and staining first with Coomassie and subsequently with SYBR Safe. The same protocol was followed in the presence of 2 mM Ca 2+ or Mn 2+ .
- Crystallization screening was performed immediately after complex formation using a Cartesian MicroSys robot (Genomic Solutions) and the sitting-drop method (96-well MRC plates) with nanodrops of 0.1 ml protein solution plus 0.1 ml reservoir solution and a reservoir volume of 60 ml.
- the initial screens tested were Crystal Screens I and II, Crystal Screen Cryo and Crystal Screen Lite (Hampton Research), Wizard I and II, Wizard Cryo I and II, Precipitant Synergy Primary, Precipitant Synergy Expanded 67% and Precipitant Synergy Expanded 33% (Emerald BioSystems).
- the final concentration of I-Dmo-I in the DNA-protein complex solution was 6 mg ml "1 .
- Crystals were obtained in the nanodrops under several conditions (Crystal Screen I conditions 15 and 36, Crystal Screen II conditions 22, 35, 37 and 43, Crystal Screen Cryo conditions 15, 20 and 37, Crystal Screen Lite conditions 18, 28 and 41, Wizard I condition 21, Wizard Cryo I conditions 40 and 47, Wizard Cryo II condition 10, Precipitant Synergy Primary conditions 42 and 52 and Precipitant Synergy Expanded 67% condition 51).
- Table I showing data-collection statistics of the native I-Dmo-I- DNA crystals grown in 2 mM Mn 2+ .
- the non palindromic twenty-four base pairs long target sequence 5'- GCCTTGCCGGGTAAGTTCCGGCGC-3' (SEQ ID NO: 13) is the natural I-Dmol target.
- the inventors divided it in two equal parts L and R.
- the 64 degenerated targets derived from LR sequence were obtained by mutating nucleotides at positions +8, +9, and +10 in the R sequence.
- oligonucleotides (5'-GCCTTGCCGGGTAAGTTCCNNNGC-S ' (SEQ ID NO: 14) and reverse complementary sequences) representing the target library LR(IONNN) were ordered from Sigma, annealed, and cloned using the Gateway protocol (Invitrogen) into the yeast pFL39-ADH-LACURAZ containing a I-Scel target site as control(Arnould et al., 2006).
- Yeast reporter vectors were transformed into S. cerevisiae strain FYBL2-7B (MAT a, ura3 ⁇ 851 , trpl ⁇ 63, leu2 ⁇ l , lys2 ⁇ 202).
- I-Dmol WT wild-type
- Dl and D2 two I-Dmol mesophilic variants reported, Dl and D2
- R-IONNN 64 I-Dmol derivated targets
- a specificity logo is a diagrammatic representation of the specificity preference for each of the possible nucleotides in the I-Dmol (non coding strand) SEQ ID NO: 15, E-Drel SEQ ID NO: 17, 1-Crel SEQ ID NO: 15 and I-Scel SEQ ID NO: 18, DNA target sites.
- the height of a given nucleotide is proportional to exp(- ⁇ G, nt /RT), where ⁇ G, nt is difference in interaction energy between the complex with mutated DNA and the wild type.
- Full-length I-Dmol in complex with a 25bp double stranded DNA was crystallized as an enzyme-substrate complex with calcium and as an enzyme- product complex with manganese. Protein expression, purification, protein-DNA complex formation and crystallization were carried out as described in example 1 above. The phase problem was solved using the anomalous signal at the selenium peak wavelength.
- the single anomalous dispersion (SAD) method was applied to obtain initial phases at 2.8 A resolution in crystals grown with Se-Met protein (see example 1).
- the three selenium atoms were located in I-Dmol using SHELX (Schneider and Sheldrick, 2002) and initial phases were obtained with SHARP (de Ia Fortelle and Bricogne, 1997).
- the initial model was built in 2.6 A 2fo-fc maps after solvent flattening using SOLOMON (Abrahams and Leslie, 1996).
- SOLOMON Abrahams and Leslie, 1996
- the structures were finally refined to 2.0 and 2.1 A in the same monoclinic space group using REFMAC (Murshudov et al., 1997) (Table II).
- Figure 1 shows the crystal structure of I-Dmol in complex with its target DNA.
- Panel a) shows the protein secondary structure, the complex is shown in two different orientations. The calcium ion is shown.
- the crystallization oligonucleotide construct is shown in panel b).
- the individual bases are named with a subindex strandA (coding strand) or strandB (non-coding strand) indicating the DNA strand where they belong.
- Fig. Ia The overall fold of I-Dmol in complex with its DNA target (Fig. Ia) shows a clear pseudo two-fold axis between the two LAGLIDADG helices dividing the protein in two domains, A (residues 5-98) and B (residues 103-195) joined by a four residue linker. These domains contain the typical ⁇ topology of the
- LAGLIDADG LAGLIDADG family. Both domains have a similar size and the ⁇ -strands form two antiparallel ⁇ -sheets composed of strands ⁇ l-4 in domain A and ⁇ 5-7 in domain B.
- the ⁇ -sheets form a concave surface with an inner cylindrical shape where the DNA molecule is accommodated.
- RMSD refers to Root Mean Square Deviation and is the measure of the average distance between the backbones of superimposed proteins.
- panel a) there is shown a stereo comparison of the enzyme active site in the substrate and product bound structures.
- the metal sites are labeled (Ml) for the shared position between calcium and manganese and the second manganese atom (M2).
- Anomalous difference maps illustrate the presence of only one atom of calcium in the DNA bound structure b) and two atoms of manganese in the digested DNA structure c).
- Panel d) shows a schematic diagram of the hypothetical enzymatic mechanism proposed for /- Dmol. Hydrolysis of the phosphodiester bonds would follow a sequential two-metal mechanism. While a single metal ion (sitel) is bound in one active site and the water nucleophile is positioned in the central site.
- a second metal ion would enter the second site (site2) displacing the water molecule previously located in that site to the central one.
- site2 displacing the water molecule previously located in that site to the central one.
- the Asp21, Gly20 and GIu 1 17, Alal l ⁇ are contributed by the LAGLIDADG motifs of the enzyme.
- Figure 3 shows a scheme of the Protein-DNA contacts in the Ca 2+ and Mn 2+ bound structures.
- the cleavage sites are indicated by the shaded phosphates.
- the portion of the DNA target which binds to the domain A of /- Dmol consists of residues -2 to 13 of the '3 strand and residues 3 to 13 of the 5' strand, with the remaining nucleotides being bound by the domain B of I- Dmol.
- Lines indicate polar contacts and van der Waals interactions respectively. Dots represent water molecules involved in the interaction. Amino acids depicted on blackened boxes represent hydrogen bond interactions with the bases and the other residues represent van der Waals interactions with the DNA (bases, riboses or phosphates).
- DNA target provided a preliminary view of regions involved in DNA target recognition inside domain A, it did not yield a complete picture of the recognition mechanism.
- Divalent metal ions play an essential role in the catalysis of endonucleases and other enzymes.
- LAGLIDADG homing endonucleases
- the general mechanism of cleavage of the phosphodiester bonds of DNA requires a nucleophile to attack the electron deficient phosphorus atom, a general base to activate the nucleophile, a general acid to protonate the leaving group, and positively charged groups to stabilize the phosphoanion transition state.
- the presence of cations is dispensable for DNA binding (Fig. 3) (Dalgaard et al., 1994).
- Figure 4 shows the loops involved in DNA binding by I-Dmol.
- the upper part of the figure depicts a ribbon diagram of the I-DmoIFDNA complex. Domain A contains two loops that contact the DNA (LIa and L2a), and domain B only has one loop (L2b) engaged in contacts with the nucleic acid. L2a and L2b are primarily associated with the central bases of the target site, and LIa is associated with bases outside that region — reflecting the asymmetry of the target recognition by I-Dmol.
- the lower part of the figure shows detailed insets of the three loops involved in DNA interactions. The protein-DNA interactions are displayed as dashed lines.
- Figure 5 shows structural basis of DNA recognition, a) Structural sequence alignment between the archaeal I-Dmol, eukaryotic I-Scel and I-Crel homing endonucleases. Secondary structure elements of the homing endonuclease I- Dmol are shown above the alignment. conserveed residues are boxed with a black background while homologous residues are boxed with a white background. Residues with a gray background are those involved in protein-base contacts in the complexes crystal structures. Sequence alignment was carried out with Clustal (Larkin et al., 2007) and the structural alignment with ESPript (Gouet et al., 1999). Panel b) shows a comparison of the location of the protein-base contacts (regions colored in gray) in the I-Dmol, I-Scel and I-Crel protein-DNA structures. Panel c) shows a schematic view of the protein-base contacts.
- the central cation should stabilize the phosphoanion transition state in the hydrolysis of both strands, and facilitate the protonation of the 03' leaving group of each strand.
- the absence of direct protein contacts between I-Crel and the nucleophilic water molecule did not facilitate the identification of a general base. It has been suggested that the extensive network of water molecules surrounding the active site participates in a concerted transfer of hydrogen atoms that activate the nucleophilic water molecule and protonate the leaving group.
- Mn 2+ cations is coordinated with the side chain of Asp21, the carbonyl of Alal 16, the 5 'phosphate of -3C str andB, the phosphate of 2A stra ndA and a water molecule outside the active site, whereas the second Mn 2+ has similar interactions with the 5 phosphate of 3G s trandA, the phosphate of -2C st randB the main chain carbonyl of Gly20, the side chain of Glul l7 in the second LAGLIDADG motive, and another water molecule outside the active center.
- I-Scel and I-Crel contain three metal sites in the active site
- I-Dmol contains only two metal sites.
- the comparison of the I-Dmol Ca 2+ and Mn 2+ anomalous maps shows that only one of the metal sites overlaps.
- the other sites, including the central one, were occupied by water molecules, whereas in the case of I-Scel and I-Crel both can be occupied by a metal (Chevalier et al., 2004; Moure et al., 2003). Therefore the structural organization of the I-Dmol active site presents a clear asymmetry in the case of the calcium-bound structure, indicating a sequential mechanism for I-Dmol catalysis that has also been suggested for I-Scel (Moure et al., 2003).
- the non-coding strand would be cleaved before the final reaction takes place on the coding strand.
- the central water could be the nucleophile that would initiate the reaction, previous activation by the electropositive environment generated by the metal present in the active site (Garcia-Viloca et al., 2004).
- the entry of another catalytic metal in the second site would promote the transfer or regeneration of the central water, leading to the cleavage of the coding strand.
- the Ca 2+ bound structure would represent a snapshot of the activation state previous to the cleavage of the phosphodiester bond in the non-coding strand whereas the Mn 2+ bound structure would depict the organization of the active site after the cleavage of both strands.
- the enzyme would produce a nick in the DNA non-coding strand before the coding strand would be cleaved resulting in the double strand cleavage.
- this possible mechanism could be discarded after the observation of the cleavage properties of I-Dmol Asp21Asn and GIu 1 17GIn single mutants (Lykke- Andersen et al., 1997), nicked intermediates were observed in I-Scel (Perrin et al., 1993) and in I-Dmol when the cleavage properties of a homodimeric I-Dmol mutant were studied using a plasmid as substrate (Silva et al., 2006) indicating that a sequential cleavage mechanism is possible for the monomeric members of the LAGLIDADG homing endonuclease family.
- I-Dmol contains two LAGLIDADG helices and it binds the nucleic acid in a monomeric form.
- the protein forces a clear bend in the DNA molecule forming and angle of approximately 140° between the longitudinal axes of both DNA halves. This angle distorts the minor groove in the middle of the DNA molecule positioning both strands in the enzyme's active site.
- the crystal structure reveals the asymmetric nature of the I-Dmol DNA binding cavity.
- domain A contains a four ⁇ -strand sheet, whereas domain B contains only three (Fig.1).
- a detailed view of the protein DNA contacts in the loops shows that the L2a and L2b loops contact symmetric regions on the DNA major grooves (Fig. 4).
- LIa interacts with bases (6-10) in the major groove closer to the 5' in strand B.
- This protein-DNA interaction is absent in the other half of the DNA target.
- the lack of the fourth ⁇ -strand in domain B eliminates the presence of a loop similar to LIa in domain B, promoting the lack of protein DNA contacts in the major groove closer to the 5' in strand A. This difference implies that the target half associated with domain A (basesl-13 in both strands) is recognized by a major number of residues.
- the protein-DNA contacts in the substrate and product bound structures were analyzed in detail with NUCPLOT (Luscombe et al., 1997) (Fig. 3, 4, 6 and 7).
- a schematic representation of the interaction reveals few differences in the protein DNA interactions between both forms (Fig.3).
- the main contacts in domain B interacting with the nucleotide bases involve Argl24, Argl26, Asp 154, Argl57 and Aspl 55.
- Argl24 is positioned at a proper distance to make polar contacts with the bases of -7G str andA and -6G str andB, whereas Argl26 hydrogen bonds the base of - 5G strandB -
- the conformation of Argl26 side chain is influenced by the interaction with Asp 1 19 that does not contact the nucleotide bases, but interacts with the phosphate backbone.
- the rotamer of Aspl l9 forces a conformation of the Argl26, indirectly inducing the recognition of the base at -5G str andB-
- the conformations and contacts of these residues are very similar both in the bound and cleaved DNA structures.
- the Ioop2a presents Thr76 and Asp75, which are the only amino acids whose interactions provide specific recognition in the central four base pairs of the DNA.
- Thr76 makes a polar contact with the base of 2A str andA
- Asp75 hydrogen bonds the base of 3C str andB-
- the conformation of this side chain is influenced by the presence of Arg77, which makes a polar contact with the base of 3G strandA (Fig.4, Loop2a).
- Arg77 together with Arg81, Arg37, Tyr29, Arg33, Glu79, Glu35 and Ser34 are the remaining residues in domain A responsible for making direct contacts with the bases of the target DNA (Fig.4, Loop2a, and Loop Ia).
- the side chain of Glu79 makes a direct contact with the base of 5A strandB and 5T s , randA , whereas the side chain of Glu35 contacts the base of 8C str andB-
- the conformation of the Glu35 side chain is favored by the interactions of the side chain of Ser83 and the main chain of Tyr36 with a water molecule, which contacts the DNA backbone.
- Tyr29 together with Arg33 and Ser34 form a second group of residues clustered in space that interact with the bases of 6C stra ndA 9G s t ra ndA and 9C s trandB respectively (Fig .4, Loop Ia). All these contacts between I-Dmol and the bases of its target DNA seem to be the responsible for DNA target recognition; the rest of the amino acids (Fig.3) involved in contacts make hydrogen bonds or van der Waals interactions with the DNA backbone.
- I-Dmol and I- Seel were two well characterized meganucleases representing the homodimeric and monomeric members of the LAGLIDADG family that bind pseudo- and non-palindromic targets respectively (Fig.5).
- the alignment illustrates the differences in the primary and secondary structures among these enzymes regarding the location of residues involved in DNA binding (residues with gray background in Fig.5a).
- FIG.5b A structural comparison of the DNA binding residues of these homing endonucleases, shows that despite the similar structural scaffold the residues responsible for DNA recognition are not topologically conserved (Fig.5b).
- the schematic comparison of the specific base- protein contacts in the different meganuclease-DNA complexes (Fig.5c) illustrates how the homodimeric meganuclease accomplish DNA recognition generating a similar network of protein-DNA contacts on both sides of the pseudo-palindromic DNA, whereas the monomeric ones display a tendency to maximize the interactions in one half of the target DNA.
- the inventors analyzed the length of the binding site and the number of specific positions for the target DNA sequences of each meganuclease. To perform this study the inventors have used the last version of FoIdX (FoldX2.8) (Schymkowitz et al., 2005). Each base was mutated to the other three possibilities and the resulting interaction energies were converted to a probability to predict the preference for each base at a determined position (Fig.6).
- Figure 6 shows the in silico binding patterns and in particular the in silico binding specificities for I-Dmol, E-Drel, I-Crel and I-Scel.
- the energy-based logos display the different specificities of the meganucleases.
- I-Dmol presents a short binding site with the highest specificity, while I-Scel has a long but quite tolerant binding site.
- Base discrimination predicted by FoIdX compares notably well with respect to available experimental results for I-Scel (Doyon et al., 2006) and l-Crel (Argast et al., 1998).
- the reference sequences for the logos are the wild type coding strands, except for I-Dmol where the non-coding strand was used.
- the wild type sequence is shown, b) in silico R-10NNN binding pattern predicted by FoIdX.
- the pattern was calculated by using the wild type l-Dmol structure and based on the difference in interaction energy with the WT DNA sequence. The energies were calculated by adding up the change in interaction energy due to each individual mutation in the DNA. Hits are found for R-IOGCC, R-IOGCG, R-IOGCA, and R- IOACC (see coding strand B sequence in Figure Ib or 7). Only the R-IOGCG hit was not found experimentally for l-Dmol or the two mutants analyzed. The target triplet is shown inside each cell.
- the Inventors scanned the energy matrices coming from the analysis above along the Saccharomyces cerevisiae and Drosophila melanogaster genomes, using two different energy thresholds with respect the wild-type interaction energy.
- the results did not yield a hit in yeast with the lower energy threshold (I-Scel site in the yeast strain sequenced is disrupted by the insertion of an intron that contains the meganuclease), and very few were found in Drosophila melanogaster (Table III).
- the energy threshold was increased, weaker hits become apparent (Table III). This could be important in the context of a highly expressed enzyme or one with enhanced activity. Table III.- DNA Interaction analysis for I-Dmol, E-Drel, I-Crel and I-Scel.
- the binding event induces a deep kink in the nucleic acid molecule to force both strands in the active site. This kink is more pronounced than in other meganucleases of the LAGLIDADG family.
- the structures of the enzyme with the bound and cleaved DNA molecule suggest a sequential mechanism for the catalysis mechanism.
- I-Dmol exhibits poor activity at 37° C due to its thermophilic origin and therefore is not an appropriate tool for practical in vivo applications.
- Dl Ile52Phe, Leu95Gln
- I-Dmol D2 Ile52Phe, Ala92Thr, PhelOlCys
- I-Dmol is a very specific meganuclease.
- the inventors monitored the cleavage pattern of I-Dmol and of its two mesophilic derivatives with the R-10NNN targets collection, which corresponds to the all sixty four possible triplets for the target positions 8G str andA, 9G stran dA, and 10C stra ndA, (Fig.7a).
- the R- 10NNN triplet is in contact with domain A (Fig. 1, 3, 4 and 7), in the region having the maximal density of protein/DNA interactions.
- the detailed interaction map of this region includes polar contacts of Arg33 with the 9G str andA base, of Ser34 carbonyl main chain with 9C str andB, of Ser34 main chain amide with 10G st randB base; and of Glu35 side chain with 8C str andB pirimidine ring.
- Figure 7 shows the in vivo cleavage patterns a) I-Dmol recognition site.
- the target sequence has been divided in two halves, L (left) and R (right).
- 64 targets R-IONNN
- 64 targets were derived from the natural I- Dmol targets differing from the natural I-Dmol target only by three base pairs at position 8, 9, and 10 on the R half of the target. •, cleavage positions, b) cleavage activities of I-Dmol wild type and of two mesophilic I-Dmol variants (Dl, D2).
- the 64 R-10NNN targets are identified in the top left panel by the 5'-NNN-3' bottom strand sequence of the nucleotides 10, 9, and 8.
- the grey box identifies the natural target. Bottom left, profile of I-Dmol. Top right, profile of l-Dmol variant Dl . Bottom right, profile of I-Dmol variant D2. Targets cleaved by the samples are boxed in solid black lines.
- the controls no meganuclease expressed, at positions al, a4, a7,b2, b5, b8, c3, c6, dl, d4, d7, ...; I-SceI CLS (variant with moderate activity) , at positions bl , b4, b7, c2, c5, c8, ...; and /-See/ WT at positions cl, c4, c7, d2, d5, d8, ....
- in silico screening methods are now possible using the information from this structure to predict the effects of changes on the I-Dmol structure of residue changes therein and also changes in the DNA target.
- Such in silico screening allows large numbers of potential enzymes to be screened against all possible targets and allows the more time consuming later in vitro/in vivo characterisation work to be focussed upon the candidate molecules identified during an initial in silico screen or later more focussed analysis of three-dimensional models of candidate polypeptides.
- the Inventors have previously conducted similar profiling with the I-Crel meganuclease as well as with hundreds of engineered derivatives (Arnould et al., 2006; Smith et al., 2006). Using a statistical approach, the inventors could also infer clues about the role of individual contacts between the protein and the target (Arnould et al., 2006; Smith et al., 2006). In former studies, the inventors have used structural data to engineer the specificity of the homodimeric I-Crel protein. First, the inventors locally engineered sub-domains of the I-Crel DNA binding interface to cleave DNA targets differing from the I-Crel target by a few consecutive base pairs.
- I-Dmol seems to have a very narrow specificity.
- the inventors have shown that the I-Crel Asp75Asn meganuclease mutant had a narrow target specificity, showing strong cleavage for only 3 targets out of two similar collections of 64 targets derived from the wild-type I-Crel target (Arnould et al., 2006; Smith et al., 2006).
- the narrow cleavage pattern of the I-Dmol Dl and D2 variants suggest that I-Dmol is at least as selective as I-Crel.
- the induction of homologous gene targeting by sequence specific endonuclease is seen today as an emerging technology with many applications (Paques and Duchateau, 2007).
- I-Ceul homing endonuclease Evolving asymmetric DNA recognition from a symmetric protein scaffold. Structure, 14, 869- 880.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Wood Science & Technology (AREA)
- Molecular Biology (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Genetics & Genomics (AREA)
- Medicinal Chemistry (AREA)
- Microbiology (AREA)
- Biotechnology (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Enzymes And Modification Thereof (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
La présente invention concerne la structure tridimensionnelle de la méganucléaseI-DmoIen combinaison avec sa cible d'ADN et, à partir de ceci, la prédiction de résidus dans l'enzyme I-DmoI qui affectent les propriétés de liaison, catalytiques et autres de cette enzyme. La présente invention porte aussi sur des enzymes I-DmoI présentant des caractéristiques modifiées, telles qu'une modification des demi-sites cibles ou une modification des propriétés de catalyse, ainsi que sur des méganucléases chimères comprenant des parties de I-DmoI, et ainsi que sur l'utilisation de la structure tridimensionnelle de la méganucléase I-DmoI en combinaison avec sa cible d'ADN dans un procédé de criblage in silico.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/IB2008/002756 WO2010001189A1 (fr) | 2008-07-03 | 2008-07-03 | Structure cristalline de i-dmoi dans un complexe avec sa cible d'adn, méganucléases chimères améliorées, et leurs utilisations |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/IB2008/002756 WO2010001189A1 (fr) | 2008-07-03 | 2008-07-03 | Structure cristalline de i-dmoi dans un complexe avec sa cible d'adn, méganucléases chimères améliorées, et leurs utilisations |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2010001189A1 true WO2010001189A1 (fr) | 2010-01-07 |
Family
ID=40352676
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/IB2008/002756 Ceased WO2010001189A1 (fr) | 2008-07-03 | 2008-07-03 | Structure cristalline de i-dmoi dans un complexe avec sa cible d'adn, méganucléases chimères améliorées, et leurs utilisations |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2010001189A1 (fr) |
Cited By (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2011064736A1 (fr) | 2009-11-27 | 2011-06-03 | Basf Plant Science Company Gmbh | Endonucléases optimisées et leurs utilisations |
| WO2011064750A1 (fr) | 2009-11-27 | 2011-06-03 | Basf Plant Science Company Gmbh | Endonucléases chimériques et utilisations de celles-ci |
| WO2012149470A1 (fr) | 2011-04-27 | 2012-11-01 | Amyris, Inc. | Procédés de modification génomique |
| DE112010004584T5 (de) | 2009-11-27 | 2012-11-29 | Basf Plant Science Company Gmbh | Chimäre Endonukleasen und Anwendungen davon |
| EP2612918A1 (fr) | 2012-01-06 | 2013-07-10 | BASF Plant Science Company GmbH | Recombinaison in planta |
| WO2015095804A1 (fr) | 2013-12-19 | 2015-06-25 | Amyris, Inc. | Procédés d'intégration génomique |
| EP3733847B1 (fr) | 2012-10-23 | 2022-06-01 | Toolgen Incorporated | Composition pour le clivage d'un adn cible comprenant un arn guide spécifique de l'adn cible et un acide nucléique codant pour la protéine cas ou la protéine cas, et leur utilisation |
| WO2023081756A1 (fr) | 2021-11-03 | 2023-05-11 | The J. David Gladstone Institutes, A Testamentary Trust Established Under The Will Of J. David Gladstone | Édition précise du génome à l'aide de rétrons |
| WO2023141602A2 (fr) | 2022-01-21 | 2023-07-27 | Renagade Therapeutics Management Inc. | Rétrons modifiés et méthodes d'utilisation |
| EP4219731A2 (fr) | 2016-05-18 | 2023-08-02 | Amyris, Inc. | Compositions et procédés d'intégration génomique d'acides nucléiques dans des tampons d'atterrissage exogènes |
| WO2024044723A1 (fr) | 2022-08-25 | 2024-02-29 | Renagade Therapeutics Management Inc. | Rétrons modifiés et méthodes d'utilisation |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2004031346A2 (fr) * | 2002-09-06 | 2004-04-15 | Fred Hutchinson Cancer Research Center | Procedes et compositions bases sur des proteines de liaison d'acide nucleique hautement specifiques modifiees |
-
2008
- 2008-07-03 WO PCT/IB2008/002756 patent/WO2010001189A1/fr not_active Ceased
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2004031346A2 (fr) * | 2002-09-06 | 2004-04-15 | Fred Hutchinson Cancer Research Center | Procedes et compositions bases sur des proteines de liaison d'acide nucleique hautement specifiques modifiees |
Non-Patent Citations (8)
| Title |
|---|
| AAGAARD C ET AL: "PROFILE OF THE DNA RECOGNITION SITE OF THE ARCHAEAL HOMING ENDONUCLEASE I-DMOL", NUCLEIC ACIDS RESEARCH, OXFORD UNIVERSITY PRESS, SURREY, GB, vol. 25, no. 8, 15 April 1997 (1997-04-15), pages 1523 - 1530, XP000942177, ISSN: 0305-1048 * |
| LUCAS PATRICK ET AL: "Rapid evolution of the DNA-binding site in LAGLIDADG homing endonucleases", NUCLEIC ACIDS RESEARCH, vol. 29, no. 4, 15 February 2001 (2001-02-15), pages 960 - 969, XP002516751, ISSN: 0305-1048 * |
| MARCAIDA MARÍA JOSÉ ET AL: "Crystal structure of I-DmoI in complex with its target DNA provides new insights into meganuclease engineering.", PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA 4 NOV 2008, vol. 105, no. 44, 4 November 2008 (2008-11-04), pages 16888 - 16893, XP002516753, ISSN: 1091-6490 * |
| MOURE CARMEN M ET AL: "Crystal structures of I-SceI complexed to nicked DNA substrates: snapshots of intermediates along the DNA cleavage reaction pathway", NUCLEIC ACIDS RESEARCH, vol. 36, no. 10, June 2008 (2008-06-01), pages 3287 - 3296, XP002516752, ISSN: 0305-1048 * |
| REDONDO PILAR ET AL: "Crystallization and preliminary X-ray diffraction analysis on the homing endonuclease I-Dmo-I in complex with its target DNA", ACTA CRYSTALLOGRAPHICA SECTION F STRUCTURAL BIOLOGY AND CRYSTALLIZATION COMMUNICATIONS, vol. 63, no. Part 12, December 2007 (2007-12-01), pages 1017 - 1020, XP002516750, ISSN: 1744-3091(print) 1744-3091(ele * |
| SILVA G H ET AL: "Analysis of the LAGLIDADG interface of the monomeric homing endonuclease I-DmoI", NUCLEIC ACIDS RESEARCH, OXFORD UNIVERSITY PRESS, SURREY, GB, vol. 32, no. 10, 1 June 2004 (2004-06-01), pages 3156 - 3168, XP002364698, ISSN: 0305-1048 * |
| SILVA G H ET AL: "Crystal structure of the thermostable archaeal intron-encoded endonuclease I-DmoI", JOURNAL OF MOLECULAR BIOLOGY, LONDON, GB, vol. 286, no. 4, 5 March 1999 (1999-03-05), pages 1123 - 1136, XP004462690, ISSN: 0022-2836 * |
| SMITH JULIANNE ET AL: "A combinatorial approach to create artificial homing endonucleases cleaving chosen sequences", NUCLEIC ACIDS RESEARCH, OXFORD UNIVERSITY PRESS, SURREY, GB, vol. 34, no. 22, 27 November 2006 (2006-11-27), pages e149 - 1, XP002457876, ISSN: 0305-1048 * |
Cited By (19)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10316304B2 (en) | 2009-11-27 | 2019-06-11 | Basf Plant Science Company Gmbh | Chimeric endonucleases and uses thereof |
| US9404099B2 (en) | 2009-11-27 | 2016-08-02 | Basf Plant Science Company Gmbh | Optimized endonucleases and uses thereof |
| DE112010004583T5 (de) | 2009-11-27 | 2012-10-18 | Basf Plant Science Company Gmbh | Chimäre Endonukleasen und Anwendungen davon |
| WO2011064736A1 (fr) | 2009-11-27 | 2011-06-03 | Basf Plant Science Company Gmbh | Endonucléases optimisées et leurs utilisations |
| DE112010004584T5 (de) | 2009-11-27 | 2012-11-29 | Basf Plant Science Company Gmbh | Chimäre Endonukleasen und Anwendungen davon |
| DE112010004582T5 (de) | 2009-11-27 | 2012-11-29 | Basf Plant Science Company Gmbh | Optimierte Endonukleasen und Anwendungen davon |
| WO2011064750A1 (fr) | 2009-11-27 | 2011-06-03 | Basf Plant Science Company Gmbh | Endonucléases chimériques et utilisations de celles-ci |
| US8685737B2 (en) | 2011-04-27 | 2014-04-01 | Amyris, Inc. | Methods for genomic modification |
| WO2012149470A1 (fr) | 2011-04-27 | 2012-11-01 | Amyris, Inc. | Procédés de modification génomique |
| US9701971B2 (en) | 2011-04-27 | 2017-07-11 | Amyris, Inc. | Methods for genomic modification |
| WO2013102875A1 (fr) | 2012-01-06 | 2013-07-11 | Basf Plant Science Company Gmbh | Recombinaison in planta |
| EP2612918A1 (fr) | 2012-01-06 | 2013-07-10 | BASF Plant Science Company GmbH | Recombinaison in planta |
| EP3733847B1 (fr) | 2012-10-23 | 2022-06-01 | Toolgen Incorporated | Composition pour le clivage d'un adn cible comprenant un arn guide spécifique de l'adn cible et un acide nucléique codant pour la protéine cas ou la protéine cas, et leur utilisation |
| US12473559B2 (en) | 2012-10-23 | 2025-11-18 | Toolgen Incorporated | Cas9/RNA complexes for inducing modifications of target endogenous nucleic acid sequences in nucleuses of eukaryotic cells |
| WO2015095804A1 (fr) | 2013-12-19 | 2015-06-25 | Amyris, Inc. | Procédés d'intégration génomique |
| EP4219731A2 (fr) | 2016-05-18 | 2023-08-02 | Amyris, Inc. | Compositions et procédés d'intégration génomique d'acides nucléiques dans des tampons d'atterrissage exogènes |
| WO2023081756A1 (fr) | 2021-11-03 | 2023-05-11 | The J. David Gladstone Institutes, A Testamentary Trust Established Under The Will Of J. David Gladstone | Édition précise du génome à l'aide de rétrons |
| WO2023141602A2 (fr) | 2022-01-21 | 2023-07-27 | Renagade Therapeutics Management Inc. | Rétrons modifiés et méthodes d'utilisation |
| WO2024044723A1 (fr) | 2022-08-25 | 2024-02-29 | Renagade Therapeutics Management Inc. | Rétrons modifiés et méthodes d'utilisation |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2010001189A1 (fr) | Structure cristalline de i-dmoi dans un complexe avec sa cible d'adn, méganucléases chimères améliorées, et leurs utilisations | |
| CN103608027B (zh) | 用于生成致密tale-核酸酶的方法及其用途 | |
| US11192929B2 (en) | Site-specific DNA base editing using modified APOBEC enzymes | |
| JP2024001024A (ja) | 配列特異性およびdna-結合親和度が変更された、合理設計メガヌクレアーゼ | |
| EP2126066B1 (fr) | Variants d'endonucléase homing laglidadg à une nouvelle spécificité de substrat et leur utilisation | |
| WO2004031346A2 (fr) | Procedes et compositions bases sur des proteines de liaison d'acide nucleique hautement specifiques modifiees | |
| JP2019062898A (ja) | 合理的に設計された、非パリンドローム認識配列を有する単鎖メガヌクレアーゼ | |
| EP2231697B1 (fr) | Enzymes améliorées de méganucléase chimère et leurs utilisations | |
| CN101384712A (zh) | 切割来自着色性干皮病基因的dna靶序列的大范围核酸酶变体及其应用 | |
| WO2006097854A1 (fr) | Meganucleases heterodimeriques et utilisation de ces dernieres | |
| JP2011505809A (ja) | ヒトゲノムのDNase高感受性領域に見出される認識配列を有する合理的に設計されたメガヌクレアーゼ | |
| Thyme et al. | Reprogramming homing endonuclease specificity through computational design and directed evolution | |
| Gupta et al. | Restriction endonucleases: natural and directed evolution | |
| Joshi et al. | Evolution of I-SceI homing endonucleases with increased DNA recognition site specificity | |
| Ashworth | Computational physical modeling and design of protein-DNA interactions | |
| Zhao | Characterization of bacterial homing endonuclease I-Ssp6803I | |
| Silva | Structural and biochemical analysis of the thermostable archaeal intron-encoded endonuclease I-DmoI | |
| SG193850A1 (en) | Meganuclease variants cleaving a dna target sequence from a glutamine synthetase gene and uses thereof |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 08874866 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 08874866 Country of ref document: EP Kind code of ref document: A1 |