WO2019161459A1 - Méthodes d'optimisation de codon - Google Patents
Méthodes d'optimisation de codon Download PDFInfo
- Publication number
- WO2019161459A1 WO2019161459A1 PCT/AU2019/050160 AU2019050160W WO2019161459A1 WO 2019161459 A1 WO2019161459 A1 WO 2019161459A1 AU 2019050160 W AU2019050160 W AU 2019050160W WO 2019161459 A1 WO2019161459 A1 WO 2019161459A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- polynucleotide
- codon
- sequence
- polynucleotides
- aav
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B35/00—ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
- G16B35/10—Design of libraries
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/102—Mutagenizing nucleic acids
- C12N15/1027—Mutagenizing nucleic acids by DNA shuffling, e.g. RSR, STEP, RPR
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1089—Design, preparation, screening or analysis of libraries using computer algorithms
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B40/00—Libraries per se, e.g. arrays, mixtures
- C40B40/04—Libraries containing only organic compounds
- C40B40/06—Libraries containing nucleotides or polynucleotides, or derivatives thereof
- C40B40/08—Libraries containing RNA or DNA which encodes proteins, e.g. gene libraries
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2750/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
- C12N2750/00011—Details
- C12N2750/14011—Parvoviridae
- C12N2750/14111—Dependovirus, e.g. adenoassociated viruses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2750/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
- C12N2750/00011—Details
- C12N2750/14011—Parvoviridae
- C12N2750/14111—Dependovirus, e.g. adenoassociated viruses
- C12N2750/14141—Use of virus, viral particle or viral elements as a vector
- C12N2750/14144—Chimeric viral vector comprising heterologous viral elements for production of another viral vector
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/22—Vectors comprising a coding region that has been codon optimised for expression in a respective host
Definitions
- the present disclosure relates generally to methods for modifying the sequence of a polynucleotide, or the sequence of a plurality of polynucleotides, so as to produce a modified polynucleotide(s) with altered codon usage.
- the disclosure also relates to methods for expressing the modified polynucleotides and methods for producing chimeric genes from the modified polynucleotides.
- DNA shuffling is a powerful process for generating diversity by recombination, and can be used to produce large libraries of chimeric genes and proteins, from which variants with desired properties can be selected. This process has been used to develop, for example, therapeutic enzymes and proteins (e.g. interferons) with improved properties, as well as viral vectors (e.g. AAV vectors) with improved transduction, tropism or other properties.
- therapeutic enzymes and proteins e.g. interferons
- viral vectors e.g. AAV vectors
- heterologous genes in transformed host cells to produce recombinant proteins is now commonplace.
- a large number of mammalian genes including, for example, murine and human genes, have been successfully expressed in various host cells, including bacterial, yeast, insect, plant and mammalian host cells.
- significant obstacles remain when expression of a foreign or synthetic gene in a selected host cell is desired.
- expression of a synthetic gene even when coupled with a strong promoter, often occurs with much lower efficiency or kinetics than would be expected. The same is frequently true of exogenous genes that are foreign to the host cell.
- Codon optimization is a technique that facilitates improved heterologous gene expression in a host cell by virtue of changing individual codons to synonymous codons that are more frequently used in a host or reference species, i.e. the species of the host cell.
- Different methods for optimizing codon have been described. These include methods in which the most common codon for a given amino acid in reference species is used at all positions in the heterologous gene, i.e. each given amino acid in the optimized sequence is encoded by a single codon which is the codon most commonly used for the given amino acid in the reference species.
- the number of codons used equals the number of amino acids in that sequence.
- the codons in the heterologous gene are modified so that the frequency of codon usage across all codons for each amino acid in the heterologous gene is the same that in the reference species. For example, if proline (Pro) is encoded by codons at the following frequencies in a reference species: CCU (10%), CCC (30%), CCA (10%) and CCG (50%); but at different frequencies in the species from which the heterologous gene is derived, then the sequence of the heterologous gene is modified so that the codons encoding Pro are present at the same frequencies as the reference species, i.e.
- DNA shuffling including DNA family shuffling
- AAV libraries based on AAV capsid libraries
- the present disclosure is predicated in part on determination that a method of codon optimization that is more localized and targeted than those codon optimization methods previously described provides unexpected benefits. For example, this localized codon optimization can result in improved gene expression. Furthermore, when the localized codon optimization is applied to multiple related genes, thereby increasing homology between those genes, more efficient recombination between all of the genes results when the genes are subjected to homologous recombination procedures such as DNA shuffling. As demonstrated herein, this in turn can lead to the development of highly diverse and highly functional libraries.
- the present disclosure provides a method for modifying a target polynucleotide, comprising: a) aligning the sequence of a target polynucleotide and the sequence of a reference polynucleotide, wherein the target polynucleotide and the reference polynucleotide encode related polypeptides; b) identifying codons in the target polynucleotide and corresponding codons in the reference polynucleotide; c) identifying a target codon in the target polynucleotide, wherein the target codon and the corresponding codon in the reference polynucleotide are synonymous; and d) modifying the sequence of the target codon to have the same sequence as the corresponding codon in the reference polynucleotide, to thereby produce a modified target polynucleotide.
- Step b) may be performed by, for example, performing in silico modification of the sequence of the target polynucle
- the method may further comprise making one or more additional modifications to the sequence of target polynucleotide.
- the one or more additional modifications do not alter the sequence of the encoded polypeptide.
- the one or more additional modifications alter the sequence of the encoded polypeptide.
- the reference polynucleotide may be selected from among an animal, fungal, plant or bacterial polynucleotide.
- the reference polynucleotide is a mammalian polynucleotide, such as a human, mouse, rat, rabbit, horse, pig, cow, dog, cat, ferret and hamster polynucleotide.
- the target polynucleotide may be selected from among an animal, fungal, plant, bacterial or viral polynucleotide.
- the target polynucleotide is a mammalian polynucleotide, such as a human, mouse, rat, rabbit, horse, pig, cow, dog, cat, ferret and hamster polynucleotide.
- the target polynucleotide is a human polynucleotide and the reference polynucleotide is selected from among a mouse, rat, rabbit, horse, pig, cow, dog, cat, ferret and hamster polynucleotide.
- the target polynucleotide and the reference polynucleotide have at least or about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95% sequence identity
- the polypeptide encoded by the target polynucleotide and the polypeptide encoded by the reference polynucleotide have at least or about 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% sequence identity.
- the present disclosure also provides a method for expressing a polynucleotide in a host cell, comprising: a) modifying a target polynucleotide according to the method described above and herein to produce a modified target polynucleotide; and b) introducing the modified target polynucleotide into a host cell to thereby facilitate expression of the modified target polynucleotide from the host cell, wherein the host cell is derived from the same organism as the reference polynucleotide.
- the level of expression of the modified target nucleic acid molecule is increased compared to the level of expression of the unmodified target nucleic acid molecule.
- the present disclosure provides a method for modifying the sequence of one or more polynucleotides, comprising: a) aligning the sequences of a plurality of polynucleotides, wherein the polynucleotides encode related polypeptides; b) identifying codons in each of the polynucleotides of the plurality and identifying the corresponding codons to which they align; c) identifying a set of aligned codons in which each codon in the set encodes the same amino acid, wherein the set of aligned codons comprises two or more synonymous codons; d) identifying the most frequently-occurring synonymous codon among the set of aligned codons; and e) modifying the sequence of any of the synonymous codons in the set of aligned codons that do not have the same sequence as the most frequently-occurring synonymous codon to have the same sequence as the most frequently-occurring synonymous codon to thereby produce one or
- step a) comprises translating the sequences of the plurality of polynucleotides to produce a plurality of amino acid sequences, aligning the plurality of amino acid sequences and converting the plurality of amino acid sequences back to the sequences of the plurality of polynucleotides;
- step d) comprises creating a codon-usage table to determine the frequency of occurrence of each synonymous codon; and/or step e) comprises in silico modification of the sequence of the synonymous codon and de novo synthesis of the one or more modified polynucleotides.
- the described method for modifying the sequence of one or more polynucleotides may further comprise making one or more additional modifications to the sequence of one or more of the plurality of polynucleotides.
- the one or more additional modifications do not alter the sequence of the encoded polypeptide(s).
- the one or more additional modifications alter the sequence of the encoded polypeptide(s).
- the plurality of polynucleotides have at least or about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95% sequence identity; and/or the polypeptides encoded by the plurality of polynucleotides have at least or about 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% sequence identity.
- the polynucleotides can encode any protein, although in exemplary embodiments the encode enzymes or capsid proteins, such as AAV capsid proteins (e.g. AAV capsid proteins of two or more AAV serotypes).
- the present disclosure further provides a method for producing a chimeric gene, comprising: a) selecting a plurality of polynucleotides, wherein each of the polynucleotides in the plurality comprises a related gene encoding a related polypeptide; b) modifying the sequence of one or more of the polynucleotides in the plurality according to the method described above and herein; and c) subjecting the plurality of polynucleotides from b) to a homologous recombination procedure, to thereby produce a chimeric gene.
- the homologous recombination procedure may be selected from DNA family shuffling, staggered extension process (StEP), random chimeragenesis on transient templates (RACHITT), and nucleotide exchange and excision technology (NExT).
- the homologous recombination procedure is a DNA family shuffling procedure and comprises: i) digesting the polynucleotides from b) into fragments; and ii) reassembling the fragments using PCR to form a chimeric gene.
- the chimeric gene produced by the methods of the disclosure may be any type of gene although in exemplary embodiments it is a chimeric viral capsid gene, such as a chimeric AAV capsid gene.
- the methods for producing a chimeric gene may further comprise inserting the chimeric gene into a vector (e.g. a plasmid or an AAV vector) or virus (e.g., AAV).
- a vector e.g. a plasmid or an AAV vector
- virus e.g., AAV
- a library of vectors can be produced.
- a library of AAV can be produced.
- the present disclosure therefore also provides a chimeric gene, a vector, and AAV or a library produced by the methods described above and herein.
- Figure 1 is a schematic representation of parental sequence contribution of sample clones isolated from AAVLib 256 . Size of vertical bars represents probability that individual residue was contributed by given parental vector. Black line represents the most probable composition of individual shuffled clones based on the longest sequence of identity to parental variants in a 5’ to 3’ direction.
- Figure 2 is a graphical representation of rAAV yield using helper plasmids encoding wild-type cap genes (wtAAV2, wtAAV5, wtAAV6 and wtAAV8) and corresponding hcoAAVs in the absence (A) or presence (B) of serotype- specific AAP provided in trans.
- Figure 3 is a schematic representation of localized codon-optimization (LCO). The upper panel represents schematic of wtAAV donors and sample alignment of fragment of cap sequence from AAV2, AAV5 and AAV6 showing DNA and amino acid sequences. Vertical lines indicate positions with same residues, while * represents mismatches at the DNA level. DNA residues modified using LCO algorithm are shown in larger font in the lower panel.
- LCO localized codon-optimization
- Figure 4 provides the results of a functional evaluation of AAVs modified using localized codon-optimization (LCO) algorithm.
- HuH7 cells were transduced at the same multiplicity of infection (MOI) with vectors encoding eGFP under the control of liver specific promoter (LSP). FACS analysis of eGFP was performed 72 hrs after transduction.
- MOI multiplicity of infection
- LSP liver specific promoter
- Figure 5 provides an alignment of wtAAV2 and lcoAAV2 cap gene region coding AAP.
- Figure 6 shows the results of a Western blot analysis of VP and Rep expression from wtAAV2, hcoAAV2 and lcoAAV2.
- Figure 7 represents the results of a study to produce an AAV shuffled library based on lcoAAV2, lcoAAV5 and lcoAAV6.
- HuH-R6Cl2 isolated from AAVLib lco256 library during selection screen on HuH7 cells.
- E Functional analysis of HuH-R6Cl2 clone.
- HuH7 cells were transduced with iodixanol preparations of rAAV-HuH-R6Cl2, control wt vectors (rAAV2, rAAV5, and rAAV6) and lcoAAV vectors (lcoAAV2, lcoAAV5 and lcoAAV6).
- FACS analysis of eGFP was performed 72hrs after transduction. **** p ⁇ 0.000l.
- Figure 8 provides the results of a functional evaluation of AAV1 through 12 optimized using localized-codon optimization method.
- Figure 9 presents the results of a study to produce rAAV vectors following codon optimization and shuffling
- LCO Localized Codon- Optimization
- a polypeptide includes a single polypeptide, as well as two or more polypeptides.
- a "codon position” refers to the loci in a polynucleotide at which a codon occurs.
- codon positions are denoted in numerical order from 5' to 3' along the polynucleotide.
- a codon position in a polynucleotide will be the same as the amino acid position in the encoded polypeptide of the amino acid encoded by the codon at the codon position.
- a codon at codon position 1 of a polynucleotide will encode the amino acid at amino acid position 1 of the polypeptide encoded by the polynucleotide.
- a codon at codon position 150 of a polynucleotide will encode the amino acid at amino acid position 150 of the polypeptide encoded by the polynucleotide.
- corresponding nucleotides refer to nucleotides, amino acids or codons that occur at aligned loci or positions.
- sequences of related or variant polynucleotides or polypeptides are aligned by any method known to those of skill in the art. Such methods typically maximize matches (e.g. identical nucleotides or amino acids at positions), and include methods such as using manual alignments and by using the numerous alignment programs available (for example, BLASTN, BLASTP, ClustalW, ClustalW2, EMBOSS, LALIGN, Kalign, etc.) and others known to those of skill in the art.
- nucleotides and/or codons By aligning the sequences of polynucleotides, one skilled in the art can identify corresponding nucleotides and/or codons (if the polynucleotide encodes a polypeptide). For example, by aligning two or more different capsid genes, such as a capsid gene from AAV1 and a capsid gene from AAV2, AAV3, AAV4 etc., one of skill in the art can identify nucleotides within the different capsid genes that correspond to each other (i.e. that occur at aligned loci) and also codons that correspond to each other (i.e. that occur at aligned loci or "codon positions" and that encode an amino acids that are at corresponding positions in the encoded capsid proteins).
- capsid genes such as a capsid gene from AAV1 and a capsid gene from AAV2, AAV3, AAV4 etc.
- aligned codons refers to codons in two or more separate polynucleotides that align with each other when the sequences of the two or more polynucleotides are aligned.
- aligned codons is analogous to "corresponding codons”.
- a "set of aligned codons” simply refers all of the aligned codons at a particular locus or position, e.g., when five polynucleotides are aligned, a set of aligned codons will typically consist of five codons. In instances where there is a gap in the alignment (e.g.
- the set of aligned codons is considered to consist of only those codons that do not include a gap, i.e. those aligned codons containing a gap are essentially ignored.
- the phrase "each codon in the set encodes the same amino acid” means that each of the aligned codons that do not contain a gap encode the same amino acid.
- the term "host cell” refers to a cell, such as a mammalian cell, that has introduced into it exogenous DNA, such as a vector or other polynucleotide.
- exogenous DNA such as a vector or other polynucleotide.
- the term includes the progeny of the original cell into which the exogenous DNA has been introduced.
- a "host cell” as used herein generally refers to a cell that has been transfected or transduced with exogenous DNA.
- related polynucleotides and “related genes” are used interchangeably and refer to polynucleotides or genes that have a similar sequence, structure, and/or function, and which encode polypeptides that have a similar sequence, structure, and/or function.
- Related polynucleotides and related genes include orthologs and paralogs that evolved through speciation or gene duplication, as well as other homologs that have been developed by, for example, recombinant techniques, including gene diversification techniques.
- sequence identity between two related polynucleotides is at least or about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%.
- related polypeptides refers to the polypeptides encoded by related polynucleotides or related genes.
- sequence identity between two related polypeptides is at least or about 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%.
- a "vector" includes reference to both polynucleotide vectors and viral vectors, each of which are capable of delivering a transgene contained within the vector into a host cell.
- Vectors can be episomal, i.e., do not integrate into the genome of a host cell, or can integrate into the host cell genome.
- the vectors may also be replication competent or replication-deficient.
- Exemplary polynucleotide vectors include, but are not limited to, plasmids, cosmids and transposons.
- Exemplary viral vectors include, for example, AAV, lentiviral, retroviral, adenoviral, herpes viral and hepatitis viral vectors.
- Adeno-associated viral vector refers to a vector derived from an adeno-associated virus, including without limitation, AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV 10, AAV11, AAV 12 or AAV13, or using synthetic or modified AAV capsid proteins, including chimeric capsid proteins.
- An AAV vector may also be referred to herein as "recombinant AAV”, “rAAV”, “recombinant AAV virion”, and “rAAV virion,” terms which are used interchangeably and refer to a replication-defective virus that includes an AAV capsid shell encapsidating an AAV genome.
- the AAV genome (also referred to as vector genome, recombinant AAV genome or rAAV genome) comprises a transgene flanked on both sides by functional AAV ITRs.
- AAV ITRs typically, one or more of the wild-type AAV genes have been deleted from the genome in whole or part, preferably the rep and/or cap genes.
- Functional ITR sequences are necessary for the rescue, replication and packaging of the vector genome into the rAAV virion.
- the present disclosure is predicated in part on the determination that a method of codon optimization that is more localized and targeted than those codon optimization methods previously described can result in improved gene expression. Moreover, as described herein, when this localized codon optimization is applied to multiple related genes, thereby increasing homology between those genes, more efficient recombination between all of the genes can result when the genes are subjected to homologous recombination procedures such as DNA shuffling. As demonstrated herein, this in turn can lead to the development of highly diverse and highly functional shuffled libraries. Accordingly, the methods of the present disclosure have particular applications in gene expression, homologous recombination procedures (e.g. DNA shuffling), the production of diverse and functional gene libraries, and the identification of variant proteins and viral vectors with improved properties.
- homologous recombination procedures e.g. DNA shuffling
- Previously-described methods of codon optimization essentially targe/ all of the codons in a gene.
- the gene is modified so that all of the codons in the gene represent the most frequently used codon in a reference polynucleotide for a given amino acid (e.g. a human gene is modified so that all of the codons for a particular amino acid residue are the same as the codon that is most frequently used in a mouse polynucleotide to encode that particular amino acid).
- the gene is modified so that the codon usage for each amino acid (i.e.
- the frequency with which each codon for each amino acid is used is the same as the frequency of codon usage in a reference polynucleotide (e.g. a human gene is modified so that the frequency with which each codon for each amino acid is used is the same as that found in a mouse polynucleotide).
- a reference polynucleotide e.g. a human gene is modified so that the frequency with which each codon for each amino acid is used is the same as that found in a mouse polynucleotide.
- codon optimization is limited to, or is predominantly limited to, codons in a polynucleotide that encode the same amino acid but using a different sequence as the corresponding codon in a related polynucleotide. It is these synonymous codons (i.e. codons having a different sequence, but which encode the same amino acid) that are modified. Codons that are the same as the corresponding codon in the related polypeptide are, either entirely or predominantly, left unchanged. Codons that encode a different amino acid than the amino acid encoded by the corresponding codon in the related polynucleotide are also, either entirely or predominantly, left unchanged.
- the methods of the present disclosure therefore require a localized and targeted assessment of each codon in a polynucleotide, so as to identify codons that encode the same amino acid as a corresponding codon in a related polynucleotide but with a different codon sequence (i.e. synonymous codons). These codons are then modified to alter their sequence so that they reflect the sequence of the corresponding codon in the related polynucleotide.
- the methods of the present disclosure provide a process for modifying the sequence of a single polynucleotide that encodes a polypeptide so as to produce a single modified polynucleotide. Such modification can facilitate improved expression.
- the methods involve aligning the sequence of a target polynucleotide for which codon optimization is desired, and the sequence of a reference polynucleotide.
- the target polynucleotide and the reference polynucleotide are related and encode related polypeptides.
- the sequence identity between the two polynucleotides is at least or about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95%.
- sequence identity between the polypeptide encoded by the target polynucleotide and the polypeptide encoded by the reference polynucleotide is typically at least or about 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%.
- the reference polynucleotide is generally derived from a species that is different to that from which the target polynucleotide is derived, and is selected on the basis that it is derived from the same species as an intended host cell for subsequent expression of the target polynucleotide.
- the target polynucleotide may comprise a human gene encoding a human protein, and is intended to be introduced into a mouse cell for expression.
- the reference polynucleotide would be a mouse polynucleotide.
- the target and reference polynucleotides are derived from the same species but a different type of tissue. In such an instance, the reference polynucleotide is selected on the basis that it is derived from the same type of tissue as the intended host cell for subsequent expression of the target polynucleotide.
- the target polynucleotide may be, for example, an animal, fungal, plant, bacterial or viral polynucleotide.
- the reference polynucleotide may be an animal (including mammalian), fungal, plant, or bacterial polynucleotide.
- the reference polynucleotide and the target polynucleotide are mammalian polynucleotides, such as, for example, human, mouse, rat, rabbit, horse, pig, cow, dog, cat, ferret, or hamster polynucleotides.
- the target and reference polynucleotides can be derived from any tissue type, including epithelial, connective, muscle, and nervous tissue. Non limiting examples of specific tissue from which the polynucleotides can be derived include tissue from the heart, lung, liver, spleen, kidney, pancreas, skin, brain, spinal cord, intestine and eye.
- the sequence of the target polynucleotide and the sequence of the reference polynucleotide are aligned. Alignment can be performed using any suitable method known in the art, of which there are many. Such methods typically maximize matches (e.g. identical nucleotides at positions), and include methods such as using manual alignments and by using the numerous alignment programs available (for example, BLASTN, BLASTP, ClustalW, ClustalW2, EMBOSS, LALIGN, Kalign, etc.) and others known to those of skill in the art.
- the sequences of the target and reference polynucleotides can be directly aligned or may be indirectly aligned.
- the polynucleotide sequences are first translated to produce amino acid sequences (i.e. the sequences of the polypeptides encoded by the target polynucleotide and the reference polynucleotide), and it is these amino acid sequences that are aligned.
- the aligned amino acid sequences are then re-converted back to nucleotide sequences, generating an alignment of the two polynucleotide sequences.
- Target codons are those that encode the same amino acid as the corresponding codon in the reference polynucleotide, but use a different sequence, i.e. the target codon and the corresponding codon in the reference polynucleotide are synonymous.
- the codon in the target polynucleotide is identified as a target codon.
- the codon in the target polynucleotide is TCT while the codon in the reference polynucleotide is TCA, then the codon in the target polynucleotide is identified as a target codon.
- One or more target codons identified as described above are then modified so that their sequence is the same as that of the corresponding codon in the reference polynucleotide.
- the codons at codon position 1 of each polynucleotide encode an isoleucine, but the codon in the target polynucleotide is ATT while the corresponding codon in the reference polynucleotide is ATC, then the codon in the target polynucleotide is identified as a target codon and its sequence modified from ATT to ATC.
- the codon in the target polynucleotide is identified as a target codon and its sequence modified from TCT to TCA. Any one or more of the identified target codons in a target polynucleotide can be modified in this manner. In some examples, all of the identified target codons in a target polynucleotide are modified.
- up to or at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% of the identified target codons are modified.
- codons that are not identified as target codons in the target polynucleotide are left unmodified.
- These non-target codons include those that encode an amino acid that is different to the amino acid encoded by the corresponding codon in the reference polynucleotide and those that have the same sequence as the corresponding codon in the reference polynucleotide.
- codons in the target polynucleotide that encode an amino acid that is different to the amino acid encoded by the corresponding codon in the reference polynucleotide are left unmodified, and/or all or at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% of the codons in the target polynucleotide that have the same sequence as the corresponding codon in the reference polynucleotide are left unmodified.
- the modification of the sequence of the target polynucleotide to produce a modified target polynucleotide can be performed using any method known in the art, including recombinant and synthetic methods, performed (either in part or in whole) in silico and/or in vitro.
- the modification of the sequence is performed in silico , followed by de novo synthesis of a modified target polynucleotide having the modified sequence (e.g. by gene synthesis methods such as those involving the chemical synthesis of overlapping oligonucleotides following by gene assembly).
- the in silico modification of the sequence may be performed with the assistance of an algorithm, which may optionally also function to identify one or more target codons for codon optimization, as described above. Localized codon optimization of multiple polynucleotides
- localized codon optimization is particularly useful when applied to multiple related genes prior to use in, for example, homologous recombination techniques for the purpose of protein engineering.
- the efficacy of techniques that are based on homologous recombination are dependent, at least in part, on the homology between the related genes.
- the main limitation of DNA family shuffling of AAV capsid genes is the efficiency with which parental capsid sequences can be shuffled, a critical step in library preparation that relies on homology between individual input capsid sequences at the DNA level. Because individual AAV serotypes differ to various degrees from one another, this has the potential to limit library complexity and cause unintentional bias in library composition. The outcome of such bias can be underrepresentation of less homologous variants and overrepresentation of sequences from parental variants with higher homologies in the final library (see e.g. Example 5).
- performing localized codon optimization on multiple related genes serves to increase DNA sequence homology between the related genes. Without being bound by theory, it is believed that the increase in homology between the genes facilitates more efficient recombination between each of the related genes, which in turn can lead to increased library diversification and diversification at the level of each clone (i.e. more parental donors contributing to each clone). Importantly, as demonstrated in the Examples, the altered codon usage resulting from localized codon optimization does not adversely affect gene and/or protein function.
- the methods of the present disclosure include methods for modifying the sequence of one or more polynucleotides encoding polypeptides using localized codon optimization, and methods for increasing the sequence homology between two or more polynucleotides encoding polypeptides using localized codon optimization.
- the sequence homology i.e. sequence identity
- the sequence homology may be increased by at least or about 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25% or more across the length of the polynucleotide, or across a region of the polynucleotide.
- sequences of a plurality of polynucleotides are aligned.
- Each of the polynucleotides are related and encode related polypeptides.
- sequence identity between any two polynucleotides in the plurality is at least or about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95%.
- sequence identity between two polypeptides encoded by any two polynucleotides is typically at least or about 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%.
- sequences of the polynucleotides can be aligned using any suitable method known in the art, of which there are many. Such methods typically maximize matches (e.g. identical nucleotides at positions) and include those performed manually and those performed using computer software. Numerous alignment programs are available and known to those of skill in the art (e.g. BLASTN, BLASTP, ClustalW, ClustalW2, EMBOSS, LALIGN, Kalign, etc.). Following alignment, codons in each of the polynucleotides are identified, as are their corresponding codons in the other polynucleotides, i.e. the codons to which they align.
- each of the codons in a set of aligned codons encodes an amino acid at corresponding positions of each of the encoded polypeptides.
- a set of aligned codons may consist of the first codon in each of the polynucleotides, and each of these codons encodes the first amino acid in the encoded polypeptides.
- codon positions can be denoted in numerical order from 5' to 3' along the polynucleotide.
- a codon position in a polynucleotide will be the same as the amino acid position in the encoded polypeptide of the amino acid encoded by the codon at the codon position.
- a codon at codon position 1 of a polynucleotide will encode the amino acid at amino acid position 1 of the polypeptide encoded by the polynucleotide.
- a codon at codon position 150 of a polynucleotide will encode the amino acid at amino acid position 150 of the polypeptide encoded by the polynucleotide.
- sequences of the polynucleotides can be directly aligned or may be aligned in an indirect manner by first translating the polynucleotides to produce amino acid sequences (i.e. the sequences of the polypeptides encoded by the polynucleotides). These amino acid sequences are then aligned before each sequence is converted back to its nucleotide sequence, thereby generating the alignment of the plurality of polynucleotide sequences. During this process, positional references are typically noted. In this way, each codon in each polynucleotide and its locus or codon position can be readily identified, more easily facilitating the identification of corresponding (or aligned) codons in each of the other polynucleotide sequences.
- each codon in the set encodes the same amino acid, but wherein the set contains two or more synonymous codons, i.e. the codons in the set do not all have the same sequence but all encode the same amino acid.
- each codon in the set may encode a serine but using the sequences TCT, TCT, TCC and TCA respectively.
- Those synonymous codons in the set of aligned codons that are not the most- frequently used are then modified so that they are the same as the most frequently-used codon, i.e. their sequence is modified so that it is the same as the sequence as the most frequently-used codon. This results in all of the codons in the set (or at the codon position) being the same following modification, i.e. each codon in the set of aligned codons in the resulting modified polynucleotides is the same as the most-frequently occurring codon prior to modification (for schematic representation, see Figure 2).
- the frequency of occurrence for TCT is 50%
- the frequency of occurrence for TCC is 25%
- the frequency of occurrence for TCA is 25%.
- TCT being the most frequently-occurring codon
- the TCC and TCA codons are modified so that they are also TCT, resulting in each of the polynucleotides containing a TCT at that codon position following modification.
- two synonymous codons may each have a frequency of occurrence of 40% and one synonymous codon may have a frequency of occurrence of 20%.
- one of the equally most prevalent synonymous codons is selected as being the "most frequently-occurring codon", and the other codons are therefore modified to be the same (i.e. have the same sequence) as this selected most frequently- occurring codon.
- the codon selected as the most frequently-occurring codon may be selected on any basis. In some instances this may be arbitrary while in other instances selection may be on the basis of rational design or other deliberate reasoning.
- the codon selected as the most-frequently occurring codon may be the one that is a present in a preferred or reference polynucleotide in the plurality, such as a polynucleotide that has known and desirable properties.
- the process of identifying a set of aligned codons in which each codon encodes the same amino acid, wherein the set of aligned codons comprises two or more synonymous codons, and subsequently modifying those codons that are not the most frequently-occurring codon, is typically repeated across the length of the alignment so that each of the polynucleotides is subjected to the localized codon-optimization across their full length.
- one or more regions of each of the polynucleotides may be left unaltered even if they contain positions at which the same amino acid is encoded by each of the codons at the position but where the codons include two or more synonymous codons.
- the modification of the sequence of the polynucleotide(s) to produce modified polynucleotides can be performed using any method known in the art, including recombinant and synthetic methods, performed (either in part or in whole) in silico and/or in vitro.
- the modification of the sequence is performed in silico , followed by de novo synthesis of a modified polynucleotide having the modified sequence (e.g. by gene synthesis methods such as those involving the chemical synthesis of overlapping oligonucleotides following by gene assembly).
- gene synthesis methods such as those involving the chemical synthesis of overlapping oligonucleotides following by gene assembly.
- improved function e.g. improved binding, specificity, activity etc.
- the polynucleotides in the plurality that are aligned and subjected to localized codon optimization may be derived from any source or organism, including an animal, plant, fungus, bacteria or virus.
- the polynucleotides are mammalian polynucleotides, including, without limitation, those derived from human, mouse, rat, rabbit, horse, pig, cow, dog, cat, ferret, and/or hamster.
- the polynucleotides are plant, fungal, bacterial or viral polynucleotides.
- the methods of the present disclosure may be applied to polynucleotides encoding any type of polypeptide having any activity or function, where it is desirable to alter that function or activity.
- Non-limiting examples include enzymes, antibodies or antigen -binding fragments thereof, and bacterial or viral surface proteins.
- the methods of the disclosure are particularly suited to the diversification of viral capsid genes and proteins, including AAV capsid genes and proteins.
- the AAV capsid gene encodes three capsid proteins: VP1, VP2 and VP3.
- the three capsid proteins typically assemble in a ratio of 1:1:10 to form the AAV capsid, although AAV capsids containing only VP3, or VP1 and VP3, or VP2 and VP3, have been produced.
- the AAV capsid gene also encodes the assembly activating protein (AAP) from an alternative open reading frame.
- AAP promotes capsid assembly, acting to target the capsid proteins to the nucleolus and promote capsid formation.
- Many AAV capsid genes are known and described in the art and more are being identified on a regular basis.
- the polynucleotides in the plurality that are subjected to localized codon optimization comprise AAV capsid genes and thus encode AAV capsid proteins.
- Localized codon optimization when applied to a single polynucleotide following alignment with a reference polynucleotide as described above, can facilitate improved expression of the gene in the polynucleotide when it is introduced into a host cell.
- the modified target polynucleotides produced using the methods described herein can exhibit increased expression when introduced into a host cell compared to the target polynucleotide prior to modification.
- methods for expressing a polynucleotide in a host cell comprise modifying a target polynucleotide using localized codon optimization as described above and herein to produce a modified target polynucleotide, and then introducing the modified target polynucleotide into a host cell to facilitate expression of the modified target polynucleotide from the host cell.
- the host cell is of the same species as that from which the reference polynucleotide is derived, and may optionally also be from the same tissue type.
- the host cell may be an animal (including mammalian), fungal, plant or bacterial cell.
- the host cell is a mammalian cell, such as, for example, a human, mouse, rat, rabbit, horse, pig, cow, dog, cat, ferret, or hamster cell.
- the host cell may be of any appropriate tissue type, including epithelial, connective, muscle, and nervous tissue.
- Non-limiting examples of specific tissue from which the host cell can be derived include tissue from the heart, lung, liver, spleen, kidney, pancreas, skin, brain, spinal cord, intestine and eye.
- Expression of the modified target polynucleotide may be increased by at least or about 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 100%, 150%, 200%, 250%, 300%, 400%, 500% or more compared to that of the unmodified target polynucleotide.
- the modified target polynucleotide is introduced into a suitable vector and operably linked to a promoter to drive expression.
- the vectors can be episomal vectors (i.e., that do not integrate into the genome of a host cell), or can be vectors that integrate into the host cell genome.
- Exemplary vectors include, but are not limited to, plasmids, cosmids, and viral vectors, such as AAV, lentiviral, retroviral, adenoviral, herpes viral, and hepatitis viral vectors.
- the vectors are plasmids.
- the vectors are AAV vectors.
- the choice and design of an appropriate vector and expression system, including appropriate host cell, is well within the ability and discretion of one of ordinary skill in the art.
- the methods of the present disclosure include methods for producing a chimeric gene by first selecting a plurality of polynucleotides, wherein each of the polynucleotides in the plurality comprises a related gene encoding a related polypeptide, modifying the sequence of one or more of the polynucleotides in the plurality using localized codon optimization as described above, and then subjecting the modified plurality of polynucleotides to a homologous recombination procedure, to thereby produce a chimeric gene.
- a library of chimeric genes is produced.
- DNA family shuffling involves in vitro recombination of related genes. Briefly, these methods typically involve enzymatically digesting the polynucleotides containing the related genes, such as with DNase I, to produce fragments; and reassembling the fragments into chimeric genes, which produces a library of chimeric genes. Reassembly of the gene fragments can be performed by PCR. Because of the related nature of the different genes, the gene fragments have overlapping regions of homology that allow the fragments to self prime in the absence of additional primer in the PCR. Thus, non-primer driven PCR can be used to assemble the fragments into chimeric genes that contain regions from multiple parental genes.
- primer- driven PCR is then also used to further amplify the chimeric genes.
- DNA family shuffling has been used to produce many chimeric genes, including chimeric capsid genes (see e.g. Grimm et al. (2008) J. Virol. 82:5887-5911, Koerber et al. (2008) Mol Ther. 16: 1703- 1709, Li et al. (2008) Mol Ther. 16: 1252-1260, and US Patent Nos. 7588772 and 9169299).
- StEP is a process first described by Zhao et al. (Nat Biotechnol. (1998) 16(3): 258-261) to produce thermostabilized subtilisin E variants. It is a PCR-based technique that involves first priming two or more related polynucleotides (parental template sequences) followed by repeated cycles of denaturation and abbreviated annealing/ extension (Zhao (2004) Methods in Enzymology 388:42-49). In each cycle the growing fragments anneal to different templates based on sequence complementarity and extend further. This is repeated until full-length polynucleotides form. Due to template switching, most of the resulting full length polynucleotides contain sequences from different parental templates.
- RACHITT is an alternative homologous recombination technique first described by Coco et al. (Nat Biotech (2001) 19:354-359) that does not involve PCR.
- the technique involves aligning fragments from the top strands of multiple parental genes onto the bottom strand of a uracil containing template.
- the 5' and 3' overhang flaps are cleaved and gaps are filled by the exonuclease and endonuclease activities of DNA polymerases.
- the uracil containing template is then removed from the heteroduplex by treatment with a uracil DNA glcosylase, followed by further amplification using PCR (see Coco W.M. (2003) RACHITT. In: Arnold F.H., Georgiou G. (eds) Directed Evolution Library Creation. Methods in Molecular Biology, vol 231. Humana Press).
- the chimeric gene(s) can then be inserted into a vector(s). This can result in the generation of a vector library.
- the vectors may be, for example, plasmids that facilitate subsequent cloning, amplification, replication and/or expression.
- the vectors are viral vectors, such as AAV vectors.
- the polynucleotides in the plurality comprise AAV capsid genes and the methods described herein are employed in the context of capsid diversification. In such embodiments, a library of chimeric capsid genes is produced. These can be introduced into vectors to produce a vector library.
- the vectors may be, for example, basic plasmids that facilitate subsequent cloning, amplification, replication and/or expression.
- the vectors are AAV vectors (or rAAV) comprising chimeric capsid proteins, which are encoded by the chimeric capsid gene encapsidated therein.
- AAV libraries i.e. viral libraries, not vector libraries are produced.
- AAV vector libraries replication-deficient and AAV libraries (replication competent) are well known in the art.
- AAV libraries chimeric capsid genes in a capsid gene library are typically cloned into a shuttle plasmid based on wild-type AAV to produce a construct or plasmid library. Plasmids and methods for producing these construct libraries are well known to those skilled in the art.
- the construct library is subsequently packaged to produce an AAV library.
- the replication competent AAV libraries contain AAV with all the same elements as wild-type virus, i.e.
- these libraries are viral libraries and not vector libraries.
- the AAV in the library can then be titrated and used in in vitro or in vivo models to select for chimeric capsids with desirable properties, which can then be vectorized.
- Methods for vectorizing a chimeric capsid protein are also well known in the art and typically involve introducing into a packaging cell line the chimeric capsid gene flanked by AAV ITRs, a rep gene, and helper functions for generating a productive AAV infection, and recovering a recombinant AAV from the supernatant of the packaging cell line.
- a packaging cell line Various types of cells can be used as the packaging cell line.
- packaging cell lines that can be used include, but are not limited to, HEK 293 cells, HeLa cells, and Vero cells, for example as disclosed in US20110201088.
- the helper functions may be provided by one or more helper plasmids or helper viruses comprising adenoviral helper genes.
- Non-limiting examples of the adenoviral helper genes include E1A, E1B, E2A, E4 and VA, which can provide helper functions to AAV packaging.
- Helper viruses of AAV are known in the art and include, for example, viruses from the family Adenoviridae and the family Herpesviridae.
- Examples of helper viruses of AAV include, but are not limited to, SAdV-l3 helper virus and SAdV-l3-like helper virus described in US20110201088, helper vectors pHELP (Applied Viromics).
- helper virus or helper plasmid of AAV that can provide adequate helper function to AAV can be used herein.
- rAAV virions are produced using a cell line that stably expresses some of the necessary components for AAV virion production.
- a plasmid (or multiple plasmids) comprising the nucleic acid encoding a capsid polypeptide of the present invention and a rep gene, and a selectable marker, such as a neomycin resistance gene, can be integrated into the genome of a cell (the packaging cells).
- the packaging cell line can then be transfected with an AAV vector and a helper plasmid or transfected with an AAV vector and co-infected with a helper virus (e.g., adenovirus providing the helper functions).
- helper virus e.g., adenovirus providing the helper functions.
- the cells are selectable and are suitable for large-scale production of the recombinant AAV.
- adenovirus or baculovirus rather than plasmids can be used to introduce the nucleic acid encoding the capsid polypeptide, and optionally the rep gene, into packaging cells.
- the AAV vector is also stably integrated into the DNA of producer cells, and the helper functions can be provided by a wild-type adenovirus to produce the recombinant AAV.
- any method suitable for purifying AAV can be used in the embodiments described herein to purify the recombinant AAV, and such methods are well known in the art.
- the recombinant AAV can be isolated and purified from packaging cells and/or the supernatant of the packaging cells.
- the AAV is purified by separation method using a CsCl gradient.
- AAV is purified as described in US20020136710 using a solid support that includes a matrix to which an artificial receptor or receptor-like molecule that mediates AAV attachment is immobilized.
- LCO Localized Codon-Optimization
- the LCO algorithm was written in Java as a native Geneious® (version 9.1.5)(www. geneiou s . com/) (PMID : 22543367) plugin (available for free download online).
- the resulting enhanced homology sequences are exportable to a FASTA format.
- the algorithm performs a specific multiple sequence alignment (MSA) on the target sequences using ClustalW2.
- MSA multiple sequence alignment
- the algorithm uses translation MSA where nucleotide sequences are translated to amino acid sequences while saving positional references.
- the resulting amino acid sequences are then aligned and re-converted back to nucleotide sequences by using the saved positional references.
- the algorithm identifies individual codons, translates them and identifies positions with 100% amino acid conservation. For all those positions, the algorithm creates local codon-usage table and uses the most common codon in all the variants in the alignment. In the case when two codons are used with the same frequency (50:50) to encode the same amino acid, the position is assigned to the codon of the first capsid used in the alignment, in this case AAV1. In regions with indels the algorithm ignores the sequences that have a gap and performs local codon optimization for all other sequences following the same method as described above.
- the algorithm allows selection of input variant(s) that will not be included in the calculation of the most common codon, but will undergo codon optimization. This feature is very important when including unverified or incomplete parental variant sequences. Furthermore, this feature allows users to perform codon optimization of novel parental sequences at a later time point, without affecting all the previously optimized variants.
- SOD Sequence Origin Depiction
- Xover tool http://qpmf.rx. umaryland.edu/xover.html
- the SOD allows the user to zoom in on the output sequence to perform detailed analysis at the nucleotide level, as well as customize the color scheme of the output.
- the tool displays also the crossover number, number of point mutations, Levenshtein distances to all parental sequence, effective mutation, as well as mean, minimum and maximum size of contributing fragments.
- the SOD graph can be exportable in a number of file formats.
- SOD output is a graph that depicts levels corresponding to parental sequences augmented with bars representing proportional likelihood of a nucleotide coming from a given sequence and a polygonal line depicting the most likely provenience for specific sequences in the resulting sequence.
- Clustal2 is used to align parental sequences and sequence being analyzed, each represented as a horizontal line, with a bar at each position where the nucleotide on the corresponding donor sequence(s) matches residue(s) in the analyzed sequence.
- the height of the bar is proportional to the percentage of likelihood the given residue contributes to the novel sequence.
- the contribution line is calculated as the longest sequence of identity in a 5' to 3' direction.
- the implementation was done in Java as a Geneious Plugin. This offers the opportunity to choose alternate MS As or perform manual alignment adjustments.
- AAV libraries were generated as described previously (Lisowski et al. (2014) Nature 506, 382-386) with minor modifications.
- AAV capsid genes were synthesized de novo (Genewiz Inc) and cloned into the same system (pRVl-l2).
- Capsid genes (Serotypes 1-12, AAV-EVE and mAAV for Lib_l-l2 and Lib_l-l2EM and Serotypes 2, 5 and 6 for Libraries 2/5/6) were excised using Swal and Nsil (NEB), mixed at 1:1 molar ratios and digested with 1:10 prediluted DNasel (NEB Cat#M030S) for 2 to 5 min.
- the pool of fragments was separated on 1% (w/v) agarose gel and fragments ranging from 200bp to lOOObp (for AAVLib 1_12+EM , AAVLib 1 12 , A A VLib lco 1 12+EM , AAVLib lcol 12 ) and from 200bp to 500bp (for AAVLib 256 and AAVLib lco256 ) were recovered with Zymoclean Gel DNA Recovery Kit (Zymogen Cat#D400lT).
- pRV based libraries were then digested overnight with Swal and Nsil and 1.4 mg of insert was ligated at l6°C with T4 ligase (NEB Cat#M0202) for 16 hours into 1 pg of a replication competent AAV2 based plasmid platform (p-Replication-Competent, p-RC) containing ITR-2 and rep2, and unique Swal and Nsil flanking a lkb randomized stuffer (ITR2-rep2-(SwaI)- stuffer-(NsiI)-ITR2). Ligation reaction was concentrated by ethanol precipitation, electroporated into SS320 electro-competent cells and grown as described before. Total pRC library plasmids were purified with an EndoFree Maxiprep Kit (QIAGEN Cat#l2362).
- rAAR1-12 plasmids (a generous gift from Prof Hiroyuki Nakai, OHSU, Portland, OR, USA) were co-transfected to provide AAP in trans.
- Cells were harvested 72 hours post transfection and centrifuged for 20 min at 3500g. Media was either discarded or used for qPCR titration following DNasel treatment to remove free plasmids.
- Cell pellet was resuspended in lmL of Benzonase Buffer (50 mM Tris, pH 8.5 with 2 mM MgCl 2 ) and underwent three freeze-thaw cycles.
- Genomic and free plasmid DNA was removed by incubating with Benzonase (EMD Chemicals Inc, Merck KGaA, Cat#l.101695.0002) 200U/mL final at 37°C for one hour. Cell debris was removed by centrifugation for 30min at 3500g. Supernatant was further cleared by adding 1M CaCl 2 to final concentration of 25mM and incubation on ice for lhr, followed by centrifugation at 3500g for 30 min at 4°C. Supernatant was transferred into a sterile cryotube and stored at -80°C. Production of Replication Competent AAV libraries.
- Recombinant AAV capsid libraries were transfected following the same protocol as described above, with the exception that only 2 plasmids were used: pAd5 helper plasmid and the pRC Libraries containing ITR-re/?2-Cap Ubrary -ITR at 1:1 molar ratio.
- Cell lysates were purified using iodixanol-based density gradient purification as previously described (PMID:26222983). 100K Amicon Ultra-4 Centrifuge Filter (EMD Millipore, Cat#UFC810024) were used to perform buffer exchange (PBS, 50 mM NaCl, 0.001% Pluronic F68 (v/v) (LifeTech, Cat#24040-032)) and concentration step.
- the variability of the library is estimated to be between at maximum 2.72x10 variants for AAVLib lco256 ; 2.4xl0 6 for AAVLib 1 12 ; 3.9xl0 6 for AAVLib lcol 12 ; 9.8xl0 6 for AAVLib 1- 12+EM ; and 6.9xl0 6 for AAVLib lcol-12+EM .
- Vector preparations were titrated by qPCR as previously described (Wang et al (2014) Hum Gene Ther Methods 25, 261-268) using the following GFP-specific forward and reverse oligonucleotides for vectors encoding LSP-GFP cassette; and re/?2-specific forward and reverse oligonucleotides for replication competent library preparations. Serial dilutions of linearized plasmid were used as standard curve.
- Human hepatoma HuH7 cells were maintained as monolayer cultures in Dulbecco’s Modified Eagle’s Medium (Sigma, Cat#D579) supplemented with 10% (v/v) Fetal Bovine Serum (Sigma, Cat#F8l92), 100 mg/mL penicillin and 100 mg/mL streptomycin.
- lxlO 5 cells were seeded per well in two 24-well tissue culture dishes 16 hours prior to infection with AAV library.
- Four lO-fold dilutions of the AAV library were added to the media in duplicate plates. 24 hrs after infection cells were washed with 1 x PBS and fresh media added.
- wild-type human Adenovirus 5 (hAd5)(ATCC Cat#VR-l5l6) was added at a multiplicity of infection (MOI) of 0.42 (based on 7 day TCID 50 ) to one of the plates.
- MOI multiplicity of infection
- AAV capsids were recovered from the media by PCR using primers flanking the capsid region.
- PCR amplified cap genes were cloned by Gibson Assembly in-frame downstream of the rep2 gene in a recipient pHelper packaging plasmid opened by PCR amplification and Dpnl treated. Twenty individual clones were sequenced to track progress of the selection process.
- Total protein (10 pg) was separated by polyacrylamide gel electrophoresis using 4-12% NuPAGE BisTris gels (FifeTechnologies, Carlsbad, CA, Cat# NP0322) followed by transfer to nitrocellulose membranes and blocking in PBS (_/_) , 5% (weight/volume) milk, 0.1% (volume/volume) Tween-20.
- Detection of VP1+VP2+VP3 proteins was performed using rabbit polyclonal primary antibody (1:300, American Research Products, Waltham, MA, Cat# 03-61084) and secondary antibody (goat anti-rabbit IgG-HRP, Santa Cruz Biotechnology, Dallas, TX, sc-2004) while detection of AAV Rep proteins was performed using anti-AAV Replicase antibody (mouse monoclonal anti-Rep, 303.9, 1:100, American Research Products, Waltham, MA, Cat# 03-61069) and secondary antibody (goat anti-mouse IgG-HRP, Agilent Dako, Santa Clara, CA, Cat# P0447).
- Vinculin primary antibody: mouse monoclonal hVIN-l, SigmaAldrich, Saint Louis, MO. Cat# V9131
- matching secondary antibody goat anti-mouse IgG-HRP, Agilent Dako, Santa Clara, CA, Cat# P0447
- Signal was Detection was done using SuperSignal West Pico Chemiluminescent Substrate (ThermoFisher, Rockford, IL, Cat# 34080) with FujiFilm Luminescent Image Analyzer system (LAS-4000).
- the efficiency of parental cap gene shuffling during the generation of AAV shuffled capsid libraries relies on homology between the input sequences at the nucleotide level.
- the level of homology of the capsid genes between twelve natural AAV isolates used for library generation varied significantly, with AAV1 and AAV6 having the highest homology (97.1%) and AAV5 and AAV12 the lowest (55.1%). It was hypothesized that such wide-ranging homologies at the DNA level would directly influence the efficiency of shuffling of the individual input sequences and bias the final library, leading to overrepresentation of sequences contributed by closely related donor AAVs.
- the three hcoAAVs cap genes were used to generate pAAV packaging constructs containing the AAV2 rep gene.
- This Localized Codon- Optimization (LCO) strategy increases homology between parental sequences by performing localized optimization at each codon independently of the rest of the gene sequence. Specifically, by generating a codon usage frequency table for each of the amino acid positions and then applying this local codon usage table to optimize individual sequences, the LCO algorithm minimizes the arbitrary changes that conventional codon optimization approaches introduce to the sequence (see Materials and Methods section and Figure 3).
- the novel highly shuffled library was cloned into a replication competent recipient construct containing ITRs and the Rep gene (ITR-Rep- Cap lihrary -ITR).
- the library was efficiently packaged using a standard transfection protocol and yielded 2x10 vector particles per packaging cell (2x10 total particles per five l5cm dishes).
- the HuH-R6Cl2 cap gene was recovered using PCR, cloned into a standard AAV-helper construct containing the rep2 gene and used to package the AAV- LSP-GFP construct.
- Packaging constructs expressing wild-type and lcoAAVs pAAV2, pAAV5, pAAV6 and pAAV-lcoAAV2, pAAV-lcoAAV5, pAAV-lcoAAV6
- Figure 7E The AAV-HuH-R6ClO vector transduced HuH7 cells with high efficiency ( Figure 7E), validating the shuffling of lcoAAVs as a new addition to the AAV engineering toolbox.
- lcoAAVl through 12 were used to perform AAV capsid shuffling and library generation.
- mAAV Australian marsupials
- AAV-EVE1 ultra- ancient AAV-derived endogenous viral element found within the genome of multiple marsupial species
- the LCO algorithm’s build in option was utilized to performed localized codon-optimization of AAV-EVE1 and mAAV using sequence input from AAV1-12 only, without using AAV-EVE1 and mAAV as input.
- localized codon-optimization increased homology between individual variants by up to 11.5% when compared to wild-type sequences ( Figure 9B), with the range homology range between variants increasing from 55-75% to 75-85% ( Figure 9C).
- SI shuffling index
- the likely explanation is that as a consequence of lower homology with AAV2 and AAV6-derived DNA fragments present during the PCR reassembly step, AAV5-derived fragments annealed to each other more efficiently than to DNA fragments from AAV2 and AAV6, leading to preferential reassembly of the AAV5 capsid gene or substantial proportions thereof.
- This effect can be reduced by introducing additional sequences into the reaction mixture, as in the case of AAVLib 1 12 or AAVLib 1 12+EM , generated from 12 and 14 parental donors, respectively, in which full-length AAV5 cap was not detected.
- AAVLib 1 12 or AAVLib 1 12+EM generated from 12 and 14 parental donors, respectively, in which full-length AAV5 cap was not detected.
- This is attributable to the fact that in a library based on a larger number of parental donors, the chance of individual DNA fragments from the same parental donor interacting with one another is reduced, decreasing the probability of full-length capsid reassembly.
- the distribution of homologies between individual fragments within the PCR mixture is more normally distributed in the context of larger libraries, increasing the probability that less homologous fragments will encounter another fragment with homology high enough to allow annealing.
- the novel enhancer/promoter element recently identified in the 5’UTR of AAV2 (Logan el al. (2017) Nat Genet 49, 1267-1273) further supports the possibility that in order to fully utilize the limited genome size, AAV could use all six DNA reading frames and thus contain additional coding or functional regions that remain to be identified. Should additional data become available in support of this hypothesis, the known level of genomic complexity of this“simple” virus would increase significantly and many of the commonly accepted assumptions related to AAV biology and vectorology would need to be re-evaluated.
- the new localized codon-optimization (LCO) algorithm presented here allows homology between individual input AAV variants to be increased ( Figure 9B) while minimizing the risk of loss of vector function ( Figures 6 and 8).
- the increased homology between individual input parental AAV cap genes led to more efficient shuffling as measured by decreased average size and increased number of individual fragments contributing to fully reassembled cap variants within shuffled AAVLib lco256 ( Figure 7B and 7C).
- AAV capsid shuffling based on lcoAAVs can be used in to enhance currently utilized AAV shuffling strategies thereby providing a powerful addition to the AAV engineering toolbox. Furthermore, based on the functional data obtained with lcoAAV capsids the localized codon-optimization algorithm could be applied more generically for DNA shuffling of other genes to increase the homology input parental sequences with reduced likelihood of impairing function.
- this new technology can enhance other techniques based on homologous recombination, such as staggered extension process (StEP), random chimeragenesis on transient templates (RACHITT), and nucleotide exchange and excision technology (NExT), making localized codon- optimization a powerful new tool with potential applicability and significant impact on the broader field of bioengineering.
- StEP staggered extension process
- RACHITT random chimeragenesis on transient templates
- NxT nucleotide exchange and excision technology
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Organic Chemistry (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Crystallography & Structural Chemistry (AREA)
- Plant Pathology (AREA)
- Microbiology (AREA)
- General Chemical & Material Sciences (AREA)
- Medicinal Chemistry (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Bioinformatics & Computational Biology (AREA)
- Library & Information Science (AREA)
- Analytical Chemistry (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
Abstract
La présente invention concerne de manière générale des méthodes pour modifier la séquence d'un polynucléotide, ou la séquence d'une pluralité de polynucléotides, de manière à produire un(des) polynucléotide(s) modifié(s) ayant une utilisation de codon altérée. L'invention concerne également des méthodes d'expression des polynucléotides modifiés et des méthodes de production de gènes chimériques à partir des polynucléotides modifiés.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| AU2018900609 | 2018-02-26 | ||
| AU2018900609A AU2018900609A0 (en) | 2018-02-26 | Methods for codon optimization |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2019161459A1 true WO2019161459A1 (fr) | 2019-08-29 |
Family
ID=67686665
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/AU2019/050160 Ceased WO2019161459A1 (fr) | 2018-02-26 | 2019-02-26 | Méthodes d'optimisation de codon |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2019161459A1 (fr) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2022198099A1 (fr) * | 2021-03-19 | 2022-09-22 | Recode Therapeutics, Inc. | Compositions de polynucléotides, formulations associées et leurs méthodes d'utilisation |
| US11642421B2 (en) | 2016-05-27 | 2023-05-09 | Transcriptx, Inc. | Treatment of primary ciliary dyskinesia with synthetic messenger RNA |
-
2019
- 2019-02-26 WO PCT/AU2019/050160 patent/WO2019161459A1/fr not_active Ceased
Non-Patent Citations (6)
| Title |
|---|
| CABANES-CREUS, M. ET AL.: "Codon-Optimization of Wild-Type Adeno-Associated Virus Capsid Sequences Enhances DNA Family Shuffling while Conserving Functionality", MOLECULAR THERAPY: METHODS & CLINICAL DEVELOPMENT, vol. 12, March 2019 (2019-03-01), pages 71 - 84, XP055633629 * |
| GRIMM, D. ET AL.: "E Pluribus Unum: 50 Years of Research, Millions of Viruses, and One Goal-Tailored Acceleration of AAV Evolution", MOLECULAR THERAPY, vol. 23, 2015, pages 1819 - 1831, XP055372394 * |
| HE, L. ET AL.: "Algorithms for optimizing cross-overs in DNA shuffling", BMC BIOINFORMATICS, vol. 13, no. 3, 2012, pages 1 - 14, XP021117734 * |
| MILLIGAN, J.N. ET AL.: "Shuffle Optimizer: A Program to Optimize DNA Shuffling for Protein Engineering", METHODS IN MOLECULAR BIOLOGY, vol. 1472, 2017, pages 35 - 45 * |
| MOORE, G.L. ET AL.: "eCodonOpt: a systematic computations framework for optimizing codon usage in directed evolution experiments", NUCLEIC ACIDS RESEARCH, vol. 30, 2002, pages 2407 - 2416, XP002959332 * |
| VILLALOBOS, A. ET AL.: "Gene Designer: a synthetic biology tool for constructing artificial DNA segments", BMC BIOINFORMATICS, vol. 7, 2006, pages 1 - 8, XP002509762 * |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11642421B2 (en) | 2016-05-27 | 2023-05-09 | Transcriptx, Inc. | Treatment of primary ciliary dyskinesia with synthetic messenger RNA |
| US11786610B2 (en) | 2016-05-27 | 2023-10-17 | Transcriptx, Inc. | Treatment of primary ciliary dyskinesia with synthetic messenger RNA |
| WO2022198099A1 (fr) * | 2021-03-19 | 2022-09-22 | Recode Therapeutics, Inc. | Compositions de polynucléotides, formulations associées et leurs méthodes d'utilisation |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11667931B2 (en) | AAV capsid production in insect cells | |
| JP7438981B2 (ja) | アデノ随伴ウイルスの肝臓特異的向性 | |
| AU2018200657B2 (en) | Methods of predicting ancestral virus sequences and uses thereof | |
| US10882886B2 (en) | Adeno-associated virus polynucleotides, polypeptides and virions | |
| US7220577B2 (en) | Modified AAV | |
| Cabanes-Creus et al. | Codon-optimization of wild-type adeno-associated virus capsid sequences enhances DNA family shuffling while conserving functionality | |
| JP2022547197A (ja) | アデノ随伴ウイルス(aav)の生体内分布の変更に向けた、aavとaav受容体(aavr)との間の相互作用を調節するための方法および組成物 | |
| CN113518824A (zh) | 新的aav变体 | |
| WO2017192750A1 (fr) | Vecteurs viraux adéno-associés recombinés | |
| JP2022530192A (ja) | プラスミドシステム | |
| WO2015196179A1 (fr) | Procédés d'encapsulation de plusieurs vecteurs viraux associés aux adénovirus | |
| JP2021515572A (ja) | キャプシド改変による組織特異的遺伝子送達の増加 | |
| CN114981442A (zh) | 用于生产重组aav的新颖组合物及方法 | |
| AU2003272672A1 (en) | High titer recombinant aav production | |
| EP4522729A1 (fr) | Compositions et procédés pour la production de parvovirus recombiné | |
| US20220177529A1 (en) | Fusion protein for enhancing gene editing and use thereof | |
| JP2022166181A (ja) | ヒト肝臓への遺伝子導入のためのアデノ随伴ウイルスビリオン | |
| KR20240025645A (ko) | 아데노-연관된 바이러스 패키징 시스템 | |
| Estevez et al. | Sequence analysis, viral rescue from infectious clones and generation of recombinant virions of the avian adeno-associated virus | |
| WO2019161459A1 (fr) | Méthodes d'optimisation de codon | |
| CN111718420A (zh) | 一种用于基因治疗的融合蛋白及其应用 | |
| JP2024525121A (ja) | 高レベルrAAV生産のためのシステム | |
| US20250320524A1 (en) | Effectively packaging high-quality raav vectors by minicircle dual transfection | |
| AU2019264991B2 (en) | Altering tissue tropism of adeno-associated viruses | |
| Kligman | Establishing a stable cell-line for producing Adeno-Associated Virus using CRISPR-Cas9 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19756595 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 19756595 Country of ref document: EP Kind code of ref document: A1 |