[go: up one dir, main page]

WO2025010306A2 - Vecteurs et procédés de construction de transgènes polycistroniques et/ou multigéniques - Google Patents

Vecteurs et procédés de construction de transgènes polycistroniques et/ou multigéniques Download PDF

Info

Publication number
WO2025010306A2
WO2025010306A2 PCT/US2024/036635 US2024036635W WO2025010306A2 WO 2025010306 A2 WO2025010306 A2 WO 2025010306A2 US 2024036635 W US2024036635 W US 2024036635W WO 2025010306 A2 WO2025010306 A2 WO 2025010306A2
Authority
WO
WIPO (PCT)
Prior art keywords
site
codon
restriction site
sequence
restriction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2024/036635
Other languages
English (en)
Other versions
WO2025010306A3 (fr
Inventor
Thomas Reed
Stephen SCHAUER
Austin PEPPEL
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Biosolution Designs
Original Assignee
Biosolution Designs
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Biosolution Designs filed Critical Biosolution Designs
Publication of WO2025010306A2 publication Critical patent/WO2025010306A2/fr
Publication of WO2025010306A3 publication Critical patent/WO2025010306A3/fr
Anticipated expiration legal-status Critical
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/66General methods for inserting a gene into a vector to form a recombinant vector using cleavage and ligation; Use of non-functional linkers or adaptors, e.g. linkers containing the sequence for a restriction endonuclease

Definitions

  • Plasmids are a naturally occurring small circular piece of DNA which can be stably maintained in an organism such as bacteria. Plasmids have been engineered as useful tools into which a foreign DNA fragment can be inserted for cloning purposes.
  • Commercial vectors contain features that allow for the convenient insertion of a DNA fragment into the vector or its removal from the vector, such as restriction enzyme recognition sites.
  • the vector and a fragment of exogenous DNA may be treated with a restriction enzyme that cuts the DNA, and DNA fragments, generating either blunt ends or overhangs known as sticky ends.
  • Vector DNA and DNA insert(s) with compatible ends can then be joined by molecular ligation.
  • a series of unique restriction sites are clustered within a multiple cloning site or polylinker. After a DNA fragment has been ligated into a cloning vector, it may be further subcloned into another vector designed for a more specific use.
  • cloning vectors in molecular biology have key features necessary for their function, such as a selectable marker. Others may have additional features specific to their use. For reason of ease and convenience, cloning is often performed using E. coli. Thus, the cloning vectors used often have elements necessary for their propagation and maintenance in E. coli, such as a functional origin of replication (ori). The ColEl origin of replication is found in many plasmids. Some vectors also include elements that allow them to be maintained in another organism in addition to E. coli.
  • An aspect of the invention relates to a method for building polycistronic or multigenic transgenes, comprising providing a set of interrelated plasmid DNA vectors genetically engineered to have one or more unique acceptor sites and a plurality of compliant genetically engineered nucleotide fragments wherein each compliant genetically engineered fragment has 5’ and 3’ ends that are compatible with one of the one or more unique acceptor sites.
  • a first plasmid DNA vector is selected and linearized at unique restriction site.
  • first acceptor site comprising a first restriction site with a 5’ end and a 3’ end, wherein a first nucleotide fragment has a 5’ end and a 3’ end that arc complimentary to the 5 ’end and the 3’ end of the first restriction site.
  • the first nucleotide fragment is inserted into the first acceptor site, And ligating the first insertion nucleotide fragment ends to the acceptor site ends regenerates the 3’ end of the first restriction site and destroys the 5’ end of the first restriction site while generating a further 5’ end of the first restriction site.
  • the linearizing, inserting and ligating steps are repeated with one or more additional nucleotide fragments, wherein each of the one or more additional nucleotide fragments is different from the first nucleotide fragments.
  • This iterative or sequential process produces a multigenic and/or polycistronic transgene comprising the first insertion segment and the one or more additional DNA segments linked end to end in a linear chain.
  • the various plasmid DNA vectors are suitable for various types of applications, including transient transfection, in vitro transcription, gene delivery, and shuttling into a viral or non- viral delivery vector.
  • a method of transferring an effector element from at least one of the plasmid DNA cloning vectors to the in vitro transcription vector to express proteins or polypeptides of interest in eukaryotic cells is provided.
  • the multigenic and/or polycistronic transgene is constructed in a first plasmid DNA vector, excised and inserted into a second plasmid DNA vector.
  • plasmid DNA cloning vectors comprise one or more genetically engineered unique acceptor sites, wherein the vector is configured for iteratively or sequentially producing a multigenic and/or polycistronic construct.
  • the plasmid DNA cloning vector has a nucleotide sequence identity selected from the group consisting of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6 and SEQ ID NO:7.
  • a combination of nucleotide molecules are provided for assembly of multigenic and/or polycistronic constructs, comprising a plasmid DNA cloning vector genetically engineered to have one or more unique acceptor sites and one or more compliant genetically engineered fragments which are insertable into at least one of the one or more unique acceptor sites, wherein each compliant genetically engineered fragment has 5’ and 3’ ends that are compatible with one of the one or more unique acceptor sites.
  • the plasmid DNA cloning vector and each of the one or more compliant genetically engineered fragments are designed such that ligation of one of the one or more genetically engineered fragments into the one of the one or more unique acceptor sites destroys a 5’ end of the acceptor site and regenerates a 3’ end of the acceptor site.
  • a method of using the integrated system of vectors rely on three types of acceptor vector/insert ligation events:
  • a single restriction enzyme site creates an “acceptor site” into which a 5’-SalI- Fragment-XhoI-3’ fragment can be inserted. This is a unique and purposeful single instance in the entire BOP structure. In this instance, one or the other insert-bounded overhangs are created or destroyed based upon the orientation of the insert.
  • the one or more unique acceptor sites comprises a plurality of acceptor sites
  • the plasmid DNA cloning vector further comprises a plurality of spacers with a spacer between each of the acceptor sites of the plurality of acceptor sites, with the caveat that each spacer is different from every other spacer in the plasmid DNA cloning vector.
  • the plurality of spacers is selected from the group consisting of CAGAGTCCC, GGGAGGTTT, ACCTCAAGG, GCAGAAGTC, AGCCAACCT, TGCCGAGTC, CCAGCCGCC, GAAGAGGT, CACTTCCTG, CTCTGAGCC, AGCCTCAGT and ATATCACGC.
  • each of the one or more compliant genetically engineered fragments to be inserted into a vector is genetically engineered to not have coding for Sall, Pvul, KasI, Agel, Xhol, Notl, and Asel restriction enzyme recognition sites.
  • the fragments are further genetically engineered to not have coding for Seal restriction enzyme recognition sites.
  • the fragments are further genetically engineered to not have coding for Bsgl restriction enzyme recognition sites.
  • a method for removing and replacing at least one restriction site or splice clement from a nucleotide sequence for insertion into a cloning vector, comprising the steps of identifying at least one restriction site or splice element to be removed, determining a desired replacement to be made by a user according to a plurality of rules used by an algorithm, generating an output DNA sequence, comparing an amino acid sequence translated from the output DNA sequence is the same as an amino acid sequence translated from the original DNA sequence, verifying the output DNA sequence, wherein the output DNA sequence passes this verifying step if the two amino acid sequences are identical; and adding nucleotides to the 5’ and 3’ ends of the output DNA sequence to enable de novo synthesis, site-directed mutagenesis and/or vector construction.
  • a spacer sequence of 8 nucleotides in length is added (GAGAGAGA) followed by a recognition site for a Pvul restriction endonuclease (CGATCG).
  • CGATCG a recognition site for a Pvul restriction endonuclease
  • AGTACT a recognition site for a Seal restriction endonuclease
  • a combination of nucleotide molecules and nucleotide molecule segments for assembly of multigenic and/or polycistronic constructs comprise a first nucleotide molecule genetically engineered to have one or more unique acceptor sites; and one or more compliant genetically engineered nucleotide molecule segments which are insertable into at least one of the one or more unique acceptor sites.
  • Each compliant genetically engineered nucleotide molecule segment has 5’ and 3’ ends which are compatible with a single one of the one or more unique acceptor sites.
  • the first nucleotide molecule and each of the one or more compliant genetically engineered nucleotide molecule segments are designed such that ligation of one of the one or more genetically engineered nucleotide molecule segments into the single one of the one or more unique acceptor sites destroys a 5’ end of the acceptor site and regenerates a 3’ end of the acceptor site.
  • a kit for building polycistronic or multigenic transgenes comprises a set of interrelated plasmid DNA vectors genetically engineered to have one or more unique acceptor sites, wherein each of the interrelated plasmid DNA vectors is engineered to permit insertion and ligation of a plurality of compliant genetically engineered nucleotide fragments wherein each compliant genetically engineered fragment has 5’ and 3’ ends that are compatible with one of the one or more unique acceptor sites of one of the interrelated plasmid vectors in the set, wherein said first plasmid DNA vector is one vector in the set, and wherein one or more of the interrelated plasmid DNA vectors is insertable into at least one other plasmid DNA vector of the set of interrelated plasmid DNA vectors.
  • Figure 1 shows the central dogma of molecular biology (DNA is transcribed into RNA and RNA is translated into protein) as it relates to the vectors of the invention.
  • the figure provides greater detail into the modular structure of eukaryotic genes at the level of a) double- stranded DNA; b) a primary RNA transcript showing the positioning of exons (rectangles) and introns (single lines between rectangles) and candidate splicing domains (dashed lines); c) a post-splicing messenger RNA (mRNA) with exons concatenated (linked rectangles), a 5’ untranslated region (5’UTR) defined by left-most rectangles with lighter shading, an open reading frame (ORF) defined by darker shaded region), a 3’UTR defined by right-most rectangles with lighter shading, and a polyA tail; and d) a translated protein (rectangle).
  • mRNA post-splicing messenger RNA
  • the figure further illustrates how a defined region of DNA contains distinct modules that exert their biological function either through DNA, RNA, or protein-encoded units (DNA-encoded modules).
  • DNA-encoded modules can be arbitrarily abstracted into three distinct domains that can be encoded in a plasmid DNA-based vector (vector-defined genetic modules).
  • vector-defined genetic modules consist of a transcription regulatory (R) domain or element, a biological effector (E) domain or element, and a transcription processing (P) domain or element.
  • Figures 2A-B show an overview of the pBOP-30 vector.
  • Figure 2A shows pBOP-30 at its highest level, with the order and identity of the restriction enzyme recognition sites and spacers of the polyMOD framework (top line), and architecture of modules within the circular vector diagramed below the polyMOD framework.
  • the modular architecture of the pBOP-30 may be used for engineering a multigenic and/or polycistronic constructs using a Sal/Xho into Xho or Sal/Not strategy.
  • Figure 2B shows the polyMOD framework introduced within the E element, which is an ORF, and is used to construct the substructure of the ORF.
  • Figure 3 shows an example of modular elements that have been inserted into the pBOP-30 vector to form a complete transcription unit (TU) within the vector.
  • TU transcription unit
  • Figure 4 shows a diagram of an exemplary method for using the pBOP-30 vector architecture to build a multigenic construct by subcloning transcription units (TUs) into the Xhol site.
  • the single Xhol single restriction enzyme site creates an “acceptor site” into which a 5’-SalI- Fragment-XhoI-3’ fragment may be inserted. This is a unique and purposeful single instance in the entire BOP structure, wherein one or the other insert-bounded overhangs are created or destroyed based upon the orientation of the insert.
  • Figure 5 shows a diagram of another exemplary method for using the pBOP-30 vector architecture to build multigenes by directionally subcloning TUs into the transcription unit linker (TUL).
  • Figure 6 shows a method for building larger multimeric pDNA vectors in three steps through the maintenance of the TUL at the 3 ’-most TU.
  • Figure 7 shows the polyMOD framework, above, and an arrangement of modules within a multi- effector/polycistronic pBOP-30-IRES vector, below.
  • the pBOP-30-IRES vector is useful for building a 2-ORF polycistron in two steps, as well as larger polycistronic constructs.
  • Figures 8A-B show the multi-effector/polycistronic IRES shuttle system of pBOP-30-IRES, which may be used for building a 3-ORF polycistron in two steps, as well as larger polycistronic constructs.
  • Figure 8A shows the steps for building a bi-ORF construct and
  • Figure 8B shows the steps for building a tri-ORF construct.
  • Figures 9A-E show the architecture for vectors useful for incorporating bioactive RNA species (bioRNA) within the effector domain of a transcription unit and how to use these in conjunction with either the pBOP-30 or pBOP-30-IRES vectors to build multi-effector E domains with one or more bioRNAs or in conjunction with ORFs.
  • Figure 9A illustrates the modular structure of the BOP-compliant, intron-encoded bioactive RNA (bioRNA) elements.
  • the upper diagram illustrates how one or more bioRNAs can be positioned within a synthetic splicing unit.
  • the lower diagram illustrates how one or more bioRNAs can be positioned within a wild type splicing unit.
  • Figure 9B shows the modular structure of the pBOP-30-bioRNA-l vector which is designed either to position a single bioRNA within a mono-effector domain or position the bioRNA upstream of a multi-effector containing one or more ORFs downstream of the bioRNA.
  • Figure 9C introduces the modular structure of the pBOP-30-bioRNA-2 vector which is designed to position a bioRNA downstream of one or more ORFs.
  • Figure 9D illustrates the steps whereby a bioRNA can be positioned within an EL domain.
  • Figure 9E illustrates how a bioRNA element can be positioned downstream of one ORF and upstream of a second ORF controlled by an IRES.
  • Figures 10A-C show various aspects of a pBOP-40 vector.
  • Figure 10A shows the polyMOD framework, above, and the modular architecture of the pBOP-40 vector, below.
  • Figure 10B shows an example of an assembled construct built within the pBOP-40 vector.
  • Figure IOC shows a three-step method for building a multigene construct in a pBOP-40 vector in a sequential process.
  • Figure 11A shows the modular architecture of a genomic integration negative selector backbones (GINSB).
  • Figure 11B shows the polyMOD framework and modular architecture of the pBOP-40-NS vector.
  • Figure 12 shows the polyMOD framework and modular architecture of a pBOP-30-IVT vector, comprising a reverse complement Bsgl site to cut a poly A site within the poly A sequence.
  • Figure 13 shows a flow chart of the algorithm to make protein encoding effector sequences compatible with vectors of the BOP system split into three steps: Sanitize Sequence Input, BOPize Sequence, and Verify Sequence.
  • Figure 14 shows a flow chart of the step of Sanitize Sequence Input.
  • Figure 15 shows a flow chart of the step of BOPize Sequence.
  • Figure 16 shows a flow chart of the step of Verify Sequence. DETAILED DESCRIPTION OF THE INVENTION
  • the invention creates an object-linking architecture designed to build multigenic therapies that can be delivered either as synthesized RNA or by employing viral or non-viral DNA delivery modalities to deliver to ex vivo cells or in vivo.
  • objectlinking system The significance of an objectlinking system is that it can enable rapid swapping of one or more pre-validated, standardized genetic elements within or between existing DNA vectors. In the absence of a standardized object-linking platform, scientists are restricted either to cobbling together DNA pieces derived from multiple techniques or to depend upon contemporary de novo synthesis technologies to build an entire DNA molecule from scratch.
  • the invention creates an architecture amenable to the low-cost and rapid construction of complex, therapeutically relevant DNA molecules by enabling the linking of defined genetic elements that can be generated using diverse molecular biology techniques, but rapidly linked using well-established genetic engineering techniques.
  • the end-product and/or intermediate produced using the vectors and methods of the invention enables rapid and low-cost production of vector variant libraries amenable to rapid screening for desired therapeutic functionality without requiring the de novo synthesis of distinct species.
  • DNA vectors and methods for their use are provided to build DNA molecules with one or more transcription units.
  • the vectors leverage the ability of scientists to employ de novo synthesis (DNS).
  • DNS de novo synthesis
  • the method of use guides generation of DNA sequences wherein certain restriction sites are removed and/or replaced by changing or deleting nucleotides to allow defined genetic functions to be bounded by specific restriction sites.
  • a two-tiered series of interlinked plasmids useful for the iterative or sequential assembly of a plurality of DNA segments are provided.
  • the basic vector structure contains two unique acceptor domains that independently enable the iterative or sequential addition of biological effectors or of entire transcription units. These two acceptor domains have a common molecular architecture wherein the 5 ’-most region is defined by complementary restriction site products that will be destroyed when ligated together and a 3 ’-most region defined by a shared site that is regenerated when ligated. Insertion of a first DNA segment into an acceptor domain destroys a complementary restriction site 5’end of the first DNA segment while the 3’ restriction site is regenerated at the insertion point to form a circular vector with a first insert.
  • a subsequent cleavage at a 3’ end of the first insert creates an insertion point for a second DNA segment, and the iterative or sequential assembly of additional DNA segments proceeds with each subsequent cleavage at a 3’ end of a growing linear chain of a desired number of inserts.
  • a method of using the vectors includes simple guidelines to ensure that de novo synthesis products, nucleotide fragments from genomic DNA, cDNA and/or clones from other plasmid vectors are compatible with the vectors of the invention, i.e., that the nucleotide sequences to be inserted into one of the vectors does not encode certain restriction enzyme sites.
  • a user may need to make one or more mutations to an insert sequence so that Sall, Pvul, KasI, Agel, Xhol, Notl, and Asel do not occur in the insert sequence.
  • any Seal sites must also be mutated so that components will be compatible with a separate polycistronic internal ribosome entry site (IRES) shuttle system or a bioactive RNA shuttle system.
  • IRS internal ribosome entry site
  • any Bsgl sites are removed from all sequences that will be contained within an insert into in vitro transcription vector.
  • BOP Bird of PreyTM
  • BOP BoP
  • the terms “Bird of PreyTM”, “BOP”, and “BoP” are used interchangeably to refer to the vectors of the invention, which form an interconnected system of vectors for various uses in biological systems.
  • the pBOP-30 vector is foundational to the BOP system and was used to build all subsequent vectors disclosed herein.
  • BOPization and “BOPizer” are used to refer to the process and/or algorithm used to prepare a nucleotide sequence for insertion into a BOP vector of the invention.
  • mutate refers to any process used to alter the nucleotide identity of a DNA sequence of interest.
  • Various techniques are known in the art and can be used to change or eliminate specific nucleotides.
  • Site-directed mutagenesis of DNA sequences is also known in the ail as gene editing.
  • de novo synthesis may also be referred to as producing a mutated nucleotide sequence designed using an original sequence of interest as a template.
  • mutations or edits are made to DNA sequences of interest to eliminate or introduce one or more restriction enzyme recognition sites.
  • a mutation is understood to be conservative, i.e., the mutated nucleotide sequence is translated into a protein that does not differ from one translated from the unmutated sequence.
  • the desired mutations to be made are conservative substitutions.
  • TU transcription unit
  • R transcription regulator
  • E biological effector
  • P transcription processing
  • IRES internal ribosome entry site
  • ORF open reading frame
  • UTR untranslated region
  • EL effector linker
  • CCEs compatible cohesive ends
  • TUV transcription unit vector
  • TUL transcription unit linker
  • CIP calf intestinal phosphatase
  • CMV cytomegalovirus
  • GIC gene integration controller
  • CMD chromatin modification domain
  • PSE positive selector element
  • BE bioindicator element
  • TUAD transcription unit acceptor domain
  • GFP green fluorescent protein
  • RFP red fluorescent protein
  • BDNF brain-derived neurotrophic factor
  • EMCV encephalomyocarditis virus
  • IVT in vitro transcription
  • IVTP in vitro transcription promoter
  • genomic integration negative selector backbones GINSB
  • SD synthetic splice donor
  • SL5 portion synthetic intron
  • transcription unit is used to describe the genetic material required to control the start and stop of transcription as well as the intervening sequences associated with the regulation of post-transcriptional modifications and control of protein translation as well as protein products and/or bioactive RNA species.
  • a vector may comprise a single transcription unit consisting of defined “domains” for the positioning of specific biological “elements”. Nucleotide sequences appropriate for each domain are classified as “elements” and are inserted into an appropriate domain.
  • a single TU consists of three functional elements: a) transcription regulatory (R) element; b) a biological effector (E) element; and c) a transcription processing (P) element, each of which is inserted into an R domain, an E domain and a P domain (see Figure 2).
  • the invention may be used to build vectors with multiple effectors in the E Domain as well as multiple TUs within the context of the vector’s entirety.
  • the genetic sequence of each TU element (R, E, and P) must be sufficiently different from each other.
  • multigenic refers to a gene construct that encodes multiple distinct transcription units.
  • a multigenic construct may also be referred to as being bi-genic, tri- genic, quad-genic, etc.
  • polycistronic is used interchangeably with the term “multi- effector transcript” and refers to an RNA transcript that encodes multiple open reading frames (ORFs) and/or an ORF and one or more bioactivc RNA (bioRNA) species.
  • a polycistronic construct may also be referred to as being bi-cistronic, tri-cistronic, quad-cistronic, etc.
  • Bioactive RNAs are non-coding RNAS that include but are not limited to microRNA (miRNA), short hairpin RNA (shRNA), miRNA antagomirs (anti-miRs), and long non-coding RNA (IncRNA).
  • the R element may consist of DNA, RNA or a combination of DNA and RNA sequences.
  • the DNA sequences can promote transcription
  • the RNA sequences can provide post-transcriptional and pre-translational control.
  • 5’ untranslated sequences (5’UTR) can affect RNA stability, subcellular localization, and/or translation efficiency).
  • R-D refers to a subdomain of the transcription regulatory domain mainly encoding DNA-based controllers
  • R-R refers to a subdomain of the transcription regulatory domain mainly encoding RNA-based controllers.
  • the E element may encode either bioactive RNA species (ex: microRNA or miRNA) and/or polyaminoacids (peptides and/or proteins).
  • bioactive RNA species ex: microRNA or miRNA
  • polyaminoacids peptides and/or proteins.
  • biological effector refers to a unit of DNA that can be transcribed to produce a bioactive RNA molecule, a long non-coding RNA (IncRNA) or an ORF that can be translated into a protein.
  • the effector domain may include a plurality of ORFs and/or bioactive RNA species.
  • E-N refers to a subdomain of ORF-encoded effectors that positions protein sub-elements at the amino (N)-terminus.
  • E-I refers to a subdomain of ORF-encoded effectors that positions protein sub-elements internally, being flanked by amino-terminal and carboxyl-terminal elements.
  • E-C refers to a subdomain of ORF-encoded effectors that positions protein sub-elements at the carboxy (C)-terminus.
  • the P element may encode RNA-based 3’UTR sequences and polyadenylation signals as well as hybrid RNA-/DNA-encoded sequences that control transcription termination as well as DNA-encoded elements that can either promote upstream transcription, regulate downstream transcription, prevent read-through of transcripts from downstream genes encoded on the complementary strand, or prevent gene silencing (e.g., enhancers, repressors, insulators, and/or chromatin remodeling sequences).
  • gene silencing e.g., enhancers, repressors, insulators, and/or chromatin remodeling sequences.
  • polyMOD framework refers to a nucleotide sequence encoding a series of restriction enzyme recognition sites interspersed with spacer sequences.
  • the polyMOD framework is analogous to a multiple cloning site found in cloning vectors, however, the selection and order of the restriction enzyme sites is carefully designed to allow assembly of inserts or elements (i.c., R clement, E clement, P clement, etc.) into a multigcnic or polycistronic construct in a controlled manner.
  • the polyMOD framework allows ordering of elements that recapitulates the ordering of genomic elements, such as a promoter, intron/exon coding sequences, polyA, and other regulatory sequences, even while allowing the various genetically engineered inserts to be selected from any organism or to be de novo synthesized with the desired nucleotide sequence.
  • the polyMOD framework also differs from traditional multiple cloning sites by interspersing spacer sequences between the restriction enzyme recognition sites to allow sufficient space for each enzyme to bind to its cognate site and cut the DNA sequence.
  • a traditional multiple cloning site does not require spacers since a user will typically select only one or two sites within the multiple cloning site to cut, while nearly every site, if not all sites, within the polyMOD framework will be utilized to assembly a functional transcriptional unit.
  • the term “rapid assembly” refers to the ability to iteratively or sequentially add fragments or inserts to a DNA construct directly into a growing linear chain of inserts without needing to use a different restriction enzyme to cleave the polyMOD framework or other regions of the cloning vector.
  • the assembly steps must be carried out sequentially, such as when individual modules or elements are joined together to form a transcription unit.
  • the assembly steps are performed iteratively, as when combining transcription units into a multigene or polycistron construct.
  • IRES internal ribosome entry site
  • the vectors and methods of the invention enable the construction of an IRES library, wherein individual IRES modules may have preferential activity in defined cell types and/or cell-states. Additionally, because the function of an IRES can be influenced by the tertiary structure of upstream and downstream RNA, the IRES library enables users the ability to build and test which IRES modules work best with their defined ORFs in their defined cell types.
  • Ori is a term of art used to identify DNA sequences encoding an origin of replication in a plasmid vector.
  • An Ori in a vector of the invention may include but is not limited to any used in widely available plasmids such as pUC, pBR322, pET, pGEX, pColEl, R6K*, pACYC, pSClOl, pBluescript, pGEM and/or PCDF; or an Ori identified as pMBl , ColEl , p!5A, pAClOl , Fl , CloDF13, and/or CDF.
  • Bacterial replication systems can also include elements in COSMIDS, FOSMIDS, and bacterial artificial chromosomes (BACs).
  • spacers refers to nucleotide sequences that are interspersed between restriction enzyme sites encoded in the vectors. Any elements that are inserted into the vectors will replace the spacers. For example, a spacer between the Pvul and Seal sites in the vector shown in Figure 2 will be replaced when an E element is successfully inserted into the vector. For example, the E element spacer is a “placeholder” at the location where the spacer will be removed and an E element module will be inserted into the vector. Other spacers include but are not limited to positive selected element spacers, E element spacers, TUL element spacers, P element spacers, 5’ chromatin modification domain spacers.
  • nucleotide sequences encoding polyMOD frameworks of the vectors may be represented as strings of bracketed restriction enzyme sites flanking the spacers, such as the following example: [(PstI)+(spacer)+(SalI)+(R element spacer)+(PvuI)+(E element spacer)+(ScaI)+(EL element spacer)+(KasI)+(P element spacer)+(AgeI)+(XhoI)+(TUL element spacer)+(NotI)+(spacer)+(XbaI)+((pBOP-30 vector backbone (VB))+(PstI)].
  • Spacers are preferably no less than six (6) and no more than fifteen (15) base pairs and occur as trimers so as to avoid translational frameshifts.
  • Ideal spacers will not, either within their sequence or when linked by upstream or downstream sequences, encode for a) restriction enzyme binding sites, b) transcription factor binding sites, c) Kozak sequence, d) RNA splice donor or splice acceptor sequences, e) a translational start (ATG) codon, or f) a polyadenylation signal (AATAAA).
  • each spacer may be used only one time per vector to avoid homologous recombination events.
  • Candidate spacer sequences include but are not limited to the exemplary nucleotide sequences shown in Table 1.
  • Tabic 2 provides two examples of how the spacer sequences listed in Table 1 may be used in a vector. Each spacer is used only once per vector but may be distributed in any order within a vector. The notation of N/A indicates an element that is not part of that vector. Table 2.
  • a positive and/or negative selector is used.
  • an antibiotic resistance gene is used for selection of cells or bacterium harboring successfully cloned inserts.
  • Antibiotics useful for this purpose include but are not limited to kanamycin, spectinomycin, streptomycin, ampicillin, carbenicillin, bleomycin, erythromycin, polymyxin B, tetracycline and chloramphenicol .
  • Figure 1 shows the modular architecture of eukaryotic genes and tracks the associated molecular components associated with pre-transcription (ex: chromatin modification domains, silencers, etc.), transcription (ex: promoters, enhancer, repressors, etc.), post-transcription (RNA splicing, RNA stabilizing/de-stabilizing elements, RNA subcellular transport, etc.), pretranslation (ex: subcellular localization-specific translation initiation, ccll-/statc- specific translation initiation, etc.), translation (ex: translation stuttering/stalling for controlled protein folding), and post-translation (ex: phosphorylation, glycosylation, etc.).
  • Figure 1 provides a purposeful abstraction of genetic control elements for the purpose of developing a modular DNA vector manufacturing system that can be used to produce DNA molecules encoding one or more transcription units with each of these transcription units encoding one or more biological effectors.
  • the BOP transcription unit comprises four major elements (TR, ED, EL, and TP; also identified as R, E, EL, and P), each of these elements are purposefully designed to have a defined substructure that allows for the efficient module swapping of defined functional elements. These subdomains are flanked by one or more restriction enzyme sites unique to the defined element or module class.
  • Figure 2 introduces how the four major elements of the transcription unit are defined five rare restriction enzyme sites, wherein the junction of each functional element is defined by a shared restriction site according to: 1) a transcription regulatory element or domain (R) of interest is bounded by Sall and Pvul, 2) a biological effector element or domain (E) is bounded by Pvul and KasI, 3) an effector linker element or domain (EL) is defined by Seal and KasI, and a transcription processing element or domain (P) is bounded by Kas I and Agel. Furthermore, Figure 2 illustrates that the R domain may be subdivided into two sub-elements (R-D and R-R) by the inclusion of an Afel site.
  • ORF-based effectors may be subdivided into three distinct subdomains (E-N, E-I, and E-C) by the fixed positioning of a Xmal site and a Mfel site.
  • the P domain may be subdivided into two sub-elements (P-5 and P-3) by the inclusion of a FspI site.
  • the R substructure allows for the efficient partitioning of DNA-encoded functional elements (R-D) from RNA-encoded functional elements (R-R).
  • the R-D subdomain is designed to include, but is not limited to, DNA-based elements encoding for promoters, enhancers, repressors, chromatin modification domains, and insulator sequences.
  • the R-R subdomain is designed to include, but is not limited to, RNA-based elements encoding splicing units and 5’ untranslated regions (5’ UTRs).
  • R sub-elements DNA- based or RNA-based does not, however, exclude the purposeful positioning of DNA-based motifs within the R-R domain (ex: important DNA-encoded transcription factor binding sites are frequently found downstream of the transcription initiation site).
  • ORF-bascd effectors frequently consist of defined protein coding sequences with no synthetic sub-architecture
  • a subset of BOP-compliant ORFs can consist of one or more defined subdomains.
  • a user may desire to add one or more functional motifs either to the amino-terminus or to the carboy-terminus of their proteins of interest.
  • These functional fusion protein motifs frequently confer properties allowing for one or more of the following properties but are not limited to these examples: 1) subcellular localization signals (ex: nuclear localization signals, endoplasmic reticulum localization signals, etc.), 2) visualization tags (ex: fluorescent proteins, antibody binding elements, etc.), and 3) protein purification elements (ex: His Tag, GST Tag, etc.).
  • the BOP ORF substructure allows for distinct motifs to be positioned at the amino-terminus of a protein (E-N) or at the carboxy-terminus (E-C).
  • E-N amino-terminus of a protein
  • E-C carboxy-terminus
  • the E-I domain is internal to the N-terminal and the C-terminal subdomains.
  • the P substructure allows for the partitioning of RNA-encoded and DNA-encoded motifs. Unlike the R substructure wherein the functional elements arc either DNA-encoded or RNA-encoded, the P substructure consists of a mixture of DNA-encoded and RNA-encoded elements. Because of this, the P substructure nomenclature uses the positional terminology of 5’ or 3’ of the FspI partitioning site.
  • the P-5 domain is designed to include RNA-based splicing elements, 3’ UTR sequences, and polyadenylation signals.
  • the P-3 domain is designed to contain the hybrid RNA/DNA motifs responsible for transcription termination as well as DNA-based motifs that can encode for enhancers, repressors, chromatin modification domains, and insulator elements.
  • the insert DNA to be cloned must be modified to remove certain nucleotide sequences. Mutations to sequences of interest are made to edit out any restrictions sites that would interfere in the engineering and use of a polycistronic and/or multigenic construct.
  • the term “compliant” refers to a nucleotide sequence that does not contain any restrictions sites that would interfere with engineering or use of a construct and is therefore ready for insertion into a vector of the invention.
  • restriction enzymes may be used in a vector but must also be edited out of any DNA sequences that arc de novo synthesized for insertion into the cloning sites of the vector.
  • Alternative methods for producing mutations in sequences include but are not limited to site-directed mutagenesis of cDNA, genomic clones, or directly from genomic DNA or reverse-transcribed mRNA.
  • the restriction enzyme sites shown in Table 3 are present in the vector domains indicated and must be edited out of any DNA sequences of an element to be inserted into its respective domain. New sites useful for subdomains of TUs are in bold text.
  • R and P elements may contain functional domains (influenced by transcription factor binding sites and/or hairpin structures) that cannot be mutated without adversely affecting their function, only the key architectural sites are mutated within the R and P domains.
  • the key sites for the transcription unit multigenic vector are Sall, Pvul, Seal, KasI, Agcl, Xhol, Notl, and Asci. Since several different codons arc available to choose from when encoding an amino acid within an ORF, there is greater latitude in mutating restriction sites in an ORF.
  • restriction sites are chosen for specific functional purposes to denote pivots for “sub-ORF” components (e.g., Xmal and Mfel can be used to establish amino-terminal and carboxy-terminal modules, respectively, to flank a defined effector ORF).
  • Sub-ORF e.g., Xmal and Mfel can be used to establish amino-terminal and carboxy-terminal modules, respectively, to flank a defined effector ORF.
  • Table 4 provides a name and sequence listing number for each plasmid DNA vector.
  • the table also provides the recognition sites for restriction enzymes that arc in the polyMOD framework for each vector.
  • the pBOP-30 vector comprises a nucleotide sequence encoding [(PstI)+(spacer)+(SalI)+(R element spacer)+(PvuI)+(E element spacer)+(ScaI)+(EL element spacer)+(KasI)+(P element spacer)+(AgeI)+(XhoI)+(TUL element spacer)+(NotI)+(spacer)+(XbaI)+((pBOP-30 vector backbone (VB))] (note: a spacer is not required between Agel and Xhol).
  • the pBOP-IRES vector [(PstI)+(spacer)+(SalI)+(R element spacer)+(PvuI)+(Reporter ORF-1 element (ex: red fluorescent protein (RFP) spacer)+(EcoRV)+(IRES element spacer)+(PacI)+(Reporter ORF-2 element (ex: green fluorescent protein (GFP) spacer)+ (ScaI)+(EL element spacer)+(KasI)+(P element spacer)+(AgeI)+(XhoI)+(TUL element spacer)+(Notl)+(spacer)+(Xbal)+((pBOP-30 vector backbone (VB))J
  • the pBOP-30-bioRNA-l vector [(PstI)+(spacer)+(SalI)+(R element spacer)+(PvuI)+(EcoRV)+(bioRNA element spacer)+(ScaI)+(EL element spacer)+(KasI)+(P element spacer)+(AgeI)+(XhoI)+(TUL element spacer)+(NotI)+(spacer)+(XbaI)+((pBOP-30 vector backbone (VB))]
  • the pBOP-30-bioRNA-2 vector [(PstI)+(spacer)+(SalI)+(R element spacer)+(PvuI)+(E element spaccr)+(EcoRV)+(bioRNA clement spaccr)+(ScaI)+(EL clement spaccr)+(KasI)+(P element spacer)+(AgeI)+(
  • the pBOP-40 vector comprises a nucleotide sequence encoding [(PstI)+(GID element spacer))+(Asd)+(5’ chromatin modification domain (CMD) element spacer)+(SphI)+(positive selector element (PSE) element spacer)+(SpeI)+(bioindicator element (BE) spacer)+(PspOMI)+(spacer)+(SalI)+(R element spacer)+(PvuI)+(E element spacer)+(ScaI)+(EL element spacer)+(KasI)+(P element spacer)+(AgeI)+(XhoI)+( spacer)+(NsiI)+(3’ chromatin modification domain (CMD) element spacer)+(NotI)+(3’ GIC element spacer)+(pB OP-30 vector backbone (VB)].
  • the pBOP-40-NS vector is a variant of pBOP-40 that also contains one or two negative selectors flanking the GIC elements.
  • the structure is defined as [(Fspl)+(negative selector- 1 (NegSel-1) element spacer)+(PstI)+(5’ GIC element spacer)+(AscI)+(spacer)+(NotI)+(3’ GIC element spacer)+(XbaI)+(negative selector-2 (NegSel-2) element spacer)+(AfeI)+(pBOP-30 vector backbone (VB)].
  • the entire NegSel-I poly MOD may be isolated with (FspFPstl) and the NegSel-2 polyMOD may be isolated with (Xbal/Afel).
  • the pBOP-30-IVT vector comprises a nucleotide sequence encoding [(PstI)+(spacer)+(SalI)+(IVTP element)+(AfeI)+(R-R ACC)+(PvuI)+(E element)+(KasI)+ )+(P- 5 ACC)+(FspI)+ (poly A element)+(reverse BsgI)+(AgeI)+(spacer)+(XbaI)+((pBOP-30 vector backbone (VB))+(PstI)].
  • DNA sequences to be used for synthesis of mRNA products by in vitro transcription Bsgl sites in any 5’ regulatory motifs, ORFs, and 3’ regulatory motifs must be edited out, as this is used to linearize the poly A sequence in the in vitro transcription step of the invention.
  • Each vector may be described as comprising a nucleotide sequence encoding a polyMOD framework having a string of restriction enzyme sites interspersed with spacer sequences. Each of the vectors, though interrelated, is designed for a different application.
  • the pBOP-30 vector is designed for transient transfections leading to non-persistent episomal delivery of pDNA.
  • the pBOP-30-IRES vector is designed for building a multi-effector/polycistronic transcription unit wherein the effector domain can encode two or more ORFs and/or two or more bioRNAs.
  • the pBOP-30-IVT vector is designed for in vitro transcription.
  • the pBOP-40 and pBOP-40-NS vectors have an architecture that allows incorporation of a single effector domain, a single transcription unit, or a multigene into a viral manufacturing shuttle vector or a non-viral genomic integration shuttle vector.
  • the pBOP-30-bioRNA vectors are designed to encode a bioactive RNA species (bioRNA) either as a single effector, or to be positioned upstream or downstream a single ORF, or to be positioned within a mixed ORF/bioRNA multi-effector Transcription Unit employing the pBOP-30-IRES vector.
  • bioRNA bioactive RNA species
  • These seven vectors are interrelated because they are designed so that a completed multigene or polycistron built in one vector may be transferred to another vector to be used for a different application.
  • a multigenic construct built in the pBOP-30-IVT vector may be excised as a whole and inserted into a PB40-based vector for gene delivery or transfer to a viral delivery system.
  • PB40-based vector for gene delivery or transfer to a viral delivery system.
  • the interrelationship of these vectors allows for rapid construction testing and subsequent application for therapeutics.
  • a selection of components may be put into a single-pot reaction. While this will not hold true across the pBOP-30-based and pBOP-40-based classes, there are a subset of elements within each class that are amenable to dynamic vector assembly.
  • the overall pBOP-40 vector architecture contains restriction enzyme site-bounded genetic modules encoding biological functions key to controlling and assaying the successful delivery and stable integration of exogenous DNA into a eukaryotic genome.
  • the pBOP-40 vector architecture contains genetic modules that allow for the inclusion of: a) genome integration controllers (GIC), b) chromatin modification domains (CMD), c) positive selector element (PSE), and d) bioindicator element (BE).
  • the components of the pBOP-40 vector need only have restriction sites mutated that allow for a) the ordered positioning of GICs, CMDs, PSEs, and BEs; and staged insertion of either a single effector into the transcription unit acceptor domain 1 (TUAD-1), a single TU or multigenic TUs into TUAD-1, TUAD-2, or TUAD-3.
  • TUAD-1 transcription unit acceptor domain 1
  • the purposeful selection of restriction sites enables the staged addition of components into the TUADs.
  • the TUAD-1 position is always employed first and can be used either to position a single effector (prepared with Pvul and KasI) into the TUAD-1 at Pvul and KasI , or to position a single transcription unit without its accompanying TUL (prepared with Sall and Agel) into the TUAD-1 at Sall and Agel.
  • a single TU or multigenic TUs along with their respective TUL (prepared with Sall and Notl) scan be positioned into the TUAD-2 position at Xho I and Noth
  • a single TU or multigenic TUs along with their respective TULs (prepared with Sall and Not I) can be inserted in reverse orientation into the TUAD-3 position at Sall and PspOMI (which is a compatible cohesive end for Notl).
  • PspOMI which is a compatible cohesive end for Notl.
  • the GIC modules flank the 5’ and 3’ ends of the pB OP-40 genomic integration destination acceptor vector and are respectively defined by the 5’ ordering of restriction sites: 5’ GIC: PstI and Asci; 3’ GIC: Notl and Xbal. Depending upon the biochemical activities of GICs, either one or both of the GIC modules may be occupied.
  • GIC sequences can encode for a) transposase integration donor sites, such as Sleeping Beauty transposon system or piggyBacTM transposon b) small tyrosine recombinase sites, such as cre-associated lox sites or Flp-associated Frt sites; large serine recombinases, such as SF370, PhiC31, or Bxbl; or homologous recombination arms useful with or without site- specific nucleases, such as ROSA26 or HPRT locus.
  • transposase integration donor sites such as Sleeping Beauty transposon system or piggyBacTM transposon
  • small tyrosine recombinase sites such as cre-associated lox sites or Flp-associated Frt sites
  • large serine recombinases such as SF370, PhiC31, or Bxbl
  • homologous recombination arms useful with or without site- specific nucleases, such as ROSA26 or HPRT locus
  • Chromatin modification domains consist of DNA sequences that can influence pre-transcriptional and transcriptional mechanisms. These include, but are not limited to, sequences that influence the epigenetic structure of a gene as well as sequences that can enhance or inhibit transcription. Examples include long-range cis-regulatory elements known as insulators (ex: CTCF insulator and beta-globin locus). Insulators display two functions: a) enhancer-blocking insulators prevent distal enhancers from acting on the promoter of neighboring genes, and b) barrier insulators prevent silencing of euchromatin by the spread of neighboring heterochromatin.
  • the positive selector element encodes for genes that provide a cell protection from toxins.
  • the PSE module is designed to encode a TU with an architecture different from that of pBOP-30.
  • the PSE module is defined by the following order: Sphl-spacer-Hindlll-spacer- EcoRI-spacer-Spel, and the entirety of the PSE module is bounded by the 5’ SphI and 3’ Spel sites.
  • the PSE module is positioned in the antisense orientation in order to prevent any potential transcriptional readthrough into downstream modules.
  • the PSE module can either be built within the context of pBOP-40, or in a separate shuttle vector wherein the R domain is bounded by SpcI and EcoRI, the E domain is bounded by EcoRI and Hindin, and the P domain is bounded by Hindlll and Sphl.
  • PSEs include but are not limited to genes that encode for enzymes that metabolize zeocin, kanamycin, neomycin, puromycin, and hygromycin.
  • the bioindicator element (BE) module encodes for genes that can provide a useful bioassay signal, such as a reporter gene.
  • the BE module is designed to encode a TU with an architecture different from that of pBOP-30.
  • the BE module is defined by the following order; Spel-spacer-Hindlll-spacer-EcoRI-spacer-PspOMI, and the entirety of the PSE module is bounded by the 5’ Spel and 3’ PspOMI sites.
  • the BE module is positioned in the antisense orientation in order to prevent any potential transcriptional readthrough into downstream modules.
  • the BE module can either be built within the context of pBOP-40, or in a separate shuttle vector wherein the R domain is bounded by PspOMI and EcoRI, the E domain is bounded by EcoRI and Hindlll, and the P domain is bounded by Hindlll and Spel.
  • BEs include but are not limited to genes that encode a) fluorescent proteins (ex: green fluorescent protein or red fluorescent protein), b) reporter gene enzymes that when acting on defined substrates can generate photonic or colorimetric signals (ex: luciferase, beta-galactosidase, alkaline phosphatase, or horse radish peroxidase), and c) reporter gene channels or pumps that allow uptake of imaging agents into cells and tissues (ex: sodium iodide symporter or transferrin receptor).
  • fluorescent proteins ex: green fluorescent protein or red fluorescent protein
  • reporter gene enzymes that when acting on defined substrates can generate photonic or colorimetric signals
  • reporter gene channels or pumps that allow uptake of imaging agents into cells and tissues (ex: sodium iodide symporter or transferrin receptor).
  • pBOP-30 The vector identified as pBOP-30 is shown in Figure 2, and has the nucleotide sequence of SEQ ID NO:1.
  • pBOP-30 comprises mutations that were made to the vector backbone, which are required to generate a vector backbone that a) maintains the function of the Ori, b) maintains the function of the kanamycin resistance gene, and c) removes restriction sites that are in the polyMOD framework, since these must be unique to the poly MOD framework where they are needed for inserting genetic modules.
  • the polyMOD framework of pBOP-30 is shown in the line drawing above the circular plasmid diagram and provides the order and identity of restriction enzyme recognition sites interspersed with spacer sequences that was inserted into the pBOP-30 backbone at the PstI site.
  • the combination of the polyMOD framework and the backbone represent the “empty” vector and have the nucleotide sequence identity of SEQ ID NO: 1.
  • the circular plasmid diagram of pBOP-30 illustrates the locations of various domains and sub-domains wherein genetic elements, i.e., nucleotide segments, are inserted into the vector. As with all the vectors disclosed herein, these domains represent locations where the indicated elements will be placed within the vector by excising and replacing the spacers shown in the diagram of the polyMOD framework.
  • the pBOP-30 vector was used to build a pBOP-30-based construct identified as pBOP- 38-CMV promoter-chiron splice unit-EGFP-bGHpA (pBOP38-CMEGBH), shown in Figure 3.
  • the pBOP-38-CMEGBH monogenic construct is designed for transient transfection in eukaryotic cells.
  • the pBOP38-CMEGBH vector enables the “swapping” of defined regulatory and/or biological effector-coding sequences.
  • a key feature of pBOP38-CMEGBH is that it can be employed for iterative addition of TUs by leveraging the property of having compatible cohesive ends (CCEs) when cut with the restriction enzymes Sall and Xhol.
  • a TU from one compliant vector can be inserted at the 3’ end of a different TU vector (TU-2) by subcloning the TU-1 Sall/XhoI-boundcd fragment into the Xhol site of the TU-2 vector.
  • TU-1 Sall/XhoI-boundcd fragment
  • the single Xhol single restriction enzyme site creates an “acceptor site” into which a 5’-SalI-Fragmcnt- XhoI-3’ fragment may be inserted. This is a unique and purposeful single instance in the entire BOP structure, wherein one or the other insert-bounded overhangs are created or destroyed based upon the orientation of the insert.
  • a TU linking protocol #2 enables the user to control the sense strand orientation from left-to-right (5’- to-3’).
  • TU linking protocol #2 takes advantage of the transcription unit linker (TUL) defined by the ordered positioning of a Xhol site, a spacer sequence designed for efficient binding of restriction enzymes, and a Notl site. Because each TU has a TUL at its 3’ terminus, any transcription unit vector (TUV) can serve either as a) a TU donor or b) a TU vector acceptor.
  • TUL transcription unit linker
  • a TU-1 vector can serve as a vector acceptor for a transcription unit in a TU-2 vector.
  • a phosphatase e.g., CIP
  • the TU-1 vector becomes a TU acceptor vector.
  • a TU-2 donor sequence is prepared. The resulting restriction digested TU-1 acceptor vector and the TU-2 donor vector are then run out on an agarose gel and the appropriate fragments are gel-purified.
  • TU donor fragments may be similar in size to the TU donor vector backbone, which precludes efficient resolution on an agarose gel.
  • the TU donor vector can be cut with Sall, Notl, and Asel.
  • the Asel serves to cut the vector backbone into two smaller pieces which can be resolved away from the TU donor fragment.
  • a two TU multigenic vector can be produced by ligating the TU-2 donor fragment with the TU-1 acceptor vector fragment.
  • the Sall and Xhol compatible cohesive ends are destroyed, yielding a multigene that has a Sall site at the 5’ end of the TU-1 module and a complete TUL at the 3’ end of the TU-2 module, thus enabling the addition of subsequent TU modules.
  • This is illustrated in Figure 5 wherein two dual effector multigenic vectors are first built, and then these are subsequently linked together using the transcription unit linker module. By repeating these steps, additional TUs may be added in an iterative manner.
  • TU-1 acceptor vector is generated by digesting with Xhol and Notl
  • one or more TU donor fragments are generated by digesting with Sall and Xhol
  • a last TU donor fragment is generated by digesting with Sall and Noth
  • Figure 6 shows the method for building larger multimeric, i.e., multigenic, constructs through the maintenance of the TUL at the 3’-most TU.
  • four separate TUs (TU-1, TU-2, TU-3, and TU-4) are linked from left-to-right (5’- to 3’) in the following order [(TU-l)+(TU-2)+(TU-3)+( TU-4)].
  • step 1 the following two multigenic vectors are built: [(TU-l)+(TU-2)] and [(TU-3)+(TU-4)].
  • step 2 the [(TU-3)+(TU-4)] multigene is released with from the vector with Sall/Notl and subcloned into the [(TU-l)+(TU-2)] multigene at Xhol/Notl to yield the quad multigene defined by [(TU-l)+(TU-2)] and [(TU-3)+(TU-4)].
  • a multigenic construct useful for vaccine development is built within the TU architecture of the pB OP-30 vector (not shown).
  • the genetic sequence of each TU element R, E, and P
  • Three different transcription units (TU-1, TU-2, and TU-3) will be linked according to the molecular structure [(TU-l)+(TU-2)+(TU-3)].
  • Each TU comprises sequences that encode either an antigen or an immune-activating factor as the E element.
  • TU-1 is defined by the following structure [(EFl alpha promoter)+(COVID-19 spike protein)+(bovine growth hormone poly A)].
  • TU-2 is defined by the following structure, which is [(CMV promoter)+(COVID-19 envelope protein)+(SV40 polyA)].
  • the TU-3 is defined by the following structure: [(PGK promoter)+(IL12 multi-effector)+(HSVTK polyA)].
  • a Sall/Notl-bounded TU-2 fragment is subcloned into TU-1 at Xhol/Notl, yielding the [(TU- l)+(TU-2)] multigenic.
  • a Sall/Notl-bounded TU-3 fragment is subcloned into the [(TU- l)+(TU-2)] multigenic at Xhol/Notl, yielding the [(TfJ-l )+(TU-2)+(TU-3)] multigenic.
  • the resulting vector (data not shown) can be used as a pDNA-bascd vaccine that can be delivered either via electroporation or a non- viral transfection reagent.
  • Figure 7A shows pBOP-30-IRES, a multi-effector/polycistronic IRES vector system useful for building a 2-ORF polycistron.
  • the polyMOD framework of pBOP-30-IRES is shown in the line drawing above the circular plasmid diagram and provides the order and identity of restriction enzyme recognition sites interspersed with spacer sequences that was inserted into the pBOP-30 backbone at the PstI and Xbal sites.
  • the base IRES vector is defined by the following ordering of genetic modules: [(PvuI)+( ORF-1 element spacer))+(EcoRV)+(IRES element spacer)+(PacI)+( ORF-2 element spacer )+(ScaI)+(effector linker (EL))+(KasI)].
  • the pBOP-30- IRES has the nucleotide sequence of SEQ ID NO:2.
  • a useful embodiment pre-positions red fluorescent protein (RFP) in the ORF- 1 element acceptor and green fluorescent protein (GFP) in the ORF-2 element acceptor. This serves as an IRES functional assay system. IRES function can be tested by confirming the presence of both RFP and GFP expression in transfected mammalian cells. A non-functional IRES will only reveal RFP expression.
  • Figure 8A shows a method whereby a multi-effector bi-cistronic TU is built in pBOP-30- IRES following a two-step process:
  • Step 1 subclone ORF-1 into the RFP module. To do this, cut the IRES shuttle with Pvul (sticky end) and EcoRV (blunt end) and dephosphorylate with calf intestinal phosphatase (CIP). Next, isolate ORF-1 by cutting with Pvul (sticky) and Seal (blunt). Finally, ligate ORF-1 into the IRES shuttle. When ligated together, the two blunt sites (EcoRV and Seal) are destroyed and the sticky Pvul site is regenerated. The resulting polycistron is generated: [(PvuI)+(ORF-l)+(IRES- l)+(PacI)+(GFP)+(ScaI)+(EL)+(KasI)].
  • Step 2 subclone ORF-2 into to the GFP module of the polycistron built in step 1 . First, cut the [(ORF-1)(IRES-1)(RFP)] polycistron with PacI (sticky) and KasI (sticky) and dephosphorylate.
  • Figure 8B shows a multi-effector/polycistronic IRES shuttle system useful for building a 3-ORF polycistron or tri-cistron construct.
  • the final product comprises a triple effector polycistron.
  • Step 1 subclone ORF-3 into the GFP module of the IRES-2 shuttle. To do this, cut the IRES-2 shuttle with PacI (sticky end) and KasI (sticky) and dephosphorylate with CIP. Next, isolate ORF-3 by cutting with Pvul (sticky) and KasI (sticky). Finally, ligate ORF-3 into the IRES-2 shuttle. Pvul and PacI have compatible cohesive ends which are respectively destroyed when ligated together, and the sticky KasI site is regenerated. The resulting polycistron is generated: [(PvuI)+(RFP)(EcoRV)+(IRES2)+(PacI)+(ORF-3)+(ScaI)+(EL)+(KasI)].
  • Step 2 subclone [(IRES-2)+(ORF-3)] into to the IRES linker (IL) module of the polycistron of p-BOP-30-IRES.
  • First cut the [(ORF-l)(IRES-l)(ORF-2)] polycistron with Seal (blunt) and KasI (sticky) and dephosphorylate.
  • the resulting triple effector [(ORF-l)+(IRES-l)+(ORF-2)+(IRES-2)+(ORF-3)] can then be used either as a) a substrate for building a larger polycistron, or b) serve as an effector module for subcloning into any pBOP-30-based TU at PvuFKasI.
  • a polycistronic construct (not shown) is built within the architecture of the pBOP-30- IRES cloning vector.
  • This cloning vector has a multi-effector polycistron defined by [(PvuI)+(RFP ORF)+(EcoRV)+(IRES-l)+(PacI)+(GFP ORF)+(ScaI)+(effector Linker (EL)+KasI].
  • Two separate effectors (ORF-1 and ORF-2) are each defined by the following molecular structure: [(PvuI)+(ORF-X)+(ScaI)+(EL)+(KasI)].
  • Both ORF-1 and ORF-2 can be generated either by 1) de novo synthesis (DNS), 2) performing a preparative PCR reaction using a pDNA vector encoding the compliant ORFs as a template, or 3) by isolating the ORF fragments from a compliant transcription unit.
  • DNS de novo synthesis
  • ORF-2 Independent of the method for acquiring ORF-1 and ORF-2, it is essential first to define the desired ordering of ORF-1 and ORF-2 in the desired multi-effector polycistron.
  • the multi-effector transcription unit is constructed with the following end structure: [(ORF-l)+(IRES-l)+(ORF-2)].
  • ORF-1 codes for brain-derived neurotrophic factor (BDNF-1)
  • ORF-2 codes for apolipoprotein E-epsilon-2 (APOE s2)
  • IRES-1 codes for the encephalomyocarditis virus internal ribosome entry site (EMCV IRES).
  • the method of construction comprises the following two steps to build the [(BDNF-1 )+(EMCV IRES)+(APOE 82)] multi-effector element.
  • Step 1 subclone ORF-1 into the GFP module. To do this, cut the IRES shuttle with Pvul (sticky end) and EcoRV (blunt end) and dephosphorylate with Calf Intestinal Phosphatase (CIP). Next, isolate ORF-1 by cutting with Pvul (sticky) and Seal (blunt). Finally, ligate ORF-1 into the IRES Shuttle. When ligated together, the two blunt sites (EcoRV and Seal) are destroyed and the sticky Pvul site is regenerated.
  • the resulting polycistron is generated: [(PvuI)+(ORF-l)+(IRES- 1 )+(PacI)+(RFP)+(ScaI)+(EL)+(KasI)] .
  • Step 2 subclone ORF-2 into to the RFP module of the polycistron built in Step 1. First, cut the [(ORF-1)(IRES-1)(RFP)] polycistron with Pad (sticky) and KasI (sticky) and dephosphorylate. Next, cut ORF-2 with Pvul (sticky) and KasI (sticky). Pvul and PacI have compatible cohesive ends which are respectively destroyed when ligated together.
  • the resulting dual effector [(ORF-l)+(IRES-l)+(ORF-2)] can then be used either as A) a substrate for building a larger polycistron, or B) serve as an Effector Module for subcloning into any pBOP-30pBOP- 30-based Transcription Unit at Pvul/Kasl.
  • a construct containing both a protein encoded effector and an intron-encoded bioRNA is built within the architecture of the cloning vector identified as pBOP-30-bioRNA-2.
  • the pBOP- 30-bioRNA-2 vector has the nucleotide sequence of SEQ ID NO:4.
  • This cloning vector has a multi-effector polycistron defined by [(PvuI)+( ORF)+(EcoRV)+(bioRNA- 1 )+(ScaI)+(Effector Linker (EL)+KasI].
  • ORF-1 The protein encoded effector (ORF-1) is defined by the following molecular structure: [(PvuI)+(ORF-X)+(ScaI)+(EL)+(KasI)].
  • Both ORF- 1 and bioRNA-1 can be generated either by 1) de novo synthesis (DNS), 2) performing a preparative PCR reaction using a pDNA vector encoding the compliant ORFs as a template, or 3) by isolating the ORF fragments from a compliant Transcription Unit.
  • DNS de novo synthesis
  • BioRNA- 1 Independent of the method for acquiring ORF- 1 and BioRNA- 1 , it is essential first to define the desired ordering of ORF-1 and ioRNA-1 in the desired multieffector construct.
  • a multi-effector transcription unit is constructed with the following end structure: [(ORF-l)+(bioRNA-l)].
  • ORF-1 codes for brain-derived neurotrophic factor (BDNF-1)
  • bioRNA-1 codes for intron 16 of the human heterogeneous nuclear ribonucleoprotein K (HNRNPK) gene, which contains a copy of the human microRNA miR7-l.
  • the method of construction comprises the following steps to build the [(BDNF- l)+( HNRNPK Intron 16) multi-effector element.
  • the resulting construct is generated: [(PvuI)+(ORF-l)+(bioRNA- l)+( ScaI)+(IL)+(KasI)].
  • the resulting dual effector [(ORF-l)+(bioRNA-l)] can then be used either as A) a substrate for building a larger polycistron, or B) serve as an effector module for subcloning into any pBOP-30-based transcription unit at PvuPKasI.
  • Figure 10A shows the pBOP-40 vector, which can be used for gene delivery, viral manufacturing and for non- viral genomic integration.
  • the pBOP-40 vector has the nucleotide sequence of SEQ ID NO:5.
  • the polyMOD framework of pBOP-40 is shown in the line drawing above the circular plasmid diagram and provides the order and identity of restriction enzyme recognition sites interspersed with spacer sequences that was inserted into the pBOP-40 backbone at the PstI and Xbal sites.
  • the detailed structure of the pBOP-40 when all elements are assembled is defined as: [(PstI)+(5 ’ genome integration controller (GIC))+(Ascl)+(5’ chromatin modification domain)+(SphI)+(positivc selector element (PSE))+(SpcI)+(BE)+PspOMI+ (spaccr)+(SalI)+(R element)+(PvuI)+(E element)+(ScaI)+(EL element)+(KasI)+(P element)+ (AgeI)+(XhoI)+(TUL element)+(NsiI)+(3’ CMD)+(NotI)+(3’ GIC)+(vector backbone (VB))+ (PstI)].
  • FIG. 11A further illustrates the structure of two transcription unit acceptor domains (TUAD-1 and TUAD-2).
  • TUAD-1 and TUAD-2 A single pBOP-30-based TU can be positioned into the TUAD-1 at Sall/Agel.
  • a SalPXhoI-flanked pBOP-30-based multigenic can be positioned into pBOP-40 either Sall/Xhol or at Xhol; these are thermodynamically challenging events because both the Sall/Xhol or Xhol sticky sites will want to self-anneal.
  • a Sal J/Not J pBOP-30pBOP-30-based multigene can be subcloned into pBOP-40 Xhol/Notl.
  • the pBOP-40 TUAD includes a NsiPNotl-bounded CMD element. If it is necessary to have a 3’ CMD in a pBOP-40pBOP-40-based multigenic vector, then it may be necessary to build a TUAD-2-based transcription unit.
  • Figure 10B is an example of a construct built in the pBOP-40 vector. Note that the various elements of the construct now indicate the identity of DNA modules that are assembled with the p-B OP-40 vector, rather than the generic identification of destinations for each element that are shown in Figure 10A.
  • Figure IOC illustrates exemplary steps for building a construct within the pBOP-40 vector.
  • the TUAD-1 position is always employed first and can be used either to position a single effector (prepared with Pvul and KasI) into the TUAD-1 transcription unit effector domain at Pvul and KasI , or to position a single transcription unit without its accompanying TUL (prepared with Sall and Agel) into the TUAD-1 at Sall and Agel.
  • TUL prepared with Sall and Agel
  • a single TU or multigenic TUs along with their respective TULs can be inserted in reverse orientation into the TUAD-3 position at Sall and PspOMI (which is a compatible cohesive end for Notl).
  • PspOMI which is a compatible cohesive end for Notl.
  • This example provides a method for constructing a vector to be integrated into a CHO cell line at a well-characterized genomic locus containing a large serine recombinase cassette exchange acceptor domain (not shown).
  • the final product encodes for the constitutive expression of a single chain antibody-Fc fusion protein along with a tetracycline-inducible glycosyltransferase.
  • the construct is assembled within a pB OP-40 vector, which is designed to facilitate genomic integration by means of recombinase donor sites within the 5’ and 3’ genomic integration controller (GIC) domains that match with the appropriate recombinase acceptor sites already pre-positioned in a favorable CHO genomic locus.
  • GIC genomic integration controller
  • a transcription unit encoding a single chain antibody and Fc domain fusion protein (TU- 1) is assembled in a pBOP-30 vector and the assembled TU element is subcloned into the pBOP- 40 TUAD-1 acceptor domain at Sall/Agcl, yielding pBOP-40-TUl.
  • a pBOP-30-based multigene encoding tetracycline-inducible glycosyltransferase is also constructed with the following architecture; [(TU-2)+(TU-3)] wherein TU-2 encodes a tetracycline inducible promoter (R element) driving expression of a glycosyltransferase (E element) and TU-3 encodes a ubiquitous, constitute promoter (R element) driving the expression of the tetracycline regulator protein (E element).
  • the Sall/Notl-flanked [(TU-2)+(TU-3)] multigenic element is subcloned into the pBOP-40-TUl vector at the TUAD-2 acceptor domain at Xhol/Notl, yielding the final vector pBOP-40[(TU-l)+(TU-2)+(TU-3)].
  • This vector is then inserted into the large serine recombinase acceptor domain of CHO cells via a recombinase-mediated cassette exchange reaction (RMCE).
  • RMCE recombinase-mediated cassette exchange reaction
  • FIG 11A shows the modular structure of a GINSB and Figure 1 IB shows a map of the pBOP-40-based vector, pBOP-40-NS, comprising the GINSB.
  • the nucleotide sequence of pBOP-4-NS is that of SEQ ID NO:6.
  • the polyMOD framework of pB OP-40 is shown as a line drawing above the circular plasmid diagram and provides the order and identity of restriction enzyme recognition sites interspersed with spacer sequences.
  • This polyMOD was inserted into the pBOP-30 backbone using Nsil at the 5’ end (Nsil and PstI have compatible cohesive ends) and Nhel at the 3’ end (Nhel and Xbal have compatible cohesive ends) so that the PstI and Xbal sites are maintained for subsequent position of the CID domains, thereby creating pBOP-40.
  • the pBOP-40-NS vector has the nucleotide sequence of SEQ ID NO:6.
  • the functionality of a GINSB does not require both negative selector (NegSel) domains to be occupied, but the overall selectivity is improved if two different negative selector genes are positioned in the NegSel- 1 and the NegSel-2 positions.
  • the vectors of the invention allow a user to integrate a foreign gene into a genome either using integrases or by employing homologous recombination.
  • the efficiency of these processes can be influenced by numerous factors, including but not limited to genomic site selection enhancement (ex: use of a nuclease to induce a site-specific genomic cleavage that enhances the probability of a homologous recombination-mediated genomic repair event).
  • genomic site selection enhancement ex: use of a nuclease to induce a site-specific genomic cleavage that enhances the probability of a homologous recombination-mediated genomic repair event.
  • Positive selectors include, but are not limited to, NeoR, PuroR, HygroR, and ZeoR genes. To select against random genomic integration events, researchers have historically flanked their desired “genome integration controllers” with negative selector genes. Historical examples of negative selectors employed in the generation of knockout or knock-in transgenic animals included, but are not limited to, use of chemical selectors genes (ex: HSVTK, CDA, etc.) and/or optical selector genes (ex: fluorescent proteins, chromophores, etc.).
  • GINSB makes use of a short-lived red fluorescent protein in the NegSel- 1 position and nothing in the NegSel-2 position.
  • This example exemplifies a GINSB with a strong promoter (CMV) driving expression of a degron -flanked red fluorescent protein (degRFP) and the GIDs encode for Sleeping Beauty Transposon donor sites.
  • CMV strong promoter
  • degRFP degron -flanked red fluorescent protein
  • the GINSB will yield short-term expression of red fluorescent protein (RFP) until the transposase donor vector has either been degraded or diluted by successive cell divisions.
  • RFP red fluorescent protein
  • degrons ensures that any RFP expression is driven by the strong CMV promoter and not from RFP stability. Persistence of RFP expression after several cell divisions indicates that an aberrant, random genome integration has occurred.
  • the use of a GINSB is particularly useful when pursuing site-specific genome integration via Homologous Recombination.
  • Figure 12 shows the pBOP-30-IVT in vitro transcription vector, comprising a reverse complement Bsgl site to cut a poly A site within the poly A sequence.
  • the pBOP-30-IVT vector can be used to position a reverse complement Bsgl site to linearize polyA site within a polyA sequence.
  • the polyMOD framework of pBOP-30-IVT is shown as the line drawing above the circular plasmid diagram and provides the order and identity of restriction enzyme recognition sites interspersed with spacer sequences that was inserted into the pB OP-30 backbone at the PstI and Xbal sites.
  • the nucleotide sequence of pBOP-30-IVT is that of SEQ ID NO:7.
  • the molecular structure of pBOP-30-IVT is defined as [(PstI)+(SalI)+(m vitro transcription promoter (IVTP))+(AfeI)+(R-R acceptor domain)+(PvuI)+(ORF element)+(KasI)+(P-5 acceptor domain)+(FspI)+(polyA element)+(reverse BsgI)+(AgeI)+(XbaI)+(pBOP-30-based vector backbone (VB)).
  • This example provides an exemplary method of constructing a multi-effector construct encoding BDNF and APOE s2 effector proteins in the pBOP-30-IVT vector (not shown).
  • the steps provide a pBOP-30-based in vitro transcription vector driven by the T7 in vitro transcription promoter (IVTP) to transcribe a multi-effector polycistronic mRNA.
  • IVTP in vitro transcription promoter
  • the pBOP-30-IVT vector is cut with Pvul and KasI and dephosphorylated and then a Pvul/Kasl- bounded multi-effector [(BDNF-1)+(EMCV IRES )+( APOE s2)] polycistron is subcloned into pBOP-30-IVT vector’s PvuI/KasI site, yielding the final vector having sequences encoding: [(T7 promoter)+([(BDNF-l)+(EMCV IRES)+(APOE s2)]+(polA)].
  • the resulting vector is then amplified and purified using standard plasmid prep purification protocols. This DNA is then linearized with Bsgl and agarose gel-purified. The final linear DNA is used as a template for RNA synthesis using T7 RNA polymerase.
  • This example provides exemplary “BOPization Rules” for preparing a nucleotide sequence for insertion into a BoP vector and provide a high-level description of the ORF-focused BOPization process
  • the transcriptional unit is the functional unit of BoP, otherwise known as a “gene” and consists of all elements from the transcription regulator domain (TR) to the transcription processing domain (TP) and is flanked by an upstream Sall (GTCGAC) site and a downstream Agel (ACCGGT) site.
  • the TU is broken down into four domains: transcription regulator domain (TR); effector (E) domain; effector linker (EL) domain; and transcription processing domain (TP).
  • TR transcription regulator domain
  • E effector
  • EL effector linker
  • TP transcription processing domain
  • the TR and TP domains should not come from the same gene I genomic region.
  • the TR is flanked by an upstream Sall (GTCGAC) site and a downstream Pvul (CGATCG) site and has the upstream regulator elements necessary for the correct temporal spatial expression of the gene.
  • the TR domain can contain enhancers, chromatin modification domains, promoters, 5’ UTRs, and splice elements, all of which can modulate the time, place, and duration of transcription as well as post-transcriptional and pre-translational events.
  • the TR domain can be broken down into two sub domains: R-D subdomain (which is flanked by Sall (GTCGAC) and Afel (AGCGCT) restrictions sites); and the R-R subdomain (which is flanked by Afel (AGCGCT) and Pvul (CGATCG) restrictions sites).
  • the general practice is to have DNA encoded regulator elements (e.g., enhancers, promoters, etc.) and the transcription start site (TSS, with up to 50bp downstream of the TSS) the present in the R-D domain and to have RNA encoded regulatory elements (e.g., 5’ UTR, introns, etc.) in the R-R subdomain. Additionally, bioRNAs in the form of intron encoded microRNAs can be placed in the R-R subdomain. When removing restriction sites from native DNA sequence that is not compatible with the TR domain, the general practice is to introduce nucleic acid changes into the DNA sequence that do not change predicted transcription factor binding sites.
  • DNA encoded regulator elements e.g., enhancers, promoters, etc.
  • TSS transcription start site
  • RNA encoded regulatory elements e.g., 5’ UTR, introns, etc.
  • bioRNAs in the form of intron encoded microRNAs can be placed in the R-R subdomain.
  • the E domain is flanked by an upstream Pvul (CGATCG) site and a downstream Seal (AGTACT) site and contains the “biological effect” elements. Effectors can be either protein encoding ORFs or non-coding, biologically active RNAs (non-coding RNA, long non-coding RNA, microRNA, and more).
  • the E domain can be broken down into three subdomains: the E-N subdomain (which is flanked by Pvul (CGATCG) and Xmal (CCCGGG) restriction sites); the E- I subdomain (which is flanked by Xmal (CCCGGG) and Mfel (CAATTG) restriction sites); and the E-C subdomain (which is flanked by Mfel (CAATTG) and Seal (AGTACT) sites.
  • the general practice is to introduce nucleic acid changes into the DNA sequence by changing the codons used to encode the protein (as described in the BOPizer Algorithm).
  • RNA-encoding DNA sequence e.g., microRNAs, IncRNA, etc.
  • the general practice is to introduce nucleic acid changes into the DNA sequence that do not impact the predicted folding of the corresponding RNA sequence.
  • the effector linker (EL) domain is flanked by an upstream Seal (AGTACT) site and a downstream KasI (GGCGCC) site and maintains the ability to add additional effectors within the transcriptional unit using an IRES. Additionally, bioRNAs in the form of intron encoded microRNAs can be placed in the EL domain. The EL domain is not broken down into subdomains. When removing restriction sites from native DNA sequence that is not compatible with the EL domain, the general practice is to introduce nucleic acid changes into the DNA sequence that do not change the predicted folding of the corresponding RNA sequence.
  • the transcription processing domain is flanked by an upstream KasI (GGCGCC) site and a downstream Agel (ACCGGT) site and can include splicing elements, 3’ UTRs, intersections of DNA/RNA transcription termination factors, poly adenylation (poly-A) sites (RNA-based), DNA-based elements, enhancers, and chromatin modification domains, all of which can modulate the time, place, and duration of transcription as well as post-transcriptional and pre-translational events.
  • GGCGCC upstream KasI
  • ACCGGT Agel
  • splicing elements 3’ UTRs, intersections of DNA/RNA transcription termination factors, poly adenylation (poly-A) sites (RNA-based), DNA-based elements, enhancers, and chromatin modification domains, all of which can modulate the time, place, and duration of transcription as well as post-transcriptional and pre-translational events.
  • the TP domain can be broken down into two subdomains: the P-5 subdomain (which is flanked by KasI (GGCGCC) and FspI (TGCGCA) restriction sites; and the P-3 subdomain (which is flanked by FspI (TGCGCA) and Agel (ACCGGT) restriction sites).
  • RNA encoded regulator elements e.g., 3’UTR, polyadenylation signals, etc.
  • DNA encoded regulatory elements e.g., enhancers, silencers, insulators, etc.
  • the combination of elements encoded in the TR and TP domains control where the TU’s effector is expressed.
  • TR domain upstream
  • TP domain downstream
  • DNA/chromatin 3-dimensional structure
  • the transcription unit linker is a preserved domain that enables the addition of more transcription units within the same vector backbone. It is important to note that the added transcription units added should have unique TR, E, and TP domains to prevent intra- vector recombination.
  • the TUL is flanked by an upstream Xhol (CTCGAG) and a downstream Notl (GCGGCCGC).
  • the algorithm to make protein encoding effector sequences compatible with the BoP system is split into three steps: Sanitize Sequence Input, BoPize Sequence, and Verify Sequence, as shown in Figure 14.
  • Sanitize Sequence Input shown in Figure 15
  • the input sequence is quality-control checked and standardized .
  • the algorithm checks to ensure that the protein encoding sequence is of a length that is divisible by three to ensure that the correct protein reading frame is used. Then, any white spaces are removed, and the text case is normalized.
  • the algorithm scans the sequence to find any of the following restriction sites: Aarl (either GCAGGTG or CACCTGC), Afel (AGCGCT), Agel (ACCGGT), Asci (GGCGCGCC), Asel (ATTAAT), AsiSI (GCGATCGC), Bsgl (cither CTGCAC or GTGCAG), EcoRI (GAATTC), EcoRV (GATATC), FspI (TGCGCA), Hindlll (AAGCTT), I-Ceul (CGTAACTATAACGGTCCTAAGGTAGCGAA; SEQ ID NO: 8), LScel (TAGGGATAACAGGGTAAT; SEQ ID NO:9), KasI (GGCGCC), Mfel (CAATTG), Mlul (ACGCGT), Notl (GCGGCCGC), Nsil (ATGCAT), PacI (TTAATTAA), Pmel (GTTTAA
  • the algorithm scans the sequence to find any of the following elements required for efficient splicing of RNA transcripts: Splice Donor site (AGGT), Branch point site (either CTAAC, CTAAT, CTGAC or CTGAT), and Splice Acceptor site (CAGG).
  • AGGT Splice Donor site
  • Branch point site either CTAAC, CTAAT, CTGAC or CTGAT
  • CAGG Splice Acceptor site
  • the algorithm uses the fact that most amino acids are encoded by multiple sequences (e.g., the amino acid leucine is encoded by the following six codons: CTA, CTC, CTG, CTT, TTA, or TTG). Replacing the original codon with another codon that encodes the same amino acid has no impact on the amino acid sequence but can dramatically change the DNA sequence of the CDS. However, as the frequency that each codon is used in genes is not constant, the algorithm systematically makes changes starting from the most commonly used codon (for Leucine, the most commonly used codon is CTG) to the less commonly used codon, with rare codon instances not used at all.
  • the restriction enzyme site or splice element overlaps a codon encoding the amino acid alanine (abbreviated as A or Ala and encoded by the following codon sequences: GCA, GCC, GCG, or GCT ), the sequence of the original codon is replaced with GCC, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with GCT, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with GCA, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains after all these changes, then a broader window is examined by looking at flanking amino acid encoding sequences. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • a or Ala codon encoding the amino acid alanine
  • the restriction enzyme site or splice element overlaps a codon encoding the amino acid cysteine (abbreviated as C or Cys and encoded by the following codon sequences: TGC or TGT ), the sequence of the original codon is replaced with TGC, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with TGT, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains after all these changes, then a broader window is examined by looking at flanking amino acid encoding sequences. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • C or Cys amino acid cysteine
  • restriction enzyme site or splice element overlaps a codon encoding the amino acid aspartic acid (abbreviated as D or Asp and encoded by the following codon sequences: GAC or GAT ), the sequence of the original codon is replaced with GAC, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or clement remains, the sequence of the original codon is replaced with GAT, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains after all these changes, then a broader window is examined by looking at flanking amino acid encoding sequences. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • D or Asp amino acid aspartic acid
  • restriction enzyme site or splice element overlaps a codon encoding the amino acid glutamic acid (abbreviated as E or Glu and encoded by the following codon sequences: GA A or GAG ), the sequence of the original codon is replaced with GAG, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with GAA, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains after all these changes, then a broader window is examined by looking at flanking amino acid encoding sequences. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • E or Glu amino acid glutamic acid
  • the restriction enzyme site or splice element overlaps a codon encoding the amino acid phenylalanine (abbreviated as F or Phe and encoded by the following codon sequences: TTC or TTT)
  • the sequence of the original codon is replaced with TTC, and the algorithm moves to the next restriction site if this codon change removes the restriction site.
  • the sequence of the original codon is replaced with TTT, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains after all these changes, then a broader window is examined by looking at flanking amino acid encoding sequences. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • the restriction enzyme site or splice element overlaps a codon encoding the amino acid glycine (abbreviated as G or Glyc and encoded by the following codon sequences: GGA, GGC, GGG, or GGT), the sequence of the original codon is replaced with GGC, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with GGG, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with GGA, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains after all these changes, then a broader window is examined by looking at flanking amino acid encoding sequences. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • G or Glyc a codon encoding the amino acid gly
  • restriction enzyme site or splice element overlaps a codon encoding the amino acid histidine (abbreviated as H or His and encoded by the following codon sequences: CAC or CAT), the sequence of the original codon is replaced with CAC, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with CAT, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains after all these changes, then a broader window is examined by looking at flanking amino acid encoding sequences. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • H or His amino acid histidine
  • restriction enzyme site or splice element overlaps a codon encoding the amino acid isoleucine (abbreviated as I or He and encoded by the following codon sequences: ATA, ATC, or ATT), the sequence of the original codon is replaced with ATC, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with ATT, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains after all these changes, then a broader window is examined by looking at flanking amino acid encoding sequences. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • I or He codon encoding the amino acid isoleucine
  • restriction enzyme site or splice element overlaps a codon encoding the amino acid lysine (abbreviated as K or Lys and encoded by the following codon sequences: AAA or AAG), the sequence of the original codon is replaced with AAG, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with AAA, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains after all these changes, then a broader window is examined by looking at flanking amino acid encoding sequences. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • K or Lys amino acid lysine
  • the restriction enzyme site or splice clement overlaps a codon encoding the amino acid leucine (abbreviated as L or Leu and encoded by the following codon sequences: CTA, CTC, CTG, CTT, TTA, or TTG)
  • the sequence of the original codon is replaced with CTG, and the algorithm moves to the next restriction site if this codon change removes the restriction site.
  • the sequence of the original codon is replaced with CTC, and the algorithm moves to the next restriction site if this codon change removes the restriction site.
  • the sequence of the original codon is replaced with TTG, and the algorithm moves to the next restriction site if this codon change removes the restriction site.
  • the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains after all these changes, then a broader window is examined by looking at flanking amino acid encoding sequences. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • the algorithm then examines a broader window by looking at flanking amino acid encoding sequences to remove the site. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • restriction enzyme site or splice element overlaps a codon encoding the amino acid asparagine (abbreviated as N or Asn and encoded by the following codon sequences: AAC or AAT), the sequence of the original codon is replaced with AAC, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with AAT, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains after all these changes, then a broader window is examined by looking at flanking amino acid encoding sequences. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • N or Asn amino acid asparagine
  • the restriction enzyme site or splice element overlaps a codon encoding the amino acid proline (abbreviated as P or Pro and encoded by the following codon sequences: CCA, CCC, CCG, or CCT), the sequence of the original codon is replaced with CCC, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with CCT, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with CCA, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains after all these changes, then a broader window is examined by looking at flanking amino acid encoding sequences. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • P or Pro codon encoding the amino acid proline
  • restriction enzyme site or splice element overlaps a codon encoding the amino acid glutamine (abbreviated as Q or Gin and encoded by the following codon sequences: CAA or CAG), the sequence of the original codon is replaced with CAG, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with CAA, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains after all these changes, then a broader window is examined by looking at flanking amino acid encoding sequences. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • Q or Gin codon encoding the amino acid glutamine
  • the restriction enzyme site or splice element overlaps a codon encoding the amino acid arginine (abbreviated as R or Arg and encoded by the following codon sequences: AGA, AGG, CGA, CGC, CGG, or CGT), the sequence of the original codon is replaced with AGA, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with AGG, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with CGG, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains after all these changes, then a broader window is examined by looking at flanking amino acid encoding sequences. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • R or Arg codon encoding the amino acid
  • the restriction enzyme site or splice element overlaps a codon encoding the amino acid serine (abbreviated as S or Ser and encoded by the following codon sequences: AGC, AGT, TCA, TCC, TCG, or TCT), the sequence of the original codon is replaced with AGC, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with TCC, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with TCT, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains after all these changes, then a broader window is examined by looking at flanking amino acid encoding sequences. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • S or Ser codon encoding the amino acid serine
  • the restriction enzyme site or splice element overlaps a codon encoding the amino acid threonine (abbreviated as T or Thr and encoded by the following codon sequences: ACC, ACA, ACG, or ACT), the sequence of the original codon is replaced with ACC, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with ACA, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with ACT, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains after all these changes, then a broader window is examined by looking at flanking amino acid encoding sequences. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • T or Thr codon encoding the amino acid threonine
  • the restriction enzyme site or splice element overlaps a codon encoding the amino acid valine (abbreviated as V or Vai and encoded by the following codon sequences: GTA, GTC, GTG, or GTT), the sequence of the original codon is replaced with GTG, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with GTC, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with GTT, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains after all these changes, then a broader window is examined by looking at flanking amino acid encoding sequences. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • V or Vai codon encoding the amino acid valine
  • the algorithm then examines a broader window by looking at flanking amino acid encoding sequences to remove the site. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • the restriction enzyme site or splice element overlaps a codon encoding the amino acid tyrosine (abbreviated as Y or Tyr and encoded by the following codon sequences: TAC or TAT), the sequence of the original codon is replaced with TAC, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains, the sequence of the original codon is replaced with TAT, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains after all these changes, then a broader window is examined by looking at flanking amino acid encoding sequences. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • the restriction enzyme site or splice element overlaps a codon encoding a stop codon (abbreviated as Ochre or Och; Amber or Amb; Opal or Opa encoded by the following codon sequences: TAA, TAG, or TGA) the sequence of the original stop codon is replaced with TAG and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site remains, the sequence of the original stop codon is replaced with TAA and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site remains, the sequence of the original stop codon is replaced with TGA, and the algorithm moves to the next restriction site if this codon change removes the restriction site. If the site or element remains after all these changes, then a broader window is examined by looking at flanking amino acid encoding sequences. If the restriction site or splice element cannot be removed, then the algorithm provides the user with an error message.
  • the most important part of the last post-processing step is verification of the sequence.
  • the initial sequence is translated from DNA into amino acid sequence and compared with the BOPized sequence after it had been translated from DNA into amino acid sequence, as shown in Figure 17.
  • sequences arc added to both the 5’ and 3’ ends of the CDS to enable de novo synthesis as well as vector construction: upstream of the translation start site (ATG) at the 5’ end, a spacer sequence of 8 nucleotides in length is placed (GAGAGAGA) followed by a recognition site for the Pvul restriction endonuclease (CGATCG); while downstream of the translation stop site (TAA, TAG, or TGA) at the 3’ end, a recognition site for the Seal restriction endonuclease (AGTACT) is placed, followed by a spacer sequence of 8 nucleotides in length (GAGAGAGA).

Landscapes

  • Genetics & Genomics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Organic Chemistry (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

Des vecteurs de clonage et des procédés utiles pour construire des constructions multigéniques et/ou polycistroniques aux fins d'une administration à des cellules et/ou à des organismes sont divulgués. Les vecteurs sont génétiquement modifiés pour comprendre des domaines accepteurs uniques qui permettent indépendamment l'ajout séquentiel ou itératif d'inserts d'ADN et/ou d'ARN conformes. L'insertion d'un premier insert nucléotidique dans un domaine r accepteur détruit un site de restriction complémentaire au niveau d'une extrémité 3' du premier segment d'insert tandis que le site de restriction 3' est régénéré au niveau du point d'insertion du vecteur de clonage pour former un vecteur circulaire avec un premier insert. Un clivage ultérieur au niveau d'une extrémité 3' du premier insert crée un point d'insertion pour un second segment d'insert, et l'ensemble itératif de segments d'insert supplémentaires permet d'effectuer chaque clivage ultérieur au niveau d'une extrémité 3' d'une chaîne linéaire en croissance d'un nombre souhaité d'inserts.
PCT/US2024/036635 2023-07-03 2024-07-03 Vecteurs et procédés de construction de transgènes polycistroniques et/ou multigéniques Pending WO2025010306A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202363511725P 2023-07-03 2023-07-03
US63/511,725 2023-07-03

Publications (2)

Publication Number Publication Date
WO2025010306A2 true WO2025010306A2 (fr) 2025-01-09
WO2025010306A3 WO2025010306A3 (fr) 2025-05-08

Family

ID=94172269

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2024/036635 Pending WO2025010306A2 (fr) 2023-07-03 2024-07-03 Vecteurs et procédés de construction de transgènes polycistroniques et/ou multigéniques

Country Status (1)

Country Link
WO (1) WO2025010306A2 (fr)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080050808A1 (en) * 2002-10-09 2008-02-28 Reed Thomas D DNA modular cloning vector plasmids and methods for their use
US20110008831A1 (en) * 2005-05-26 2011-01-13 Cytos Biotechnology Ag Scalable fermentation process
WO2021081353A1 (fr) * 2019-10-23 2021-04-29 Checkmate Pharmaceuticals, Inc. Agonistes de récepteur du type rig-i synthétiques

Also Published As

Publication number Publication date
WO2025010306A3 (fr) 2025-05-08

Similar Documents

Publication Publication Date Title
US20230279391A1 (en) Systems, methods, and compositions for site-specific genetic engineering using programmable addition via site-specific targeting elements (paste)
AU2019327449A1 (en) Methods and compositions for modulating a genome
US20190038780A1 (en) Vectors and system for modulating gene expression
AU2005248371B2 (en) Methods for dynamic vector assembly of DNA cloning vector plasmids
Carninci et al. Balanced-size and long-size cloning of full-length, cap-trapped cDNAs into vectors of the novel λ-FLC family allows enhanced gene discovery rate and functional analysis
WO1998012339A9 (fr) Vecteurs viraux et leurs utilisations
AU762274B2 (en) Expression vectors containing hot spot for increased recombinant protein expression in transfected cells
CN109929839B (zh) 拆分型单碱基基因编辑系统及其应用
WO2002008408A2 (fr) Systemes de vecteurs modulaires
CN113913405A (zh) 一种编辑核酸的系统及方法
CN101511994A (zh) 使用真核细胞系生成蛋白质
CN114072510A (zh) 用于昆虫细胞和哺乳动物细胞中的蛋白质表达的重组转移载体
US12275942B2 (en) Method for producing DNA vectors from molecular bricks containing sequences of interest
Zhao et al. Efficient and reproducible multigene expression after single-step transfection using improved bac transgenesis and engineering toolkit
JP5791508B2 (ja) 人工染色体ベクター
WO2025010306A2 (fr) Vecteurs et procédés de construction de transgènes polycistroniques et/ou multigéniques
WO2002044415A1 (fr) Procede de criblage de bibliotheques d'adn et production de constructions d'adn de recombinaison
EP3853361A1 (fr) Procédés et compositions de clonage universel à base d'introns
US20170096679A1 (en) Eukaryotic expression vectors resistant to transgene silencing
Venken et al. Synthetic assembly DNA cloning to build plasmids for multiplexed transgenic selection, counterselection or any other genetic strategies using Drosophila melanogaster
JP2024514961A (ja) レンチウイルスベクター産生のための安定な産生系
CN118374547A (zh) 一种无抗微质粒及其制备方法和应用
EP2716759A1 (fr) Vecteur de ciblage génique, son procédé de fabrication, et son procédé d'utilisation
EP2758531B1 (fr) Introduction d'éléments de vecteurs modulaires pendant la production de lentivirus
US20240417728A1 (en) Folding oligonucleotides

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24836561

Country of ref document: EP

Kind code of ref document: A2