[go: up one dir, main page]

WO2025038955A1 - Systems and methods for improving safety of split intein aav mediated gene therapy - Google Patents

Systems and methods for improving safety of split intein aav mediated gene therapy Download PDF

Info

Publication number
WO2025038955A1
WO2025038955A1 PCT/US2024/042744 US2024042744W WO2025038955A1 WO 2025038955 A1 WO2025038955 A1 WO 2025038955A1 US 2024042744 W US2024042744 W US 2024042744W WO 2025038955 A1 WO2025038955 A1 WO 2025038955A1
Authority
WO
WIPO (PCT)
Prior art keywords
polynucleotide sequence
intein
seq
scn1a
sodium channel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2024/042744
Other languages
French (fr)
Other versions
WO2025038955A9 (en
Inventor
Franck KALUME
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Seattle Childrens Hospital
Original Assignee
Seattle Childrens Hospital
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Seattle Childrens Hospital filed Critical Seattle Childrens Hospital
Publication of WO2025038955A1 publication Critical patent/WO2025038955A1/en
Publication of WO2025038955A9 publication Critical patent/WO2025038955A9/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/86Viral vectors
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • A61K48/005Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'active' part of the composition delivered, i.e. the nucleic acid delivered
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • A61K48/005Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'active' part of the composition delivered, i.e. the nucleic acid delivered
    • A61K48/0058Nucleic acids adapted for tissue specific expression, e.g. having tissue specific promoters as part of a contruct
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/705Receptors; Cell surface antigens; Cell surface determinants
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/95Fusion polypeptide containing a motif/fusion for degradation (ubiquitin fusions, PEST sequence)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2750/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
    • C12N2750/00011Details
    • C12N2750/14011Parvoviridae
    • C12N2750/14111Dependovirus, e.g. adenoassociated viruses
    • C12N2750/14141Use of virus, viral particle or viral elements as a vector
    • C12N2750/14143Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/40Systems of functionally co-operating vectors

Definitions

  • This invention relates to split intein mediated gene expression systems incorporating a degradation signal for removal of gene expression byproducts.
  • AAV adeno-associated virus
  • PTS split intein-mediated protein trans-splicing
  • split-inteins are expressed as two independent polypeptides (N-intein and C-intein) at the extremities of the host polypeptides (N-polypeptide and C-polypeptide) and remain inactive until encountering their complementary partner.
  • the reconstituted intein excises itself from the host protein while mediating ligation of the N- and C-polypeptides via a peptide bond, in a traceless manner.
  • AAV intein vectors encode components
  • OWOPT e.g., excised inteins
  • AAV intein vectors are the delivery for expression and reconstitution of large protein molecules to supplement or salvage the function of endogenously deficient or mutated protein molecules due to one or more diseases or disorders.
  • Epilepsy is a neurological disorder that occurs when the brain presents an enduring predisposition to generate two or more epileptic seizures.
  • An epileptic seizure is a temporary disruption of brain function due to abnormal excessive or synchronous neuronal activity. Its manifestation may include periods of unusual behavior, sensations and sometimes loss of consciousness.
  • Dravet Syndrome (DS) particularly is a rare and catastrophic form of intractable epilepsy that begins in infancy. Initially, the patient experiences prolonged seizures.
  • Sodium channels are made up of large alpha subunits that may associate with accessory proteins, such as beta subunits.
  • An alpha subunit forms the core of the channel and is functional on its own.
  • the alpha subunit protein When expressed by a cell, it is able to form a pore in the cell membrane that conducts Na+ in a voltage-dependent way, even if beta subunits or other known modulating proteins are not expressed.
  • accessory proteins When accessory proteins assemble with a subunits, the resulting complex can display altered voltage dependence and cellular localization.
  • a system includes: (a) a first expression construct comprising: a first portion of a polynucleotide sequence of a gene encoding the sodium channel alpha subunit, and a polynucleotide sequence encoding an N-fragment of a split intein (N-intein), at the 3’ end relative to the first portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit; (b) a second expression construct comprising: a second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit, and a polynucleotide sequence encoding a C-fragment of the split intein (C-intein), at the 5’ end relative to the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit; and (c) a third expression construct comprising a polynucleotide sequence of a gene encoding the sodium channel alpha subunit; and
  • a system includes: (a) a first expression construct comprising: a first portion of a polynucleotide sequence of a gene encoding the sodium channel alpha subunit, and a polynucleotide sequence encoding an N-fragment of a split intein (N- intein), at the 3’ end relative to the first portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit; (b) a second expression construct comprising: a second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit, and a polynucleotide sequence encoding a C-fragment of the split intein (C-intein), at the 5’ end relative to the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit; wherein the first expression construct, the second expression construct, or both further comprises a polynucleotide sequence en
  • a system includes the first expression construct, the second expression, either one or both further comprising the polynucleotide sequence encoding the degron, and a third expression construct coding for a same or different degron.
  • a system for expression one or more coding sequences includes (a) a first expression construct coding for a first fusion protein comprising a first segment of a sodium channel alpha subunit and an N-fragment of a split intein, wherein the first segment of the sodium channel alpha subunit is at the N-terminus relative to the N- fragment of the split intein; (b) a second expression construct coding for a second fusion protein comprising a C-fragment of the split intein and a second segment of the sodium channel alpha subunit, wherein the second segment of the sodium channel alpha subunit is at the C-terminus relative to the C-fragment of the split intein; and (c) a polynucleotide sequence encoding a degron, wherein the polynucleotide sequence encoding the degron is in a third expression construct different from the first or the second expression constructs OR is within the first and/or the second expression constructs.
  • the first fusion protein and the second fusion protein are spliced together, thereby joining the first segment and the second segment of the sodium channel alpha subunit and joining the N-fragment and the C-fragment of the split intein.
  • the system comprising the first, second, and third expression constructs is co-transduced to a cell for expression of the first fusion protein, the second fusion protein, and the degron in the cell.
  • the split intein is one having fast splicing rate (e.g., half-lives within 5 min), such as consensus fast intein.
  • the degron is a polypeptide having no more than 30 amino-acid residues in length.
  • the split intein in the system is consensus fast intein, and the degron in the system is one having 9-26 or no more than 30 amino-acid residues in length.
  • the splicing to joining the first and second segments of the sodium channel alpha subunit takes places faster than degron-mediated degradation, resulting in the joined segments (preferably a full) sodium channel alpha subunit locating in a cell membrane, whereas joined intein is degraded via degron.
  • the first and/or the second expression construct further comprises an enhancer sequence, configured for targeted expression within a targeted cell type.
  • the expression construct(s) further comprises a promoter sequence.
  • the first and/or the second expression construct further comprises an intron having a polynucleotide sequence of SEQ ID NO: 107.
  • the first and/or the second expression construct further comprises the enhancer sequence, the promoter, and the intron.
  • the gene encoding the sodium channel alpha subunit is selected from the group consisting of SCNlA t SCN2A, SCN3A, SCN4A, SCN5A, SCN8A, SCN9A, SCN10A, SCN11A, and SCN7A.
  • the sodium channel alpha subunit comprises sodium channel protein type 1 subunit alpha
  • the gene encoding the sodium channel alpha subunit comprises SCN1A.
  • the sodium channel alpha subunit comprises human sodium channel protein type 1 subunit alpha isoform 2.
  • the sodium channel alpha subunit comprises a variant of human sodium channel protein type 1 subunit alpha isoform 2 having an amino acid substitution of A1056T, wherein amino acid residue numbering is according to NCBI accession number NP 001340878.1.
  • the first expression construct comprises the polynucleotide sequence encoding the degron, and the first expression construct comprises from 5’ to 3’ : the first portion of the polynucleotide sequence of the SCN1A - the polynucleotide sequence encoding the N-intein - the polynucleotide sequence encoding the degron.
  • the degron has an amino acid sequence of RPGSTSPFAPSATDLPSMPEPALTSR (SEQ ID NO:91) or ACKNWFSSLSHFVIHL (SEQ ID NO: 92).
  • the first expression construct comprises the polynucleotide sequence encoding the degron, and the first expression construct comprises from 5’ to 3’ : the first portion of the polynucleotide sequence of the SCN1A - the polynucleotide sequence encoding the degron - the polynucleotide sequence encoding the N- intein.
  • the second expression construct comprises the polynucleotide sequence encoding the degron, and when expressed, the degron is two or more amino acid residues at the N-terminus relative to a protein product encoded by the second portion of the polynucleotide sequence of the SCN1A.
  • the degron has an amino acid sequence of MSCAQESITSLYKKAGSENLYFQ (SEQ ID NO:88), MSCAQES (SEQ ID NO:90), or GSLIIFIIL (SEQ ID NO:93).
  • the second expression construct comprises the polynucleotide sequence encoding the degron, and the second expression construct comprises from 5’ to 3’ : a first portion of the polynucleotide sequence encoding the C-intein - the polynucleotide sequence encoding the degron - a second portion of the polynucleotide sequence encoding the C-intein - the second portion of the polynucleotide sequence of the SCN1A, wherein when expression, a protein product of the first portion of the polynucleotide sequence encoding the C-intein and a protein product of the second portion of the polynucleotide sequence encoding the C-intein together form the C-intein.
  • a protein product of the first portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit and a protein product of the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit are linked, via a peptide bond between the C-terminus of the first portion’ s protein product and the N-terminus of the second portion’s protein product, to reconstitute the sodium channel alpha subunit.
  • the degron has an amino acid sequence selected from the group consisting of MSCAQESITSLYKKAGSENLYFQ (SEQ ID NO:88), MSCAQES (SEQ ID NO:90), RPGSTSPFAPSATDLPSMPEPALTSR (SEQ ID NO:91), ACKNWFSSLSHFVIHL (SEQ ID NO: 92), and GSLIIFIIL (SEQ ID NO:93).
  • the breakpoint of the first and second segment of the sodium channel alpha subunit is at a place wherein the first residue of the C-terminus segment (fragment) is Cys, Ser, or Thr.
  • the first and the second portions of the polynucleotide sequence of the SCN1A encode residues 1-1049 and residues 1050-1998 of sodium channel protein type 1 subunit alpha isoform 2 (or variants containing A1056T substitution), respectively.
  • the first and the second portions of the polynucleotide sequence of the SCN1A encode residues 1-956 and residues 957-1998 of the sodium channel protein type 1 subunit alpha isoform 2 (or variants containing A1056T substitution), respectively.
  • the first and the second portions of the polynucleotide sequence of the SCN1A encode residues 1-947 and residues 948-1998 of the sodium channel protein type 1 subunit alpha isoform 2 (or variants containing A1056T substitution), respectively.
  • the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterml049 (SEQ ID NO: 59), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterm949 (SEQ ID NO: 60).
  • the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterm956 (SEQ ID NO: 61), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCN!A-CO-Cterml042 (SEQ ID NO: 62).
  • the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterm947 (SEQ ID NO: 63), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterml051 (SEQ ID NO: 64).
  • the split intein comprises consensus fast intein (Cfa);
  • the degron is a polypeptide being 5-30 amino-acid residues in length or preferably 9-26 aminoacid residues in length; and a polypeptide product encoded by the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit starts with a cystein, serine, or threonine residue.
  • the first segment and the second segment of the sodium channel alpha subunits are even or about even sized; or the lengths of segments are not more than 20%, 30%, 40%, or 50% different.
  • the polynucleotide sequence encoding the N-intein comprises a polynucleotide sequence of Cfa-N (SEQ ID NO:57), and the polynucleotide sequence encoding the C-intein comprises a polynucleotide sequence of Cfa-C (SEQ ID NO:58).
  • the first expression construct and the second expression construct independently further comprise the promoter sequence selected from a minBglobin promoter having a polynucleotide sequence of SEQ ID NO:3, an hSynl promoter having a polynucleotide sequence of SEQ ID NO: 52, or a CMV promoter having a polynucleotide sequence of SEQ ID NO:53; optionally a shortened hSynl promoter having a polynucleotide sequence of SEQ ID NO: 54.
  • the first expression construct and the second expression construct independently further comprise the minBglobin promoter having a polynucleotide sequence of SEQ ID NO:3.
  • the enhancer sequence is configured for targeted expression of the first, the second, or both portions, respectively, of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit within a targeted central nervous system cell type.
  • the targeted central nervous system cell type is GABAergic neuron, glutamatergic neuron, or both cell types.
  • the targeted central nervous system cell type is GABAergic interneuron.
  • the enhancer sequence is set forth in SEQ ID NO: 2 (DLX2.0).
  • the enhancer sequence has a concatemerized core having a polynucleotide sequence of SEQ ID NO: 1.
  • the enhancer sequence is a concatemerized repeat (2, 3, 4, 5, 6, 7, 8, 9, 10, or more contiguous repeats) of a polynucleotide sequence of SEQ ID NO: 1.
  • an enhancer sequence is set forth in SEQ ID NO: 55 (eHGT_078h), or the targeted central nervous system cell type comprises a glutamatergic neuron.
  • the first expression construct, the second expression construct, or both independently further comprise a miRNA binding site sequence, configured for targeted expression of the first, the second, or both portions, respectively, of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit within a selected central nervous system cell type.
  • the miRNA binding site sequence is set forth in SEQ ID NO: 56 (4x2C miRNA binding site) or SEQ ID NO:87 (8x2C miRNA binding site), or the selected central nervous system cell type comprises a pan- GABAergic neuron.
  • An artificial expression construct is also provided, which includes the first, the second and/or the third expression construct of a system disclosed herein, wherein each expression construct is associated with a capsid that crosses the blood brain barrier.
  • a capsid that crosses the blood brain barrier comprises PHP.eB.
  • a capsid that crosses the blood brain barrier comprises AAV-BR1.
  • a capsid that crosses the blood brain barrier comprises AAV-PHP.S.
  • a capsid that crosses the blood brain barrier comprises AAV-PHP.B.
  • a capsid that crosses the blood brain barrier comprises AAV-PPS.
  • an administrable composition which includes one or more artificial expression constructs disclosed herein, preferably in association with a capsid that crosses blood brain barrier; and a pharmaceutically acceptable excipient.
  • a transgenic cell comprising a system of one or more expression constructs disclosed herein.
  • the transgenic cell comprises a GAB Aergic neuron, or more specifically GAB Aergic interneuron.
  • the transgenic cell comprises a glutamatergic neuron.
  • Methods are also provided for rescuing voltage-gated sodium channel function within a targeted population of cells, the method comprising: co-administering a therapeutically effective amount of the two or more expression constructs of a system disclosed herein to a sample or subject comprising the targeted population of cells, and inducing expression of the expression constructs to reconstitute a sodium channel alpha subunit, thereby rescuing voltage-gated sodium channel function within the targeted population of cells.
  • Methods are also provided for administering a system of expression constructs to a subject in need thereof, the method comprising administering a therapeutically effective amount of a system disclosed herein, or co-administering two or more expression constructs of a system disclosed herein, to a sample or subject comprising the targeted population of cells, and inducing expression of the expression constructs to reconstitute a sodium channel alpha subunit, thereby rescuing voltage-gated sodium channel function within the targeted population of cells.
  • the subject has a sodium channel opathies, optionally comprising Dravet syndrome, myoclonic seizures, myoclonic astatic epilepsy (MAE), intractable childhood epilepsy with generalized tonic-clonic seizures, simple febrile seizures, generalized epilepsy and febrile seizures plus (GEFS+), migrating partial seizures of infancy, Lennox-Gastaut syndrome, or West syndrome.
  • the subject is a pediatric patient having Dravet syndrome.
  • FIG. 1A A functional split-intein design of SCN1A that reconstitutes functional Navi.i activity.
  • FIG. 1A Design of split-intein fusion protein halves of SCN1A. We inserted the breakpoint and Cfa-N and Cfa-C split-intein peptides just before the native Cysl050 with no additional amino acids. After joining, the Cfa-N and Cfa-C intein fragments self-excise and yield scarless reconstituted SCN1 A. HA and FLAG epitopes are inserted at the N- and C-termini of the N- and C-terminal halves for detection. (FIG.
  • FIG. 1C Reconstitution of full-length SCN1A after co-transfection of split-intein fusion protein halves into HEK-293 cells. We analyzed whole cell protein preparations by western blot for HA epitope tag after transfection.
  • Anti-tubulin is a loading control. (FIG.
  • Exemplary currents evoked in HEK293 cells transfected with full- length SCN1A SCN1A-FL, green
  • a combination of SCNIA-Ntm and SCNIA-Ctm SCN1A-N+C, orange
  • SCNIA-Ntm only SCN1A-N, blue
  • SCNIA-Ctm only SCN1A- C, blue
  • Co-transfection with SCNIA-Ntm and SCNIA-Ctm produced functional Navi.i currents comparable in size to full-length SCN1A.
  • Scale bar 1 nA, 10 msec.
  • N and C are not significantly different from GFP and pi/2 controls (p> 0.05, ns).
  • Statistical comparisons performed by pairwise Mann-Whitney U tests. All HEK293 transfections performed in the background of a separate SCN1B/2B expression plasmid, at a ratio of 10: 1. (FIG.
  • SCN1A-FL full-length human SCN1A
  • SCN1A-N N-terminal SCNIA-Cfa intein
  • SCN1A-C Cfa intein-C- terminal SCN1A
  • FIG. 2A Cell class-specific delivery of SCN1A to telencephalic GABAergic interneurons using optimized enhancer DLX2.0.
  • FIG. 2A Recombinant AAV2/PHP.eB vectors for delivery of DLX2.0-split-intein-SCNlA.
  • FIG. 2B Efficient SCN1 A reconstitution in mouse brain with DLX2.0-split-intein-SCNl A vectors.
  • panels 2B- 2F we injected P2 neonatal mice BL-ICV with 3el0 gc of each indicated vector.
  • FIG. 2C Specific detection of HA- and FLAG- expressing GABA+ cells in cortex. HA- and FLAG-expressing cells can be observed when either half is delivered alone or together. Layer 2/3 of VISp is shown at P20.
  • FIG. 2D Representative stitched fluorescence image of biodistribution of HA and FLAG epitopes in scattered telencephalic neurons. Expression is pseudo-colored black.
  • FIG. 2E Representative stitched fluorescence image of HA and FLAG epitopes, and Gad67+ neurons in VISp. In panels D-E we show expression at P47.
  • FIG. 2F High specificity and completeness of expression within Gad67+ neurons in multiple telencephalic regions across multiple animals. We counted cells that express both HA and FLAG epitopes in visual VISp, MO, and HPF. Layer 1 was excluded from VISp and MO analysis due to DLX2.0-PHP.eB vectors inefficiently targeting that layer. Each point represents one mouse, bars represent the means, and error bars represent standard error of the mean. Mice span ages P47-P139, mean age P85. Abbreviations: CTX cerebral cortex, OLF olfactory areas, HPF hippocampal formation, OT olfactory tubercle, STR striatum, VISp primary visual cortex, MO motor cortex.
  • FIG. 3A Experimental timeline to rescue of epileptic symptoms in DS model mice.
  • Some animals were tested for seizure susceptibility by thermal challenge between P25-P35.
  • FIG. 3B Mortality protection in DS model mice after treatment with DLX2.0-SCN1A AAVs.
  • the untreated control groups in Fig. 3B-EE represents the same set of untreated animals as that shown in Fig. 7B, 7D-7F.
  • FIG. 3C-3E Protection from heat-induced seizures in DS model mice after treatment with DLX2.0-SCN1 A AAVs.
  • FIG. 4A-4F Recovery of spontaneous epileptic symptoms in DS model mice with DLX2.0-SCN1 A AAVs.
  • FIG. 4A-4B Interictal spike reduction in DS model mice. We implanted Scnlafl/+; Meox2-Cre DS model mice with ECoG/EMG electrodes, which revealed frequent interictal spikes generalized across the brain in both left (L) and right (R) channels as in our previous work45,50.
  • FIG. 4A Example interictal spikes are shown in untreated mice.
  • FIG. 4E-4F Spontaneous GTCs inDS model mice.
  • FIG. 4E Example spontaneous GTC seizure observed in an untreated Scnlafl/+; Meox2-Cre DS model mice.
  • FIGs 5A-5D Recovery of severe epileptic phenotypes in mice lacking SCN1A in telencephalic GABAergic interneurons using DLX2.0-SCN1A AAVs.
  • FIG. 5A Genetic cross resulting in 100% Scnla+/fl; Dlx5/6-Cre animals.
  • FIG. 6A Nonselective delivery of SCN1A to neurons using hSynl promoter.
  • FIG. 6A Recombinant AAV2/PHP.eB vectors for delivery of hSynl-split-intein- SCNIA.
  • FIG. 6B Efficient SCN1A reconstitution in mouse brain with hSynl -split-intein- SCN1A vectors.
  • panels 6B-6F we injected P2 neonatal mice BL-ICV with 3el0 gc of each indicated vector (6el0 total gc in the N+C animals).
  • FIG. 6C HA- and FLAG-expressing NeuN+ and Gad67+ cells in cortex after co-inj ection with N- and C-terminal vectors.
  • White arrows indicate NeuN+ and Gad67+ cells that express FLAG but not HA.
  • Cyan arrows indicate NeuN+ and Gad67+ cells that express both FLAG and HA.
  • Layer 2/3 of VISp is shown at P84.
  • FIG. 6D Representative stitched fluorescence image of biodistribution of biodistribution of HA and FLAG epitopes in neurons after co-inj ection with N- and C-terminal vectors. Expression is pseudo-colored black. Expression shown at P84.
  • FIG. 6E Representative stitched fluorescence image of HA and FLAG epitopes and NeuN+ neurons throughout the layers of VISp. Expression shown at P84.
  • FIG. 6F Quantification of specificity and completeness of expression within Gad67+ and NeuN+ neurons in multiple telencephalic regions.
  • FIG. 7A Early pre-weaning toxicity and weak protection from epileptic symptoms with nonselective SCN1A vectors.
  • FIG. 7A Experimental timeline to rescue of epileptic symptoms in DS model mice with nonselective hSynl promoter-driven SCN1A vectors.
  • FIG. 7B-7C Preweaning mortality in DS model mice after treatment with nonselective SCN1A AAVs.
  • FIG. 7C From analysis of recovered genotypes at P21 we inferred that both DS and littermate control animals were similarly affected by nonselective SCN1A AAV lethality. FD: found dead.
  • FIG. 7D-7F Protection from heat-induced seizures in DS model mice after treatment with high-dose nonselective SCN1 A AAVs.
  • FIG. 8A-8E Isoform usage and allele prevalence of human SCN1A.
  • SCN1A isoform usage across cortical cell type subclasses in mice (8A) and humans (8B).
  • Mouse VISp cortical cell type-specific RNA-seq profiles are from Tasic, et al., Nature 563, 72 (2016) and human middle temporal gyrus (MTG) cortical cell type-specific profiles are from Hodge, et al., Nature 573, 61-68 (2019).
  • Genome-aligned reads are aggregated according cell type subclasses, and visualized as pileups on UCSC genome browser alongside the positions of the exon whose splice donor usage determines whether the 2009-, 1998-, or 1981-amino acid isoform of SCN1A is expressed. Regions shown: mmlO chr2:66324527- 66325003, hg38 chr2: 166043623-166044099 (reverse complement reference sequences for legibility). Full vertical scale represents 0.65 (mouse) or 0.4 (human) read counts per million. (FIG. 8C) Alignment of mammalian SCN1 A protein sequences.
  • the alanine residue at position 1056 in the NCBI RefSeq human SCN1A sequence is orthologous to a conserved threonine residue in most other mammalian species. Additionally, Origene commercial clones of human SCN1A contain threonine residue at this position, but agree at all other positions.
  • DNA sequences represent unique (non-PCR duplicate) reads from assay for transposase- accessible chromatin with sequencing (ATAC-seq) from three independent human patient brain samples (H17.26.001, H17.26.003, H18.03.001).
  • FIG. 8E Population allele frequencies in human SCN1A via gnomAD database.
  • gnomAD v3 and v2 populations represent partially overlapping healthy patient populations subject to genome and/or exome sequencing.
  • the threonine-encoding allele is represented by a T on the + strand (corresponding to A on the - strand) in 73-74% of the population.
  • Figures 9A-9C Cell class specificity observed with an independent marker of telencephalic GABAergic interneurons.
  • AAV vector produced at PackGene and analyzed expression at P30-P35 with both sagittal and coronal sections (n 4 mice analyzed).
  • n 4 mice analyzed.
  • FIG. 10A Biodistribution and rescue of epileptic symptoms from independently produced batches of DLX2.0-split intein SCN1A vectors. Specific expression of HA-tagged N-terminal and FLAG-tagged C-terminal SCN1A half-channels in Gad67+ neurons with independent packaging of DLX2.0-SCN1A AAVs. Animals were dosed at low dose (lelO gc each vector) or high dose (3el0 gc each vector) of DLX2.0-SCN1A AAVs by BL-ICV at P0-P3.
  • FIG. 10A Telencephalic GABAergic interneuron specificity is maintained while completeness increases with greater dose of AAV.
  • FIG. 10B-10E Protection from mortality and heat-induced seizures in Scnlafl/+; Meox2-Cre mice with independently packaged DLX2.0-SCN1 A AAV vectors.
  • FIGS. 11A-11E Rescue of mortality and epileptic symptoms in a second independent mouse model of DS.
  • FIG. HA Testing DLX2.0-split-intein-SCNlA in an independent mouse model of DS.
  • FIG. 11B Mortality monitoring after DLX2.0-split-intein-SCNl A administration.
  • N+C DLX2.0-SCN1 A provides highly significant complete rescue from mortality to beyond 365 days.
  • FIG. 11C Seizure monitoring and ECoG in DS model mice. After P70, we implanted headmounts and monitored animals for seizures and epileptic activity by paired ECoG with channels in somatosensory (SS) and parietal (P) cortex, EMG, and video monitoring.
  • SS somatosensory
  • P parietal cortex
  • Example #1 untreated Scnla+/R613X mouse displays several non-uniformly distributed GTC seizures over 228 hours of recording (one GTC seizure shown), as well as interictal spikes marked by red dots (zoom in on one example spike).
  • Example #2 N+C DLX2.0-SCNlA-injected Scnla+/R613X mouse displays no GTCs and few spikes.
  • Example #3 N+C DLX2.0-SCNlA-injected Scnla+/R613X mouse shows no GTCs but frequent spikes with aberrant spike morphology, likely an outlier. (FIG. 11D) Protection from GTCs with treatment in DS model mice.
  • “Intein” refers to a polypeptide sequence capable of catalyzing a protein splicing reaction that excises its (the intein) sequence from their host protein and joins flanking sequences (N- and C-exteins) with a peptide bond. Intein excision is a posttranslational process that does not require auxiliary enzymes or cofactors. This self-excision process is called “protein splicing,” by analogy to the splicing of RNA introns from pre-mRNA (Perl er F et al, Nucl Acids Res. 22: 1125-1127 (1994)).
  • the segments are called “intein” for internal protein sequence, and “extein” for external protein sequence, with upstream exteins termed “N- exteins” and downstream exteins called “C-exteins.”
  • the products of the protein splicing process are two stable proteins: the mature protein and the intein.
  • Inteins are typically 150-550 amino acids in size and may also contain a homing endonuclease domain.
  • a list of known inteins, and exemplary mutually orthogonal split inteins, are shown at www.inteins.com, described by Pinto et al. in Nature Communications 2020, 11 : 1529, and provided in for example US2023/0116688, which is incorporated by reference herein.
  • inteins share a low degree of sequence similarity, with conserved residues only at the N- and C-termini. Most inteins begin with Ser or Cys and end in His-Asn or in His-Gln. In various embodiments, the first amino acid of the C-extein is an invariant Ser, Thr, or Cys, but the residue preceding the intein at the N-extein is not conserved.
  • split intein refers to any intein in which the N- terminal and C-terminal amino acid sequences are not directly linked via a peptide bond, such that the N-terminal and C-terminal sequences become separate fragments that can non- covalently re-associate, or reconstitute, into an intein that is functional for trans-splicing reactions.
  • a split intein involves two complementary half inteins, termed the N-intein and C-intein, that associate selectively and extremely tightly to form an active intein enzyme (Shah N.H., et al, J. Amer. Chem. Soc. 135: 18673-18681; Dassa B., et al, Nucl. Acids Res., 37:2560- 2573 (2009)).
  • the two fragments of the split intein are encoded by two separately transcribed and translated genes. These so-called split inteins self-associate and catalyze protein-splicing activity in trans.
  • split intein N-fragment refers to any intein sequence that comprises an N-terminal amino acid sequence that is functional for trans-splicing reactions, that is, that is capable of associating with a functional split intein C-fragment to form a complete intein that is capable of excising itself from the host protein, catalyzing the ligation of the extein or flanking sequences with a peptide bond, or that upon association with a split intein C-fragment catalyzes the “N-terminal cleavage”, that is, the nucleophilic attack of the peptide bond between the extein and the N- terminus of the split intein N-fragment resulting in the breaking of said peptide bond.
  • An IntN thus also comprises a sequence that is spliced out when trans-splicing occurs.
  • An IntN can comprise a sequence that is a modification of the N-terminal portion of a naturally occurring intein sequence. For example, it can comprise additional amino acid residues and/or mutated residues so long as the inclusion of such additional and/or mutated residues does not render the IntN non-functional in trans-splicing.
  • the inclusion of the additional and/or mutated residues improves or enhances the trans-splicing activity of the IntN.
  • N-intein is fused to the N-terminal fragment of a protein to be reconstituted, wherein the N- intein is at the C-teminus relative to the N-fragment of the protein to be reconstituted.
  • split intein C-fragment refers to any intein sequence that comprises a C-terminal amino acid sequence that is functional for trans-splicing reactions, that is, that upon association is capable of associating with a functional split intein N-fragment to form a complete intein that is capable of excising itself from the host protein, catalyzing the ligation of the extein or flanking sequences with a peptide bond, or that upon association with a split N-intein catalyze
  • An IntC thus also comprises a sequence that is spliced out when trans-splicing occurs.
  • An IntC can comprise a sequence that is a modification of the C-terminal portion of a naturally occurring intein sequence. For example, it can comprise additional amino acid residues and/or mutated residues so long as the inclusion of such additional and/or mutated residues does not render the IntC non-functional in trans-splicing.
  • the inclusion of the additional and/or mutated residues improves or enhances the trans-splicing activity of the IntC.
  • the C-intein is fused to the C-terminal fragment of a protein to be reconstituted, wherein the C- intein is at the N-terminus relative to the C-terminal fragment of the protein to be reconstituted.
  • “Degrons,” “degradation signal,” or “destabilizing domain” refers to a naturally-occurring or artificially-constructed polypeptide sequence which when recombinantly fused to another polypeptide it accelerates its protein degradation via the proteosomal degradation pathway, or any other cellular degradation mechanism.
  • Enhancer or “enhancer element” refers to a cis-acting sequence that increases the level of transcription associated with a promoter and can function in either orientation relative to the promoter and the coding sequence that is to be transcribed and can be located upstream or downstream relative to the promoter or the coding sequence to be transcribed.
  • an “enhancer” is an DNA regulatory element that confer cell type specificity of gene expression.
  • a targeted central nervous system cell type enhancer is an enhancer that is uniquely or predominantly utilized by the targeted central nervous system cell type; and a targeted central nervous system cell type enhancer enhances expression of a gene in the targeted central nervous system cell type, but does not substantially direct expression of genes in other non-targeted cell types, thus having neural specific transcriptional activity.
  • enhancers especially interneuron-specific enhancers, are provided in US2021/0348195 and US2018/0078658, which are incorporated by reference.
  • Neurons found in the mammalian (e.g., human) nervous system can be divided into three classes based on their roles: sensory neurons, motor neurons, and interneurons.
  • Targeted cell types can be identified based on transcriptional profiles, such as those described in Tasic et al., 2018 Nature.
  • GABAergic interneurons express GABA synthesis genes Gadl/GADl and/or Gad2/GAD2; whereas glutamatergic neurons express glutamate transmitters SIcl7a6 and/or SIcl7a7.
  • Ion transporters are transmembrane proteins that mediate transport of ions across cell membranes.
  • ion transporters include voltage gated sodium channels, potassium channels, and calcium channels.
  • Na v Mammalian voltage-gated sodium (Na v ) channels are composed of a highly glycosylated -260 kDa a subunit, the pore forming protein, linked via disulfide bonds to (32/(34 subunits and non-covalently with (31/(33 subunits.
  • SCN1A-SCN9A N av 1 a subunit genes
  • the N av channel a subunit is a complex of transmembrane helices surrounding a central ion-conducting pore, usually capable of producing functional channels in a heterologous expression system. Approximately 2000 amino acid residues are arranged in 4 homologous domains, each consisting of 6 transmembrane segments, and a hairpin loop that lines the pore and includes the selectivity filter.
  • An additional family of accessory (3 subunits also exists, split into 2 groups discriminated by their mechanism of interaction with the a subunit: disulphide-linked (32 and (34; and non-covalently associated (31 (including splicing variant) and (33; wherein the extracellular immunolgobulin-like domain of the (3 subunit is important for surface expression and modulation of a subunit gating, while the transmembrane domain influences N av voltage-dependence.
  • the SCN1A gene codes for the alpha subunit of Navl.1 channel.
  • the Navl. I channel is mainly responsible for the generation and propagation of neuronal action potentials. Different mutations in this gene are associated with epilepsy and febrile seizures.
  • SCN1A is part of a cluster of voltage-gated sodium channel genes that is home to SCN2A, SCN3A, SCN7A, as well as SCN9A, which encode Navl.2, Navl.3, Na x , and Navl.7, respectively.
  • the Navl. l open-reading frame is believed to be organized into 26 exons and blueprints the instructions for a protein incorporating between 1976 and 2009 amino acids. Generally the variance in length stems from alternative splice junctions at the end of exon 11 that produce a full-length isoform or shortened versions thereof.
  • Coding sequences encoding molecules e.g., RNA, proteins
  • Coding sequences can be readily obtained from publicly available databases and publications. Coding sequences can further include various sequence polymorphisms, mutations, and/or sequence variants wherein such alterations do not affect the function of the encoded molecule.
  • the term “encode” or “encoding” refers to a property of sequences of nucleic acids, such as a vector, a plasmid, a gene, cDNA, mRNA, to serve as templates for synthesis of other molecules such as proteins.
  • the term “gene” may include not only coding sequences but also regulatory regions such as promoters, enhancers, and termination regions.
  • the term further can include all introns and other DNA sequences spliced from the mRNA transcript, along with variants resulting from alternative splice sites.
  • the sequences can also include degenerate codons of a reference sequence or sequences that may be introduced to provide codon preference in a specific organism or cell type.
  • vectors refers to a nucleic acid molecule capable of transferring or transporting another nucleic acid molecule, such as an expression construct.
  • vectors include plasmids (e.g., DNA plasmids or RNA plasmids), transposons, cosmids, bacterial artificial chromosomes, and viral vectors.
  • Adenovirus vectors refer to those constructs containing adenovirus sequences sufficient to support packaging of an expression construct and to express a coding sequence that has been cloned therein in a sense or antisense orientation.
  • a recombinant Adenovirus vector includes a genetically engineered form of an adenovirus. Both ends of the viral genome contain 100-200 base pair inverted repeats (ITRs), which are cis elements necessary for viral DNA replication and packaging.
  • ITRs inverted repeats
  • the early (E) and late (L) regions of the genome contain different transcription units that are divided by the onset of viral DNA replication.
  • the El region (El A and E1B) encodes proteins responsible for the regulation of transcription of the viral genome and a few cellular genes.
  • E2A and E2B results in the synthesis of the proteins for viral DNA replication. These proteins are involved in DNA replication, late gene expression, and host cell shut-off.
  • the typical vector is replication defective and will not have an adenovirus El region.
  • vectors e.g., AAV with capsids that cross the blood-brain barrier (BBB) are selected.
  • vectors are modified to include capsids that cross the BBB.
  • AAV with viral capsids that cross the blood brain barrier include AAV9, AAVrh.10, AAV1R6, AAV1R7, rAAVrh.8, AAV-BR1, AAV- PHP.S, AAV-PHP.B, and AAV-PPS.
  • the PHP.eB capsid differs from AAV9 such that, using AAV9 as a reference, amino acids starting at residue 586: S-AQ-A (SEQ ID NO:46) are changed to S-DGTLAVPFK-A (SEQ ID NO:47). Additional description regarding capsids that cross the blood brain barrier is provided by Chan et al., Nat. Neurosci. 2017 August: 20(8): 1172-1179.
  • a degradation signal in a system, wherein the degradation signal is a peptide sequence, known as degron, which mediates rapid ubiquitination and subsequent proteasomal degradation of a nearby protein or a protein that the degron is embedded in.
  • the degradation signal, or degron is included with theN-terminal segment (split) of an intein, i.e., N-intein.
  • the drgron is included with the C-terminal segment (split) of the intein, i.e., C-intein.
  • the degron is included with both the N-intein and the C-intein.
  • the degron is in an individual expression construct or vector, separate from the expression con struct! s) or vector(s) that contain C-intein or N-intein.
  • a degron is included with (or attached to) just one of the C-intein or the N-intein, and not with both the C-intein and the N-intein.
  • the degron when a degron is attached to the N-intein, the degron is attached to the C-terminal end of the N-intein, and the configuration from N- to C-end of this half of the split intein fusion is: N- terminal fragment of a sodium channel protein (e.g., N-terminal fragment of a sodium channel alpha subunit) - N-intein - degron.
  • N- terminal fragment of a sodium channel protein e.g., N-terminal fragment of a sodium channel alpha subunit
  • the degron when a degron is included with (or attached to) the C-intein, the degron is inserted into the C-intein and a few residues (e.g., 2, 3, 4, 5, 6, 7, 8, 9.
  • the configuration from N to C is: a first portion of C-intein - degron - a second portion of the C-intein - C-terminal end containing fragment of the sodium channel protein.
  • the first portion of C-intein and the second portion of the C-intein when operably connected, form a C-intein.
  • the degradation signal, or degron is integrated at the 5’ end of the N-intein, so that upon intein-mediated protein /ra/z.s-spl icing and intein excision, the degron would be placed at the N-terminal of the excised intein.
  • the degradation signal, or degron is integrated at the 3’ end of the C-intein, so that upon intein-mediated protein /ra/z.s-spl icing and intein excision, the degron would be placed at the C-terminal of the excised intein.
  • the degradation signal, or degron is integrated at the 3’ end of the N-intein and/or the 5’ end of the C-intein, so that upon intein-mediated protein /ra//.s-spl icing and intein excision, the degron would be in the middle of the excised intein.
  • the degron encoded in the system is one having no more than 30 amino acid residues. In some embodiments, the degron encoded in the system is one being 5- 27 amino-acid residues in length. In some embodiments, the degron encoded in the system is one being 5-10 amino-acid residues in length. In some embodiments, the degron encoded in the system is one being 11-20 amino-acid residues in length. In some embodiments, the degron encoded in the system is one being 21-26 amino-acid residues in length.
  • the system includes polynucleotides encoding a short degron (e.g., no more than 30 amino-acid residues in length) and an intein having fast splicing rates (e.g., half-lives below 5 min) such as consensus fast intein (Cfa); and the system does not include polynucleotides encoding degron that is more than 30 amino-acid residues in length or polynucleotides encoding segments that form an intein with a splicing rate slower than Cfa or half-lives greater than 5 min.
  • a short degron e.g., no more than 30 amino-acid residues in length
  • an intein having fast splicing rates e.g., half-lives below 5 min
  • Cfa consensus fast intein
  • the system is effective for reconstituting a target protein (e.g., sodium channel alpha subunit) at a faster speed or higher efficiency than degron-mediated degradation of intein, N-intein and/or C-intein.
  • a target protein e.g., sodium channel alpha subunit
  • the system having a polynucleotide sequence encoding the degron is effective for reconstituting a target protein (e.g., sodium channel alpha subunit) at an amount or yield that is at least 100%, 95%, 90%, 85%, or 80% compared to that reconstituted in a system without a polynucleotide sequence.
  • the system having a polynucleotide sequence encoding the degron is effective for reducing amount of intein (e.g., free intein after target protein splicing) by at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 95%.
  • intein e.g., free intein after target protein splicing
  • a degron is from the class II trans-activator (CIITA). It some embodiments, the CIITA degron is a 26 amino acid-long peptide, having an amino acid sequence of RPGSTSPFAPSATDLPSMPEPALTSR (SEQ ID NO:91). In some embodiments, the CIITA degron is a variant having at least 95% or 90% sequence identity with SEQ ID NO:91. In some embodiments, the CITTA degron is a N-terminal degron.
  • CIITA class II trans-activator
  • the CITTA degron when the CITTA degron is attached to the N-intein, the CITTA degron is at the C-terminal of the N-intein, and the configuration from N to C is: N-terminal fragment of the sodium channel alpha subunit - N-intein - CITTA degron; and the other half of the split-intein has a configuration from N to C being C-intein - C-terminal fragment of the sodium channel alpha subunit.
  • the CITTA degron is inserted within the C-intein, and the configuration from N to C of this half of the split intein fusion is: a first portion of C-intein - degron - a second portion of the C-intein - C-terminal fragment of the sodium channel alpha subunit; and the other half of the split-intein has a configuration from N to C being N-terminal fragment of the sodium channel alpha subunit - N-intein.
  • a degron is derived from the ornithine decarboxylase 1 (ODC1).
  • ODC1 degron is a 23 amino acid-long peptide, having an amino acid sequence of MSCAQESITSLYKKAGSENLYFQ (SEQ ID NO:88).
  • the ODC 1 degron is a variant having at least 95% or 90% sequence identity with SEQ ID NO:88.
  • the ODC1 degron is a 37 amino acid-long peptide, having an amino acid sequence of
  • the ODC1 degron is a variant having at least 95% or 90% sequence identity with SEQ ID NO:89.
  • the ODC1 degron is a 7 amino acid-long peptide, having an amino acid sequence of MSCAQES (SEQ ID NO:90).
  • the ODC1 degron is a variant having at least 95% or 90% sequence identity with SEQ ID NO:90.
  • an active (ligase-binding) fragment of the ODC1 degron consists of an amino acid sequence of MSCAQES.
  • one or more ODC1 degrons are each a C-terminal degron.
  • the ODC1 degron is attached to as inserted into the C-intein, and the configuration from N to C of this half of the split intein fusion is: a first portion of C-intein - degron - a second portion of the C-intein - C-terminal fragment of the sodium channel alpha subunit; and the other half of the split-intein has a configuration from N to C being N-terminal fragment of the sodium channel alpha subunit - N-intein.
  • a degron is the peptide CL1 or a variant thereof.
  • the CL1 degron is a 16 amino acid-long peptide, having an amino acid sequence of ACKNWFSSLSHFVIHL (SEQ ID NO:92).
  • the CL1 degron is a variant having at least 95% or 90% sequence identity with SEQ ID NO:92.
  • the CL1 degron or a variant thereof (“CL degron”) is attached to the N-intein and at the C-terminal of the N-intein, i.e., as a C-terminal tail of N-intein.
  • a configuration from N- to C-terminus is: N-terminal fragment of the sodium channel alpha subunit - N-intein - CL degron; and the other half of the split-intein has a configuration from N to C being C-intein - C-terminal fragment of the sodium channel alpha subunit.
  • Description of the CL degron is further provided by Gilon et al., the EMBO Journal, Vol.17 No.10 pp.2759-2766, 1998.
  • Variants of peptide CL1 include CL2, CL6, CL9, CLIO, CLI 1, CL12, CL15, CL16, and SL17, whose sequences are summarized below, as well as those having at least 95% or 90% sequence identity thereto:
  • a degron is a short DEG1 with a hydrophobic end.
  • the DEG1 degron is a 9 amino acid-long peptide, having an amino acid sequence of GSLIIFIIL (SEQ ID NO:93).
  • the DEG1 degron is a variant having at least 95% or 90% sequence identity with SEQ ID NO:93.
  • the DEG1 degron is a C-terminal degron.
  • the DEG1 degron is inserted in the C-intein, and the configuration from N to C of this half of the split intein fusion is: a first portion of C-intein - degron - a second portion of the C-intein - C-terminal fragment of the sodium channel alpha subunit; and the other half of the split-intein has a configuration from N to C being N-terminal fragment of the sodium channel alpha subunit - N-intein.
  • Various embodiments provide systems to express a coding sequence of a sodium channel alpha subunit (e.g., Navl.l a subunit, or short as Navl.l unless otherwise noted) for reconstitution of the sodium channel alpha subunit.
  • the system expresses a coding sequence of Navl. l, which is gene SCN1A.
  • the systems express the coding sequence of a sodium channel alpha subunit selected from Navl. l through Navi.9, and correspondingly encoded by genes SCN1A through SCN11A.
  • the systems express SCN2A, SCN3A, SCN4A, SCN5A, SCN8A, SCN9A, SCN10A, SCN11A, or SCN7A, for reconstitution of Navi.2, Navi.3, Navi.4, Navi.5, Navi.6, Navi.7, Navi.8, Navi.9, or Nax, respectively.
  • l alpha subunit is or comprises the polynucleotide sequence of SCN1A
  • the system includes: a first expression construct including a first portion of the polynucleotide sequence of the SCN1 A, and a polynucleotide sequence encoding an N-fragment of a split intein (N-intein) at the 3’ end relative to the first portion of the polynucleotide sequence of the SCN1A,' and a second expression construct including a second portion of the polynucleotide sequence of the SCN1A, and a polynucleotide sequence encoding a C-fragment of the split intein (C-intein) at the 5’ end relative to the second portion of the polynucleotide sequence of the SCN1A- wherein protein products of the first and the second portions of the polynucleotide sequence of the SCN1A are linked, via a peptide bond between the C-terminus of the first portion’s protein product
  • first and/or second expression construct further comprises a polynucleotide sequence encoding a degron, located if within the first expression construct at the 3 ’ end relative to the first portion of the polynucleotide sequence of the SCN1A, and if within the second expression construct at the 5’ end relative to the second portion of the polynucleotide sequence of the SCN1A, OR the system includes a third expression construct encoding the degron.
  • a composition or a system for reconstitution of Navl. l alpha subunit, which comprises: a. a first polynucleotide encoding a polypeptide comprising an N-fragment of a split intein, wherein the N-fragment of the split intein is directly linked via a peptide bond, optionally through a peptide linker, to the N-terminal fragment of Navl.l alpha subunit; b.
  • a second polynucleotide encoding a polypeptide comprising a C-fragment of the split intein, wherein the C-fragment of the split intein is directly linked via a peptide bond, optionally through a peptide linker, to the C-terminal fragment of the Navl. l alpha subunit; and c.
  • a third polynucleotide encoding a degron; wherein the polynucleotides of the composition may be packed together in a single formulation or separately in different formulations, wherein the first and the second polynucleotides encode the N- and the C-terminal fragments of the Navl.l alpha subunit, respectively, so that when both fragments are spliced together, the N-terminal fragment is linked to the C-terminal fragment, generating whole Navl.l alpha subunit.
  • the composition is further characterized in that: the split intein N-fragment is further directly linked via a peptide bond to a degron, wherein the degron is linked to the intein N-fragment via the C-terminus of the intein, with or without a linker between the intein N-fragment and the degron, and wherein the N-terminus of the Split intein N-fragment is directly linked via a peptide bond to the N-terminal fragment of the Navi .1 alpha subunit; and/or the split intein C-fragment is further directly linked via a peptide bond to a degron, wherein the degron is linked to the intein C-fragment via the N-terminus of the intein, with or without a linker between the intein C-fragment and the degron, and wherein the C-terminus of the Split intein C-fragment is directly linked via a peptide bond to the C-terminal fragment of the
  • a system to express a coding sequence of Navi.1 for its reconstitution includes a first expression construct and a second expression construct, wherein the first expression constructs includes from 5’ to 3’ : a first portion of the polynucleotide sequence of the SCN1A - a polynucleotide sequence encoding the N-intein - a polynucleotide sequence encoding the degron; and wherein the second expression constructs includes from 5’ to 3’ end: a polynucleotide sequence encoding a C-fragment of the split intein (C-intein) - a second portion of the polynucleotide sequence of the SCN1A.
  • the first expression constructs includes from 5’ to 3’ : a first portion of the polynucleotide sequence of the SCN1A - a polynucleotide sequence encoding the N-intein - a polynucleotide sequence encoding the degron;
  • the polynucleotide sequence encoding the degron is one that encodes RPGSTSPFAPSATDLPSMPEPALTSR (SEQ ID NO:91) or ACKNWFSSLSHFVIHL (SEQ ID NO:92) or a variant having a sequence identity of at least 90% or 95% thereto.
  • a system to express a coding sequence of Navi.1 for its reconstitution includes a first expression construct and a second expression construct, wherein the first expression constructs includes from 5’ to 3’ : a first portion of the polynucleotide sequence of the SCN1A - a polynucleotide sequence encoding the degron - a polynucleotide sequence encoding the N-intein; and wherein the second expression constructs includes from 5’ to 3’ end: a polynucleotide sequence encoding a C-fragment of the split intein (C-intein) - a second portion of the polynucleotide sequence of the SCN1A.
  • the first expression constructs includes from 5’ to 3’ : a first portion of the polynucleotide sequence of the SCN1A - a polynucleotide sequence encoding the degron - a polynucleotide sequence encoding the N-intein;
  • a system to express a coding sequence of Navi.1 for its reconstitution includes a first expression construct and a second expression construct, wherein the first expression construct includes from 5’ to 3’ : a first portion of the polynucleotide sequence of the SCN1A - a polynucleotide sequence encoding the N-intein; and the second expression construct includes from 5’ to 3’: a polynucleotide sequence encoding the degron - a polynucleotide sequence encoding the C-intein - a second portion of the polynucleotide sequence of the SCN1A.
  • the polynucleotide sequence encoding the degron is one encoding MSCAQESITSLYKKAGSENLYFQ (SEQ ID NO:88), FPPEVEEQDDGTLPMSCAQESGMDRHPAACASARINV (SEQ ID NO: 89), MSCAQES (SEQ ID NOVO), or GSLIIFIIL (SEQ ID NO:93), or a sequence having at least 90% or 95% sequence identity thereto.
  • a system to express a coding sequence of Navi.1 for its reconstitution includes a first expression construct and a second expression construct, wherein the first expression construct includes from 5’ to 3’ : a first portion of the polynucleotide sequence of the SCN1A - a polynucleotide sequence encoding the N-intein; and wherein the second expression construct includes from 5’ to 3’ : a polynucleotide sequence encoding the C- intein - a polynucleotide sequence encoding the degron - a second portion of the polynucleotide sequence of the SCN1A.
  • the degron has an amino acid sequence selected from the group consisting of: MSCAQESITSLYKKAGSENLYFQ (SEQ ID NO:88), FPPEVEEQDDGTLPMSCAQESGMDRHPAACASARINV (SEQ ID NO: 89), MSCAQES (SEQ ID NOVO), RPGSTSPFAPSATDLPSMPEPALTSR (SEQ ID NO:91), ACKNWFSSLSHFVIHL (SEQ ID NO:92), and GSLIIFIIL (SEQ ID NO:93), and variants having at least 90% or 90% sequence identity to any of the above.
  • Alternative sites are provided for splitting human SCN1A to make AAV-sized halves.
  • the protein is split at (right before, i.e., at the N-terminus end before) breakpoint Cysl050, according to amino acid positions in sodium channel protein type 1 subunit alpha isoform 2 of NCBI reference no. NP 001340878.1.
  • an endogenous cysteine residue is required to make half joining scarless and reconstitute a full- length unmutated protein.
  • Alternative split intein breakpoints are at either Cys957 or Cys948, according to amino acid positions in sodium channel protein type 1 subunit alpha isoform 2 of NCBI reference no. NP 001340878.1.
  • breakpoints would permit a better AAV size and packaging efficiency of the N-terminal half for better expression than that seen with hSCNl A- CO-Nterml049 when using the Cysl050 breakpoint.
  • they would place the intein junctions in the extracellular/lumenal space.
  • the N-terminal junction sequence at the front of an intein is termed as the -1 position; and the +1 position after the intein sequence usually has a Cys, Ser, or Thr residue.
  • the first portion of the polynucleotide sequence of the SCN1A encodes residues 1-1049 of the Navl. l
  • the second portion of the polynucleotide sequence of the SCN1A encodes residues 1050- 1998 of the Navl. l, wherein the amino acid position is based on numberings in NP 001340878.1
  • the first portion of the polynucleotide sequence encodes a sodium channel alpha subunit N-terminus fragment when sequence aligned being corresponding to residues 1-1049 of the Navl. l having an NCBI reference no.
  • NP 001340878.1 encodes a sodium channel alpha subunit N-terminus fragment when sequence aligned being corresponding to residues 1050-1998 of the Navl.l having an NCBI reference no. NP_001340878.1.
  • the first portion of the polynucleotide sequence of the SCN1A encodes residues 1-956 of the Navl. l
  • the second portion of the polynucleotide sequence of the SCN1A encodes residues 957- 1998, wherein the amino acid position is based on numberings in NP 001340878.1
  • the first portion of the polynucleotide sequence encodes a sodium channel alpha subunit N-terminus fragment when sequence aligned being corresponding to residues 1-956 of the Navl.l having an NCBI reference no.
  • NP_001340878.1 encodes a sodium channel alpha subunit N-terminus fragment when sequence aligned being corresponding to residues 957-1998 of the Navl. l having an NCBI reference no. NP_001340878.1.
  • the first portion of the polynucleotide sequence of the SCN1A encodes residues 1-947 of the Navl. l
  • the second portion of the polynucleotide sequence of the SCN1A encodes residues 948- 1998, wherein the amino acid position is based on numberings in NP 001340878.1
  • the first portion of the polynucleotide sequence encodes a sodium channel alpha subunit N-terminus fragment when sequence aligned being corresponding to residues 1-947 of the Navl.l having an NCBI reference no.
  • NP_001340878.1 encodes a sodium channel alpha subunit N-terminus fragment when sequence aligned being corresponding to residues 947-1998 of the Navl. l having an NCBI reference no. NP_001340878.1.
  • the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterml049 (SEQ ID NO: 59), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterm949 (SEQ ID NO: 60).
  • the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCN!A-CO-Nterm956 (SEQ ID NO: 61), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterml042 (SEQ ID NO: 62).
  • the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterm947 (SEQ ID NO: 63), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterml051 (SEQ ID NO: 64).
  • the intein includes a Cfa intein, an Ssp intein, a gp41-l intein, IMPDH-1 intein, Nrdj-1 intein, gp41-8 intein, or an Npu intein.
  • the split intein comprises consensus fast intein (Cfa),. We conceive that with the Cfa, the split intein trans-splicing reaction will occur first, before the binding of ubiquitin ligase to degron and subsequent degradation.
  • the polynucleotide sequence encoding the N- intein comprises a polynucleotide sequence of Cfa-N (SEQ ID NO:57), and the polynucleotide sequence encoding the C-intein comprises a polynucleotide sequence of Cfa-C (SEQ ID NO: 58).
  • the intein is functionally similar to a Cfa intein.
  • functionally similar to a Cfa intein means that the expression construct includes a variant of a Cfa intein, yet still results in construction of a functional protein (e.g., voltage-gated sodium channel).
  • a functional protein e.g., voltage-gated sodium channel
  • a mature protein Navl. l is expressed by splitting the coding sequence into three fragments and putting the N-terminal portion of the coding sequence with a first N-intein into a first artificial expression construct, putting the middle portion of the coding sequence with a first C-intein and second N-intein into a second artificial expression construct, and putting the C-terminal portion of the coding sequence with a second C-intein into a second artificial expression construct, wherein the first N-intein and first C-intein specifically splice together to form an intein, and wherein the second N-intein and second C- intein specifically splice together to form an intein.
  • At least one, or two or all three fragments of the coding sequence includes a degron.
  • the first intein and the second intein are different, so that the first N-intein and the second C-intein do not splice together, and the second N-intein and the first C-intein do not splice together. That is, preferably, first N-intein and first C-intein, and second N-intein and second C-intein, are two mutually orthogonal split inteins. Exemplary mutually orthogonal split inteins are described at least by Pinto et al. in Nature Communications (2020)11 : 1529. A method of using this system includes administering the first, second, and third artificial expression construct to a cell. Similarly, mature proteins can be formed from several fragments using the appropriate number of inteins.
  • the first expression construct and the second expression construct independently further comprise a promoter sequence, selected from a minBglobin promoter, an hSynl promoter, or a CMV promoter; optionally the hSynl promoter comprising a shortened hSynl promoter having a polynucleotide sequence of SEQ ID NO:54.
  • the first expression construct, the second expression construct, or both independently further comprise an enhancer sequence, configured for targeted expression of the first, the second, or both portions, respectively, of the polynucleotide sequence of the sodium channel alpha subunit within a targeted central nervous system cell type.
  • the enhancer sequence comprises a polynucleotide sequence of DLX2.0 (SEQ ID NO:2).
  • the enhancer sequence has a concatemerized core of a I56i enhancer optionally as set forth in SEQ ID NO: 1.
  • the targeted central nervous system cell type comprises a GABAergic neuron, preferably a GABAergic interneuron, or more preferably a telencephalic GABAergic interneuron.
  • the enhancer sequence is a 527 bp enhancer sequence (referred to as mI56i or mDIx) from the intergenic interval between the distal-less homeobox 5 and 6 genes (DIx5/6), which are naturally expressed by forebrain GABAergic interneurons during embryonic development. Further description of enhance sequences, such as those for selectively modulating gene expression in interneurons, is provided in US20210348195, which is incorporated by reference.
  • the enhancer sequence comprises a polynucleotide sequence of eHGT_078h (SEQ ID NO:55).
  • the targeted central nervous system cell type comprises a forebrain glutamatergic neuron.
  • the first expression construct, the second expression construct, or both independently further comprise a miRNA binding site sequence, configured for targeted expression of the first, the second, or both portions, respectively, of the polynucleotide sequence of the sodium channel alpha subunit within a selected central nervous system cell type.
  • the miRNA binding site sequence comprises a polynucleotide sequence of 4x2C miRNA binding site (SEQ ID NO:56), and the selected central nervous system cell type comprises a pan-GABAergic neuron.
  • an enhancer is used to drive gene expression in a targeted central nervous system cell population.
  • the artificial expression constructs utilize the following enhancers to drive gene expression within targeted central nervous system cell populations as follows (enhancer / targeted cell population): DLX2.0 / forebrain GABAergic; hSynl with 4x2C or 8x2C miR binding site / pan-GABAergic neurons; eHGT_078h / forebrain glutamatergic neurons.
  • the artificial expression construct can include a shortened promoter or a minimal promoter.
  • the shortened promoter includes the hSynl prom oter( shortened).
  • the minimal promoter includes minBglobin.
  • vectors described herein including vectors: CN3252 and CN3254, CN3683 and CN3684, CN3251 and CN3253, CN3677 and CN3678, CN4541 and CN4542, CN4217 and CN4218, or CN4642 and CN4643, as described in Tables below.
  • Various embodiments further provide an artificial expression construct containing a first expression construct as disclosed herein. Further embodiments also provide an artificial expression construct containing a second expression construct as disclosed herein. Preferably, the artificial expression construct is within an adeno-associated viral (AAV) vector.
  • the artificial expression construct can also include other regulatory elements if necessary or beneficial. Examples of regulatory elements utilized within artificial expression constructs disclosed herein include DLX2.0, minBglobin promoter, hSynl promoter, CMV promoter, hSynl promoter (shortened), 4x2C miR binding site, 8x2c miR binding site, and eHGT_078h.
  • the artificial expression constructs are expressed in all neurons.
  • the artificial expression constructs include an hSynl promoter and are expressed in neurons. In particular embodiments, the artificial expression constructs are expressed in all cell lines. In particular embodiments, the artificial expression constructs include a CMV promoter and are expressed in cell lines.
  • Various embodiments provide an administrable composition, which includes any one or more systems disclosed herein to express coding sequence of and reconstitute Navl. l. Additional embodiments provide an administrable composition, which includes either one or both of an artificial expression construct containing the first expression construct, and an artificial expression construct containing the second expression construct.
  • a transgenic cell is also provided, comprising any one or more systems disclosed herein. In some aspects, the transgenic cell is a GAB Aergic neuron or a glutamatergic neuron or a cell line of GAB Aergic neuron or glutamatergic neuron.
  • Additional embodiments provide a method of rescuing voltage-gated sodium channel function in a population of cells, and the method includes co-administering a therapeutically effective amount of a first expression construct and a therapeutically effective amount of a second expression construct, as disclosed herein, to a sample or subject comprising the population of cells, and inducing expression of the first expression construct and the second expression construct to reconstitute Navl. l, thereby rescuing voltage-gated sodium channel function in the population of cells.
  • the method is for rescuing voltage-gated sodium channel function in a targeted population of cells.
  • the methods involve a targeted central nervous system cell type enhancer, which is uniquely or predominantly utilized by the targeted central nervous system cell type.
  • a targeted central nervous system cell type enhancer enhances expression of a gene in the targeted central nervous system.
  • a targeted central nervous system cell type enhancer is also a targeted central nervous system type enhancer that enhances expression of a gene in the targeted central nervous system and does not substantially direct expression of genes in other non-targeted cell types, thus having cell type specific transcriptional activity.
  • the subject has an SCN1 A-related seizure disorder comprising Dravet syndrome, myoclonic seizures, myoclonic astatic epilepsy (MAE), intractable childhood epilepsy with generalized tonic-clonic seizures, simple febrile seizures, generalized epilepsy and febrile seizures plus (GEFS+), migrating partial seizures of infancy, Lennox-Gastaut syndrome, or West syndrome.
  • SCN1 A-related seizure disorder comprising Dravet syndrome, myoclonic seizures, myoclonic astatic epilepsy (MAE), intractable childhood epilepsy with generalized tonic-clonic seizures, simple febrile seizures, generalized epilepsy and febrile seizures plus (GEFS+), migrating partial seizures of infancy, Lennox-Gastaut syndrome, or West syndrome.
  • the subject is a pediatric patient having Dravet syndrome.
  • the subject is a pediatric human.
  • the subject is an infant (1 year old or younger).
  • the subject is a young child, e.g., between 1 and 10 years old.
  • the is a teenager.
  • the human subject is age 1 day through 5 months, 6 months through 4 years, 5 years through 11 years, or 12 years through 17 years.
  • artificial expression constructs can deliver SCN1A as several fragments of SCN1A delivered by several artificial expression constructs.
  • SCN1A can be delivered in a first artificial expression construct including a first portion of the SCN1A coding sequence and second artificial expression construct including a second portion of the SCN1A coding sequence.
  • the first portion of the SCN1A coding sequence is the N-terminal portion of the coding sequence and the second portion of the SCN1 A coding sequence is the C-terminal portion of the coding sequence.
  • the sodium channel alpha subunit coding sequence can be split into an N- terminal portion and C-terminal portion at any point, or preferably at a breakpoint wherein the first amino acid residue encoded downstream of the breakpoint is Cys, Ser, or Thr, such that upon intein fusion, a functional sodium channel alpha subunit molecule is expressed.
  • an N-terminal portion of the SCN1A coding sequence includes hSCNlA-CO-Nterml049 (SEQ ID NO:59), hSCNlA-CO-Nterm956 (SEQ ID NO:61), or hSCNl A-CO-Nterm947 (SEQ ID NO:63).
  • a C-terminal portion of the SCN1A coding sequence includes hSCNIA- CO-Cterm949 (SEQ ID NO: 60), hSCNlA- CO-Cterml042 (SEQ ID NO:62), or hSCNIA-CO- Cterml051 (SEQ ID NO:64).
  • Exemplary reporter genes/proteins include those expressed by Addgene ID#s 83894 (pAAV-hDlx-Flex-dTomato-Fishell_7), 83895 (pAAV-hDlx-Flex-GFP-Fishell_6), 83896 (pAAV-hDlx-GiDREADD-dTomato-Fishell-5), 83898 (pAAV-mDlx-ChR2- mCherry-Fishell-3), 83899 (pAAV-mDlx-GCaMP6f-Fishell-2), 83900 (pAAV-mDlx-GFP- Fishell-1), and 89897 (pcDNA3- FLAG-mTET2 (N500)).
  • Exemplary reporter genes particularly can include those which encode an expressible fluorescent protein, or expressible biotin; blue fluorescent proteins (e.g. eBFP, eBFP2, Azurite, mKalamal, GFPuv, Sapphire, T- sapphire); cyan fluorescent proteins (e.g. eCFP, Cerulean, CyPet, AmCyanl, Midoriishi-Cyan, mTurquoise); green fluorescent proteins (e.g.
  • blue fluorescent proteins e.g. eBFP, eBFP2, Azurite, mKalamal, GFPuv, Sapphire, T- sapphire
  • cyan fluorescent proteins e.g. eCFP, Cerulean, CyPet, AmCyanl, Midoriishi-Cyan, mTurquoise
  • green fluorescent proteins e.g.
  • artificial expression constructs can include DNA and RNA editing tools such CRISPR/Cas (e.g., guide RNA and a nuclease, such as Cas, Cas9 or cpfl).
  • Functional molecules can also include engineered Cpfls such as those described in US 2018/0030425, US 2016/0208243, WO/2017/184768 and Zetsche et al. (2015) Cell 163: 759-771; single gRNA (see e.g., Jinek et al. (2012) Science 337:816-821; Jinek et al. (2013) eLife 2:e00471; Segal (2013) eLife 2:e00563) or editase, guide RNA molecules, microRNA, or homologous recombination donor cassettes.
  • CRISPR/Cas e.g., guide RNA and a nuclease, such as Cas, Cas9 or cpfl.
  • Functional molecules can also include engineered Cpfls such as those described in US 2018
  • artificial expression constructs can include tag cassettes.
  • a tag cassette includes His tag (HHHHHH; SEQ ID NO:34), Flag tag (DYKDDDDK; SEQ ID NO:35), Xpress tag (DLYDDDDK; SEQ ID NO:36), Avi tag (GLNDIFEAQKIEWHE; SEQ ID NO: 37), Calmodulin tag (KRRWKKNFIAVSAANRFKKISSSGAL; SEQ ID NO: 38), Polyglutamate tag, HA tag (YPYDVPDYA; SEQ ID NO:39), Myc tag (EQKLISEEDL; SEQ ID NO:40), Strep tag (which refers the original STREP® tag (WRHPQFGG; SEQ ID NO:41), STREP® tag II (WSHPQFEK SEQ ID NO:42 (IBA Institut fur Bioanalytik, Germany); see, e.g., US 7,981,632), Softag 1 (SLAELLNAGLGGS; SEQ ID NO:43),
  • the artificial expression constructs include an internal ribosome entry site (IRES) sequence. See for example, figure IB.
  • IRES allow ribosomes to initiate translation at a second internal site on a mRNA molecule, leading to production of two proteins from one mRNA.
  • IRES includes IRES2.
  • IRES2 allows for a second protein open reading frame (ORF) to be translated from the same transcript. This is unlike the 2A sequence which allows for a single ORF to be cleaved into two proteins, with similar efficiencies of production
  • Coding sequences encoding molecules e.g., RNA, proteins
  • Coding sequences can be obtained from publicly available databases and publications. Coding sequences can further include various sequence polymorphisms, mutations, and/or sequence variants wherein such alterations do not affect the function of the encoded molecule.
  • the term “encode” or “encoding” refers to a property of sequences of nucleic acids, such as a vector, a plasmid, a gene, cDNA, mRNA, to serve as templates for synthesis of other molecules such as proteins.
  • the term “gene” may include not only coding sequences but also regulatory regions such as promoters, enhancers, insulators, and/or post-regulatory elements, such as termination regions.
  • the term further can include all introns and other DNA sequences spliced from the mRNA transcript, along with variants resulting from alternative splice sites.
  • the sequences can also include degenerate codons of a reference sequence or sequences that may be introduced to provide codon preference in a specific organism or cell type.
  • Promoters can include general promoters, tissue-specific promoters, cellspecific promoters, and/or promoters specific for the cytoplasm.
  • Promoters may include strong promoters, weak promoters, constitutive expression promoters, and/or inducible promoters. Inducible promoters direct expression in response to certain conditions, signals or cellular events.
  • the promoter may be an inducible promoter that requires a particular ligand, small molecule, transcription factor or hormone protein in order to effect transcription from the promoter.
  • promoters include minBglobin (also referred to as minBGprom), CMV promoter, hSynl promoter, hSynl promoter (shortened), minCMV, minCMV* (minCMV* is minCMV with a SacI restriction site removed), minRho, minRho* (minRho* is minRho with a SacI restriction site removed), SV40 immediately early promoter, the Hsp68 minimal promoter (proHSP68), and the Rous Sarcoma Virus (RSV) long-terminal repeat (LTR) promoter.
  • Minimal promoters have no activity to drive gene expression on their own but can be activated to drive gene expression when linked to a proximal enhancer element.
  • expression constructs are provided within vectors.
  • the term vector refers to a nucleic acid molecule capable of transferring or transporting another nucleic acid molecule, such as an expression construct.
  • the transferred nucleic acid is generally linked to, e.g., inserted into, the vector nucleic acid molecule.
  • a vector may include sequences that direct autonomous replication in a cell or may include sequences that permit integration into host cell DNA.
  • Useful vectors include, for example, plasmids (e.g., DNA plasmids or RNA plasmids), transposons, cosmids, bacterial artificial chromosomes, and viral vectors.
  • Adeno-Associated Virus is a parvovirus, discovered as a contamination of adenoviral stocks. It is a ubiquitous virus (antibodies are present in 85% of the US human population) that has not been linked to any disease. It is also classified as a dependovirus, because its replication is dependent on the presence of a helper virus, such as adenovirus. Various serotypes have been isolated, of which AAV-2 is the best characterized. AAV has a single-stranded linear DNA that is encapsidated into capsid proteins VP1, VP2 and VP3 to form an icosahedral virion of 20 to 24 nm in diameter.
  • the AAV DNA is 4.7 kilobases long. It contains two open reading frames and is flanked by two ITRs. There are two major genes in the AAV genome: rep and cap. The rep gene codes for proteins responsible for viral replications, whereas cap codes for capsid protein VP1-3. Each ITR forms a T-shaped hairpin structure. These terminal repeats are the only essential cis components of the AAV for chromosomal integration. Therefore, the AAV can be used as a vector with all viral coding sequences removed and replaced by the cassette of genes for delivery. Three AAV viral promoters have been identified and named p5, pl 9, and p40, according to their map position. Transcription from p5 and pl9 results in production of rep proteins, and transcription from p40 produces the capsid proteins.
  • AAVs stand out for use within the current disclosure because of their superb safety profile and because their capsids and genomes can be tailored to allow expression in targeted cell populations.
  • scAAV refers to a self-complementary AAV.
  • pAAV refers to a plasmid adeno-associated virus.
  • rAAV refers to a recombinant adeno-associated virus.
  • pSMART-HCKan is a high copy number vector with a kanamycin resistance marker for efficient blunt cloning of unstable sequences.
  • viral vectors may also be employed.
  • vectors derived from viruses such as vaccinia virus, polioviruses and herpes viruses may be employed. They offer several attractive features for various mammalian cells
  • Elements directing the efficient termination and polyadenylation of a heterologous nucleic acid transcript can increase heterologous gene expression.
  • Transcription termination signals are generally found downstream of the polyadenylation signal.
  • vectors include a polyadenylation signal 3' of a polynucleotide encoding a molecule (e.g., protein) to be expressed.
  • poly(A) site or "poly(A) sequence” denotes a DNA sequence which directs both the termination and polyadenylation of the nascent RNA transcript by RNA polymerase II.
  • Polyadenylation sequences can promote mRNA stability by addition of a poly(A) tail to the 3' end of the coding sequence and thus, contribute to increased translational efficiency.
  • Particular embodiments may utilize BGHpA, hGHpA, SV40pA, or shortPolyA.
  • a preferred embodiment of an expression construct includes a terminator element. These elements can serve to enhance transcript levels and to minimize read through from the construct into other plasmid sequences.
  • vectors include:
  • Subcomponent sequences within the larger vector sequences can be readily identified by one of ordinary skill in the art and based on the contents of the current disclosure. Nucleotides between identifiable and enumerated subcomponents reflect restriction enzyme recognition sites used in assembly (cloning) of the constructs, and in some cases, additional nucleotides do not convey any identifiable function. These segments of complete vector sequences can be adjusted based on use of different cloning strategies and/or vectors. In general, short 6-nucleotide palindromic sequences reflect vector construction artifacts that are not important to vector function. [0140] In particular embodiments vectors (e.g., AAV) with capsids that cross the blood-brain barrier (BBB) are selected.
  • AAV blood-brain barrier
  • vectors are modified to include capsids that cross the BBB.
  • AAV with viral capsids that cross the blood brain barrier include AAV9 (Gombash et al., Front Mol Neurosci. 2014; 7:81), AAVrh.10 (Yang, et al., Mol Ther.2014; 22(7): 1299-1309), AAV1R6, AAV1R7 (Albright et al., Mol Ther. 2018; 26(2): 510), rAAVrh.8 (Yang, et al., supra), AAV-BR1 (Marchio et al., EMBO Mol Med.
  • the PHP.eB capsid differs from AAV9 such that, using AAV9 as a reference, amino acids starting at residue 586: S-AQ-A (SEQ ID NO: 46) are changed to S-DGTLAVPFK-A (SEQ ID NO: 47).
  • PHP.eB refers to SEQ ID NO: 30. Further description of capsids that cross the BBB is provided in US20210348195, which is incorporated by reference.
  • AAV9 is a naturally occurring AAV serotype that, unlike many other naturally occurring serotypes, can cross the BBB following intravenous injection. It transduces large sections of the central nervous system (CNS), thus permitting minimally invasive treatments (Naso et al., BioDrugs. 2017; 31(4): 317), for example, as described in relation to clinical trials for the treatment of spinal muscular atrophy (SMA) syndrome by AveXis (AVXS-101, NCT03505099) and the treatment of CLN3 gene-Related Neuronal Ceroid-Lipofuscinosis (NCT03770572).
  • SMA spinal muscular atrophy
  • AveXis AVXS-101, NCT03505099
  • CLN3 gene-Related Neuronal Ceroid-Lipofuscinosis NCT03770572
  • AAVrh.10 was originally isolated from rhesus macaques and shows low seropositivity in humans when compared with other common serotypes used for gene delivery applications (Selot et al., Front Pharmacol. 2017; 8: 441) and has been evaluated in clinical trials LYS-SAF302, LYSOGENE, and NCT03612869.
  • AAV1R6 and AAV1R7 two variants isolated from a library of chimeric AAV vectors (AAV1 capsid domains swapped into AAVrh.10), retain the ability to cross the BBB and transduce the CNS while showing significantly reduced hepatic and vascular endothelial transduction.
  • rAAVrh.8 also isolated from rhesus macaques, shows a global transduction of glial and neuronal cell types in regions of clinical importance following peripheral administration and also displays reduced peripheral tissue tropism compared to other vectors.
  • AAV-BR1 is an AAV2 variant displaying the NRGTEWD (SEQ ID NO:48) epitope that was isolated during in vivo screening of a random AAV display peptide library. It shows high specificity accompanied by high transgene expression in the brain with minimal off-target affinity (including for the liver) (Korbelin et al., EMBO Mol Med. 2016; 8(6): 609).
  • AAV-PHP.S (Addgene, Watertown, MA) is a variant of AAV9 generated with the CREATE method that encodes the 7-mer sequence QAVRTSL (SEQ ID NO:49), transduces neurons in the enteric nervous system, and strongly transduces peripheral sensory aff erents entering the spinal cord and brain stem.
  • AAV-PHP.B (Addgene, Watertown, MA) is a variant of AAV9 generated with the CREATE method that encodes the 7-mer sequence TLAVPFK (SEQ ID NO:50). It transfers genes throughout the CNS with higher efficiency than AAV9 and transduces the majority of astrocytes and neurons across multiple CNS regions.
  • AAV-PPS an AAV2 variant crated by insertion of the DSPAHPS (SEQ ID NO:51) epitope into the capsid of AAV2, shows a dramatically improved brain tropism relative to AAV2.
  • a capsid that results in transduction of targeted cell types in a primate following administration is chosen.
  • a capsid that results in widespread transduction of tissue and cell types impacted by the loss of Senia following administration is chosen.
  • targeted cell types are neurons.
  • neurons include GABAergic neurons or glutamatergic neurons.
  • GABAergic neurons include pan- GABAergic neurons, forebrain GABAergic neurons, hippocampal GABAergic neurons, or cortical GABAergic neurons.
  • glutamatergic neurons include forebrain glutamatergic neurons.
  • physiologically active components can be formulated with a carrier or more than one carrier that is suitable for administration to a cell, tissue slice, animal (e.g., mouse, non-human primate), or human.
  • Physiologically active components within compositions described herein can be prepared in neutral forms, as freebases, or as pharmacologically acceptable salts.
  • Pharmaceutically-acceptable salts include the acid addition salts (formed with the free amino groups of the protein) and which are formed with inorganic acids such as, for example, hydrochloric or phosphoric acids, or such organic acids as acetic, oxalic, tartaric, mandelic, and the like. Salts formed with the free carboxyl groups can also be derived from inorganic bases such as, for example, sodium, potassium, ammonium, calcium, or ferric hydroxides, and such organic bases as isopropylamine, trimethylamine, histidine, procaine and the like.
  • inorganic acids such as, for example, hydrochloric or phosphoric acids, or such organic acids as acetic, oxalic, tartaric, mandelic, and the like.
  • Salts formed with the free carboxyl groups can also be derived from inorganic bases such as, for example, sodium, potassium, ammonium, calcium, or ferric hydroxides, and such organic bases as isopropylamine, trimethyl
  • Carriers of physiologically active components can include solvents, dispersion media, vehicles, coatings, diluents, isotonic and absorption delaying agents, buffers, solutions, suspensions, colloids, and the like.
  • the use of such carriers for physiologically active components is well known in the art. Except insofar as any conventional media or agent is incompatible with the physiologically active components, it can be used with compositions as described herein.
  • pharmaceutically-acceptable carriers refer to carriers that do not produce an allergic or similar untoward reaction when administered to a human, and in particular embodiments, when administered intravenously.
  • compositions can be formulated for intravenous, intraparenchymal, intraocular, intravitreal, parenteral, subcutaneous, intracerebro-ventricular, intramuscular, intrathecal, intraspinal, intraperitoneal, oral or nasal inhalation, or by direct injection in or application to one or more cells, tissues, or organs.
  • compositions may include liposomes, lipids, lipid complexes, microspheres, microparticles, nanospheres, and/or nanoparticles.
  • liposomes are generally known to those of skill in the art. Liposomes have been developed with improved serum stability and circulation half-times (see, for instance, U.S. Pat. No. 5,741,516). Further, various methods of liposome and liposome like preparations as potential drug carriers have been described (see, for instance U.S. Pat. Nos. 5,567,434; 5,552,157; 5,565,213; 5,738,868; and 5,795,587).
  • Nanocapsules can generally entrap compounds in a stable and reproducible way (Quintanar-Guerrero et al., Drug Dev Ind Pharm 24(12): 1113-1128, 1998; Quintanar-Guerrero et al., Pharm Res. 15(7): 1056-1062, 1998; Quintanar-Guerrero et al., J. Microencapsul. 15(l):107-l 19, 1998; Douglas et al., Crit Rev Ther Drug Carrier Syst 3(3):233- 261, 1987).
  • ultrafine particles can be designed using polymers able to be degraded in vivo.
  • Biodegradable polyalkyl- cyanoacrylate nanoparticles that meet these requirements are contemplated for use in the present disclosure.
  • Such particles can be easily made, as described in Couvreur et al., J Pharm Sci 69(2): 199-202, 1980; Couvreur etal., Crit Rev Ther Drug Carrier Syst. 5(1)1-20, 1988; zur Muhlen et al., Eur J Pharm Biopharm, 45(2): 149-155, 1998; Zambaux et a/., JControl Realease 50(1-3):31- 40, 1998; and U.S. Pat. No. 5,145,684.
  • Injectable compositions can include sterile aqueous solutions or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersions (U.S. Pat. No. 5,466,468).
  • the form is sterile and fluid to the extent that it can be delivered by syringe.
  • it is stable under the conditions of manufacture and storage, and optionally contains one or more preservative compounds against the contaminating action of microorganisms, such as bacteria and fungi.
  • the carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (e.g., glycerol, propylene glycol, and liquid polyethylene glycol, and the like), suitable mixtures thereof, and/or vegetable oils.
  • polyol e.g., glycerol, propylene glycol, and liquid polyethylene glycol, and the like
  • suitable mixtures thereof e.g., vegetable oils
  • vegetable oils e.g., glycerol, propylene glycol, and liquid polyethylene glycol, and the like
  • suitable mixtures thereof e.g., vegetable oils.
  • vegetable oils e.g., glycerol, propylene glycol, and liquid polyethylene glycol, and the like
  • suitable mixtures thereof e.g., glycerol, propylene glycol, and liquid polyethylene glycol, and the like
  • vegetable oils e.g., glycerol, propylene glycol, and liquid polyethylene glycol
  • the preparation will include an isotonic agent(s), for example, sugar(s) or sodium chloride.
  • an isotonic agent(s) for example, sugar(s) or sodium chloride.
  • Prolonged absorption of the injectable compositions can be accomplished by including in the compositions of agents that delay absorption, for example, aluminum monostearate and gelatin.
  • injectable compositions can be suitably buffered, if necessary, and the liquid diluent first rendered isotonic with sufficient saline or glucose.
  • Dispersions may also be prepared in glycerol, liquid polyethylene glycols, and mixtures thereof and in oils. As indicated, under ordinary conditions of storage and use, these preparations can contain a preservative to prevent the growth of microorganisms.
  • Sterile compositions can be prepared by incorporating the physiologically active component in an appropriate amount of a solvent with other optional ingredients (e.g., as enumerated above), followed by filtered sterilization.
  • dispersions are prepared by incorporating the various sterilized physiologically active components into a sterile vehicle that contains the basic dispersion medium and the required other ingredients (e.g., from those enumerated above).
  • preferred methods of preparation can be vacuum-drying and freeze-drying techniques which yield a powder of the physiologically active components plus any additional desired ingredient from a previously sterile-filtered solution thereof.
  • Oral compositions may be in liquid form, for example, as solutions, syrups or suspensions, or may be presented as a drug product for reconstitution with water or other suitable vehicle before use.
  • Such liquid preparations may be prepared by conventional means with pharmaceutically acceptable additives such as suspending agents (e.g., sorbitol syrup, cellulose derivatives or hydrogenated edible fats); emulsifying agents (e.g., lecithin or acacia); non- aqueous vehicles (e.g., almond oil, oily esters, or fractionated vegetable oils); and preservatives (e.g., methyl or propyl-p-hydroxybenzoates or sorbic acid).
  • suspending agents e.g., sorbitol syrup, cellulose derivatives or hydrogenated edible fats
  • emulsifying agents e.g., lecithin or acacia
  • non- aqueous vehicles e.g., almond oil, oily esters, or fractionated vegetable oils
  • preservatives
  • compositions may take the form of, for example, tablets or capsules prepared by conventional means with pharmaceutically acceptable excipients such as binding agents (e.g., pregelatinized maize starch, polyvinyl pyrrolidone or hydroxypropyl methylcellulose); fillers (e.g., lactose, microcrystalline cellulose or calcium hydrogen phosphate); lubricants (e.g., magnesium stearate, talc or silica); disintegrants (e.g., potato starch or sodium starch glycolate); or wetting agents (e.g., sodium lauryl sulphate). Tablets may be coated by methods well-known in the art.
  • binding agents e.g., pregelatinized maize starch, polyvinyl pyrrolidone or hydroxypropyl methylcellulose
  • fillers e.g., lactose, microcrystalline cellulose or calcium hydrogen phosphate
  • lubricants e.g., magnesium stearate, talc or silica
  • Inhalable compositions can be delivered in the form of an aerosol spray presentation from pressurized packs or a nebulizer, with the use of a suitable propellant, e.g., dichlorodifluoromethane, trichlorofluoromethane, di chlorotetrafluoroethane, carbon dioxide or other suitable gas.
  • a suitable propellant e.g., dichlorodifluoromethane, trichlorofluoromethane, di chlorotetrafluoroethane, carbon dioxide or other suitable gas.
  • the dosage unit may be determined by providing a valve to deliver a metered amount.
  • Capsules and cartridges of, e.g., gelatin for use in an inhaler or insufflator may be formulated containing a powder mix of the compound and a suitable powder base such as lactose or starch.
  • Compositions can also include microchip devices (U.S. Pat. No. 5,797,898), ophthalmic formulations (Bourlais etal., Prog Retin Eye Res, 17(l):33-58, 1998), transdermal matrices (U.S. Pat. No. 5,770,219 and U.S. Pat. No. 5,783,208) and feedback-controlled delivery (U.S. Pat. No. 5,697,899).
  • microchip devices U.S. Pat. No. 5,797,898
  • ophthalmic formulations Bophthalmic formulations
  • transdermal matrices U.S. Pat. No. 5,770,219 and U.S. Pat. No. 5,783,208
  • feedback-controlled delivery U.S. Pat. No. 5,697,899
  • Supplementary active ingredients can also be incorporated into the compositions.
  • compositions can include at least 0.1% of the physiologically active components or more, although the percentage of the physiologically active components may, of course, be varied and may conveniently be between 1 or 2% and 70% or 80% or more or 0.5-99% of the weight or volume of the total composition.
  • the amount of physiologically active components in each physiologically-useful composition may be prepared in such a way that a suitable dosage will be obtained in any given unit dose of the compound. Factors such as solubility, bioavailability, biological half-life, route of administration, product shelf life, as well as other pharmacological considerations will be contemplated by one skilled in the art of preparing such pharmaceutical formulations, and as such, a variety of compositions and dosages may be desirable.
  • FDA United States Food and Drug Administration
  • the present disclosure includes cells including an artificial expression construct described herein.
  • a cell that has been transformed with an artificial expression construct can be used for many purposes, including in neuroanatomical studies, assessments of functioning and/or non-functioning proteins, and drug screens that assess the regulatory properties of enhancers.
  • the cell is a mammalian cell.
  • the artificial expression construct includes a regulatory element and/or a vector sequence of DLX2.0, minBglobin promoter, hSynl promoter, CMV promoter, hSynl promoter (shortened), 4x2C miR binding site, 8x2C miR binding site and/or eHGT_078h and/or CN3252, CN3254, CN3683, CN3684, CN3251, CN3253, CN3677, CN3678, CN4541, CN4542, CN4217, CN4218, CN4642, or CN4643, and the cell line is a human, primate, or murine cell.
  • Cell lines which can be utilized for transgenesis in the present disclosure also include primary cell lines derived from living tissue such as rat or mouse brains and organotypic cell cultures, including brain slices from animals such as
  • WO 91/13150 describes a variety of cell lines, including neuronal cell lines, and methods of producing them.
  • WO 97/39117 describes a neuronal cell line and methods of producing such cell lines.
  • the neuronal cell lines disclosed in these patent applications are applicable for use in the present disclosure.
  • neuronal describes something that is of, related to, or includes, neuronal cells. Neuronal cells are defined by the presence of an axon and dendrites.
  • neuronal-specific refers to something that is found, or an activity that occurs, in neuronal cells or cells derived from neuronal cells, but is not found in or occur in, or is not found substantially in or occur substantially in, non-neuronal cells or cells not derived from neuronal cells, for example glial cells such as astrocytes or oligodendrocytes.
  • non-neuronal cell lines may be used, including mouse embryonic stem cells.
  • Cultured mouse embryonic stem cells can be used to analyze expression of genetic constructs using transient transfection with plasmid constructs.
  • Mouse embryonic stem cells are pluripotent and undifferentiated. These cells can be maintained in this undifferentiated state by Leukemia Inhibitory Factor (LIF). Withdrawal of LIF induces differentiation of the embryonic stem cells.
  • LIF Leukemia Inhibitory Factor
  • the stem cells form a variety of differentiated cell types. Differentiation is caused by the expression of tissue specific transcription factors, allowing the function of an enhancer sequence to be evaluated. (See for example Fiskerstrand et al., FEBS Lett 458: 171-174, 1999).
  • Methods to differentiate stem cells into neuronal cells include replacing a stem cell culture media with a media including basic fibroblast growth factor (bFGF) heparin, an N2 supplement (e.g., transferrin, insulin, progesterone, putrescine, and selenite), laminin and poly ornithine.
  • bFGF basic fibroblast growth factor
  • N2 supplement e.g., transferrin, insulin, progesterone, putrescine, and selenite
  • laminin poly ornithine.
  • a process to produce myelinating oligodendrocytes from stem cells is described in Hu, etal., 2009, Nat. Protoc. 4: 1614-22.
  • Bibel, etal., 2007, Nat. Protoc. 2: 1034-43 describes a protocol to produce glutamatergic neurons from stem cells while Chatzi, et al., 2009, Exp. Neurol. 217:407-16 describes a procedure to produce GABAe
  • U.S. Publication No. 2012/0329714 describes use of prolactin to increase neural stem cell numbers while U.S. Publication No. 2012/0308530 describes a culture surface with amino groups that promotes neuronal differentiation into neurons, astrocytes and oligodendrocytes.
  • the fate of neural stem cells can be controlled by a variety of extracellular factors. Commonly used factors include brain derived growth factor (BDNF; Shetty and Turner, 1998, J. Neurobiol. 35:395- 425); fibroblast growth factor (bFGF; U.S. Pat.
  • BDNF brain derived growth factor
  • bFGF fibroblast growth factor
  • somatostatin e.g, cyclic adenosine monophosphate; epidermal growth factor (EGF); dexamethasone (glucocorticoid hormone); forskolin; GDNF family receptor ligands; potassium; retinoic acid (U.S. PatentNo. 6,395,546); tetanus toxin; and transforming growth factor-a and TGF-P (U.S. Pat. Nos. 5,851,832 and 5,753,506).
  • neurotrophins e.g, cyclic adenosine monophosphate; epidermal growth factor (EGF); dexamethasone (glucocorticoid hormone); forskolin; GDNF family receptor ligands; potassium; retinoic acid (U.S. PatentNo. 6,395,546); tetanus toxin; and transforming growth factor-a and TGF-P (U.S. Pat. Nos. 5,851,832 and 5,753,
  • yeast one-hybrid systems may also be used to identify compounds that inhibit specific protein/DNA interactions, such as transcription factors for DLX2.0, minBglobin promoter, hSynl, promoter, CMV promoter, hSynl promoter (shortened), 4x2C miR binding site, 8x2C miR binding site, and/or eHGT_078h.
  • Methods are also provided for administering a system of expression constructs to a subject in need thereof, which include administering a therapeutically effective amount of a system disclosed herein to a sample or subject comprising a targeted population of cells, and inducing expression of the first expression construct and the second expression construct of the system to reconstitute a sodium channel alpha subunit, thereby rescuing voltage-gated sodium channel function within the targeted population of cells.
  • the subject in need thereof has a sodium channelopathies, optionally comprising Dravet syndrome, myoclonic seizures, myoclonic astatic epilepsy (MAE), intractable childhood epilepsy with generalized tonic-clonic seizures, simple febrile seizures, generalized epilepsy and febrile seizures plus (GEFS+), migrating partial seizures of infancy, Lennox-Gastaut syndrome, or West syndrome.
  • a sodium channelopathies optionally comprising Dravet syndrome, myoclonic seizures, myoclonic astatic epilepsy (MAE), intractable childhood epilepsy with generalized tonic-clonic seizures, simple febrile seizures, generalized epilepsy and febrile seizures plus (GEFS+), migrating partial seizures of infancy, Lennox-Gastaut syndrome, or West syndrome.
  • the system of expression constructs is administered to a human subject or mammalian subject. In some embodiments, the system of expression constructs is administered to cells or tissue cultures obtained from, or derived from, a human subject or a mammalian subject.
  • Dravet syndrome is a devastating developmental epileptic encephalopathy marked by treatment-resistant seizures, developmental delay, intellectual disability, motor deficits, and a 10-20% rate of premature death. Most DS patients harbor loss-of-function mutations in one copy of SCN1A, which has been associated with inhibitory neuron dysfunction.
  • an interneuron-targeting AAV human SCN1A gene replacement therapy using cell class-specific enhancers We generated a split-intein fusion form of SCN1A to circumvent AAV packaging limitations and deliver SCN1A via a dual vector approach using cell class-specific enhancers. These constructs produced full-length NaVEl protein and functional sodium channels in HEK293 cells and in brain cells in vivo. After packaging these vectors into enhancer- AAVs and administering to mice, immunohistochemical analyses showed telencephalic GAB Aergic interneuron-specific and dose-dependent transgene biodistribution.
  • Dravet syndrome is a severe early-onset epileptic encephalopathy marked by spontaneous and febrile seizures, motor disabilities, cognitive dysfunction, developmental delay, and heightened risk of premature death by sudden unexpected death in epilepsy (SUDEP).
  • DS afflicts approximately 1 : 16000 births, usually manifests in the first year of life, and produces profound symptoms that require life-long care.
  • Most first-line anti-epileptic drugs are ineffective or contraindicated for DS, although several recently approved drugs now partially ameliorate DS symptoms.
  • no FDA-approved long-term disease-modifying treatments currently exist for DS despite extensive efforts. As a result, a treatment for DS is a pressing unmet need for patients and their caregivers.
  • AAVs enhancer-adeno-associated viruses
  • DLX2.0-driven intein-SCNIA vectors specifically transduce telencephalic GABAergic interneurons.
  • Neonatal mice were injected at postnatal day 2 (P2) with 5 pL total volume of these vectors delivered bilaterally via intracerebroventricular (ICV) route at a dose of 3el0 genome copies (gc) of each half.
  • ICV intracerebroventricular
  • mice After twenty days, we analyzed mouse motor cortex membrane protein content by western blot to assess efficiency of half joining byintein-mediated fusion.
  • a separate cohort of Dlx5/6-Cre; AH4 mice to provide an independent label for telencephalic GABAergic interneurons we observed similar high levels of specificity and moderate completeness of AAV transduction (Fig. 9A-9C).
  • Dual DLX2.0 split-intein SCN1A vectors protect against Sudden Unexpected Death in Epilepsy (SUDEP) in DS mouse models.
  • Dual DLX2.0 split-intein SCN1A vectors protect against thermal and spontaneous myoclonic and generalized tonic-clonic seizures in DS mice.
  • DS model mice are sensitive to thermally induced seizures, similar to patients with DS.
  • MC myoclonic
  • GTC generalized tonic-clonic
  • DLX2.0-SCN1A AAV vectors conferred strong and reproducible protection from mortality, induced seizures, and spontaneous seizure burden in two independent genetic mouse models of DS, although this protective effect of the AAV treatment is dose- and AAV quality- dependent.
  • Dual DLX2.0 vectors completely protect against SUDEP and seizures in mice with telencephalic GABAergic interneuron-specific Senia deletion.
  • mice were injected with dual DLX2.0-SCN1A AAVs via BL-ICV route at P0-3 and monitored these and untreated controls for premature death up to P70.
  • telencephalic GABAergic interneuron-selective targeting is beneficial for gene replacement therapy in DS
  • a comparator we produced and tested split- intein vectors driven by hSynl which expresses in most brain neuronal populations, including excitatory and inhibitory neurons (Fig. 6A). Constructs for these vectors were built and packaged in the same way as DLX2.0 ones, except that hSynl promoter was used in this case. Delivered by BL-ICV at P2, these hSynl split-intein vectors led to reconstitution of full-length NaVl .1 in mouse brain (Fig.
  • mice treated with dual nonselective AAVs exhibited a surfeit of unexpected deaths.
  • GFAP astrogliosis
  • Ibal microgliosis
  • VISp VISp shown
  • mice two example mice spanning the range of astrogliosis observed are shown.
  • Nonselective neuronal expression of SCN1A exacerbated astrogliosis in DS model mice but did not cause overt changes in microglial appearance.
  • Dual hSynl AAV vectors confer partial protection against thermal MC and GTC seizures.
  • DLX2.0 enhancer used in this study targets both PVALB and SST and also other telencephalic interneuron populations, in both mouse and human tissue, which may explain the strong anti-epileptic effect of DLX2.0-SCN1A.
  • HC-AdVs high-capacity adenoviral vectors
  • helper-Dependent or “gutless” high-capacity adenoviral vectors
  • Our AAV-mediated and telencephalic GABAergic interneuron-selective gene replacement strategy is unique in several ways. First, our vectors express a new copy of the SCN1A gene at moderate levels, and don’t upregulate the endogenous SCN1 A allele, a strategy that could be ineffective or harmful for certain disease alleles.
  • telencephalic GABAergic interneurons the most essential cell type, whereas antisense oligonucleotides lack targeting ability. Since SCNIAis expressed in both excitatory and inhibitory neurons, activation of the SCN1A in excitatory neurons could attenuate the corrective effect of SCN1A upregulation.
  • SCNIA expressed in both excitatory and inhibitory neurons
  • activation of the SCN1A in excitatory neurons could attenuate the corrective effect of SCN1A upregulation.
  • AAV delivery with the PHP.eB capsid and neonatal ICV injection, widespread viral transduction of telencephalic GABAergic interneurons is achieved.
  • telencephalic interneurons provides a unique and highly effective treatment for DS.
  • the split-intein fused SCN1A halves are delivered in advanced BBB-penetrant AAV capsids.
  • the AAV capsids comprise AV-PHP.eB, which efficiently transduce the central nervous systems.
  • the AAV capsids comprise AAV-PHP.S, which efficiently transduce the peripheral nervous systems.
  • the split-intein fused SCN1A halves delivered in AAV capsids are administered locally with intraparenchymal delivery.
  • regulatory elements known capable of restricting expression to interneurons include the distal-less homeobox 5 and 6 (Dlx5/6) genes, which are specifically expressed by all forebrain GABAergic interneurons during embryonic development.
  • genes have an inverted orientation relative to one another and share a 400bp (mI56i or mDlx) and a 300bp (mI56ii) enhancer sequence in the lOkb non-coding intergenic region 3' to each of them.
  • mDlx enhancer can be used to target reporter genes in a pattern very similar to the normal patterns of Dlx5/6 expression during embryonic development, e.g., selectively expressed within GABAergic interneurons in a wide variety of vertebrate species.
  • mice at SCRI were maintained in standard cages for laboratory mice, on a 12h:12h light-dark cycle, with ad-libitum access to food and water, at 23 degrees C.
  • Mouse models and their littemiates of DS used in these studies were generated using Cre-Lox technology'.
  • DS mice carrying a whole body heterozygous knock-out of Senia were obtained by breeding fl oxed Senia mice with MCOX2 ⁇ Cre mice (Strain #: 003755; Jackson Laboratories).
  • mice carrying an Senia KO allele restricted to (specifically in) forebrain GABAergic neurons alone were generated by breeding fl oxed Senia mice with Dlx5/6-Cre mice (Strain#: 008199, Jackson Laboratories). All breeder mice were maintained on a C57BL/6J background for at least 10 generations.
  • Mouse core body temperature was monitored and controlled using a rectal temperatureprobe (RET4) and a heat lamp, both connected to a temperature controller in a feedback loop (Physitemp Instruments Inc.). Baseline body temperature was measured and subsequently, gradual temperature increases of 0.5°C every 2 minutes were conducted until seizure occurrence ora 42°C temperature was attained. Then, the mouse was immediately cooled down using a small fan. Mouse behavior during the whole test was recorded using a digital video camera and reviewed for MC and GTC seizure scoring.
  • RET4 rectal temperatureprobe
  • a heat lamp both connected to a temperature controller in a feedback loop (Physitemp Instruments Inc.). Baseline body temperature was measured and subsequently, gradual temperature increases of 0.5°C every 2 minutes were conducted until seizure occurrence ora 42°C temperature was attained. Then, the mouse was immediately cooled down using a small fan. Mouse behavior during the whole test was recorded using a digital video camera and reviewed for MC and GTC seizure scoring.
  • Neonatal P0-3 mice were cryo-anesthetized on a small aluminum plate placed on ice.
  • Single AAVs or dual AAVs were injected bilaterally into lateral ventricles using a33-gauge needle attached to Hamilton microliter syringe.
  • 2.5 pl of the AAV solution were injected in each ventricle for a total of 5 pl per mouse containing a total of lelO or 3el 0 gc each viral vector.
  • mice were put back into their nest and placed on a warmthing pad until their body temperature returned to normal. Subsequently, they were returned to the cage with the mother.
  • Electrocorlicography (ECoG) electrode implantation surgery ECG
  • ECoG electrodes were inserted into a small cranial burr hole above the somatosensory cortex in each hemisphere. Similarly, a reference electrode microscrew was placed in a burr hole above the cerebellum. EMG electrodes were inserted and secured into the neck muscles.
  • Electrodes were attached to an interface connector and the assembly was affixed to the skull with dental cement (Lang Dental Manufacturing Co., Inc., Wheeling, IL, United States). The incision around the electrode implant was closed using sutures. Mice were allowed to recover from surgery for 1-3 days before recording.
  • Interictal spikes were frequently followed by a slow wave, but they were not associated with increased EMG activity or movement on video. Conversely, GTC seizure events were marked at their onset by bursts of generalized spikes and waves of increasing amplitude and decreasing frequency on ECoG. They coincided with increased activity on EMG and video and were followed by a distinct period of post-ictal ECoG suppression.
  • mice were handled under appropriate institutional protocols and guidelines. Procedures were approvedby the Allen Institute Institutional Animal Care and Use Committee under protocols 2002 and 2301. We housed animals in a 14: 10 light:dark cycle in ventilated racks with ad libitum access to food (LabDiet 5001) and water, as well as enrichment items consisting of plastic shelters and nesting materials. Young animals are weaned promptly at 21 days of age. We obtained 129S1 SvIm.J-Scn f J (here Scnla +/R613X ) mice from Jackson Laboratory (strain # 034129) and maintained breeders on a 129Sl/SvImJ genetic background.
  • AIBS Allen Institute for Brain Science
  • mice P56-90
  • ECoG/EMG headmount For stereotaxic surgical procedures, we induced anesthesia in mice first with 5% isoflurane in oxygen, and then maintained anesthesia with 1.5-2.5% isoflurane.
  • Electrode leads were soldered onto the 8-pin headmount (#8431-SM, Pinnacle Technology Inc.).
  • the headmount contains two insulated EMG wire electrodes that are pre-soldered, and these EMG electrodes were inserted into the neck muscles. All wires, pins and the headmount were embedded in light curable dental composite resin (Prime-Dent, Prime Dental Manufacturing Inc., Chicago, IL, USA). Mice were singly housed post-surgery and recovered for at least 7 days prior to recording. Recordings were thus acquired between ages P74 and Pl 23. For recordings at AIBS, mice were singly housed in 10-inch clear acrylic chambers (#8228, Pinnacle Technology Inc.) under a 14-hr on, 10-hr off light/dark cycle.
  • mice were tethered with the pre-amplifier through a commutator to the data acquisition system (#8401 -HR, Pinnacle Technology Inc.). All ECoGZEMG data were recorded with a 500 Hz sampling rate, 10 X gain, a low pass (ECoG: 0.5 Hz; EMG: 1 Hz) filter, and a high pass (500 Hz) filter. Videos were recorded synchronously at a frame rate of 10 frame/s with a resolution of 640x480 pixels. We implanted a total of 18 non-injected Scnla +/R6I3X mice. Of these 18, seven mice (3M+4F) died during recovery prior to recording, and we recorded from the remaining 11 mice (9M+2F).
  • mice Under avertin terminal anesthesia, we perfused mice with ice-cold PBS with 0.25 mM EDTA added (25 mL), followed by cold 4% PFA inPBS (12 mL). Following brain and other organ dissection, we post-fixed brain in 4% PFA in PBS overnight at 4°C.
  • PFA in PBS in one liter-sized batches by dissolving PFApowder in PBS with heating, and froze 50 mL aliquots at -20°C until use, which was important as we found anti-Gad67 and antiGABA stain quality depended upon PFA preparation method.
  • mouse monoclonal anti-FLAG clone M2 (Millipore-Sigma # F1804)
  • rabbit monoclonal anti-HA clone C29F4 (1/1000, Cell Signaling # 3724S)
  • mouse monoclonal anti-HA clone 16B12 (1/1000, Biolegend # 901513)
  • mouse monoclonal anti-HA clone HA.C5 (1/1000, Thermo Fisher Scientific # MA5-27543
  • mouse monoclonal anti-Gad67 clone 1G10.2 (1/250, Millipore-Sigma # MAB5406)
  • mouse monoclonal anti-NeuN clone 1B7 (1/500, Novus Biologicals # NBP1 -92693 AF647)
  • guinea pig polyclonal anti-GABA(l/500 Millipore-Sigma # AB 175).
  • This residue is alanine in the NCBI RefSeq sequence but is a threonine in commercial clones available from Origene (catalog # RG220167), as well as a conserved threonine across most other mammalian species (Fig. 8C), and finally we observed this residue to be a threonine in three of three human tissue donors sequenced in our prior work (Fig. 8D, data available at dbGaP # phs002292.vl.pl). From the Genome Aggregation Database (gnomAD) the reference alanine allele appears to be the minor allele in the human population (27%), whereas the majority of alleles in the population encode threonine at that position (73%). As a result, we used threonine in that position for our SCN1A transgene.
  • gnomAD Genome Aggregation Database
  • HA and FLAG epitope tags (HA at the N-terminus of the N-terminal half, and FLAG at the C-terminus of the C-terminal half), with short dipeptide linkers between the epitope tags and SCN1A coding sequence.
  • these protein sequences we reverse-translated and performed codon optimization using Integrated DNA Technologies online codon optimization tool (www.idtdna.com/pages/tools/codon-optimization-tool), and manually adjusted the codon usage to minimize cryptic splice donors and acceptors which could negatively impact expression from unwanted splicing, and manually minimized repeats over 10 bp which might negatively impact cloning or expression.
  • ORF full protein open reading frame
  • plasmids containing the split fusion protein halves alone did not exhibit rearrangements or require special culturing techniques, and we amplified these plasmids in either pSMART-HC-Kan orpAAV backbone using Stbl3 cells (Thermo Fisher Scientific # C737303).
  • mice brain membrane protein samples To prepare mouse brain membrane protein samples, we dissected motor cortex samples (-25-50 mg wet tissue) into 1.7 mL Eppendorf Lo-Bind tubes and froze on dry ice and stored at -20°C. We prepared membrane protein samples using the Mem -Per Plus Membrane Extraction Kit (Thermo Fisher #89842) with small adjustments to the protocol. Briefly we washed the tissue once with ice-cold Cell Wash Solution, spun down, and wash buffer aspirated off.
  • Mem -Per Plus Membrane Extraction Kit Thermo Fisher #89842
  • mice anti-FLAG clone M2 (1/1000 dilution, Sigma # F1804)
  • mouse anti-HA clone 16B12 (1/3000 dilution for HEK-293 cell lysates, Biolegend # 901513)
  • rabbit anti-HA clone C29F4 (1/1000 dilutionfor brain membrane protein preparations, Cell Signaling # 3724S)
  • NeuroMab mouse anti-NaVLl clone K74/71 (1/300 dilution, Antibodies Incorporated # 75-023
  • loading controlrabbit polyclonal anti-alpha tubulin (1/5000 dilution, Cell Signaling #2144S) or mouse anti-alpha tubulin clone DM1 A (1/5000 dilution, Santa Cruz Biotechnology sc-32293) in 2.5% milk in PBS with 0.1% T
  • HEK293 cells (CRL-1573; ATCC, Gaithersburg, MD) were cultured in standard media consisting of DMEM, high glucose, glutaMAX (Gibco 10566016; Thermo Fisher Scientific, Waltham, MA), supplemented with 10% (v/v) fetal calf serum (FCS) (Gibco A5670401; Thermo Fisher Scientific, Waltham, MA) and 1% (v/v) Penicillin-Streptomycin (P/S) (10,000 U/ml) (Gibco 15140148; Thermo Fisher Scientific, Waltham, MA), grown at 37°Cand 5% CO2.
  • FCS fetal calf serum
  • P/S Penicillin-Streptomycin
  • Cells were passaged in T25 tissue culture flasks (FB012935; Thermo Fisher Scientific, Waltham, MA) approximately twice a week. Only cells passaged less than 20 times were used for transfection and expression studies.
  • Plasmid constructs were acutely transfected into HEK293 cells using Viafect reagent (E4981; Promega, Madison, WI), following the manufacture’s protocol.
  • HEK293 cells were first prepared for transfection by plating into 12-well tissue culture plates (Nunc 12-565- 321; Thermo Fisher Scientific, Waltham, MA) at a density of -0.5-2 xlO 5 cells per well and grown to -80-90% confluence with standard media, allowing for one confluent well per transfection condition. On the day of transfection, media in confluent wells to be transfected were replaced with 0.5 mL fresh DMEM with 10% FCS, without P/S.
  • Lipophilic/DNA transfection complexes were generated for each well to be transfected. This was achieved by combining a total of -1.0 pg of plasmid DNAs with serum-free OptiMEM (Gibco 31985062; Thermo Fisher Scientific, Waltham, MA) to a final volume of 100 pL, then adding 3.0 pL Viafect with gentle trituration and allowing the mixture to assemble at 24°C for 30 mins. After 30 mins, this mixture was added to each well. Live transfected cells were incubated overnight at 37°C, and visually monitored for transfection efficiency in situ using a plate microscope equipped with fluorescence (Invitrogen EVOS M7000; Thermo Fisher Scientific, Waltham, MA). Transfection efficiencies were typically >70-80%.
  • plasmid DNAs used for transfections per well : a) Full- length SCN1A (SCN1A-FL) 0.8 pg of pDNA, b) SCNIA-Ntm (SCN1A-N) 0.4 pg pDNA combined with SCNIA-Ctm (SCN1A-C) 0.4 pg pDNA, c) SCNIA-Ntm (SCN1A-N) 0.8 pg pDNA, d) SCNIA-Ctm (SCN1A-C) 0.8 pg pDNA, e) empty vector (SYFP only) 0.8 pg pDNA.
  • transfection conditions were performed in a background of 80 ng pDNA of a bi- cistronic construct expressing SCN1B and SCN2B (pi/2), which encodes two sodium channel P-subunits, yielding an a:P subunit-encoding pDNA mass ratio of 10: 1.
  • One additional control utilized only 80 ng of P 1/2 pDNA.
  • coverslips containing adherent transfected cells were transferred to the stage of a Zeiss AxoExaminer.Al microscope, equipped with an 40X water immersion objective and epifluorescence capability. Pipettes were positioned with a Sutter MPC-325 micromanipulator (Novato, CA). Whole-cell voltage-clamp recordings were acquired with an AxoClamp200B amplifier (Molecular Devices, Union City, CA), using pClampl0.4.
  • composition of recording solutions was: Bath (in mM, 140 NaCl, 2 CaC12, 2 MgC12, 10 HEPES, pH 7.4); Pipette internal solution (in mM, 35 NaCl, 105 CsF, 10 EGTA, 10 HEPES, pH 7.4). Patch pipettes were pulled from borosilicate glass (1B120F-4; World Precision Instruments, Sarasota, FL) on a P-97 Sutter Instruments puller (Novato, CA), and fire-polished on a Micro-Forge MF-830 (Narashige International USA, Amityville, NY) to a resistance of 0.8-1.5 MQ.
  • Na + reversal potential was 35 mV, based on a calculated Nernst equilibrium potential with the recording solutions used.
  • Peak currents were recorded in response to a family of voltage steps from a holding potential of -120 mV to 40 mV, in 5 mV increments, with an inter-pulse interval of 2 seconds to allow channels to fully deactivate to the deep closed state.
  • Conductance/voltage plots were fitted to a single Boltzmann function:
  • the term “comprising” or “comprises” is used in reference to compositions, methods, and respective component(s) thereof, that are useful to an embodiment, yet open to the inclusion of unspecified elements, whether useful or not. It will be understood by those within the art that, in general, terms used herein are generally intended as “open” terms (e.g., the term “including” should be interpreted as “including but not limited to,” the term “having” should be interpreted as “having at least,” the term “includes” should be interpreted as “includes but is not limited to,” etc.).
  • Beta-Globin Minimal Promoter (pBGmin/minBGlobin/minBGprom):
  • ATGACGATGACAAG (SEQ ID NO: 68)

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Organic Chemistry (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Zoology (AREA)
  • Biomedical Technology (AREA)
  • Public Health (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Veterinary Medicine (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Epidemiology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Toxicology (AREA)
  • Cell Biology (AREA)
  • Virology (AREA)
  • Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Plant Pathology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Microbiology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

Systems and methods are provided for split intein-mediated AAV mediated delivery and expression of large protein molecules with improved safety features. Specifically degron sequences as degradation signals are included at specific terminus of N- or C- split intein fragments that are each coupled to a portion of the coding sequence of voltage-gated sodium channel alpha subunit (e.g., alpha subunit 1, Nav1.1), and upon expression and trans-splicing of split intein fragments, Nav1.1 is reconstituted whereas trans-spliced intein is digested along with the coupled degron via degron degradation pathways.

Description

SYSTEMS AND METHODS FOR IMPROVING SAFETY OF SPLIT INTEIN AAV
MEDIATED GENE THERAPY
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application includes a claim of priority under 35 U.S.C. §119(e) to U.S. provisional patent application No. 63/532,994, filed August 16, 2023, the entirety of which is hereby incorporated by reference.
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
[0002] This invention was made with Government support under grant no. MH 120095 awarded by the National Institutes of Health. The Government has certain rights in the invention.
REFERENCE TO SEQUENCE LISTING
[0003] This application contains a Sequence Listing submitted as a computer readable form named “067505_000010WOPT_SequenceListing.xml”, having a size in bytes of 198,105 bytes, and created on August 16, 2024. The information contained in this computer readable form is hereby incorporated by reference in its entirety.
FIELD OF DISCLOSURE
[0004] This invention relates to split intein mediated gene expression systems incorporating a degradation signal for removal of gene expression byproducts.
BACKGROUND
[0005] In gene therapy, the relatively small DNA packaging capacity of adeno- associated virus (AAV), which is restricted to the size of the parental genome (approximately 5 kb), prevents their application for the treatment of diseases that arise due to mutations in genes with larger coding sequences. Split intein-mediated protein trans-splicing (PTS) has been evaluated as a strategy to reconstitute large proteins via AAV vectors, thus overcoming their limited cargo capacity. In this system, a large protein coding sequence is split into two or more parts, each at least tagged on one end (or, for internal parts of the split sequence, flanked) by sequences that encode split-inteins, which are independently cloned in two or more AAV vectors. Split-inteins are expressed as two independent polypeptides (N-intein and C-intein) at the extremities of the host polypeptides (N-polypeptide and C-polypeptide) and remain inactive until encountering their complementary partner. On counterpart association, the reconstituted intein excises itself from the host protein while mediating ligation of the N- and C-polypeptides via a peptide bond, in a traceless manner. However, AAV intein vectors encode components
1 067505-00001 OWOPT (e.g., excised inteins) of nonmammalian origin that could elicit immune or toxic responses in target cells and/or raise regulatory concerns for clinical translation.
[0006] One exemplary application of AAV intein vectors is the delivery for expression and reconstitution of large protein molecules to supplement or salvage the function of endogenously deficient or mutated protein molecules due to one or more diseases or disorders. [0007] Epilepsy is a neurological disorder that occurs when the brain presents an enduring predisposition to generate two or more epileptic seizures. An epileptic seizure is a temporary disruption of brain function due to abnormal excessive or synchronous neuronal activity. Its manifestation may include periods of unusual behavior, sensations and sometimes loss of consciousness. Dravet Syndrome (DS) particularly is a rare and catastrophic form of intractable epilepsy that begins in infancy. Initially, the patient experiences prolonged seizures. In their second year, additional types of seizures begin to occur and this typically coincides with a developmental decline. This leads to poor development of language and motor skills. Diseases such as Dravet syndrome, myoclonic seizures, and intractable childhood epilepsy with generalized tonic-clonic seizures often arise from mutations in SCN1A gene.
[0008] Various sodium channelopathies resulting in seizure disorders and associated comorbidities are associated with reduction or loss of function of the sodium channels in cell membranes. Sodium channels are made up of large alpha subunits that may associate with accessory proteins, such as beta subunits. An alpha subunit forms the core of the channel and is functional on its own. When the alpha subunit protein is expressed by a cell, it is able to form a pore in the cell membrane that conducts Na+ in a voltage-dependent way, even if beta subunits or other known modulating proteins are not expressed. When accessory proteins assemble with a subunits, the resulting complex can display altered voltage dependence and cellular localization.
[0009] Therefore, it is an object of the present invention to provide a /ra/z.s-spl icing system with enhanced safety or reduced immunogenicity by removing byproducts.
[0010] It is another object of the present invention to improve the safety or trans- splicing system for use in safe delivery of coding sequence of a sodium channel protein in the treatment of Dravet Syndrome or epilepsy disorders.
[0011] All publications herein are incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference. The following description includes information that may be useful in understanding the present invention. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed invention, or that any publication specifically or implicitly referenced is prior art.
SUMMARY OF THE DISCLOSURE
[0012] The following embodiments and aspects thereof are described and illustrated in conjunction with compositions and methods which are meant to be exemplary and illustrative, not limiting in scope.
[0013] Systems are provided to express one or more coding sequences. In various embodiments, a system includes: (a) a first expression construct comprising: a first portion of a polynucleotide sequence of a gene encoding the sodium channel alpha subunit, and a polynucleotide sequence encoding an N-fragment of a split intein (N-intein), at the 3’ end relative to the first portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit; (b) a second expression construct comprising: a second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit, and a polynucleotide sequence encoding a C-fragment of the split intein (C-intein), at the 5’ end relative to the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit; and (c) a third expression construct comprising a polynucleotide sequence encoding a degron.
[0014] In some embodiments, a system includes: (a) a first expression construct comprising: a first portion of a polynucleotide sequence of a gene encoding the sodium channel alpha subunit, and a polynucleotide sequence encoding an N-fragment of a split intein (N- intein), at the 3’ end relative to the first portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit; (b) a second expression construct comprising: a second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit, and a polynucleotide sequence encoding a C-fragment of the split intein (C-intein), at the 5’ end relative to the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit; wherein the first expression construct, the second expression construct, or both further comprises a polynucleotide sequence encoding a degron, located if within the first expression construct at the 3’ end relative to the first portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit, or located if within the second expression construct at the 5’ end relative to the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit.
[0015] In other embodiments, a system includes the first expression construct, the second expression, either one or both further comprising the polynucleotide sequence encoding the degron, and a third expression construct coding for a same or different degron. [0016] In some embodiments, a system for expression one or more coding sequences includes (a) a first expression construct coding for a first fusion protein comprising a first segment of a sodium channel alpha subunit and an N-fragment of a split intein, wherein the first segment of the sodium channel alpha subunit is at the N-terminus relative to the N- fragment of the split intein; (b) a second expression construct coding for a second fusion protein comprising a C-fragment of the split intein and a second segment of the sodium channel alpha subunit, wherein the second segment of the sodium channel alpha subunit is at the C-terminus relative to the C-fragment of the split intein; and (c) a polynucleotide sequence encoding a degron, wherein the polynucleotide sequence encoding the degron is in a third expression construct different from the first or the second expression constructs OR is within the first and/or the second expression constructs. In various aspects, the first fusion protein and the second fusion protein are spliced together, thereby joining the first segment and the second segment of the sodium channel alpha subunit and joining the N-fragment and the C-fragment of the split intein. In various embodiments, the system comprising the first, second, and third expression constructs is co-transduced to a cell for expression of the first fusion protein, the second fusion protein, and the degron in the cell.
[0017] In various embodiments, the split intein is one having fast splicing rate (e.g., half-lives within 5 min), such as consensus fast intein. In various embodiments, the degron is a polypeptide having no more than 30 amino-acid residues in length. In various embodiments, the split intein in the system is consensus fast intein, and the degron in the system is one having 9-26 or no more than 30 amino-acid residues in length. Without wishing to be bound by a particular theory, the splicing to joining the first and second segments of the sodium channel alpha subunit takes places faster than degron-mediated degradation, resulting in the joined segments (preferably a full) sodium channel alpha subunit locating in a cell membrane, whereas joined intein is degraded via degron.
[0018] In some embodiments, the first and/or the second expression construct further comprises an enhancer sequence, configured for targeted expression within a targeted cell type. In various embodiments, the expression construct(s) further comprises a promoter sequence. In some embodiments, the first and/or the second expression construct further comprises an intron having a polynucleotide sequence of SEQ ID NO: 107. In some embodiments, the first and/or the second expression construct further comprises the enhancer sequence, the promoter, and the intron.
[0019] In some embodiments, the gene encoding the sodium channel alpha subunit is selected from the group consisting of SCNlAt SCN2A, SCN3A, SCN4A, SCN5A, SCN8A, SCN9A, SCN10A, SCN11A, and SCN7A. In some embodiments, the sodium channel alpha subunit comprises sodium channel protein type 1 subunit alpha, or the gene encoding the sodium channel alpha subunit comprises SCN1A. In some embodiments, the sodium channel alpha subunit comprises human sodium channel protein type 1 subunit alpha isoform 2. In some embodiments, the sodium channel alpha subunit comprises a variant of human sodium channel protein type 1 subunit alpha isoform 2 having an amino acid substitution of A1056T, wherein amino acid residue numbering is according to NCBI accession number NP 001340878.1.
[0020] In some embodiments, the first expression construct comprises the polynucleotide sequence encoding the degron, and the first expression construct comprises from 5’ to 3’ : the first portion of the polynucleotide sequence of the SCN1A - the polynucleotide sequence encoding the N-intein - the polynucleotide sequence encoding the degron. For example, the degron has an amino acid sequence of RPGSTSPFAPSATDLPSMPEPALTSR (SEQ ID NO:91) or ACKNWFSSLSHFVIHL (SEQ ID NO: 92).
[0021] In some embodiments, the first expression construct comprises the polynucleotide sequence encoding the degron, and the first expression construct comprises from 5’ to 3’ : the first portion of the polynucleotide sequence of the SCN1A - the polynucleotide sequence encoding the degron - the polynucleotide sequence encoding the N- intein.
[0022] In some embodiments, the second expression construct comprises the polynucleotide sequence encoding the degron, and when expressed, the degron is two or more amino acid residues at the N-terminus relative to a protein product encoded by the second portion of the polynucleotide sequence of the SCN1A. For example, the degron has an amino acid sequence of MSCAQESITSLYKKAGSENLYFQ (SEQ ID NO:88), MSCAQES (SEQ ID NO:90), or GSLIIFIIL (SEQ ID NO:93).
[0023] In some embodiments, the second expression construct comprises the polynucleotide sequence encoding the degron, and the second expression construct comprises from 5’ to 3’ : a first portion of the polynucleotide sequence encoding the C-intein - the polynucleotide sequence encoding the degron - a second portion of the polynucleotide sequence encoding the C-intein - the second portion of the polynucleotide sequence of the SCN1A, wherein when expression, a protein product of the first portion of the polynucleotide sequence encoding the C-intein and a protein product of the second portion of the polynucleotide sequence encoding the C-intein together form the C-intein. [0024] In various aspect, wherein when the first and the second expression constructs are expressed, a protein product of the first portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit and a protein product of the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit are linked, via a peptide bond between the C-terminus of the first portion’ s protein product and the N-terminus of the second portion’s protein product, to reconstitute the sodium channel alpha subunit.
[0025] In various aspects, the degron has an amino acid sequence selected from the group consisting of MSCAQESITSLYKKAGSENLYFQ (SEQ ID NO:88), MSCAQES (SEQ ID NO:90), RPGSTSPFAPSATDLPSMPEPALTSR (SEQ ID NO:91), ACKNWFSSLSHFVIHL (SEQ ID NO: 92), and GSLIIFIIL (SEQ ID NO:93).
[0026] In various embodiment, the breakpoint of the first and second segment of the sodium channel alpha subunit is at a place wherein the first residue of the C-terminus segment (fragment) is Cys, Ser, or Thr.
[0027] In some embodiment, the first and the second portions of the polynucleotide sequence of the SCN1A encode residues 1-1049 and residues 1050-1998 of sodium channel protein type 1 subunit alpha isoform 2 (or variants containing A1056T substitution), respectively.
[0028] In some embodiments, the first and the second portions of the polynucleotide sequence of the SCN1A encode residues 1-956 and residues 957-1998 of the sodium channel protein type 1 subunit alpha isoform 2 (or variants containing A1056T substitution), respectively.
[0029] In some embodiment, the first and the second portions of the polynucleotide sequence of the SCN1A encode residues 1-947 and residues 948-1998 of the sodium channel protein type 1 subunit alpha isoform 2 (or variants containing A1056T substitution), respectively.
[0030] In some embodiment, the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterml049 (SEQ ID NO: 59), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterm949 (SEQ ID NO: 60).
[0031] In some embodiments, the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterm956 (SEQ ID NO: 61), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCN!A-CO-Cterml042 (SEQ ID NO: 62). [0032] In some embodiment, the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterm947 (SEQ ID NO: 63), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterml051 (SEQ ID NO: 64).
[0033] In various embodiments, the split intein comprises consensus fast intein (Cfa); the degron is a polypeptide being 5-30 amino-acid residues in length or preferably 9-26 aminoacid residues in length; and a polypeptide product encoded by the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit starts with a cystein, serine, or threonine residue. Preferably the first segment and the second segment of the sodium channel alpha subunits are even or about even sized; or the lengths of segments are not more than 20%, 30%, 40%, or 50% different.
[0034] In some embodiments, the polynucleotide sequence encoding the N-intein comprises a polynucleotide sequence of Cfa-N (SEQ ID NO:57), and the polynucleotide sequence encoding the C-intein comprises a polynucleotide sequence of Cfa-C (SEQ ID NO:58).
[0035] In some embodiments, the first expression construct and the second expression construct independently further comprise the promoter sequence selected from a minBglobin promoter having a polynucleotide sequence of SEQ ID NO:3, an hSynl promoter having a polynucleotide sequence of SEQ ID NO: 52, or a CMV promoter having a polynucleotide sequence of SEQ ID NO:53; optionally a shortened hSynl promoter having a polynucleotide sequence of SEQ ID NO: 54. In some embodiments, the first expression construct and the second expression construct independently further comprise the minBglobin promoter having a polynucleotide sequence of SEQ ID NO:3.
[0036] In some embodiments, the enhancer sequence is configured for targeted expression of the first, the second, or both portions, respectively, of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit within a targeted central nervous system cell type. In some embodiment, the targeted central nervous system cell type is GABAergic neuron, glutamatergic neuron, or both cell types. In some embodiment, the targeted central nervous system cell type is GABAergic interneuron.
[0037] In some embodiment, the enhancer sequence is set forth in SEQ ID NO: 2 (DLX2.0). In some embodiment, the enhancer sequence has a concatemerized core having a polynucleotide sequence of SEQ ID NO: 1. In some embodiment, the enhancer sequence is a concatemerized repeat (2, 3, 4, 5, 6, 7, 8, 9, 10, or more contiguous repeats) of a polynucleotide sequence of SEQ ID NO: 1. In some embodiment, an enhancer sequence is set forth in SEQ ID NO: 55 (eHGT_078h), or the targeted central nervous system cell type comprises a glutamatergic neuron.
[0038] In some embodiments, the first expression construct, the second expression construct, or both independently further comprise a miRNA binding site sequence, configured for targeted expression of the first, the second, or both portions, respectively, of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit within a selected central nervous system cell type. In some embodiments, the miRNA binding site sequence is set forth in SEQ ID NO: 56 (4x2C miRNA binding site) or SEQ ID NO:87 (8x2C miRNA binding site), or the selected central nervous system cell type comprises a pan- GABAergic neuron.
[0039] An artificial expression construct is also provided, which includes the first, the second and/or the third expression construct of a system disclosed herein, wherein each expression construct is associated with a capsid that crosses the blood brain barrier. In some embodiments, a capsid that crosses the blood brain barrier comprises PHP.eB. In some embodiments, a capsid that crosses the blood brain barrier comprises AAV-BR1. In some embodiments, a capsid that crosses the blood brain barrier comprises AAV-PHP.S. In some embodiments, a capsid that crosses the blood brain barrier comprises AAV-PHP.B. In some embodiments, a capsid that crosses the blood brain barrier comprises AAV-PPS.
[0040] Also provided is an administrable composition, which includes one or more artificial expression constructs disclosed herein, preferably in association with a capsid that crosses blood brain barrier; and a pharmaceutically acceptable excipient.
[0041] A transgenic cell is also provided comprising a system of one or more expression constructs disclosed herein. In some embodiments, the transgenic cell comprises a GAB Aergic neuron, or more specifically GAB Aergic interneuron. In some embodiments, the transgenic cell comprises a glutamatergic neuron.
[0042] Methods are also provided for rescuing voltage-gated sodium channel function within a targeted population of cells, the method comprising: co-administering a therapeutically effective amount of the two or more expression constructs of a system disclosed herein to a sample or subject comprising the targeted population of cells, and inducing expression of the expression constructs to reconstitute a sodium channel alpha subunit, thereby rescuing voltage-gated sodium channel function within the targeted population of cells.
[0043] Methods are also provided for administering a system of expression constructs to a subject in need thereof, the method comprising administering a therapeutically effective amount of a system disclosed herein, or co-administering two or more expression constructs of a system disclosed herein, to a sample or subject comprising the targeted population of cells, and inducing expression of the expression constructs to reconstitute a sodium channel alpha subunit, thereby rescuing voltage-gated sodium channel function within the targeted population of cells.
[0044] In some embodiments, the subject has a sodium channel opathies, optionally comprising Dravet syndrome, myoclonic seizures, myoclonic astatic epilepsy (MAE), intractable childhood epilepsy with generalized tonic-clonic seizures, simple febrile seizures, generalized epilepsy and febrile seizures plus (GEFS+), migrating partial seizures of infancy, Lennox-Gastaut syndrome, or West syndrome. In some embodiments, the subject is a pediatric patient having Dravet syndrome.
[0045] Other features and advantages of the invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, which illustrate, by way of example, various features of embodiments of the invention.
BRIEF DESCRIPTION OF THE FIGURES
[0046] Exemplary embodiments are illustrated in referenced figures. It is intended that the embodiments and figures disclosed herein are to be considered illustrative rather than restrictive.
[0047] Figures 1A-1G: A functional split-intein design of SCN1A that reconstitutes functional Navi.i activity. (FIG. 1A) Design of split-intein fusion protein halves of SCN1A. We inserted the breakpoint and Cfa-N and Cfa-C split-intein peptides just before the native Cysl050 with no additional amino acids. After joining, the Cfa-N and Cfa-C intein fragments self-excise and yield scarless reconstituted SCN1 A. HA and FLAG epitopes are inserted at the N- and C-termini of the N- and C-terminal halves for detection. (FIG. IB) Cloning SCN1A split-intein fusion protein halves into CMV-driven plasmid vectors for testing functionality in cell lines, as well as IRES2-SYFP2 and IRES2-mScarlet transfection reporters. (FIG. 1C) Reconstitution of full-length SCN1A after co-transfection of split-intein fusion protein halves into HEK-293 cells. We analyzed whole cell protein preparations by western blot for HA epitope tag after transfection. Lanes: 1) empty vector, 2) full-length HA-SCN1 A, 3) full-length SCN1A-FLAG, 4) HA-SCNIA-Ntm, 5) SCNIA-FLAG-Ctm, and 6) HA-SCNIA-Ntm plus SCNIA-FLAG-Ctm. Expected sizes: full-length HA-SCN1A 232 kDa, HA-SCNIA-Ntm 134 kDa, reconstituted full-length HA-SCN1A-FLAG after joining 233 kDa. Anti-tubulin is a loading control. (FIG. ID) Exemplary currents evoked in HEK293 cells transfected with full- length SCN1A (SCN1A-FL, green), a combination of SCNIA-Ntm and SCNIA-Ctm (SCN1A-N+C, orange), SCNIA-Ntm only (SCN1A-N, blue) and SCNIA-Ctm only (SCN1A- C, blue), in response to a family of step depolarizations from a holding potential of -120 mV to 40 mV, in 5 mV increments. Co-transfection with SCNIA-Ntm and SCNIA-Ctm produced functional Navi.i currents comparable in size to full-length SCN1A. Scale bar: 1 nA, 10 msec. (FIG. IE) Peak current densities, normalized to capacitance, for full-length SCN1A (FL, n= 39), SCNIA-Ntm + SCNIA-Ctm (N+C, n= 28), SCNIA-Ntm (N, n= 15), SCNIA-Ctm (C, n= 11), GFP/empty vector (GFP, n= 11) and SCN1B/2B (pi/2, n= 13). Medians displayed with data points. FL and N+C datasets are significantly different (**, p= 0.0083). N+C is highly significantly different compared to N or C (****, p< 0.0001). N and C are not significantly different from GFP and pi/2 controls (p> 0.05, ns). Statistical comparisons performed by pairwise Mann-Whitney U tests. All HEK293 transfections performed in the background of a separate SCN1B/2B expression plasmid, at a ratio of 10: 1. (FIG. IF) Voltage-dependent gating properties of WT full-length human SCN1A (SCN1A-FL) and reconstituted SCN1A channels formed by co-expression of N-terminal SCNIA-Cfa intein (SCN1A-N) and Cfa intein-C- terminal SCN1A (SCN1A-C) plasmid constructs, acutely expressed in HEK293 cells and characterized by whole-cell patch-clamp recordings. Currents activated by a family of step depolarizations from SCN1A-FL (green) and reconstituted SCN1A-N+C (orange), from a holding potential of -120 mV (left panels). Steady-state inactivation current traces evoked by a step to -20 mV, after a family of 1 sec preconditioning steps from -120 mV to 30 mV (right panels). Scale bars for activation currents (IF, left panels) equal 1.0 nA, and for SSI currents (IF, right panels) equal 0.5 nA; time equals 1.0 msec for all panels. (FIG. 1G) Conductancevoltage (G/V) and steady-state inactivation (SSI) plots for SCN1A-FL (green) and reconstituted SCN1A-N+C (orange) channels, fitted to single Boltzmann functions. G/V plots for both constructs are indistinguishable, whereas the SSI plot for SCN1A-N+C is shifted by +5 mV relative to SCN1A-FL. Statistical significance derived from unpaired pairwise t-tests, assuming equal variance (* p< 0.05; ** p< 0.01).
[0048] Figures 2A-2F: Cell class-specific delivery of SCN1A to telencephalic GABAergic interneurons using optimized enhancer DLX2.0. (FIG. 2A) Recombinant AAV2/PHP.eB vectors for delivery of DLX2.0-split-intein-SCNlA. (FIG. 2B) Efficient SCN1 A reconstitution in mouse brain with DLX2.0-split-intein-SCNl A vectors. In panels 2B- 2F we injected P2 neonatal mice BL-ICV with 3el0 gc of each indicated vector. At P24 we analyzed mouse brain membrane protein fractions by western blotting with antibodies targeting HA or FLAG epitope tags, the C-terminus of NaVl.l, and alpha-tubulin as a loading control. Note the PBS-injected negative control is the same lane as that shown in Fig. 6B as these experiments were performed together. (FIG. 2C) Specific detection of HA- and FLAG- expressing GABA+ cells in cortex. HA- and FLAG-expressing cells can be observed when either half is delivered alone or together. Layer 2/3 of VISp is shown at P20. (FIG. 2D) Representative stitched fluorescence image of biodistribution of HA and FLAG epitopes in scattered telencephalic neurons. Expression is pseudo-colored black. (FIG. 2E) Representative stitched fluorescence image of HA and FLAG epitopes, and Gad67+ neurons in VISp. In panels D-E we show expression at P47. (FIG. 2F) High specificity and completeness of expression within Gad67+ neurons in multiple telencephalic regions across multiple animals. We counted cells that express both HA and FLAG epitopes in visual VISp, MO, and HPF. Layer 1 was excluded from VISp and MO analysis due to DLX2.0-PHP.eB vectors inefficiently targeting that layer. Each point represents one mouse, bars represent the means, and error bars represent standard error of the mean. Mice span ages P47-P139, mean age P85. Abbreviations: CTX cerebral cortex, OLF olfactory areas, HPF hippocampal formation, OT olfactory tubercle, STR striatum, VISp primary visual cortex, MO motor cortex.
[0049] Figures 3A-3E: Recovery of mortality and epileptic symptoms in DS model mice with DLX2.0-split intein-SCNl A. (FIG. 3A) Experimental timeline to rescue of epileptic symptoms in DS model mice. We used Scnlafl/+;Meox2-Cre animals on a pure C57BL/6 background to model DS45. We injected P0-P3 pups with empty control, single-part alone control, or telencephalic GABAergic interneuron-targeting dual DLX2.0 N+C SCN1A AAVs (3el0 gc each vector per animal), and monitored mortality until P70. Some animals were tested for seizure susceptibility by thermal challenge between P25-P35. (FIG. 3B) Mortality protection in DS model mice after treatment with DLX2.0-SCN1A AAVs. Mice treated with DLX2.0 N+C SCN1 A AAVs (n= 27) exhibit significantly greater survival than untreated mice (n= 68, *** Log-rank test p= 1.4e-5) or mice treated with empty or single-part vector controls (n= 40, *** Log-rank test p= 6.6e-4). Note the untreated control groups in Fig. 3B-EE represents the same set of untreated animals as that shown in Fig. 7B, 7D-7F. (FIG. 3C-3E) Protection from heat-induced seizures in DS model mice after treatment with DLX2.0-SCN1 A AAVs. (FIG. 3C) DS model mice treated with DLX2.0 N+C SCN1A AAVs (n= 16) are significantly less likely to exhibit MC seizures by 42°C than untreated mice (n= 11, * Fisher’s exact test p= 0.042) or mice treated with empty or single-part vector controls (n= 27, * Fisher’ s exact test p= 0.011). (FIG. 3D) DS model mice treated with DLX2.0 N+C SCN1A AAVs exhibit significantly less likely to exhibit GTC seizures by 42°C than untreated mice (** Fisher’s exact test p= 0.0025) or mice treated with empty or single-part vector controls (** Fisher’s exact test p= 0.00010). (FIG. 3E) DS model mice treated with DLX2.0 N+C SCN1A AAVs exhibit significantly fewer MC events during thermal challenge assay than untreated mice or mice treated with empty or single-part vector controls (** unpaired t-test p< 0.001 at each indicated timepoint, for comparison to Untreated and Empty/single part negative control animals).
[0050] Figures 4A-4F: Recovery of spontaneous epileptic symptoms in DS model mice with DLX2.0-SCN1 A AAVs. (FIG. 4A-4B) Interictal spike reduction in DS model mice. We implanted Scnlafl/+; Meox2-Cre DS model mice with ECoG/EMG electrodes, which revealed frequent interictal spikes generalized across the brain in both left (L) and right (R) channels as in our previous work45,50. (FIG. 4A) Example interictal spikes are shown in untreated mice. (FIG. 4B) Counting interictal spikes in untreated and treated (3el0 gc each DLX2.0 N+C SCN1A AAV vector delivered BL-ICV at P0-P3) mice reveals a significantly decreased frequency of interictal spikes with treatment. n= 10 animals per condition, each circle represents one animal, bars and error bars represent means and standard error of the means. *** p = 0.0001 by two-tailed unpaired t-test. (FIG. 4C-4D) Spontaneous MC event prevention in DS model mice. (FIG. 4C) Example MC event observed in untreated Scnlafl/+; Meox2-Cre DS model mice, which have sharp spikes followed closely by EMG signals. (FIG. 4D) We categorized mice as having MC events or not having MC events during the recording period, which revealed a highly significant prevention of MCs in the treated animals. *** p = 7.1e-4 by Fisher’s exact test. (FIG. 4E-4F) Spontaneous GTCs inDS model mice. (FIG. 4E) Example spontaneous GTC seizure observed in an untreated Scnlafl/+; Meox2-Cre DS model mice. (FIG. 4F) We categorized mice as having or not having GTC seizures during the recording period, which revealed treated animals did not exhibit GTC seizures although this effect was not significant (ns, p = 0.47 by Fisher’s exact test).
[0051] Figures 5A-5D: Recovery of severe epileptic phenotypes in mice lacking SCN1A in telencephalic GABAergic interneurons using DLX2.0-SCN1A AAVs. (FIG. 5A) Genetic cross resulting in 100% Scnla+/fl; Dlx5/6-Cre animals. (FIG. 5B) Survival curves for treated and untreated Scnla+/fl; Dlx5/6-Cre animals. Untreated pups exhibit 100% mortality by the 6th week of life (n= 31/31). In contrast, pups injected with DLX2.0-split-intein-SCNl A (3el0 gc each vector at P2 by BL-ICV) exhibit 100% survival through the 10th week of life (n= 9/9). *** p= 3.3e-14 by Fisher’s exact test. (FIG. 5C-5D) Protection from seizures in treated Scnla+/fl; Dlx5/6-Cre animals during thermal challenge assay. (FIG. 5C) Significant protection from thermally induced MC seizures (*** p= 7.1e-4, Fisher’s exact test). (FIG. 5D) Significant protection from thermally induced GTC seizures (*** p= l.le-5, Fisher’s exact test). [0052] Figures 6A-6F: Nonselective delivery of SCN1A to neurons using hSynl promoter. (FIG. 6A) Recombinant AAV2/PHP.eB vectors for delivery of hSynl-split-intein- SCNIA. (FIG. 6B) Efficient SCN1A reconstitution in mouse brain with hSynl -split-intein- SCN1A vectors. In panels 6B-6F we injected P2 neonatal mice BL-ICV with 3el0 gc of each indicated vector (6el0 total gc in the N+C animals). At P24 we analyzed mouse brain membrane protein fractions by western blotting with antibodies targeting HA or FLAG epitope tags, the C-terminus of Navi. i, and alpha-tubulin as a loading control. Note the PBS-injected negative control lane is the same lane as that shown in Fig. 2B as these experiments were performed together. (FIG. 6C) HA- and FLAG-expressing NeuN+ and Gad67+ cells in cortex after co-inj ection with N- and C-terminal vectors. White arrows indicate NeuN+ and Gad67+ cells that express FLAG but not HA. Cyan arrows indicate NeuN+ and Gad67+ cells that express both FLAG and HA. Layer 2/3 of VISp is shown at P84. (FIG. 6D) Representative stitched fluorescence image of biodistribution of biodistribution of HA and FLAG epitopes in neurons after co-inj ection with N- and C-terminal vectors. Expression is pseudo-colored black. Expression shown at P84. (FIG. 6E) Representative stitched fluorescence image of HA and FLAG epitopes and NeuN+ neurons throughout the layers of VISp. Expression shown at P84. (FIG. 6F) Quantification of specificity and completeness of expression within Gad67+ and NeuN+ neurons in multiple telencephalic regions. We counted cells that express both HA and FLAG epitopes in VISp, MO, and HPF (layer 1 was excluded from VISp and MO analysis due to hSynl-PHP.eB vectors inefficiently targeting that layer). Each point represents one mouse, bars represent the means, and error bars represent standard error of the mean. As expected hSynl-driven expression shows specificity for NeuN+ cells but not Gad67+ cells. Mice span ages P76-P86, mean age P81. Abbreviations: CTX cerebral cortex, OLF olfactory areas, HPF hippocampal formation, STR striatum, MB midbrain, HB hindbrain, CBX cerebellar cortex, VISp primary visual cortex, MO motor cortex.
[0053] Figures 7A-7F: Early pre-weaning toxicity and weak protection from epileptic symptoms with nonselective SCN1A vectors. (FIG. 7A) Experimental timeline to rescue of epileptic symptoms in DS model mice with nonselective hSynl promoter-driven SCN1A vectors. (FIG. 7B-7C) Preweaning mortality in DS model mice after treatment with nonselective SCN1A AAVs. (FIG. 7B) Mice treated with hSynl N+C SCN1A AAVs at high dose (3el0 gc each vector, n= 18) or low dose (lelO gc each vector, n= 36) exhibit significantly greater preweaning mortality by P21 than untreated mice (n= 68) and empty/single part negative control mice (n= 33). *** p < .001 atP21 timepoint versus untreated by Fisher’s exact test. Low dose versus untreated p= 3.6e-4; low dose versus empty/single p= 0.014; high dose versus untreated p= 6.6e-6; high dose versus empty/single p= 4.9e-4. We did not observe significant effects on survival in the post-weaning period. Note the untreated control groups in Fig. 7B, 7D-7F represents the same sets of untreated animals as that shown in Fig. 3B-3E. (FIG. 7C) From analysis of recovered genotypes at P21 we inferred that both DS and littermate control animals were similarly affected by nonselective SCN1A AAV lethality. FD: found dead. (FIG. 7D-7F) Protection from heat-induced seizures in DS model mice after treatment with high-dose nonselective SCN1 A AAVs. (FIG. 7D) DS model mice treated with high-dose hSynl N+C SCN1 A AAVs (n= 5) are significantly less likely to exhibit MC seizures by 42°C than Empty/single part mice (n= 18, * Log-rank test p= 0.020). (FIG. 7E) DS model mice treated with high-dose hSynl N+C SCN1A AAVs exhibit a trend towards less GTC seizure likelihood by 42°C than empty/single part mice (*Log-rank test p= 0.047). (FIG. 7F) Mice treated with high-dose hSynl N+C SCN1A AAVs exhibit significantly fewer MC events during thermal challenge assay than mice treated with empty or single-part vector controls (* p < 0.05, unpaired t-test at each indicated temperature for High-dose versus Untreated or Empty/single animals). In D-F, we did not observe a protective effect against seizures from low-dose hSynl N+C AAVs (n= 12).
[0054] Figures 8A-8E: Isoform usage and allele prevalence of human SCN1A. (FIG. 8A-8B) SCN1A isoform usage across cortical cell type subclasses in mice (8A) and humans (8B). Mouse VISp cortical cell type-specific RNA-seq profiles are from Tasic, et al., Nature 563, 72 (2018) and human middle temporal gyrus (MTG) cortical cell type-specific profiles are from Hodge, et al., Nature 573, 61-68 (2019). Genome-aligned reads are aggregated according cell type subclasses, and visualized as pileups on UCSC genome browser alongside the positions of the exon whose splice donor usage determines whether the 2009-, 1998-, or 1981-amino acid isoform of SCN1A is expressed. Regions shown: mmlO chr2:66324527- 66325003, hg38 chr2: 166043623-166044099 (reverse complement reference sequences for legibility). Full vertical scale represents 0.65 (mouse) or 0.4 (human) read counts per million. (FIG. 8C) Alignment of mammalian SCN1 A protein sequences. The alanine residue at position 1056 in the NCBI RefSeq human SCN1A sequence is orthologous to a conserved threonine residue in most other mammalian species. Additionally, Origene commercial clones of human SCN1A contain threonine residue at this position, but agree at all other positions. Sequences used for alignment: human NP_001340878.1, gorilla XP_055236229.1, chimpanzee XP_054535897.1, Macaque XP_001101023.1, mouse lemur XP_020143996.1, domestic ferret XP_004744052.1, rat NP_110502.2, mouse NP_001300926.1, human Origene clone catalog number RG220167. Amino acid positions numbering is according the human 1998 amino acid isoform (NP_001340878.1). (FIG. 8D) SCN1A sequences in donated human brain samples. DNA sequences represent unique (non-PCR duplicate) reads from assay for transposase- accessible chromatin with sequencing (ATAC-seq) from three independent human patient brain samples (H17.26.001, H17.26.003, H18.03.001). (FIG. 8E) Population allele frequencies in human SCN1A via gnomAD database. gnomAD v3 and v2 populations represent partially overlapping healthy patient populations subject to genome and/or exome sequencing. The threonine-encoding allele is represented by a T on the + strand (corresponding to A on the - strand) in 73-74% of the population.
[0055] Figures 9A-9C: Cell class specificity observed with an independent marker of telencephalic GABAergic interneurons. Cell type-specific expression of SCNlA in Dlx56-Cre; Ail4 reporter mice. We injected these mice at P2 BL-ICV with 3el0 gc each DLX2.0-SCN1 A AAV vector produced at PackGene and analyzed expression at P30-P35 with both sagittal and coronal sections (n = 4 mice analyzed). We analyzed expression specifically in somatosensory cortex (SS), and motor cortex (MC), and counted absolute numbers of expressing cells in somatosensory cortex (9 A) and hippocampus (9B) and motor cortex (9C).
[0056] Figures 10A-10E: Biodistribution and rescue of epileptic symptoms from independently produced batches of DLX2.0-split intein SCN1A vectors. Specific expression of HA-tagged N-terminal and FLAG-tagged C-terminal SCN1A half-channels in Gad67+ neurons with independent packaging of DLX2.0-SCN1A AAVs. Animals were dosed at low dose (lelO gc each vector) or high dose (3el0 gc each vector) of DLX2.0-SCN1A AAVs by BL-ICV at P0-P3. (FIG. 10A) Telencephalic GABAergic interneuron specificity is maintained while completeness increases with greater dose of AAV. (FIG. 10B-10E) Protection from mortality and heat-induced seizures in Scnlafl/+; Meox2-Cre mice with independently packaged DLX2.0-SCN1 A AAV vectors. (FIG. 10C) Trend towards protection from mortality with high dose of DLX2.0-SCN1A AAV vectors (p = 0.06 by Log-rank test, High dose [n = 10] versus Untreated [n = 68]). (FIG. 10D-10E) Dose-dependent protection from heat-induced MC (10D) and GTC (10E) seizures. * p < 0.05 for comparisons of High dose (n= 10) versus Untreated (n= 11) or Empty/single vector (n= 23) negative controls by Log-rank test. We did not observe significant seizure protection with low doses of DLX2.0-SCN1A AAVs.
[0057] Figures 11A-11E: Rescue of mortality and epileptic symptoms in a second independent mouse model of DS. (FIG. HA) Testing DLX2.0-split-intein-SCNlA in an independent mouse model of DS. We generated Scnla+/R613X mice on a mixed Fl C57Bl/6: 129Sv background, injected neonates BL-ICV with DLX2.0-split-intein-SCNlA (3el0 gc each vector), and monitored mortality over the first 70 days of life, and then performed ECoG seizure monitoring on rescued animals following mortality monitoring. (FIG. 11B) Mortality monitoring after DLX2.0-split-intein-SCNl A administration. N+C DLX2.0-SCN1 A provides highly significant complete rescue from mortality to beyond 365 days. *** p < .001 by Mantel-Cox test (chi-square= 77.69, df= 7) after correction for multiple planned comparisons (3 comparisons of injection materials against untreated Scnla+/R613X mice). All other within-genotype comparisons are not significant (p> 0.05) after accounting for multiple comparisons. (FIG. 11C) Seizure monitoring and ECoG in DS model mice. After P70, we implanted headmounts and monitored animals for seizures and epileptic activity by paired ECoG with channels in somatosensory (SS) and parietal (P) cortex, EMG, and video monitoring. Example #1 : untreated Scnla+/R613X mouse displays several non-uniformly distributed GTC seizures over 228 hours of recording (one GTC seizure shown), as well as interictal spikes marked by red dots (zoom in on one example spike). Example #2 N+C DLX2.0-SCNlA-injected Scnla+/R613X mouse displays no GTCs and few spikes. Example #3 N+C DLX2.0-SCNlA-injected Scnla+/R613X mouse shows no GTCs but frequent spikes with aberrant spike morphology, likely an outlier. (FIG. 11D) Protection from GTCs with treatment in DS model mice. We manually quantified GTC events over the full recording session for each animal, and displayed results as events per 24 hour recording period, which revealed a significant seizure reduction with administration of N+C DLX2.0-SCN1 A. * Mann- Whitney U test p= 0.020; data are non-normally distributed according to Shapiro-Wilk test (untreated p= 0.018; N+C DLX2.0-SCN1A p= 1.0e-6). We also observe that all untreated Scnla+/R613X mice with zero GTCs during the recording session eventually survive beyond P150. Marked mice (#1, #2, #3) correspond to the example mice displayed in panel 11C. (FIG. HE) Fewer spikes with treatment in DS model mice. We identified spikes over the final day of the recording session for each animal using the line length threshold method. Quantified spikes are contemporaneous in both SS and P channels, as understood for genetic generalized epilepsy such as DS. Marked mice (#1, #2, #3) correspond to the example mice in panel 11C. Despite outlier injected example mouse #3 having aberrant frequent spikes, we observe significantly fewer spikes after injection in the injected mice versus untreated mice (* Mann- Whitney U test p= 0.033). Data are non-normally distributed according to Shapiro-Wilk test (untreated p= 0.0015; N+C DLX2.0-SCN1A p= 1.5e-6).
DESCRIPTION OF THE DISCLOSURE
[0058] All references cited herein are incorporated by reference in their entirety as though fully set forth. Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Singleton et al., Dictionary of Microbiology and Molecular Biology 3rd ed., Revised, J. Wiley & Sons (New York, NY 2006); March, Advanced Organic Chemistry Reactions, Mechanisms and Structure 7th ed., J. Wiley & Sons (New York, NY 2013); and Sambrook and Russel, Molecular Cloning: A Laboratory Manual 4th ed., Cold Spring Harbor Laboratory Press (Cold Spring Harbor, NY 2012), provide one skilled in the art with a general guide to many of the terms used in the present application.
[0059] One skilled in the art will recognize many methods and materials similar or equivalent to those described herein, which could be used in the practice of the present invention. Indeed, the present invention is in no way limited to the methods and materials described. For purposes of the present invention, the following terms are defined below.
[0060] “Intein” refers to a polypeptide sequence capable of catalyzing a protein splicing reaction that excises its (the intein) sequence from their host protein and joins flanking sequences (N- and C-exteins) with a peptide bond. Intein excision is a posttranslational process that does not require auxiliary enzymes or cofactors. This self-excision process is called “protein splicing,” by analogy to the splicing of RNA introns from pre-mRNA (Perl er F et al, Nucl Acids Res. 22: 1125-1127 (1994)). The segments are called “intein” for internal protein sequence, and “extein” for external protein sequence, with upstream exteins termed “N- exteins” and downstream exteins called “C-exteins.” The products of the protein splicing process are two stable proteins: the mature protein and the intein. Inteins are typically 150-550 amino acids in size and may also contain a homing endonuclease domain. A list of known inteins, and exemplary mutually orthogonal split inteins, are shown at www.inteins.com, described by Pinto et al. in Nature Communications 2020, 11 : 1529, and provided in for example US2023/0116688, which is incorporated by reference herein.
[0061] Known inteins share a low degree of sequence similarity, with conserved residues only at the N- and C-termini. Most inteins begin with Ser or Cys and end in His-Asn or in His-Gln. In various embodiments, the first amino acid of the C-extein is an invariant Ser, Thr, or Cys, but the residue preceding the intein at the N-extein is not conserved.
[0062] The term “split intein” as used herein refers to any intein in which the N- terminal and C-terminal amino acid sequences are not directly linked via a peptide bond, such that the N-terminal and C-terminal sequences become separate fragments that can non- covalently re-associate, or reconstitute, into an intein that is functional for trans-splicing reactions.
[0063] A split intein involves two complementary half inteins, termed the N-intein and C-intein, that associate selectively and extremely tightly to form an active intein enzyme (Shah N.H., et al, J. Amer. Chem. Soc. 135: 18673-18681; Dassa B., et al, Nucl. Acids Res., 37:2560- 2573 (2009)). The two fragments of the split intein are encoded by two separately transcribed and translated genes. These so-called split inteins self-associate and catalyze protein-splicing activity in trans. Split inteins have been identified in diverse cyanobacteria and archaea (Caspi et al, Mol Microbiol. 50: 1569-1577 (2003); Choi J. et al, J Mol Biol. 556: 1093- 1106 (2006.); DassaB. et al, Biochemistry. 46:322-330 (2007.); Liu X. and Yang J., J Biol Chem. 275:26315- 26318 (2003); Wu H. et al, Proc Natl Acad Sci USA. £5:9226-9231 (1998.); and Zettler J. et al, FEBS Letters. 553:909-914 (2009)), but have not been found in eukaryotes thus far. Recently, a bioinformatic analysis of environmental metagenomic data revealed 26 different loci with a novel genomic arrangement. At each locus, a conserved enzyme coding region is interrupted by a split intein, with a freestanding endonuclease gene inserted between the sections coding for intein subdomains. Among them, five loci were completely assembled: DNA helicases (gp41 -1, gp41-8); Inosine-5 '-monophosphate dehydrogenase (IMPDH-1); and Ribonucleotide reductase catalytic subunits (NrdA-2 and NrdJ-1). This fractured gene organization appears to be present mainly in phages (Dassa et al, Nucleic Acids Research. 57:2560-2573 (2009)).
[0064] The term “split intein N-fragment,” “N-fragment of a split intein,” “N-terminal split intein,” “N-terminal intein fragment,” “N-terminal intein sequence” (abbreviated “IntN” or “N-intein”) refers to any intein sequence that comprises an N-terminal amino acid sequence that is functional for trans-splicing reactions, that is, that is capable of associating with a functional split intein C-fragment to form a complete intein that is capable of excising itself from the host protein, catalyzing the ligation of the extein or flanking sequences with a peptide bond, or that upon association with a split intein C-fragment catalyzes the “N-terminal cleavage”, that is, the nucleophilic attack of the peptide bond between the extein and the N- terminus of the split intein N-fragment resulting in the breaking of said peptide bond. An IntN thus also comprises a sequence that is spliced out when trans-splicing occurs. An IntN can comprise a sequence that is a modification of the N-terminal portion of a naturally occurring intein sequence. For example, it can comprise additional amino acid residues and/or mutated residues so long as the inclusion of such additional and/or mutated residues does not render the IntN non-functional in trans-splicing. Preferably, the inclusion of the additional and/or mutated residues improves or enhances the trans-splicing activity of the IntN. In various embodiments, N-intein is fused to the N-terminal fragment of a protein to be reconstituted, wherein the N- intein is at the C-teminus relative to the N-fragment of the protein to be reconstituted. [0065] The terms “split intein C-fragment,” “C-fragment of a split intein,” “C-terminal split intein,” “C-terminal intein fragment,” and “C-terminal intein sequence” (abbreviated “IntC” or “C-intein”) refer to any intein sequence that comprises a C-terminal amino acid sequence that is functional for trans-splicing reactions, that is, that upon association is capable of associating with a functional split intein N-fragment to form a complete intein that is capable of excising itself from the host protein, catalyzing the ligation of the extein or flanking sequences with a peptide bond, or that upon association with a split N-intein catalyzes the “C- terminal cleavage”, that is, the nucleophilic attack of the peptide bond between the extein and the C-terminus of the split intein C-fragment resulting in the breaking of said peptide bond. An IntC thus also comprises a sequence that is spliced out when trans-splicing occurs. An IntC can comprise a sequence that is a modification of the C-terminal portion of a naturally occurring intein sequence. For example, it can comprise additional amino acid residues and/or mutated residues so long as the inclusion of such additional and/or mutated residues does not render the IntC non-functional in trans-splicing. Preferably, the inclusion of the additional and/or mutated residues improves or enhances the trans-splicing activity of the IntC. In various embodiments, the C-intein is fused to the C-terminal fragment of a protein to be reconstituted, wherein the C- intein is at the N-terminus relative to the C-terminal fragment of the protein to be reconstituted. [0066] “Degrons,” “degradation signal,” or “destabilizing domain” refers to a naturally-occurring or artificially-constructed polypeptide sequence which when recombinantly fused to another polypeptide it accelerates its protein degradation via the proteosomal degradation pathway, or any other cellular degradation mechanism.
[0067] “Enhancer” or “enhancer element” refers to a cis-acting sequence that increases the level of transcription associated with a promoter and can function in either orientation relative to the promoter and the coding sequence that is to be transcribed and can be located upstream or downstream relative to the promoter or the coding sequence to be transcribed. In preferable embodiments, an “enhancer” is an DNA regulatory element that confer cell type specificity of gene expression. For example, a targeted central nervous system cell type enhancer is an enhancer that is uniquely or predominantly utilized by the targeted central nervous system cell type; and a targeted central nervous system cell type enhancer enhances expression of a gene in the targeted central nervous system cell type, but does not substantially direct expression of genes in other non-targeted cell types, thus having neural specific transcriptional activity. Examples of enhancers, especially interneuron-specific enhancers, are provided in US2021/0348195 and US2018/0078658, which are incorporated by reference. [0068] Neurons found in the mammalian (e.g., human) nervous system can be divided into three classes based on their roles: sensory neurons, motor neurons, and interneurons. Targeted cell types can be identified based on transcriptional profiles, such as those described in Tasic et al., 2018 Nature. For example, GABAergic interneurons express GABA synthesis genes Gadl/GADl and/or Gad2/GAD2; whereas glutamatergic neurons express glutamate transmitters SIcl7a6 and/or SIcl7a7.
[0069] Ion transporters are transmembrane proteins that mediate transport of ions across cell membranes. In particular embodiments, ion transporters include voltage gated sodium channels, potassium channels, and calcium channels.
[0070] Mammalian voltage-gated sodium (Nav) channels are composed of a highly glycosylated -260 kDa a subunit, the pore forming protein, linked via disulfide bonds to (32/(34 subunits and non-covalently with (31/(33 subunits. Nine Nav1 a subunit genes (SCN1A-SCN9A) have been identified in mammals, constituting the Nav1 gene subfamily.
[0071] the Nav channel a subunit is a complex of transmembrane helices surrounding a central ion-conducting pore, usually capable of producing functional channels in a heterologous expression system. Approximately 2000 amino acid residues are arranged in 4 homologous domains, each consisting of 6 transmembrane segments, and a hairpin loop that lines the pore and includes the selectivity filter. An additional family of accessory (3 subunits also exists, split into 2 groups discriminated by their mechanism of interaction with the a subunit: disulphide-linked (32 and (34; and non-covalently associated (31 (including splicing variant) and (33; wherein the extracellular immunolgobulin-like domain of the (3 subunit is important for surface expression and modulation of a subunit gating, while the transmembrane domain influences Nav voltage-dependence.
[0072] The SCN1A gene codes for the alpha subunit of Navl.1 channel. The Navl. I channel is mainly responsible for the generation and propagation of neuronal action potentials. Different mutations in this gene are associated with epilepsy and febrile seizures. SCN1A may encode proteins of various isoforms, including but are not limited to NP 001340878.1, NP_001159435.1, NP_001189364.1, and those provided in biogps.org/#goto=genereport&id=6323. Situated at position 2q24.3, SCN1A is part of a cluster of voltage-gated sodium channel genes that is home to SCN2A, SCN3A, SCN7A, as well as SCN9A, which encode Navl.2, Navl.3, Nax, and Navl.7, respectively.
[0073] The Navl. l open-reading frame is believed to be organized into 26 exons and blueprints the instructions for a protein incorporating between 1976 and 2009 amino acids. Generally the variance in length stems from alternative splice junctions at the end of exon 11 that produce a full-length isoform or shortened versions thereof.
[0074] Coding sequences encoding molecules (e.g., RNA, proteins) described herein can be readily obtained from publicly available databases and publications. Coding sequences can further include various sequence polymorphisms, mutations, and/or sequence variants wherein such alterations do not affect the function of the encoded molecule. The term “encode” or “encoding” refers to a property of sequences of nucleic acids, such as a vector, a plasmid, a gene, cDNA, mRNA, to serve as templates for synthesis of other molecules such as proteins.
[0075] The term “gene” may include not only coding sequences but also regulatory regions such as promoters, enhancers, and termination regions. The term further can include all introns and other DNA sequences spliced from the mRNA transcript, along with variants resulting from alternative splice sites. The sequences can also include degenerate codons of a reference sequence or sequences that may be introduced to provide codon preference in a specific organism or cell type.
[0076] In various embodiments, expression constructs are provided within vectors. “Vector” refers to a nucleic acid molecule capable of transferring or transporting another nucleic acid molecule, such as an expression construct. Examples of vectors include plasmids (e.g., DNA plasmids or RNA plasmids), transposons, cosmids, bacterial artificial chromosomes, and viral vectors.
[0077] Adenovirus vectors” refer to those constructs containing adenovirus sequences sufficient to support packaging of an expression construct and to express a coding sequence that has been cloned therein in a sense or antisense orientation. A recombinant Adenovirus vector includes a genetically engineered form of an adenovirus. Both ends of the viral genome contain 100-200 base pair inverted repeats (ITRs), which are cis elements necessary for viral DNA replication and packaging. The early (E) and late (L) regions of the genome contain different transcription units that are divided by the onset of viral DNA replication. The El region (El A and E1B) encodes proteins responsible for the regulation of transcription of the viral genome and a few cellular genes. The expression of the E2 region (E2A and E2B) results in the synthesis of the proteins for viral DNA replication. These proteins are involved in DNA replication, late gene expression, and host cell shut-off. The typical vector is replication defective and will not have an adenovirus El region.
[0078] In particular embodiments vectors (e.g., AAV) with capsids that cross the blood-brain barrier (BBB) are selected. In particular embodiments, vectors are modified to include capsids that cross the BBB. Examples of AAV with viral capsids that cross the blood brain barrier include AAV9, AAVrh.10, AAV1R6, AAV1R7, rAAVrh.8, AAV-BR1, AAV- PHP.S, AAV-PHP.B, and AAV-PPS. The PHP.eB capsid differs from AAV9 such that, using AAV9 as a reference, amino acids starting at residue 586: S-AQ-A (SEQ ID NO:46) are changed to S-DGTLAVPFK-A (SEQ ID NO:47). Additional description regarding capsids that cross the blood brain barrier is provided by Chan et al., Nat. Neurosci. 2017 August: 20(8): 1172-1179.
[0079] Various embodiments provide the inclusion of a degradation signal in a system, wherein the degradation signal is a peptide sequence, known as degron, which mediates rapid ubiquitination and subsequent proteasomal degradation of a nearby protein or a protein that the degron is embedded in. In some embodiments, the degradation signal, or degron, is included with theN-terminal segment (split) of an intein, i.e., N-intein. In some embodiments, the drgron is included with the C-terminal segment (split) of the intein, i.e., C-intein. In some embodiments, the degron is included with both the N-intein and the C-intein.
[0080] In further embodiments, the degron is in an individual expression construct or vector, separate from the expression con struct! s) or vector(s) that contain C-intein or N-intein. [0081] In some embodiments, a degron is included with (or attached to) just one of the C-intein or the N-intein, and not with both the C-intein and the N-intein. In some embodiments, when a degron is attached to the N-intein, the degron is attached to the C-terminal end of the N-intein, and the configuration from N- to C-end of this half of the split intein fusion is: N- terminal fragment of a sodium channel protein (e.g., N-terminal fragment of a sodium channel alpha subunit) - N-intein - degron. In some embodiments, when a degron is included with (or attached to) the C-intein, the degron is inserted into the C-intein and a few residues (e.g., 2, 3, 4, 5, 6, 7, 8, 9. 10 residues) before/upstream/at the N-terminus relative to the fragment of the sodium channel protein; and that is, the configuration from N to C is: a first portion of C-intein - degron - a second portion of the C-intein - C-terminal end containing fragment of the sodium channel protein. The first portion of C-intein and the second portion of the C-intein, when operably connected, form a C-intein.
[0082] Some embodiments provide that the degradation signal, or degron, is integrated at the 5’ end of the N-intein, so that upon intein-mediated protein /ra/z.s-spl icing and intein excision, the degron would be placed at the N-terminal of the excised intein. Some embodiments provide that the degradation signal, or degron, is integrated at the 3’ end of the C-intein, so that upon intein-mediated protein /ra/z.s-spl icing and intein excision, the degron would be placed at the C-terminal of the excised intein. Other embodiments provide that the degradation signal, or degron, is integrated at the 3’ end of the N-intein and/or the 5’ end of the C-intein, so that upon intein-mediated protein /ra//.s-spl icing and intein excision, the degron would be in the middle of the excised intein.
[0083] Preferably, the degron encoded in the system is one having no more than 30 amino acid residues. In some embodiments, the degron encoded in the system is one being 5- 27 amino-acid residues in length. In some embodiments, the degron encoded in the system is one being 5-10 amino-acid residues in length. In some embodiments, the degron encoded in the system is one being 11-20 amino-acid residues in length. In some embodiments, the degron encoded in the system is one being 21-26 amino-acid residues in length.
[0084] In various embodiments, the system includes polynucleotides encoding a short degron (e.g., no more than 30 amino-acid residues in length) and an intein having fast splicing rates (e.g., half-lives below 5 min) such as consensus fast intein (Cfa); and the system does not include polynucleotides encoding degron that is more than 30 amino-acid residues in length or polynucleotides encoding segments that form an intein with a splicing rate slower than Cfa or half-lives greater than 5 min. In various embodiments, the system is effective for reconstituting a target protein (e.g., sodium channel alpha subunit) at a faster speed or higher efficiency than degron-mediated degradation of intein, N-intein and/or C-intein. In various embodiments, the system having a polynucleotide sequence encoding the degron is effective for reconstituting a target protein (e.g., sodium channel alpha subunit) at an amount or yield that is at least 100%, 95%, 90%, 85%, or 80% compared to that reconstituted in a system without a polynucleotide sequence. In various embodiments, the system having a polynucleotide sequence encoding the degron is effective for reducing amount of intein (e.g., free intein after target protein splicing) by at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 95%.
[0085] In some embodiments, a degron is from the class II trans-activator (CIITA). It some embodiments, the CIITA degron is a 26 amino acid-long peptide, having an amino acid sequence of RPGSTSPFAPSATDLPSMPEPALTSR (SEQ ID NO:91). In some embodiments, the CIITA degron is a variant having at least 95% or 90% sequence identity with SEQ ID NO:91. In some embodiments, the CITTA degron is a N-terminal degron. That is, preferably, when the CITTA degron is attached to the N-intein, the CITTA degron is at the C-terminal of the N-intein, and the configuration from N to C is: N-terminal fragment of the sodium channel alpha subunit - N-intein - CITTA degron; and the other half of the split-intein has a configuration from N to C being C-intein - C-terminal fragment of the sodium channel alpha subunit. In other embodiments, the CITTA degron is inserted within the C-intein, and the configuration from N to C of this half of the split intein fusion is: a first portion of C-intein - degron - a second portion of the C-intein - C-terminal fragment of the sodium channel alpha subunit; and the other half of the split-intein has a configuration from N to C being N-terminal fragment of the sodium channel alpha subunit - N-intein. .
[0086] In some embodiments, a degron is derived from the ornithine decarboxylase 1 (ODC1). In some embodiments, the ODC1 degron is a 23 amino acid-long peptide, having an amino acid sequence of MSCAQESITSLYKKAGSENLYFQ (SEQ ID NO:88). In some embodiments, the ODC 1 degron is a variant having at least 95% or 90% sequence identity with SEQ ID NO:88. In other embodiments, the ODC1 degron is a 37 amino acid-long peptide, having an amino acid sequence of
FPPEVEEQDDGTLPMSCAQESGMDRHPAACASARINV (SEQ ID NO:89). In some embodiments, the ODC1 degron is a variant having at least 95% or 90% sequence identity with SEQ ID NO:89. In yet other embodiments, the ODC1 degron is a 7 amino acid-long peptide, having an amino acid sequence of MSCAQES (SEQ ID NO:90). In some embodiments, the ODC1 degron is a variant having at least 95% or 90% sequence identity with SEQ ID NO:90. It has been determined that part, of some degron sequences do not participate in the binding with ubiquitin ligase; and hence, a shorter fragment of some long degron sequences may be used which involves in binding with ubiquitin ligase to mediate degradation. For example, an active (ligase-binding) fragment of the ODC1 degron consists of an amino acid sequence of MSCAQES. In some embodiments, one or more ODC1 degrons are each a C-terminal degron. That is, preferably, the ODC1 degron is attached to as inserted into the C-intein, and the configuration from N to C of this half of the split intein fusion is: a first portion of C-intein - degron - a second portion of the C-intein - C-terminal fragment of the sodium channel alpha subunit; and the other half of the split-intein has a configuration from N to C being N-terminal fragment of the sodium channel alpha subunit - N-intein.
[0087] In some embodiments, a degron is the peptide CL1 or a variant thereof. In some embodiments, the CL1 degron is a 16 amino acid-long peptide, having an amino acid sequence of ACKNWFSSLSHFVIHL (SEQ ID NO:92). In some embodiments, the CL1 degron is a variant having at least 95% or 90% sequence identity with SEQ ID NO:92. In some embodiments, the CL1 degron or a variant thereof (“CL degron”) is attached to the N-intein and at the C-terminal of the N-intein, i.e., as a C-terminal tail of N-intein. A configuration from N- to C-terminus is: N-terminal fragment of the sodium channel alpha subunit - N-intein - CL degron; and the other half of the split-intein has a configuration from N to C being C-intein - C-terminal fragment of the sodium channel alpha subunit. Description of the CL degron is further provided by Gilon et al., the EMBO Journal, Vol.17 No.10 pp.2759-2766, 1998. [0088] Variants of peptide CL1 include CL2, CL6, CL9, CLIO, CLI 1, CL12, CL15, CL16, and SL17, whose sequences are summarized below, as well as those having at least 95% or 90% sequence identity thereto:
CL2 SLISLPLPTRVKFSSLLLIRIMKIITMTFPKKLRS (SEQ ID NO:94)
CL6 FYYPIWFARVLLVHYQ (SEQ ID NO: 95)
CL9 SNPFSSLFGASLLIDSVSLKSNWDTSSSSCLISFFSSVMFSSTTRS (SEQ ID NO: 96)
CLIO CRQRFSCHLTASYPQSTVTPFLAFLRRDFFFLRHNSSAD (SEQ ID NO:97)
CLI 1 GAPHVVLFDFELRITNPLSHIQSVSLQITLIFCSLPSLILSKFLQV (SEQ ID NO:98) CL12 NTPLFSKSFSTTCGVAKKTLLLAQISSLFFLLLSSNIAV (SEQ ID NO:99)
CL15 PTVKNSPKIFCLSSSPYLAFNLEYLSLRIFSTLSKCSNTLLTSLS (SEQ ID NO: 100)
CL 16 SNQLKRLWLWLLEVRSFDRTLRRPWIHLPS (SEQ ID NO : 101 )
SL17 SISFVIRSHASIRMGASNDFFHKLYFTKCLTSVILSKFLIHLLLRSTPRV (SEQ ID NO: 102)
[0089] In some embodiments, a degron is a short DEG1 with a hydrophobic end. In some embodiments, the DEG1 degron is a 9 amino acid-long peptide, having an amino acid sequence of GSLIIFIIL (SEQ ID NO:93). In some embodiments, the DEG1 degron is a variant having at least 95% or 90% sequence identity with SEQ ID NO:93. In some embodiments, the DEG1 degron is a C-terminal degron. That is, preferably, the DEG1 degron is inserted in the C-intein, and the configuration from N to C of this half of the split intein fusion is: a first portion of C-intein - degron - a second portion of the C-intein - C-terminal fragment of the sodium channel alpha subunit; and the other half of the split-intein has a configuration from N to C being N-terminal fragment of the sodium channel alpha subunit - N-intein.
[0090] Various embodiments provide systems to express a coding sequence of a sodium channel alpha subunit (e.g., Navl.l a subunit, or short as Navl.l unless otherwise noted) for reconstitution of the sodium channel alpha subunit. In various embodiments, the system expresses a coding sequence of Navl. l, which is gene SCN1A. In some embodiments, the systems express the coding sequence of a sodium channel alpha subunit selected from Navl. l through Navi.9, and correspondingly encoded by genes SCN1A through SCN11A. In other embodiments, the systems express SCN2A, SCN3A, SCN4A, SCN5A, SCN8A, SCN9A, SCN10A, SCN11A, or SCN7A, for reconstitution of Navi.2, Navi.3, Navi.4, Navi.5, Navi.6, Navi.7, Navi.8, Navi.9, or Nax, respectively. [0091] In some embodiments, a system to express a coding sequence of Navi.1 alpha subunit, wherein the coding sequence of Navl. l alpha subunit is or comprises the polynucleotide sequence of SCN1A, and the system includes: a first expression construct including a first portion of the polynucleotide sequence of the SCN1 A, and a polynucleotide sequence encoding an N-fragment of a split intein (N-intein) at the 3’ end relative to the first portion of the polynucleotide sequence of the SCN1A,' and a second expression construct including a second portion of the polynucleotide sequence of the SCN1A, and a polynucleotide sequence encoding a C-fragment of the split intein (C-intein) at the 5’ end relative to the second portion of the polynucleotide sequence of the SCN1A- wherein protein products of the first and the second portions of the polynucleotide sequence of the SCN1A are linked, via a peptide bond between the C-terminus of the first portion’s protein product and the N-terminus of the second portion’s protein product, to reconstitute the Navl. l; and wherein the first and/or second expression construct further comprises a polynucleotide sequence encoding a degron, located if within the first expression construct at the 3 ’ end relative to the first portion of the polynucleotide sequence of the SCN1A, and if within the second expression construct at the 5’ end relative to the second portion of the polynucleotide sequence of the SCN1A, OR the system includes a third expression construct encoding the degron.
[0092] In some embodiments, a composition or a system is provided for reconstitution of Navl. l alpha subunit, which comprises: a. a first polynucleotide encoding a polypeptide comprising an N-fragment of a split intein, wherein the N-fragment of the split intein is directly linked via a peptide bond, optionally through a peptide linker, to the N-terminal fragment of Navl.l alpha subunit; b. a second polynucleotide encoding a polypeptide comprising a C-fragment of the split intein, wherein the C-fragment of the split intein is directly linked via a peptide bond, optionally through a peptide linker, to the C-terminal fragment of the Navl. l alpha subunit; and c. a third polynucleotide encoding a degron; wherein the polynucleotides of the composition may be packed together in a single formulation or separately in different formulations, wherein the first and the second polynucleotides encode the N- and the C-terminal fragments of the Navl.l alpha subunit, respectively, so that when both fragments are spliced together, the N-terminal fragment is linked to the C-terminal fragment, generating whole Navl.l alpha subunit.
In some embodiment, the composition is further characterized in that: the split intein N-fragment is further directly linked via a peptide bond to a degron, wherein the degron is linked to the intein N-fragment via the C-terminus of the intein, with or without a linker between the intein N-fragment and the degron, and wherein the N-terminus of the Split intein N-fragment is directly linked via a peptide bond to the N-terminal fragment of the Navi .1 alpha subunit; and/or the split intein C-fragment is further directly linked via a peptide bond to a degron, wherein the degron is linked to the intein C-fragment via the N-terminus of the intein, with or without a linker between the intein C-fragment and the degron, and wherein the C-terminus of the Split intein C-fragment is directly linked via a peptide bond to the C-terminal fragment of the Navi.1 alpha subunit.
[0093] In some embodiments, a system to express a coding sequence of Navi.1 for its reconstitution includes a first expression construct and a second expression construct, wherein the first expression constructs includes from 5’ to 3’ : a first portion of the polynucleotide sequence of the SCN1A - a polynucleotide sequence encoding the N-intein - a polynucleotide sequence encoding the degron; and wherein the second expression constructs includes from 5’ to 3’ end: a polynucleotide sequence encoding a C-fragment of the split intein (C-intein) - a second portion of the polynucleotide sequence of the SCN1A. In further embodiments of this system, the polynucleotide sequence encoding the degron is one that encodes RPGSTSPFAPSATDLPSMPEPALTSR (SEQ ID NO:91) or ACKNWFSSLSHFVIHL (SEQ ID NO:92) or a variant having a sequence identity of at least 90% or 95% thereto.
[0094] In some embodiments, a system to express a coding sequence of Navi.1 for its reconstitution includes a first expression construct and a second expression construct, wherein the first expression constructs includes from 5’ to 3’ : a first portion of the polynucleotide sequence of the SCN1A - a polynucleotide sequence encoding the degron - a polynucleotide sequence encoding the N-intein; and wherein the second expression constructs includes from 5’ to 3’ end: a polynucleotide sequence encoding a C-fragment of the split intein (C-intein) - a second portion of the polynucleotide sequence of the SCN1A.
[0095] In some embodiments, a system to express a coding sequence of Navi.1 for its reconstitution includes a first expression construct and a second expression construct, wherein the first expression construct includes from 5’ to 3’ : a first portion of the polynucleotide sequence of the SCN1A - a polynucleotide sequence encoding the N-intein; and the second expression construct includes from 5’ to 3’: a polynucleotide sequence encoding the degron - a polynucleotide sequence encoding the C-intein - a second portion of the polynucleotide sequence of the SCN1A. In further embodiments of this system, the polynucleotide sequence encoding the degron is one encoding MSCAQESITSLYKKAGSENLYFQ (SEQ ID NO:88), FPPEVEEQDDGTLPMSCAQESGMDRHPAACASARINV (SEQ ID NO: 89), MSCAQES (SEQ ID NOVO), or GSLIIFIIL (SEQ ID NO:93), or a sequence having at least 90% or 95% sequence identity thereto.
[0096] In some embodiments, a system to express a coding sequence of Navi.1 for its reconstitution includes a first expression construct and a second expression construct, wherein the first expression construct includes from 5’ to 3’ : a first portion of the polynucleotide sequence of the SCN1A - a polynucleotide sequence encoding the N-intein; and wherein the second expression construct includes from 5’ to 3’ : a polynucleotide sequence encoding the C- intein - a polynucleotide sequence encoding the degron - a second portion of the polynucleotide sequence of the SCN1A.
[0097] In some embodiments, the degron has an amino acid sequence selected from the group consisting of: MSCAQESITSLYKKAGSENLYFQ (SEQ ID NO:88), FPPEVEEQDDGTLPMSCAQESGMDRHPAACASARINV (SEQ ID NO: 89), MSCAQES (SEQ ID NOVO), RPGSTSPFAPSATDLPSMPEPALTSR (SEQ ID NO:91), ACKNWFSSLSHFVIHL (SEQ ID NO:92), and GSLIIFIIL (SEQ ID NO:93), and variants having at least 90% or 90% sequence identity to any of the above.
[0098] Alternative sites are provided for splitting human SCN1A to make AAV-sized halves. In some aspects, the protein is split at (right before, i.e., at the N-terminus end before) breakpoint Cysl050, according to amino acid positions in sodium channel protein type 1 subunit alpha isoform 2 of NCBI reference no. NP 001340878.1. In various implementations, an endogenous cysteine residue is required to make half joining scarless and reconstitute a full- length unmutated protein. Alternative split intein breakpoints are at either Cys957 or Cys948, according to amino acid positions in sodium channel protein type 1 subunit alpha isoform 2 of NCBI reference no. NP 001340878.1. These breakpoints would permit a better AAV size and packaging efficiency of the N-terminal half for better expression than that seen with hSCNl A- CO-Nterml049 when using the Cysl050 breakpoint. However, they would place the intein junctions in the extracellular/lumenal space. The N-terminal junction sequence at the front of an intein is termed as the -1 position; and the +1 position after the intein sequence usually has a Cys, Ser, or Thr residue.
[0099] In some embodiments of the one or more systems disclosed herein, the first portion of the polynucleotide sequence of the SCN1A encodes residues 1-1049 of the Navl. l, and the second portion of the polynucleotide sequence of the SCN1A encodes residues 1050- 1998 of the Navl. l, wherein the amino acid position is based on numberings in NP 001340878.1. In some embodiments, the first portion of the polynucleotide sequence encodes a sodium channel alpha subunit N-terminus fragment when sequence aligned being corresponding to residues 1-1049 of the Navl. l having an NCBI reference no. NP 001340878.1, and the second of portion of the polynucleotide sequence encodes a sodium channel alpha subunit N-terminus fragment when sequence aligned being corresponding to residues 1050-1998 of the Navl.l having an NCBI reference no. NP_001340878.1.
[0100] In some embodiments of the one or more systems disclosed herein, the first portion of the polynucleotide sequence of the SCN1A encodes residues 1-956 of the Navl. l, and the second portion of the polynucleotide sequence of the SCN1A encodes residues 957- 1998, wherein the amino acid position is based on numberings in NP 001340878.1. In some embodiments, the first portion of the polynucleotide sequence encodes a sodium channel alpha subunit N-terminus fragment when sequence aligned being corresponding to residues 1-956 of the Navl.l having an NCBI reference no. NP_001340878.1, and the second of portion of the polynucleotide sequence encodes a sodium channel alpha subunit N-terminus fragment when sequence aligned being corresponding to residues 957-1998 of the Navl. l having an NCBI reference no. NP_001340878.1.
[0101] In some embodiments of the one or more systems disclosed herein, the first portion of the polynucleotide sequence of the SCN1A encodes residues 1-947 of the Navl. l, and the second portion of the polynucleotide sequence of the SCN1A encodes residues 948- 1998, wherein the amino acid position is based on numberings in NP 001340878.1. In some embodiments, the first portion of the polynucleotide sequence encodes a sodium channel alpha subunit N-terminus fragment when sequence aligned being corresponding to residues 1-947 of the Navl.l having an NCBI reference no. NP_001340878.1, and the second of portion of the polynucleotide sequence encodes a sodium channel alpha subunit N-terminus fragment when sequence aligned being corresponding to residues 947-1998 of the Navl. l having an NCBI reference no. NP_001340878.1.
[0102] In some embodiments of the one or more systems disclosed herein, the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterml049 (SEQ ID NO: 59), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterm949 (SEQ ID NO: 60).
[0103] In some embodiments of the one or more systems disclosed herein, the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCN!A-CO-Nterm956 (SEQ ID NO: 61), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterml042 (SEQ ID NO: 62).
[0104] In some embodiments of the one or more systems disclosed herein, the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterm947 (SEQ ID NO: 63), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterml051 (SEQ ID NO: 64).
[0105] In some embodiments, the intein includes a Cfa intein, an Ssp intein, a gp41-l intein, IMPDH-1 intein, Nrdj-1 intein, gp41-8 intein, or an Npu intein. In various aspects, the split intein comprises consensus fast intein (Cfa),. We conceive that with the Cfa, the split intein trans-splicing reaction will occur first, before the binding of ubiquitin ligase to degron and subsequent degradation. In some aspects, the polynucleotide sequence encoding the N- intein comprises a polynucleotide sequence of Cfa-N (SEQ ID NO:57), and the polynucleotide sequence encoding the C-intein comprises a polynucleotide sequence of Cfa-C (SEQ ID NO: 58). In other embodiments, the intein is functionally similar to a Cfa intein. Herein, functionally similar to a Cfa intein means that the expression construct includes a variant of a Cfa intein, yet still results in construction of a functional protein (e.g., voltage-gated sodium channel). In the event that trans-splicing efficiency is reduced in the presence of degron, we conceive that targeting a specific cell type gives flexibility to deliver higher doses without side effect, which may also compensate for the attenuation in the trans-splicing efficiency.
[0106] In some embodiments, a mature protein Navl. l is expressed by splitting the coding sequence into three fragments and putting the N-terminal portion of the coding sequence with a first N-intein into a first artificial expression construct, putting the middle portion of the coding sequence with a first C-intein and second N-intein into a second artificial expression construct, and putting the C-terminal portion of the coding sequence with a second C-intein into a second artificial expression construct, wherein the first N-intein and first C-intein specifically splice together to form an intein, and wherein the second N-intein and second C- intein specifically splice together to form an intein. In further embodiments, at least one, or two or all three fragments of the coding sequence includes a degron. Preferably, the first intein and the second intein are different, so that the first N-intein and the second C-intein do not splice together, and the second N-intein and the first C-intein do not splice together. That is, preferably, first N-intein and first C-intein, and second N-intein and second C-intein, are two mutually orthogonal split inteins. Exemplary mutually orthogonal split inteins are described at least by Pinto et al. in Nature Communications (2020)11 : 1529. A method of using this system includes administering the first, second, and third artificial expression construct to a cell. Similarly, mature proteins can be formed from several fragments using the appropriate number of inteins.
[0107] In various aspects, the first expression construct and the second expression construct independently further comprise a promoter sequence, selected from a minBglobin promoter, an hSynl promoter, or a CMV promoter; optionally the hSynl promoter comprising a shortened hSynl promoter having a polynucleotide sequence of SEQ ID NO:54.
[0108] In various aspects, the first expression construct, the second expression construct, or both independently further comprise an enhancer sequence, configured for targeted expression of the first, the second, or both portions, respectively, of the polynucleotide sequence of the sodium channel alpha subunit within a targeted central nervous system cell type.
[0109] In some aspects, the enhancer sequence comprises a polynucleotide sequence of DLX2.0 (SEQ ID NO:2). In some aspects, the enhancer sequence has a concatemerized core of a I56i enhancer optionally as set forth in SEQ ID NO: 1. These artificial enhancer elements provide more rapid onset of transgene expression compared to a single full length original (native) enhancer. In some aspects, the targeted central nervous system cell type comprises a GABAergic neuron, preferably a GABAergic interneuron, or more preferably a telencephalic GABAergic interneuron. In other embodiments, the enhancer sequence is a 527 bp enhancer sequence (referred to as mI56i or mDIx) from the intergenic interval between the distal-less homeobox 5 and 6 genes (DIx5/6), which are naturally expressed by forebrain GABAergic interneurons during embryonic development. Further description of enhance sequences, such as those for selectively modulating gene expression in interneurons, is provided in US20210348195, which is incorporated by reference.
[0110] In some aspects, the enhancer sequence comprises a polynucleotide sequence of eHGT_078h (SEQ ID NO:55). In some aspects, the targeted central nervous system cell type comprises a forebrain glutamatergic neuron.
[OHl] In some aspects, the first expression construct, the second expression construct, or both independently further comprise a miRNA binding site sequence, configured for targeted expression of the first, the second, or both portions, respectively, of the polynucleotide sequence of the sodium channel alpha subunit within a selected central nervous system cell type. [0112] In some aspects, the miRNA binding site sequence comprises a polynucleotide sequence of 4x2C miRNA binding site (SEQ ID NO:56), and the selected central nervous system cell type comprises a pan-GABAergic neuron.
[0113] In particular embodiments, an enhancer is used to drive gene expression in a targeted central nervous system cell population. Particular embodiments of the artificial expression constructs utilize the following enhancers to drive gene expression within targeted central nervous system cell populations as follows (enhancer / targeted cell population): DLX2.0 / forebrain GABAergic; hSynl with 4x2C or 8x2C miR binding site / pan-GABAergic neurons; eHGT_078h / forebrain glutamatergic neurons. In particular embodiments, the artificial expression construct can include a shortened promoter or a minimal promoter. In particular embodiments, the shortened promoter includes the hSynl prom oter( shortened). In particular embodiments, the minimal promoter includes minBglobin.
[0114] Particular embodiments provide artificial expression construct pairs including the features of vectors described herein including vectors: CN3252 and CN3254, CN3683 and CN3684, CN3251 and CN3253, CN3677 and CN3678, CN4541 and CN4542, CN4217 and CN4218, or CN4642 and CN4643, as described in Tables below.
[0115] Various embodiments further provide an artificial expression construct containing a first expression construct as disclosed herein. Further embodiments also provide an artificial expression construct containing a second expression construct as disclosed herein. Preferably, the artificial expression construct is within an adeno-associated viral (AAV) vector. The artificial expression construct can also include other regulatory elements if necessary or beneficial. Examples of regulatory elements utilized within artificial expression constructs disclosed herein include DLX2.0, minBglobin promoter, hSynl promoter, CMV promoter, hSynl promoter (shortened), 4x2C miR binding site, 8x2c miR binding site, and eHGT_078h. [0116] In particular embodiments, the artificial expression constructs are expressed in all neurons. In particular embodiments, the artificial expression constructs include an hSynl promoter and are expressed in neurons. In particular embodiments, the artificial expression constructs are expressed in all cell lines. In particular embodiments, the artificial expression constructs include a CMV promoter and are expressed in cell lines.
[0117] Various embodiments provide an administrable composition, which includes any one or more systems disclosed herein to express coding sequence of and reconstitute Navl. l. Additional embodiments provide an administrable composition, which includes either one or both of an artificial expression construct containing the first expression construct, and an artificial expression construct containing the second expression construct. [0118] A transgenic cell is also provided, comprising any one or more systems disclosed herein. In some aspects, the transgenic cell is a GAB Aergic neuron or a glutamatergic neuron or a cell line of GAB Aergic neuron or glutamatergic neuron.
[0119] Additional embodiments provide a method of rescuing voltage-gated sodium channel function in a population of cells, and the method includes co-administering a therapeutically effective amount of a first expression construct and a therapeutically effective amount of a second expression construct, as disclosed herein, to a sample or subject comprising the population of cells, and inducing expression of the first expression construct and the second expression construct to reconstitute Navl. l, thereby rescuing voltage-gated sodium channel function in the population of cells.
[0120] Preferably, the method is for rescuing voltage-gated sodium channel function in a targeted population of cells. In some embodiments, the methods involve a targeted central nervous system cell type enhancer, which is uniquely or predominantly utilized by the targeted central nervous system cell type. A targeted central nervous system cell type enhancer enhances expression of a gene in the targeted central nervous system. In certain embodiments, a targeted central nervous system cell type enhancer is also a targeted central nervous system type enhancer that enhances expression of a gene in the targeted central nervous system and does not substantially direct expression of genes in other non-targeted cell types, thus having cell type specific transcriptional activity.
[0121] In some embodiments of the methods, the subject has an SCN1 A-related seizure disorder comprising Dravet syndrome, myoclonic seizures, myoclonic astatic epilepsy (MAE), intractable childhood epilepsy with generalized tonic-clonic seizures, simple febrile seizures, generalized epilepsy and febrile seizures plus (GEFS+), migrating partial seizures of infancy, Lennox-Gastaut syndrome, or West syndrome.
[0122] In some embodiments of the methods, the subject is a pediatric patient having Dravet syndrome. In some embodiments, the subject is a pediatric human. In some embodiments, the subject is an infant (1 year old or younger). In some embodiments, the subject is a young child, e.g., between 1 and 10 years old. In some embodiments, the is a teenager. In various embodiments, the human subject is age 1 day through 5 months, 6 months through 4 years, 5 years through 11 years, or 12 years through 17 years.
[0123] In particular embodiments, artificial expression constructs can deliver SCN1A as several fragments of SCN1A delivered by several artificial expression constructs. For example, SCN1A can be delivered in a first artificial expression construct including a first portion of the SCN1A coding sequence and second artificial expression construct including a second portion of the SCN1A coding sequence. In particular embodiments, the first portion of the SCN1A coding sequence is the N-terminal portion of the coding sequence and the second portion of the SCN1 A coding sequence is the C-terminal portion of the coding sequence.
[0124] The sodium channel alpha subunit coding sequence can be split into an N- terminal portion and C-terminal portion at any point, or preferably at a breakpoint wherein the first amino acid residue encoded downstream of the breakpoint is Cys, Ser, or Thr, such that upon intein fusion, a functional sodium channel alpha subunit molecule is expressed. In particular embodiments, an N-terminal portion of the SCN1A coding sequence includes hSCNlA-CO-Nterml049 (SEQ ID NO:59), hSCNlA-CO-Nterm956 (SEQ ID NO:61), or hSCNl A-CO-Nterm947 (SEQ ID NO:63). In particular embodiments, a C-terminal portion of the SCN1A coding sequence includes hSCNIA- CO-Cterm949 (SEQ ID NO: 60), hSCNlA- CO-Cterml042 (SEQ ID NO:62), or hSCNIA-CO- Cterml051 (SEQ ID NO:64).
[0125] Exemplary reporter genes/proteins include those expressed by Addgene ID#s 83894 (pAAV-hDlx-Flex-dTomato-Fishell_7), 83895 (pAAV-hDlx-Flex-GFP-Fishell_6), 83896 (pAAV-hDlx-GiDREADD-dTomato-Fishell-5), 83898 (pAAV-mDlx-ChR2- mCherry-Fishell-3), 83899 (pAAV-mDlx-GCaMP6f-Fishell-2), 83900 (pAAV-mDlx-GFP- Fishell-1), and 89897 (pcDNA3- FLAG-mTET2 (N500)). Exemplary reporter genes particularly can include those which encode an expressible fluorescent protein, or expressible biotin; blue fluorescent proteins (e.g. eBFP, eBFP2, Azurite, mKalamal, GFPuv, Sapphire, T- sapphire); cyan fluorescent proteins (e.g. eCFP, Cerulean, CyPet, AmCyanl, Midoriishi-Cyan, mTurquoise); green fluorescent proteins (e.g. GFP, GFP-2, tagGFP, turboGFP, EGFP, Emerald, Azami Green, Monomeric Azami Green (mAzamigreen), CopGFP, AceGFP, avGFP, ZsGreenl, Oregon Green™(Thermo Fisher Scientific)); Luciferase; orange fluorescent proteins (mOrange, mKO, Kusabira-Orange, Monomeric Kusabira-Orange, mTangerine, tdTomato, dTomato); red fluorescent proteins (mKate, mKate2, mPlum, DsRed monomer, mCherry, mRuby, mRFPl, DsRed-Express, DsRed2, DsRed-Monomer, HcRed-Tandem, HcRedl, AsRed2, eqFP611, mRaspberry, mStrawberry, Jred, Texas Red™ (Thermo Fisher Scientific)); far red fluorescent proteins (e.g., mPlum and mNeptune); yellow fluorescent proteins (e.g., YFP, eYFP, Citrine, SYFP2, Venus, YPet,PhiYFP, ZsYellowl); and tandem conjugates.
[0126] In particular embodiments, artificial expression constructs can include DNA and RNA editing tools such CRISPR/Cas (e.g., guide RNA and a nuclease, such as Cas, Cas9 or cpfl). Functional molecules can also include engineered Cpfls such as those described in US 2018/0030425, US 2016/0208243, WO/2017/184768 and Zetsche et al. (2015) Cell 163: 759-771; single gRNA (see e.g., Jinek et al. (2012) Science 337:816-821; Jinek et al. (2013) eLife 2:e00471; Segal (2013) eLife 2:e00563) or editase, guide RNA molecules, microRNA, or homologous recombination donor cassettes.
[0127] In particular embodiments, artificial expression constructs can include tag cassettes. A tag cassette includes His tag (HHHHHH; SEQ ID NO:34), Flag tag (DYKDDDDK; SEQ ID NO:35), Xpress tag (DLYDDDDK; SEQ ID NO:36), Avi tag (GLNDIFEAQKIEWHE; SEQ ID NO: 37), Calmodulin tag (KRRWKKNFIAVSAANRFKKISSSGAL; SEQ ID NO: 38), Polyglutamate tag, HA tag (YPYDVPDYA; SEQ ID NO:39), Myc tag (EQKLISEEDL; SEQ ID NO:40), Strep tag (which refers the original STREP® tag (WRHPQFGG; SEQ ID NO:41), STREP® tag II (WSHPQFEK SEQ ID NO:42 (IBA Institut fur Bioanalytik, Germany); see, e.g., US 7,981,632), Softag 1 (SLAELLNAGLGGS; SEQ ID NO:43), Softag 3 (TQDPSRVG; SEQ ID NO:44), and V5 tag (GKPIPNPLLGLDST; SEQ ID NO:45). In particular embodiments, a tag cassette includes a fusion of tag cassettes such as 3XFLAG. In particular embodiments, 3XFLAG includes the sequence set forth in SEQ ID NO: 15.
[0128] In particular embodiments, the artificial expression constructs include an internal ribosome entry site (IRES) sequence. See for example, figure IB. IRES allow ribosomes to initiate translation at a second internal site on a mRNA molecule, leading to production of two proteins from one mRNA. In particular embodiments, IRES includes IRES2. In particular embodiments, IRES2 allows for a second protein open reading frame (ORF) to be translated from the same transcript. This is unlike the 2A sequence which allows for a single ORF to be cleaved into two proteins, with similar efficiencies of production
[0129] Coding sequences encoding molecules (e.g., RNA, proteins) described herein can be obtained from publicly available databases and publications. Coding sequences can further include various sequence polymorphisms, mutations, and/or sequence variants wherein such alterations do not affect the function of the encoded molecule. The term “encode” or “encoding” refers to a property of sequences of nucleic acids, such as a vector, a plasmid, a gene, cDNA, mRNA, to serve as templates for synthesis of other molecules such as proteins.
[0130] The term “gene” may include not only coding sequences but also regulatory regions such as promoters, enhancers, insulators, and/or post-regulatory elements, such as termination regions. The term further can include all introns and other DNA sequences spliced from the mRNA transcript, along with variants resulting from alternative splice sites. The sequences can also include degenerate codons of a reference sequence or sequences that may be introduced to provide codon preference in a specific organism or cell type. [0131] Promoters can include general promoters, tissue-specific promoters, cellspecific promoters, and/or promoters specific for the cytoplasm. Promoters may include strong promoters, weak promoters, constitutive expression promoters, and/or inducible promoters. Inducible promoters direct expression in response to certain conditions, signals or cellular events. For example, the promoter may be an inducible promoter that requires a particular ligand, small molecule, transcription factor or hormone protein in order to effect transcription from the promoter. Particular examples of promoters include minBglobin (also referred to as minBGprom), CMV promoter, hSynl promoter, hSynl promoter (shortened), minCMV, minCMV* (minCMV* is minCMV with a SacI restriction site removed), minRho, minRho* (minRho* is minRho with a SacI restriction site removed), SV40 immediately early promoter, the Hsp68 minimal promoter (proHSP68), and the Rous Sarcoma Virus (RSV) long-terminal repeat (LTR) promoter. Minimal promoters have no activity to drive gene expression on their own but can be activated to drive gene expression when linked to a proximal enhancer element. [0132] In particular embodiments, expression constructs are provided within vectors. The term vector refers to a nucleic acid molecule capable of transferring or transporting another nucleic acid molecule, such as an expression construct. The transferred nucleic acid is generally linked to, e.g., inserted into, the vector nucleic acid molecule. A vector may include sequences that direct autonomous replication in a cell or may include sequences that permit integration into host cell DNA. Useful vectors include, for example, plasmids (e.g., DNA plasmids or RNA plasmids), transposons, cosmids, bacterial artificial chromosomes, and viral vectors.
[0133] Adeno-Associated Virus (AAV) is a parvovirus, discovered as a contamination of adenoviral stocks. It is a ubiquitous virus (antibodies are present in 85% of the US human population) that has not been linked to any disease. It is also classified as a dependovirus, because its replication is dependent on the presence of a helper virus, such as adenovirus. Various serotypes have been isolated, of which AAV-2 is the best characterized. AAV has a single-stranded linear DNA that is encapsidated into capsid proteins VP1, VP2 and VP3 to form an icosahedral virion of 20 to 24 nm in diameter.
[0134] The AAV DNA is 4.7 kilobases long. It contains two open reading frames and is flanked by two ITRs. There are two major genes in the AAV genome: rep and cap. The rep gene codes for proteins responsible for viral replications, whereas cap codes for capsid protein VP1-3. Each ITR forms a T-shaped hairpin structure. These terminal repeats are the only essential cis components of the AAV for chromosomal integration. Therefore, the AAV can be used as a vector with all viral coding sequences removed and replaced by the cassette of genes for delivery. Three AAV viral promoters have been identified and named p5, pl 9, and p40, according to their map position. Transcription from p5 and pl9 results in production of rep proteins, and transcription from p40 produces the capsid proteins.
[0135] AAVs stand out for use within the current disclosure because of their superb safety profile and because their capsids and genomes can be tailored to allow expression in targeted cell populations. scAAV refers to a self-complementary AAV. pAAV refers to a plasmid adeno-associated virus. rAAV refers to a recombinant adeno-associated virus. pSMART-HCKan is a high copy number vector with a kanamycin resistance marker for efficient blunt cloning of unstable sequences.
[0136] Other viral vectors may also be employed. For example, vectors derived from viruses such as vaccinia virus, polioviruses and herpes viruses may be employed. They offer several attractive features for various mammalian cells
[0137] Elements directing the efficient termination and polyadenylation of a heterologous nucleic acid transcript can increase heterologous gene expression. Transcription termination signals are generally found downstream of the polyadenylation signal. In particular embodiments, vectors include a polyadenylation signal 3' of a polynucleotide encoding a molecule (e.g., protein) to be expressed. The term "poly(A) site" or "poly(A) sequence" denotes a DNA sequence which directs both the termination and polyadenylation of the nascent RNA transcript by RNA polymerase II. Polyadenylation sequences can promote mRNA stability by addition of a poly(A) tail to the 3' end of the coding sequence and thus, contribute to increased translational efficiency. Particular embodiments may utilize BGHpA, hGHpA, SV40pA, or shortPolyA. In particular embodiments, a preferred embodiment of an expression construct includes a terminator element. These elements can serve to enhance transcript levels and to minimize read through from the construct into other plasmid sequences.
[0138] Particular embodiments of vectors include:
Figure imgf000039_0001
Figure imgf000040_0001
[0139] Subcomponent sequences within the larger vector sequences can be readily identified by one of ordinary skill in the art and based on the contents of the current disclosure. Nucleotides between identifiable and enumerated subcomponents reflect restriction enzyme recognition sites used in assembly (cloning) of the constructs, and in some cases, additional nucleotides do not convey any identifiable function. These segments of complete vector sequences can be adjusted based on use of different cloning strategies and/or vectors. In general, short 6-nucleotide palindromic sequences reflect vector construction artifacts that are not important to vector function. [0140] In particular embodiments vectors (e.g., AAV) with capsids that cross the blood-brain barrier (BBB) are selected. In particular embodiments, vectors are modified to include capsids that cross the BBB. Examples of AAV with viral capsids that cross the blood brain barrier include AAV9 (Gombash et al., Front Mol Neurosci. 2014; 7:81), AAVrh.10 (Yang, et al., Mol Ther.2014; 22(7): 1299-1309), AAV1R6, AAV1R7 (Albright et al., Mol Ther. 2018; 26(2): 510), rAAVrh.8 (Yang, et al., supra), AAV-BR1 (Marchio et al., EMBO Mol Med. 2016; 8(6): 592), AAV-PHP.S (Chan et al., Nat Neurosci. 2017; 20(8): 1172), AAV- PHP.B (Deverman et al., Nat Biotechnol. 2016; 34(2): 204), AAV-PPS (Chen et al., Nat Med. 2009; 15: 1215), and PHP.eB. In particular embodiments, the PHP.eB capsid differs from AAV9 such that, using AAV9 as a reference, amino acids starting at residue 586: S-AQ-A (SEQ ID NO: 46) are changed to S-DGTLAVPFK-A (SEQ ID NO: 47). In particular embodiments, PHP.eB refers to SEQ ID NO: 30. Further description of capsids that cross the BBB is provided in US20210348195, which is incorporated by reference.
[0141] AAV9 is a naturally occurring AAV serotype that, unlike many other naturally occurring serotypes, can cross the BBB following intravenous injection. It transduces large sections of the central nervous system (CNS), thus permitting minimally invasive treatments (Naso et al., BioDrugs. 2017; 31(4): 317), for example, as described in relation to clinical trials for the treatment of spinal muscular atrophy (SMA) syndrome by AveXis (AVXS-101, NCT03505099) and the treatment of CLN3 gene-Related Neuronal Ceroid-Lipofuscinosis (NCT03770572).
[0142] AAVrh.10 was originally isolated from rhesus macaques and shows low seropositivity in humans when compared with other common serotypes used for gene delivery applications (Selot et al., Front Pharmacol. 2017; 8: 441) and has been evaluated in clinical trials LYS-SAF302, LYSOGENE, and NCT03612869.
[0143] AAV1R6 and AAV1R7, two variants isolated from a library of chimeric AAV vectors (AAV1 capsid domains swapped into AAVrh.10), retain the ability to cross the BBB and transduce the CNS while showing significantly reduced hepatic and vascular endothelial transduction.
[0144] rAAVrh.8, also isolated from rhesus macaques, shows a global transduction of glial and neuronal cell types in regions of clinical importance following peripheral administration and also displays reduced peripheral tissue tropism compared to other vectors.
[0145] AAV-BR1 is an AAV2 variant displaying the NRGTEWD (SEQ ID NO:48) epitope that was isolated during in vivo screening of a random AAV display peptide library. It shows high specificity accompanied by high transgene expression in the brain with minimal off-target affinity (including for the liver) (Korbelin et al., EMBO Mol Med. 2016; 8(6): 609). [0146] AAV-PHP.S (Addgene, Watertown, MA) is a variant of AAV9 generated with the CREATE method that encodes the 7-mer sequence QAVRTSL (SEQ ID NO:49), transduces neurons in the enteric nervous system, and strongly transduces peripheral sensory aff erents entering the spinal cord and brain stem.
[0147] AAV-PHP.B (Addgene, Watertown, MA) is a variant of AAV9 generated with the CREATE method that encodes the 7-mer sequence TLAVPFK (SEQ ID NO:50). It transfers genes throughout the CNS with higher efficiency than AAV9 and transduces the majority of astrocytes and neurons across multiple CNS regions.
[0148] AAV-PPS, an AAV2 variant crated by insertion of the DSPAHPS (SEQ ID NO:51) epitope into the capsid of AAV2, shows a dramatically improved brain tropism relative to AAV2.
[0149] For additional information regarding capsids that cross the blood brain barrier, see Chan et al., Nat. Neurosci. 2017 Aug: 20(8): 1172-1179.
[0150] In particular embodiments, a capsid that results in transduction of targeted cell types in a primate following administration (e.g., i.v. administration) is chosen. In particular embodiments, a capsid that results in widespread transduction of tissue and cell types impacted by the loss of Senia following administration is chosen. In particular embodiments, targeted cell types are neurons. In particular embodiments, neurons include GABAergic neurons or glutamatergic neurons. In particular embodiments, GABAergic neurons include pan- GABAergic neurons, forebrain GABAergic neurons, hippocampal GABAergic neurons, or cortical GABAergic neurons. In particular embodiments, glutamatergic neurons include forebrain glutamatergic neurons.
[0151] Artificial expression constructs and vectors that result in rescue of voltage-gated sodium channel function of the present disclosure (referred to herein as physiologically active components) can be formulated with a carrier or more than one carrier that is suitable for administration to a cell, tissue slice, animal (e.g., mouse, non-human primate), or human. Physiologically active components within compositions described herein can be prepared in neutral forms, as freebases, or as pharmacologically acceptable salts.
[0152] Pharmaceutically-acceptable salts include the acid addition salts (formed with the free amino groups of the protein) and which are formed with inorganic acids such as, for example, hydrochloric or phosphoric acids, or such organic acids as acetic, oxalic, tartaric, mandelic, and the like. Salts formed with the free carboxyl groups can also be derived from inorganic bases such as, for example, sodium, potassium, ammonium, calcium, or ferric hydroxides, and such organic bases as isopropylamine, trimethylamine, histidine, procaine and the like.
[0153] Carriers of physiologically active components can include solvents, dispersion media, vehicles, coatings, diluents, isotonic and absorption delaying agents, buffers, solutions, suspensions, colloids, and the like. The use of such carriers for physiologically active components is well known in the art. Except insofar as any conventional media or agent is incompatible with the physiologically active components, it can be used with compositions as described herein.
[0154] The phrase "pharmaceutically-acceptable carriers" refer to carriers that do not produce an allergic or similar untoward reaction when administered to a human, and in particular embodiments, when administered intravenously.
[0155] In particular embodiments, compositions can be formulated for intravenous, intraparenchymal, intraocular, intravitreal, parenteral, subcutaneous, intracerebro-ventricular, intramuscular, intrathecal, intraspinal, intraperitoneal, oral or nasal inhalation, or by direct injection in or application to one or more cells, tissues, or organs.
[0156] Compositions may include liposomes, lipids, lipid complexes, microspheres, microparticles, nanospheres, and/or nanoparticles.
[0157] The formation and use of liposomes is generally known to those of skill in the art. Liposomes have been developed with improved serum stability and circulation half-times (see, for instance, U.S. Pat. No. 5,741,516). Further, various methods of liposome and liposome like preparations as potential drug carriers have been described (see, for instance U.S. Pat. Nos. 5,567,434; 5,552,157; 5,565,213; 5,738,868; and 5,795,587).
[0158] The disclosure also provides for pharmaceutically acceptable nanocapsule formulations of the physiologically active components. Nanocapsules can generally entrap compounds in a stable and reproducible way (Quintanar-Guerrero et al., Drug Dev Ind Pharm 24(12): 1113-1128, 1998; Quintanar-Guerrero et al., Pharm Res. 15(7): 1056-1062, 1998; Quintanar-Guerrero et al., J. Microencapsul. 15(l):107-l 19, 1998; Douglas et al., Crit Rev Ther Drug Carrier Syst 3(3):233- 261, 1987). To avoid side effects due to intracellular polymeric overloading, such ultrafine particles can be designed using polymers able to be degraded in vivo. Biodegradable polyalkyl- cyanoacrylate nanoparticles that meet these requirements are contemplated for use in the present disclosure. Such particles can be easily made, as described in Couvreur et al., J Pharm Sci 69(2): 199-202, 1980; Couvreur etal., Crit Rev Ther Drug Carrier Syst. 5(1)1-20, 1988; zur Muhlen et al., Eur J Pharm Biopharm, 45(2): 149-155, 1998; Zambaux et a/., JControl Realease 50(1-3):31- 40, 1998; and U.S. Pat. No. 5,145,684.
[0159] Injectable compositions can include sterile aqueous solutions or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersions (U.S. Pat. No. 5,466,468). For delivery via injection, the form is sterile and fluid to the extent that it can be delivered by syringe. In particular embodiments, it is stable under the conditions of manufacture and storage, and optionally contains one or more preservative compounds against the contaminating action of microorganisms, such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (e.g., glycerol, propylene glycol, and liquid polyethylene glycol, and the like), suitable mixtures thereof, and/or vegetable oils. Proper fluidity may be maintained, for example, by the use of a coating, such as lecithin, by the maintenance of the required particle size in the case of dispersion, and/or by the use of surfactants. The prevention of the action of microorganisms can be brought about by various antibacterial and/or antifungal agents, for example, parabens, chlorobutanol, phenol, sorbic acid, thimerosal, and the like. In various embodiments, the preparation will include an isotonic agent(s), for example, sugar(s) or sodium chloride. Prolonged absorption of the injectable compositions can be accomplished by including in the compositions of agents that delay absorption, for example, aluminum monostearate and gelatin. Injectable compositions can be suitably buffered, if necessary, and the liquid diluent first rendered isotonic with sufficient saline or glucose.
[0160] Dispersions may also be prepared in glycerol, liquid polyethylene glycols, and mixtures thereof and in oils. As indicated, under ordinary conditions of storage and use, these preparations can contain a preservative to prevent the growth of microorganisms.
[0161] Sterile compositions can be prepared by incorporating the physiologically active component in an appropriate amount of a solvent with other optional ingredients (e.g., as enumerated above), followed by filtered sterilization. Generally, dispersions are prepared by incorporating the various sterilized physiologically active components into a sterile vehicle that contains the basic dispersion medium and the required other ingredients (e.g., from those enumerated above). In the case of sterile powders for the preparation of sterile injectable solutions, preferred methods of preparation can be vacuum-drying and freeze-drying techniques which yield a powder of the physiologically active components plus any additional desired ingredient from a previously sterile-filtered solution thereof.
[0162] Oral compositions may be in liquid form, for example, as solutions, syrups or suspensions, or may be presented as a drug product for reconstitution with water or other suitable vehicle before use. Such liquid preparations may be prepared by conventional means with pharmaceutically acceptable additives such as suspending agents (e.g., sorbitol syrup, cellulose derivatives or hydrogenated edible fats); emulsifying agents (e.g., lecithin or acacia); non- aqueous vehicles (e.g., almond oil, oily esters, or fractionated vegetable oils); and preservatives (e.g., methyl or propyl-p-hydroxybenzoates or sorbic acid). The compositions may take the form of, for example, tablets or capsules prepared by conventional means with pharmaceutically acceptable excipients such as binding agents (e.g., pregelatinized maize starch, polyvinyl pyrrolidone or hydroxypropyl methylcellulose); fillers (e.g., lactose, microcrystalline cellulose or calcium hydrogen phosphate); lubricants (e.g., magnesium stearate, talc or silica); disintegrants (e.g., potato starch or sodium starch glycolate); or wetting agents (e.g., sodium lauryl sulphate). Tablets may be coated by methods well-known in the art. [0163] Inhalable compositions can be delivered in the form of an aerosol spray presentation from pressurized packs or a nebulizer, with the use of a suitable propellant, e.g., dichlorodifluoromethane, trichlorofluoromethane, di chlorotetrafluoroethane, carbon dioxide or other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in an inhaler or insufflator may be formulated containing a powder mix of the compound and a suitable powder base such as lactose or starch.
[0164] Compositions can also include microchip devices (U.S. Pat. No. 5,797,898), ophthalmic formulations (Bourlais etal., Prog Retin Eye Res, 17(l):33-58, 1998), transdermal matrices (U.S. Pat. No. 5,770,219 and U.S. Pat. No. 5,783,208) and feedback-controlled delivery (U.S. Pat. No. 5,697,899).
[0165] Supplementary active ingredients can also be incorporated into the compositions.
[0166] Typically, compositions can include at least 0.1% of the physiologically active components or more, although the percentage of the physiologically active components may, of course, be varied and may conveniently be between 1 or 2% and 70% or 80% or more or 0.5-99% of the weight or volume of the total composition. Naturally, the amount of physiologically active components in each physiologically-useful composition may be prepared in such a way that a suitable dosage will be obtained in any given unit dose of the compound. Factors such as solubility, bioavailability, biological half-life, route of administration, product shelf life, as well as other pharmacological considerations will be contemplated by one skilled in the art of preparing such pharmaceutical formulations, and as such, a variety of compositions and dosages may be desirable. [0167] In particular embodiments, for administration to humans, compositions should meet sterility, pyrogenicity, and the general safety and purity standards as required by United States Food and Drug Administration (FDA) or other applicable regulatory agencies in other countries.
[0168] The present disclosure includes cells including an artificial expression construct described herein. A cell that has been transformed with an artificial expression construct can be used for many purposes, including in neuroanatomical studies, assessments of functioning and/or non-functioning proteins, and drug screens that assess the regulatory properties of enhancers.
[0169] A variety of host cell lines can be used, but in particular embodiments, the cell is a mammalian cell. In particular embodiments, the artificial expression construct includes a regulatory element and/or a vector sequence of DLX2.0, minBglobin promoter, hSynl promoter, CMV promoter, hSynl promoter (shortened), 4x2C miR binding site, 8x2C miR binding site and/or eHGT_078h and/or CN3252, CN3254, CN3683, CN3684, CN3251, CN3253, CN3677, CN3678, CN4541, CN4542, CN4217, CN4218, CN4642, or CN4643, and the cell line is a human, primate, or murine cell. Cell lines which can be utilized for transgenesis in the present disclosure also include primary cell lines derived from living tissue such as rat or mouse brains and organotypic cell cultures, including brain slices from animals such as rats, mice, non-human primates, or human neurosurgical tissue.
[0170] WO 91/13150 describes a variety of cell lines, including neuronal cell lines, and methods of producing them. Similarly, WO 97/39117 describes a neuronal cell line and methods of producing such cell lines. The neuronal cell lines disclosed in these patent applications are applicable for use in the present disclosure.
[0171] In particular embodiments, "neuronal" describes something that is of, related to, or includes, neuronal cells. Neuronal cells are defined by the presence of an axon and dendrites. [0172] The term "neuronal-specific" refers to something that is found, or an activity that occurs, in neuronal cells or cells derived from neuronal cells, but is not found in or occur in, or is not found substantially in or occur substantially in, non-neuronal cells or cells not derived from neuronal cells, for example glial cells such as astrocytes or oligodendrocytes.
[0173] In particular embodiments, non-neuronal cell lines may be used, including mouse embryonic stem cells. Cultured mouse embryonic stem cells can be used to analyze expression of genetic constructs using transient transfection with plasmid constructs. Mouse embryonic stem cells are pluripotent and undifferentiated. These cells can be maintained in this undifferentiated state by Leukemia Inhibitory Factor (LIF). Withdrawal of LIF induces differentiation of the embryonic stem cells. In culture, the stem cells form a variety of differentiated cell types. Differentiation is caused by the expression of tissue specific transcription factors, allowing the function of an enhancer sequence to be evaluated. (See for example Fiskerstrand et al., FEBS Lett 458: 171-174, 1999).
[0174] Methods to differentiate stem cells into neuronal cells include replacing a stem cell culture media with a media including basic fibroblast growth factor (bFGF) heparin, an N2 supplement (e.g., transferrin, insulin, progesterone, putrescine, and selenite), laminin and poly ornithine. A process to produce myelinating oligodendrocytes from stem cells is described in Hu, etal., 2009, Nat. Protoc. 4: 1614-22. Bibel, etal., 2007, Nat. Protoc. 2: 1034-43 describes a protocol to produce glutamatergic neurons from stem cells while Chatzi, et al., 2009, Exp. Neurol. 217:407-16 describes a procedure to produce GABAergic neurons. This procedure includes exposing stem cells to all-trans-RA for three days.
[0175] U.S. Publication No. 2012/0329714 describes use of prolactin to increase neural stem cell numbers while U.S. Publication No. 2012/0308530 describes a culture surface with amino groups that promotes neuronal differentiation into neurons, astrocytes and oligodendrocytes. Thus, the fate of neural stem cells can be controlled by a variety of extracellular factors. Commonly used factors include brain derived growth factor (BDNF; Shetty and Turner, 1998, J. Neurobiol. 35:395- 425); fibroblast growth factor (bFGF; U.S. Pat. No.5,766,948; FGF-1, FGF-2); Neurotrophin-3 (NT-3) and Neurotrophin-4 (NT-4); Caldwell, et al., 2001, Nat. Biotechnol. l;19:475-9); ciliary neurotrophic factor (CNTF); BMP-2 (U.S. Pat. Nos. 5,948,428 and 6,001,654); isobutyl 3- methylxanthine; leukemia inhibitory growth factor (LIF; U.S. PatentNo. 6,103,530); somatostatin; amphiregulin; neurotrophins (e.g, cyclic adenosine monophosphate; epidermal growth factor (EGF); dexamethasone (glucocorticoid hormone); forskolin; GDNF family receptor ligands; potassium; retinoic acid (U.S. PatentNo. 6,395,546); tetanus toxin; and transforming growth factor-a and TGF-P (U.S. Pat. Nos. 5,851,832 and 5,753,506).
[0176] In particular embodiments, yeast one-hybrid systems may also be used to identify compounds that inhibit specific protein/DNA interactions, such as transcription factors for DLX2.0, minBglobin promoter, hSynl, promoter, CMV promoter, hSynl promoter (shortened), 4x2C miR binding site, 8x2C miR binding site, and/or eHGT_078h.
[0177] Methods are also provided for administering a system of expression constructs to a subject in need thereof, which include administering a therapeutically effective amount of a system disclosed herein to a sample or subject comprising a targeted population of cells, and inducing expression of the first expression construct and the second expression construct of the system to reconstitute a sodium channel alpha subunit, thereby rescuing voltage-gated sodium channel function within the targeted population of cells.
[0178] In various embodiments, the subject in need thereof has a sodium channelopathies, optionally comprising Dravet syndrome, myoclonic seizures, myoclonic astatic epilepsy (MAE), intractable childhood epilepsy with generalized tonic-clonic seizures, simple febrile seizures, generalized epilepsy and febrile seizures plus (GEFS+), migrating partial seizures of infancy, Lennox-Gastaut syndrome, or West syndrome.
[0179] In various embodiments, the system of expression constructs is administered to a human subject or mammalian subject. In some embodiments, the system of expression constructs is administered to cells or tissue cultures obtained from, or derived from, a human subject or a mammalian subject.
EXAMPLES
[0180] The following examples are provided to better illustrate the claimed invention and are not to be interpreted as limiting the scope of the invention. To the extent that specific materials are mentioned, it is merely for purposes of illustration and is not intended to limit the invention. One skilled in the art may develop equivalent means or reactants without the exercise of inventive capacity and without departing from the scope of the invention.
[0181] AAV-MEDIATED INTERNEURON-SPECIFIC GENE REPLACEMENT FOR DRAVET SYNDROME
[0182] Dravet syndrome (DS) is a devastating developmental epileptic encephalopathy marked by treatment-resistant seizures, developmental delay, intellectual disability, motor deficits, and a 10-20% rate of premature death. Most DS patients harbor loss-of-function mutations in one copy of SCN1A, which has been associated with inhibitory neuron dysfunction. Here we developed an interneuron-targeting AAV human SCN1A gene replacement therapy using cell class-specific enhancers. We generated a split-intein fusion form of SCN1A to circumvent AAV packaging limitations and deliver SCN1A via a dual vector approach using cell class-specific enhancers. These constructs produced full-length NaVEl protein and functional sodium channels in HEK293 cells and in brain cells in vivo. After packaging these vectors into enhancer- AAVs and administering to mice, immunohistochemical analyses showed telencephalic GAB Aergic interneuron-specific and dose-dependent transgene biodistribution.
[0183] These vectors conferred strong dose-dependent protection against postnatal mortality and seizures in two DS mouse models carrying independent loss-of-function alleles of Senia, at two independent research sites, supporting the robustness of this approach. No mortality or toxicity was observed in wild-type mice injected with single vectors expressing either the N-terminal or C-terminal halves of SCN1A, or the dual vector system targeting interneurons. In contrast, nonselective neuronal targeting of SCN1A conferred less rescue against mortality and presented substantial preweaning lethality. These findings demonstrate that interneuron-specific AAV-mediated SCN1A gene replacement is sufficient for significant rescue in DS mouse models and show that it could be an effective therapeutic approach for patients with DS.
[0184] Dravet syndrome (DS) is a severe early-onset epileptic encephalopathy marked by spontaneous and febrile seizures, motor disabilities, cognitive dysfunction, developmental delay, and heightened risk of premature death by sudden unexpected death in epilepsy (SUDEP). DS afflicts approximately 1 : 16000 births, usually manifests in the first year of life, and produces profound symptoms that require life-long care. Most first-line anti-epileptic drugs are ineffective or contraindicated for DS, although several recently approved drugs now partially ameliorate DS symptoms. Critically, no FDA-approved long-term disease-modifying treatments currently exist for DS despite extensive efforts. As a result, a treatment for DS is a pressing unmet need for patients and their caregivers.
[0185] Several aspects of DS pathophysiology have become clear. Over 80% of patients harbor monoallelic loss-of-function mutations in SCN1A, which encodes NaVl.l, one of the voltage-gated sodium channels expressed in brain. Mouse models with monoallelic Senia disruptions recapitulate the major clinical presentations of DS, confirming genetic causality in DS. These mouse models indicate DS is associated with loss of excitability in telencephalic fast-spiking interneurons, consistent with inhibition stimulation studies in patients.
[0186] Furthermore, targeted monoallelic Senia disruption in interneurons is sufficient for epileptic symptomology, which can be ameliorated by simultaneous disruption in excitatory neurons. Overall, these results indicate interneurons are the critical pathological cell population in DS.
[0187] Herein, we utilize enhancer-adeno-associated viruses (AAVs) to allow for interneuron-specific expression of a DS therapeutic transgene. To overcome the limitation of genome size constraints (~4.7kb) of AAVs, which precludes delivery of human SCN1A open reading frame (ORF, 6.0kb) in a single AAV, we split the SCN1A ORF into two fragments (two “halves”) that undergo intein-mediated ligation to reconstitute a scarless, full-length functional voltage-gated sodium channel Navl.l. [0188] In this study, we demonstrate functional rescue in DS mice by restoring SCN1A to telencephalic GABAergic interneurons using AAV vectors. With this split-intein mechanism and class-specific enhancers, we demonstrate interneuron-specific delivery and reconstitution of NaV 1 1, which completely rescues mortality in DS model mice and confers strong resistance to epileptic seizures. Importantly, we show that expression of the transgenes or individual SCN1A halves in WT mice does not cause any overt toxicity. In contrast, we also find that nonselective neuronal expression of SCN1A generates early lethal toxicity. Together these data indicate that telencephalic GABAergic interneuron-specific expression of SCN1A could provide a safe and effective therapeutic for DS.
[0189] Split-intein fusion constructs produce full-length functional NaVl- 1 channels.
[0190] The open reading frame for human SCN1A (6.0 kb) is larger than the packaging limit of recombinant AAVs(~4.7kb); this has thus far prevented an AAV gene replacement therapy for Dravet syndrome. To overcome this challenge, we divided the gene into two halves and used split-intein protein splicing to reconstitute the gene product (NaVl • 1 channel) after translation. We designed and built DNA constructs using this approach for a dual vector gene replacement therapy for Dravet syndrome in mice.
[0191] We developed split-intein DNA constructs using the predominantly expressed 1998- aminoacid SCN1A isoform (Fig. 8A-8E) sequence which was codon optimized to maximize expression. We placed the split-intein breakpoint directly upstream of Cysl050 since intein- mediated protein ligation requires the presence of a cysteine residue adjacent to the split site. We utilizedthe Cfa split-intein which was engineered for rapid activity and chemical stability. The N- and C-terminals of the Cfa split-intein were respectively fused to the N- and C-terminals of halvesof the split SCN1A gene (Fig. 1A). Tothe recombinant split-intein SCN1A halves, we added optimized short intron sequences within each coding sequence for improved expression, as well as HA and FLAG epitope tags for immunodetection (Fig. 1A, IB). Exemplary short intron sequences added are highlighted in italics or underlined in ‘hSCNlA-CO-Nterml049- Intron,’ ‘hSCNlA-CO-Cterm949-Intron,’ ‘hSCNlA-CO-Nterm956-Intron,’ ‘hSCNlA-CO- Cterml042-Intron,’ ‘hSCNlA-CO-Nterm947-Intron,’ and ‘hSCNIA-CO-Cterm 1051 -Intron’; or has a polynucleotide sequence of
GTAAGTACTAGCAGCTACAATCCAGCTACCATTCTGCTTTTATTCTATGGTTGGG ATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTTCATA CCTCTTATCTTCCTCCCACAG (SEQ ID NO: 107).
[0192] To confirm intein-mediated reconstitution of full-length NaVl. l channels, the SCN1A gene halves were expressed in HEK-293 cells using the CMV promoter (Fig. IB, 1C). By western blot analysis, cells transfected with HA-tagged full-length SCN1A produced a single band with an apparent mass of approximately 250 kDa (predicted mass 232 kDa, Fig. 1C Lane 2). Those transfected with HA-tagged SCNIA-Ntm half led to an apparent mass of approximately 135 kDa (predicted mass 134 kDa, Fig. 1C Lane 4). However, cells transfected with both SCNIA-Ntm and SCNIA-Ctm halves demonstrated a strong band at approximately 250 kDa and only a weak band at approximately 135 kDa (Fig. 1C Lane 6). These results demonstrate that our constructs led to efficient intein-directed ligation of NaV 1.1 protein halves.
[0193] To examine whether reconstituted NaVLl protein derived from split-intein SCN1A halves was functional, we assessed HEK-293 cells expressing these constructs by whole-cell patch clamp electrophysiology. Fluorescent protein reporters appended to each SCN1A expression construct were used to identify transfected cells and confirm expression (Fig. IB). To promote NaVLl channel cell surface expression, constructs were co-transfected with NaVL l P-subunits (SCN1B/SCN2B). We observed that cells expressing full lengthW/M (SCN1A-FL) constructs show rapid and large inward sodium currents in response to depolarizing voltage steps (Fig. ID). Cells expressing both SCNIA-Ntm and SCNIA-Ctm (SCN1A-N+C) also showed large inward sodium currents, but not cells expressing either halves alone or empty vector (Fig. ID). Quantification revealed significantly greater capacitance-normalized peak current densities in SCN1A-N+C cells (median current density 130.6 pA/pF) than cells singly expressing either SCN lA-Ntmor SCNIA-Ctm alone or negative controls (SCN1A-N 4.5 pA/pF, p< 0.0001; SCN1A-C 3.3pA/pF, p< 0.0001; GFP/empty vector 3.7 pA/pF, p= 0.0084; P-subunits alone 2.3 pA/pF, p= 0.0038, all pairwise Mann-Whitney U tests, Fig. IE). Median peak current density for SCN1A-N+C cells (130.6 pA/pF) was significantly less than that observed in SCN1A-FL cells (232.0 pA/pF)(p= 0.0083, pairwise Mann-Whitney U test), possibly due to less of each DNA construct used for transfections (0.4 pg each of SCNIA-Ntm and SCNIA-Ctm, as versus 0.8 pg forSCNIA-FL). Examinations of the records at faster time scales show that both SCN1A-FL and SCN1 A-N+C cells expressed similarly rapidly activating and inactivating inward currents in response to step depolarizations (Fig. IF)
[0194] To quantify whether the current mediated by split-intein reconstructed channels exhibit the known normal voltage-dependent gating properties of NaVLl channels, we analyzed conductance-voltage (G/V) activation and steady-state inactivation (SSI) relationships. Indistinguishable G/V activation plots were exhibited by cells transfected with SCN1A-FL (V50= -27.9 mV, slope= 6.2) or SCN1A-N+C (V50= -27.3 mV, slope= 5.9), similar to previously reported full-length SCN1A (V50= -26.4 mV, slope= 7.1). Steady-state inactivation profiles for SCN1 A-FL and SCN1 A-N+C also were both similar to that previously reported for full-length SCN1A (V50= -67.5 mV, slope= -6.2). The SSI profile for reconstituted SCN1 A-N+C currents (V50= -64.1 mV, slope= -6.0) was observed to be slightly but significantly depolarized by approximately 5 mV relative to SCN1A-FL (V50= -69.4 mV, slope= -6.0) (Fig. 1G). Overall, these results demonstrate that functional NaVl.l sodium channels are efficiently reconstituted from our two half SCN1A split-intein constructs, with similar voltage-dependent gating properties to full-length SCN1A, and this permits delivery of NaVl.l channel activity from two AAV-sized vectors.
[0195] DLX2.0-driven intein-SCNIA vectors specifically transduce telencephalic GABAergic interneurons.
[0196] We cloned the split-intein fusion protein halves HA-SCN1 A-Ntm and SCN1 A-Ctm- FLAG into AAV plasmid vectors under control of an optimized hDLXI5/6i enhancer (“DLX2.0”) that drives transgene expression in telencephalic GABAergic interneurons in mice and in human organotypic slice cultures. These constructs were packaged into research-grade AAV2/PHP.eB viral vectors at Packgene Inc. (Fig. 2A). Neonatal mice were injected at postnatal day 2 (P2) with 5 pL total volume of these vectors delivered bilaterally via intracerebroventricular (ICV) route at a dose of 3el0 genome copies (gc) of each half. After twenty days, we analyzed mouse motor cortex membrane protein content by western blot to assess efficiency of half joining byintein-mediated fusion. Mice injected with either DLX2.0- HA-SCN1 A-Ntm or DLX2.0-SCN1 A-Ctm-FLAG PHP.eB AAVs alone showed weak HA- or FLAG-immunoreactive bands near their expected sizes of their protein products (N-term predicted size 134 kDa, C-term predicted size 115 kDa, Fig. 2B). The weakness of the unjoined SCN1 A half-protein bands in our membrane protein fractionis likely a technical artifact of the western blot membrane protein preparation since half proteins alone are efficiently detected by immunohistochemistry (IHC) with no obvious change in subcellular distribution when codelivered (see Fig. 2C). Regardless, when we co-injected viral vectors for both halves, we observed strong bands near the expected apparent size of intact NaVl. l near 250 kDa, demonstrating that intein-mediated reconstitution of full-length human NaVl.l occurs in mouse telencephalic GABAergic interneurons (Fig. 2B).
[0197] We also analyzed the biodistribution of transgene expression by IHC in animals injected with these vectors. We observed strong HA and FLAG immunoreactivity in scattered tel encephalic neurons of mice injected with each vector alone, or when injected together (Fig. 2C). Both HA- and FLAG-expressing cells were present throughout the telencephalon after BL-ICV delivery (Fig. 2D). Co-staining with the GABAergic neuron markers GABA and Gad67 demonstrated high overlap of HA- and FLAG-expressing cells (Fig. 2C, 2E). Quantitative analysis of these IHC data showed that 98-99% of cells co-expressing HA and FLAG were Gad67+ in several telencephalic regions (Fig. 2F), indicating high specificity. Additionally, HA+FLAG+ expression was observed in a substantial proportion of the telencephalic Gad67+ GABAergic neuron population in different brain regions, including those known to be involved in seizure generation such as hippocampus and cortex (mean ± standard deviation: VISp 57 ± 18%, MO 57 ± 10%, HPF 31 ± 11%, n= 8 mice, Fig.2F). Using a separate cohort of Dlx5/6-Cre; AH4 mice to provide an independent label for telencephalic GABAergic interneurons, we observed similar high levels of specificity and moderate completeness of AAV transduction (Fig. 9A-9C). These data demonstrate that ourvectors effectively deliver both the C- and N-terminal split-intein SCN1A halves with high specificity and moderate coverage for telencephalic GABAergic interneurons in mice. Together with our western blot and electrophysiology results above, these findings indicate that upon their expression, the two protein products of SCN1A halves fuse using the split-intein leadingto expression of functional full-length NaVLl channels in telencephalic GABAergic interneurons.
[0198] Dual DLX2.0 split-intein SCN1A vectors protect against Sudden Unexpected Death in Epilepsy (SUDEP) in DS mouse models.
[0199] With the ability to deliver SCN1A with cell class specificity, we next tested whether telencephalic GABAergic interneuron-specific SCN1A gene replacement could rescue DS symptoms in mouse models. We used Sen lcr ;Meox2-('re DS mice bred on a pure C57B1/6 background housed at Seattle Children’s Research Institute. These animals demonstrate -50% mortality due to SUDEP by P70 (young adulthood), similar to other DS model lines, analogous to the -15% rate of SUDEP in DS patients. We injected a cohort of neonatal DS model mice (BL-ICV at P0-P3 with 3el0 gc each vector of DLX2.0-split-intein-SCNlA produced at PackGene Inc.) and measured mortality and susceptibility to thermally induced seizures in these animals (Fig. 3A). Remarkably, all treated DS mice (n= 27/27) survived beyond P70, as opposed to negative control mice either untreated or receiving control AAVs (empty or single part-only vectors), which showed significantly higher mortality (Untreated n= 34/68, 50% mortality, Log-rank testp=1.4e-5; Empty/single-part n= 15/40, 37.5% mortality, Log-rank test p= 6.6e-4; Fig. 3B). No side effects were observed in littermate control mice that received dual vector treatment. Strikingly, a subset of the treated DS animals was maintained longer, and 100% survived to beyond P200 (n= 11/11). These findings demonstrate that telencephalic GABAergic interneuron supplementation of AAV-mediated SCN1A transgene is sufficient to completely prevent SUDEP in DS model mice.
[0200] Dual DLX2.0 split-intein SCN1A vectors protect against thermal and spontaneous myoclonic and generalized tonic-clonic seizures in DS mice.
[0201] DS model mice are sensitive to thermally induced seizures, similar to patients with DS. We previously showed that small elevations of core body temperature trigger several myoclonic (MC) seizures, leading ultimately to a generalized tonic-clonic (GTC) seizure in DS mice. To examine the efficacy of our DLX2.0-SCN1A vectors in preventing thermally induced MC and GTC seizures, we analyzed DS mice between P25 to P35 (Fig. 3A). First, video analysis revealed that mice treated with dual DLX2.0-SCNlAAAVs were protected from MC seizures up to 40°C (n= 16/16), at which temperature fewer than half of negative control mice were MC seizure-free (Untreated n= 5/11, Empty/single part AAVs n= 9/23, Fig. 3C). With further increase in temperature, a growing fraction of the treated mice began to exhibit MC seizures, reaching 50% of treated subjects at 42°C (n= 8/16), but at this temperature significantly greater numbers of negative control mice experienced MC seizures (Untreated n= 9/11, Fisher’s exact test p= 0.042; Empty/single part AAVs n= 21/23, Fisher’s exact test p= 0.011, Fig. 3C). Second, we also observed treatment conferred GTC seizure protection in this assay. All treated DS mice were free from GTC seizures up to 41.5°C (n= 16/16, 100%, Fig. 3D), and the majority were protected from GTC seizures at 42°C, the last stage of the test (n= 15/16, 94%), which was significantly greater than negative control treatment groups (Untreated n= 4/11, Fisher’s exact test p= 0.0025; Empty/single part AAVs n= 7/23, Fisher’s exact test p= 0.00010, Fig. 3D). Finally, assessment of all the MC seizures preceding a GTC seizure showed that the cumulative MC eventnumber before GTC seizure onset was significantly suppressed when DS mice were treated with dual DLX2.0-SCN1A AAVs compared to control AAV treatment or untreated controls (Untreated or Empty/single versus DLX2.0-SCN1A, unpaired t-test p< 0.01 at each temperature from 39.5 to 41, Fig. 3E). These findings demonstrate that DLX2.0 mediated delivery of SCN1 A transgene specifically to telencephalic GABAergic interneurons yields substantial protection from thermally induced seizures.
[0202] To assess the effectiveness of DLX2.0-SCN1A AAV vectors to protect against spontaneous epileptic symptoms, we implanted untreated and treated DS model mice with electrodes for electrocorticography (ECoG) recordings. Untreated DS model mice displayed high-amplitude interictal spikes (Fig. 4A), and the frequency of these spikes was significantly diminished by dual DLX2.0-SCN1A AAV treatment (ANOVA p <0.05, Fig. 4B). We also observed MC events during recordings of untreated DS model mice (n = 10/10, Fig. 4C), which were absent in recordings from treated DS model mice (n = 0/10, p = 7. le-4 Fisher’s exact test, Fig. 4D). Finally, some untreated DS model mice exhibited spontaneous generalized tonic- clonic (GTC) seizures during recording (n = 2/10, Fig. 4E), but none of the treated animals exhibited GTCs (n= 0/10, Fig. 4F) although this effect was not significant (p = 0.47 Fisher’s exact test). These results indicate that dual DLX2.0-SCN1A AAV treatment can yield protection against spontaneous epileptic symptoms in DS model mice.
[0203] Reproducibility across AA V batches.
[0204] To confirm these findings, we tested independent batches of DLX2.0-split-intein- SCN1 A vectors produced in-house at Allen Institute, and we observed dose-dependent specific expression in telencephalic GABAergic interneurons (Fig. 10A), which correlated with dosedependent increases in survival (Fig. 10C) and thermal seizure protection (Fig. 10D, 10E). These results show reproducibility of epileptic rescue with independent batches of vector, although this in-house-packaged vector showed slightly reduced levels of transduction and rescue as compared to the commercially produced vector.
[0205] Reproducibility across mouse models and testing sites.
[0206] To further confirm the therapeutic effects were robust, we retested the PackGene batch of DLX2.0-SCN1A AAV vectors in a second independent Scnla+/R6I3X mouse model cohort on 129/BL6 Fl background housed at the Allen Institute for Brain Science (Fig. 11A). We observed extensive premature lethality by P70 (n= 19/33, 58%) in uninjected mice, but 100% survival in mice injected with the dual DLX2.0-SCN1A AAVs (n= 30, p< 0.001 by Mantel-Cox test, Fig. 11B) After mortality monitoring, some mice underwent ECoG recordings which revealed untreated DS model mice demonstrate spontaneous GTC seizures and interictal spikes between P70 to P120 (Fig. 11C), which were both significantly reduced by the administration of DLX2.0-SCN1A vectors (GTCs, Mann-Whitney U test p= 0.020; spikes, Mann-Whitney U test p=0.033; Fig. 11C-11E). A separate subset of this cohort was monitored for long-term survival, and the dual -vector injected DS model mice exhibit 100% survival to beyond P365 (n= 14/14), with no evidence of toxicity in littermate control animals receiving either or both halves of the DLX2.0-SCN1A dual AAV vector system (Fig. 11B). Thus, DLX2.0-SCN1A AAV vectors conferred strong and reproducible protection from mortality, induced seizures, and spontaneous seizure burden in two independent genetic mouse models of DS, although this protective effect of the AAV treatment is dose- and AAV quality- dependent.
[0207] Dual DLX2.0 vectors completely protect against SUDEP and seizures in mice with telencephalic GABAergic interneuron-specific Senia deletion.
[0208] Much more severe mortality has been seen with specific Senia loss in interneurons, likely due to a disrupted excitatory/inhibitory balance. To further investigate the effectiveness of the dual vector DLX2.0-SCN1 A AAV, we delivered the AAVs to mice carrying the diseasecausing mutation in the same telencephalic GABAergic interneuron population. These mice were generated by crossing mice carrying the Dlx5/6-Cre allele with mice carrying floxed Senia (Fig. 5A). Since the site of disease pathogenesis precisely matches the treatment target, in this experiment we directly tested the therapeutic effectiveness of the DLX2.0-SCN1 A AAV vectors in amore precise rescue scenario. Scnlaf+; Dlx5/6-Cre mice were injected with dual DLX2.0-SCN1A AAVs via BL-ICV route at P0-3 and monitored these and untreated controls for premature death up to P70. Untreated mice showed severe mortality starting at postnatal week 3, with all the mice succumbing by week 6 (n= 31/31, Fig. 5B). In striking contrast, all mice treated with dual DLX2.0-SCN1 AAAVs survived up to P70 (n= 9/9, p= 3.3e-14, Fisher’s exact test, Fig. 5B), despite the more severe adverse phenotype compared to mice with a global heterozygous loss of Senia. Furthermore, none of the treated mice exhibited either MC or GTC seizures during thermal challenge up to 42°C, unlike untreated animals (MCs: n= 8/10 untreated versus 0/9 treated, p= 7.1e-4; GTCs n= 10/10 untreated versus 0/9 treated, p= l.le-5; both Fisher’s exact test; Fig. 5C). These data demonstrate that rescue of DS phenotypes is possible when the therapeutic transgene is precisely delivered to the critical cell populations carrying disease-causing mutations, even in the face of more severe symptoms.
[0209] hSynl -driven vectors lead to nonselective neuronal expression of SCN1A.
[0210] To determine whether telencephalic GABAergic interneuron-selective targeting is beneficial for gene replacement therapy in DS, as a comparator we produced and tested split- intein vectors driven by hSynl which expresses in most brain neuronal populations, including excitatory and inhibitory neurons (Fig. 6A). Constructs for these vectors were built and packaged in the same way as DLX2.0 ones, except that hSynl promoter was used in this case. Delivered by BL-ICV at P2, these hSynl split-intein vectors led to reconstitution of full-length NaVl .1 in mouse brain (Fig. 6B), with expression observed in both Gad67+NeuN+ and Gad67- NeuN+ neurons (Fig. 6C). This expression pattern was observed throughout the telencephalon, with little expression seen in subtelencephalic structures likely due to the forebrain-biased delivery route (Fig.6D). Quantification confirmed all labeled cells to be NeuN+ neurons (Fig. 6E-6F). Average levels of completeness for HA+FLAG+ cells ranged from 6-18% NeuN+ cells and 13-33% Gad67+ cells, depending on the telencephalic structure analyzed (Fig. 6F). [0211] Dual hSynl nonselective neuronal AAV vectors led to pre-weaning mortality.
[0212] To characterize the efficacy of nonselective neural SCN1 A AAVs to prevent SUDEP, we conducted spontaneous mortality surveillance in Senia'1 ;Meox2-Cre DS model mice after BL-ICV injection of dual N+C AAVs at P0-3, as compared to untreated or empty or single part control animals (Fig. 7A). During the preweaning weeks, mice treated with dual nonselective AAVs exhibited a surfeit of unexpected deaths. The extent of pre-weaning mortality by P21 was dose-dependent and significantly greater than that observed under negative control conditions (low dose lelOgc each vector, n= 9/36 [25%] deaths, Fisher’s exact test untreated comparison p= 3.6e-4; high dose 3el0 each vector, n= 8/18 [44%] deaths, Fisher’s exact test untreated comparison p= 6.6e-6, Fig. 7B). Since it was not possible to recover the lost pups, census of DS and control mice during pre-weaning stage were estimated based on the number of DS and control mice identified after genotyping at P21 in this experiment and the number of DS and control mice observed in our colony in untreated mice in the prior 6 months period. This analysis did not reveal any detectable influence of genotype on nonselective hSynpl SCNIA-induced mortality during the preweaning period (Fig. 7C), and pathology of surviving nonselective SCNIA-expressing brains indicates no obvious microgliosis but greater DS- associated astrogliosis. For the latter, we sacrificed Scnla+/fl; Meox2-Cre DS model and Cre- negative littermate control mice following mortality monitoring after P2 BL-ICV injections (3el0 gc each vector) of hSynl-driven N+C SCN1A or DLX2.0-driven N-only or C-only single-part negative controls as indicated. Animal ages ranged from P72-P89, and conditions represent two (littermate control) or three (DS model) animals analyzed per condition. We analyzed brains by H4C to assess astrogliosis (GFAP) or microgliosis (Ibal) in cortex (VISp shown). For DS model mice, two example mice spanning the range of astrogliosis observed are shown. Nonselective neuronal expression of SCN1A exacerbated astrogliosis in DS model mice but did not cause overt changes in microglial appearance.
[0213] In the post-weaning period, we did not observe significant effects on survival or average age of death with either the high-dose or low-dose nonselective SCN1A AAV treatments as compared to the untreated or control AAV-treated mice (Fig. 7B). Together these findings indicate that nonselective neuronal expression of SCN1 A offers little protection from SUDEP and concemingly, has a dose-dependent mortality side effect during the pre-weaning period.
[0214] Dual hSynl AAV vectors confer partial protection against thermal MC and GTC seizures.
[0215] To examine whether the nonselective AAV mediated gene therapy might counter thermal seizure susceptibility in DS mice, we also tested these mice with the thermal seizure induction protocol. In surviving mice treated with the high dose of dual hSynl AAVs, as compared to empty/single part negative controls we observed significant protection from thermally induced MC seizures (Log-rank test p= 0.020, Fig. 7D) and from thermal GTC seizures (Log-rank test p= 0.047, Fig. 7E), as well as greatly lessened cumulative MC seizure load during the thermal challenge (Untreated or Empty/single versus Treated, unpaired t-test p< 0.05 at each temperature from 40 to 41, Fig. 7F). In contrast we observed no protection from heat-induced MC or GTC seizures in mice treated with the lower dose of the dual AAVs compared to negative control DS mice (Fig. 7C, 7D-7F). Thus, despite the early mortality induced by nonselective neuronal SCNIAAAVs, they can offer protection from thermally induced MC and GTC seizures at the higher dose in surviving treated adults.
[0216] Overall in this study we use an AAV viral vector system to deliver functional human NaVL l tovpathological telencephalic GABAergic interneurons in DS model mice, using the optimized enhancer DLX2.0 to achieve high specificity. Importantly, we find not only that delivery to telencephalic GABAergic interneurons is sufficient for strong rescue of epileptic symptoms, but also that this specificity is required to deliver human NaVL l in a manner that can be tolerated. DLX2.0-SCN1A AAV vectors achieve long-term recovery of DS mortality (from 200 days to one year) in two independent DS mouse models at two independent testing sites. This robust mortality protection correlates with strong anti-seizure effect, indicating the mechanism behind mortality rescue is seizure reduction due to resupplying NaVLl voltagegated sodium channel activity.
[0217] The cell types sufficient to rescue DS epilepsy using SCN1A gene replacement [0218] Mouse models of DS indicate that epileptic symptomology is driven by a disruption of the excitatory/inhibitory balance, and congruently, patients display disrupted inhibition in the cortical microcircuit. Our results confirm the hypothesis that DS epilepsy is primarily a disease of the interneurons, and that interneuron targeting is an effective therapeutic strategy. Within this cell class, several subclasses of telencephalic GABAergic interneurons may contribute to the disease. In particular, PVALB-expressing interneurons are thought to promote beneficial gamma rhythms, possibly through their 5cw7a-dependent fast-spiking behavior. However, SST-expressing and other telencephalic interneuron populations may also contribute to the epileptic phenotype. Importantly, the DLX2.0 enhancer used in this study targets both PVALB and SST and also other telencephalic interneuron populations, in both mouse and human tissue, which may explain the strong anti-epileptic effect of DLX2.0-SCN1A.
[0219] A unique SCN lA-targeting disease-modifying strategy for DS.
[0220] Drug development to find pathway modulators that can overcome deficient NaVl .1 has been challenging. In preferable embodiments, we do not utilize high-capacity adenoviral vectors (HC-AdVs, also known as Helper-Dependent or “gutless”), which are devoid of all viral coding genes but offers limited biodistribution. Our AAV-mediated and telencephalic GABAergic interneuron-selective gene replacement strategy is unique in several ways. First, our vectors express a new copy of the SCN1A gene at moderate levels, and don’t upregulate the endogenous SCN1 A allele, a strategy that could be ineffective or harmful for certain disease alleles. Second, the vectors target telencephalic GABAergic interneurons, the most essential cell type, whereas antisense oligonucleotides lack targeting ability. Since SCNIAis expressed in both excitatory and inhibitory neurons, activation of the SCN1A in excitatory neurons could attenuate the corrective effect of SCN1A upregulation. Last, using AAV delivery, with the PHP.eB capsid and neonatal ICV injection, widespread viral transduction of telencephalic GABAergic interneurons is achieved. We are not aware of a vector that can give superior widespread transduction to the brain. None of the previous studies delivering or activating NaVl .1 using exogenous agents have demonstrated complete recovery of mortality as we have observed. Moreover, we have demonstrated robust rescue in two genetic models, at two research sites, and even in a severe £>/x5/6-Cre-driven knockout model. Thus, gene replacement, cell type specificity, and broad coverage of telencephalic interneurons provides a unique and highly effective treatment for DS.
[0221] In some embodiments, for human DS patients, the split-intein fused SCN1A halves are delivered in advanced BBB-penetrant AAV capsids. For example, the AAV capsids comprise AV-PHP.eB, which efficiently transduce the central nervous systems. In another example, the AAV capsids comprise AAV-PHP.S, which efficiently transduce the peripheral nervous systems. In some embodiments, for human patients, the split-intein fused SCN1A halves delivered in AAV capsids are administered locally with intraparenchymal delivery.
[0222] We have not observed signs of toxicity over one year in mouse brain, indicating the intein fusion proteins delivered with cell type specificity may be safe. Other regulatory elements that confer a broader distribution pattern extending to sub-telencephalic regions may be used to further treat non-epileptic DS. For example, regulatory elements known capable of restricting expression to interneurons include the distal-less homeobox 5 and 6 (Dlx5/6) genes, which are specifically expressed by all forebrain GABAergic interneurons during embryonic development. These genes have an inverted orientation relative to one another and share a 400bp (mI56i or mDlx) and a 300bp (mI56ii) enhancer sequence in the lOkb non-coding intergenic region 3' to each of them. The high degree of conservation of these sequences across vertebrate species is indicative of an important role in gene regulation. Indeed, the mDlx enhancer can be used to target reporter genes in a pattern very similar to the normal patterns of Dlx5/6 expression during embryonic development, e.g., selectively expressed within GABAergic interneurons in a wide variety of vertebrate species.
[0223] We demonstrate proof-of-concept for AAV-mediated SCN1A gene replacement therapy in DS, which requires cell class specificity for safety and efficacy. These results and these vectors represent an important step towards a gene replacement gene therapy for DS patients, and possibly other conditions of pathologic insufficient sodium channel function.
[0224] Materials and Techniques
[0225] Study design.
[0226] In this study, we tested if SCN1A gene replacement was possible using a dual vector system to deliver both halves of the molecule and split-intein technology to fuse the expressed halves into a single scarless full length protein. We tested if SCN1A could be expressed in a circuit-selective fashion using enhancer- AAVs to deliver the transgenes to telencephalic GABAergic interneurons or all neurons. Then, we tested if these circuit-selective enhancer- AAVs were sufficient to correct major epileptic phenotypes of mouse models of DS. We selected three major phenotypes for testing that correspond to clinical phenotypes seen in DS: mortality, heat-induced seizures, and spontaneous seizures. In vitro studies were conducted with multiple biological replicates to demonstrate the full-length NaVl.l functional channels were formed. NaVl.l assembly in GABAergic interneurons was demonstrated in multiple replicates using viruses from two sources in vivo in mouse brains. Circuit specificity and completeness were demonstrated by immunohistochemistry, and full-length NaVl .1 assembly by western blot analysis. Totest in vivo efficacy, litters of mice were treated between P0 andP3 with the dual viruses or single virus controls by ICV administration prior to genotyping. In vivo efficacy was further confirmed at a second site on a genetically distinct DS model. Animal sexes were noted but no differences in efficacy were observed. All experiments were run across multiple litters, and sufficient animals were injected and evaluated to draw statistically significant conclusions. Dual vectors produced from two sources were tested in vivo for efficacy. Researchers were not blinded to the animal genotypes after they were determined, or tovectors delivered. No data were excluded from the study. For statistical analysis, data are reported as means ± standard error of the means, Kaplan-Meier survivor plots, or grouped bar graphs. Comparisons across continuously distributed grouped data were prefaced by Shapiro- Wilk tests for distribution normality. When normality assumptions held, the groups were compared by unpaired t-tests or ANOVA, but when normality is not held, the groups were compared by Mann -Whitney U test. To compare Kaplan-Meier survivorship, we used Mantel- Cox or Log-rank tests. To compare groups of categorical count data, we used Fisher’s exact tests. Differences were considered significant at p< 0.05, with Bonferroni adjustments to significance thresholds with comparisons across multiple groups.
[0227] Mice at Seattle Children ’s Research Institute (SCRI).
[0228] For studies conducted at SCRI, all experimental procedures were conducted in compliance with the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health and were approved by the Institutional Animal Care and Use Committee (IACUC) of the Seattle Children’ s Research Institute under protocol ACU000108 (PI-Kalume). WT and mutant mice within litters were used in these experiments. They were subjected to a treatment or control paradigm as described below. Both male and female mice were included in the studies. Each litter was randomly assigned to the treatment or control group.
[0229] Mice at SCRI were maintained in standard cages for laboratory mice, on a 12h:12h light-dark cycle, with ad-libitum access to food and water, at 23 degrees C. Mouse models and their littemiates of DS used in these studies were generated using Cre-Lox technology'. DS mice carrying a whole body heterozygous knock-out of Senia were obtained by breeding fl oxed Senia mice with MCOX2~Cre mice (Strain #: 003755; Jackson Laboratories). DS mice carrying an Senia KO allele restricted to (specifically in) forebrain GABAergic neurons alone were generated by breeding fl oxed Senia mice with Dlx5/6-Cre mice (Strain#: 008199, Jackson Laboratories). All breeder mice were maintained on a C57BL/6J background for at least 10 generations. Animals were genotyped for Senia floxed allele using the following primers: FHY311 (5’-CTTGATGTGTTGAAATTCAC-3’ (SEQ ID NO: 103)) and FHY314 (5’- TATAGAGTGTTTAATCTCAAC-3’ (SEQ ID NO: 104)) which yielded a 846 BP WT allele, 1019 BP floxed allele; and 258 BP excised allele. For the Cre alleles, the following primers were used: (5’-GGTTTCCCGCAGAACCTGAA-3' (SEQ ID NO: 105)) and (5’- CCATCGCTCGACCAGTTTAGT-3’ (SEQ ID NO: 106)) (Jackson Laboratories) [0230] Thermal seizure test.
[0231] Mouse core body temperature was monitored and controlled using a rectal temperatureprobe (RET4) and a heat lamp, both connected to a temperature controller in a feedback loop (Physitemp Instruments Inc.). Baseline body temperature was measured and subsequently, gradual temperature increases of 0.5°C every 2 minutes were conducted until seizure occurrence ora 42°C temperature was attained. Then, the mouse was immediately cooled down using a small fan. Mouse behavior during the whole test was recorded using a digital video camera and reviewed for MC and GTC seizure scoring.
[0232] Intracerebroventricukir Injections.
[0233] Neonatal P0-3 mice were cryo-anesthetized on a small aluminum plate placed on ice. Single AAVs or dual AAVs were injected bilaterally into lateral ventricles using a33-gauge needle attached to Hamilton microliter syringe. 2.5 pl of the AAV solution were injected in each ventricle for a total of 5 pl per mouse containing a total of lelO or 3el 0 gc each viral vector. Following the injection, mice were put back into their nest and placed on a wanning pad until their body temperature returned to normal. Subsequently, they were returned to the cage with the mother.
[0234] Electrocorlicography (ECoG) electrode implantation surgery.
[0235] Mice underwent survival surgery to implant ECoG and EMG electrodes, under isoflurane anesthesia. A midline incision was made above the skull to expose the site of electrode implantation. ECoG electrodes consisted of a micro-screw attached to a silver ware (diameter: 130 pm bare; 180 pm coated). EMG electrodes were made of asilver wire shaped in a loop at one end. An ECoG electrode micro-screw was inserted into a small cranial burr hole above the somatosensory cortex in each hemisphere. Similarly, a reference electrode microscrew was placed in a burr hole above the cerebellum. EMG electrodes were inserted and secured into the neck muscles. All electrodes were attached to an interface connector and the assembly was affixed to the skull with dental cement (Lang Dental Manufacturing Co., Inc., Wheeling, IL, United States). The incision around the electrode implant was closed using sutures. Mice were allowed to recover from surgery for 1-3 days before recording.
[0236] Video-ECoG-FJvIG recording.
[0237] Simultaneous video-ECoG-EMG records were collected in conscious mice on a PowerLab 8/35 data acquisition unit using LabChart 8.0 software (AD Instruments, Colorado Spring, Co). All bioelectrical signals were acquired at 1 -KHz sampling rate. The ECoG signals were processed with a 1-70 Hz bandpass filter and the EMG signals with a 10-Hz highpass filter. Power-spectral densities of the electrical signals were computed, and video-ECoG-EMG records were inspected for interictal spikes and ictal epileptiform events. Interictal spikes were characterized on ECoG as discharges with an abrupt onset, a sharp contour, and an amplitude greater than twice the background activity. Interictal spikes were frequently followed by a slow wave, but they were not associated with increased EMG activity or movement on video. Conversely, GTC seizure events were marked at their onset by bursts of generalized spikes and waves of increasing amplitude and decreasing frequency on ECoG. They coincided with increased activity on EMG and video and were followed by a distinct period of post-ictal ECoG suppression.
[0238] For studies conducted at Allen Institute for Brain Science (AIBS), all mice were handled under appropriate institutional protocols and guidelines. Procedures were approvedby the Allen Institute Institutional Animal Care and Use Committee under protocols 2002 and 2301. We housed animals in a 14: 10 light:dark cycle in ventilated racks with ad libitum access to food (LabDiet 5001) and water, as well as enrichment items consisting of plastic shelters and nesting materials. Young animals are weaned promptly at 21 days of age. We obtained 129S1 SvIm.J-Scn
Figure imgf000063_0001
f J (here Scnla+/R613X) mice from Jackson Laboratory (strain # 034129) and maintained breeders on a 129Sl/SvImJ genetic background. To generate experimental animals to model DS, we crossed these animals with C57B1/6J mice (Jackson strain # 000664), which resulted in a 50:50 129:BL/6 Fl genetic background which has been successfully used to model epileptic phenotypes of DS in animals containing one loss-of-function allele of Senia. Genotyping was performed with tail biopsy at P2, and we utilized PCR-sanger genotyping services at Transnetyx for this line. For survival monitoring we checked cages twice daily for the presence of deceased animals.
[0239] Neonatal ICV injection.
[0240] We used the neonatal intracerebroventricular (ICV) injection technique. Briefly, we anesthetized P2 neonates with ice but shielded from direct ice exposure. During anesthesia, pups were injected freehand bilaterally with 5 pL (2.5 pL each hemisphere) of AAV-containing solution using a Hamilton syringe. AAVs were diluted in sterile PBS to expel either lelO or 3el0 gc of each of two halves of the dual -vector encoding split-intein human SCN1 A. In control animals, only one half of the dual-vector system was delivered, orDLX2.0-SYFP2-only empty control vector (CN1390), or mice were left untreated. Control vectors were delivered at 3el0 gc per animal with BL-ICV delivery. After injection, pups were gently warmed on a cage warmer set to 28°C with mother present.
[0241] Continuous video ECoG/EMG recordings.
[0242] We implanted adult mice (P56-90) with ECoG/EMG headmount. For stereotaxic surgical procedures, we induced anesthesia in mice first with 5% isoflurane in oxygen, and then maintained anesthesia with 1.5-2.5% isoflurane. We implanted screw electrodes with wire lead (0.08”, #8405, Pinnacle Technology Inc., KS, USA) over the left somatosensory cortex (AP: + 1 mm; ML: - 2.5 mm), the right parietal lobe (AP: - 2 mm; ML: + 1.5 mm), the right frontal area (AP: + 1 mm; ML: + 2.5 mm; as Ground), and the cerebellum (AP: - 5.8 to -6.2 mm; ML: 0 mm; as Reference). Electrode leads were soldered onto the 8-pin headmount (#8431-SM, Pinnacle Technology Inc.). The headmount contains two insulated EMG wire electrodes that are pre-soldered, and these EMG electrodes were inserted into the neck muscles. All wires, pins and the headmount were embedded in light curable dental composite resin (Prime-Dent, Prime Dental Manufacturing Inc., Chicago, IL, USA). Mice were singly housed post-surgery and recovered for at least 7 days prior to recording. Recordings were thus acquired between ages P74 and Pl 23. For recordings at AIBS, mice were singly housed in 10-inch clear acrylic chambers (#8228, Pinnacle Technology Inc.) under a 14-hr on, 10-hr off light/dark cycle. Mice were tethered with the pre-amplifier through a commutator to the data acquisition system (#8401 -HR, Pinnacle Technology Inc.). All ECoGZEMG data were recorded with a 500 Hz sampling rate, 10 X gain, a low pass (ECoG: 0.5 Hz; EMG: 1 Hz) filter, and a high pass (500 Hz) filter. Videos were recorded synchronously at a frame rate of 10 frame/s with a resolution of 640x480 pixels. We implanted a total of 18 non-injected Scnla+/R6I3X mice. Of these 18, seven mice (3M+4F) died during recovery prior to recording, and we recorded from the remaining 11 mice (9M+2F). For these non-injected animals we recorded for 53 to 335 hrs, with some recordings prematurely shortened because of death during the recording session. We also implanted a total of 11 DLX2.0-intein-SCNlA-injected Scnla+/R613X
Figure imgf000064_0001
Of these 11, we could not record from three mice due to surgical error (IM), hardware failure (IM), and one death following surgery (IF), and we recorded from the remaining 8 mice (4M+4F). For these DLX2.0-intein-SCNl A-injected Scnla+/R613X mice we recorded animals for 207-257 hrs.
[0243] To quantify ECoG data at AIBS, we quantified the number of GTC seizures with manual counting over the recording period and expressed as the average number of GTC seizures per 24 hrs of recording. We also counted the number of interictal spikes (IISs) during the last 24 hrs of recording for each mouse. To do so, we identified candidate interictal spike events by cumulative line length over a time interval of 50 msec with a threshold of 50 microvolts, plotted each candidate event for visual confirmation of spike-like characteristics (high amplitude strong deflection and return to baseline within 30 msec) and counting. These measurements of epileptic activity are non-normally distributed among animals by Shapiro- Wilk tests for normality (GTCs within non-injected Scnla+/R613X mice: W= 0.81, p=0.018; IISs within non-injected Scnla+/R613X mice W= 0.72, p= 0.0015), we compared number of seizures and IISs were compared between groups by two-sample Mann-Whitney U test (with significance level at p= 0.05). [0244] IHC.
[0245] Under avertin terminal anesthesia, we perfused mice with ice-cold PBS with 0.25 mM EDTA added (25 mL), followed by cold 4% PFA inPBS (12 mL). Following brain and other organ dissection, we post-fixed brain in 4% PFA in PBS overnight at 4°C. We prepared PFA in PBS in one liter-sized batches by dissolving PFApowder in PBS with heating, and froze 50 mL aliquots at -20°C until use, which was important as we found anti-Gad67 and antiGABA stain quality depended upon PFA preparation method. After overnight postfixation, we transferred brainsto 30% sucrose solution in PBS, and then embedded in OCT after sinking (48-72 hours), froze on dry ice and stored at -80°C until sectioning. We sectioned brains sagittally at 25 micron thickness using a Leica CM3050S cryostat and stored sections in PBS at 4°C until IHC. For IHC we permeabilized and blocked sections with blocking solution (PBS containing 5% normal goat serum [Thermo Fisher Scientific # 10000C] and 0.1% Triton X-100 [Millipore-Sigma # X100-100ML]) for 60 minutes. Then we probed with diluted primary antibody in blocking solution overnight, washed twice for 15 minutes each with PBS containing 0.1% Triton X-100 (PBSX), then detected with diluted 488-, 555-, and 647-conjugated secondary antibodies (all from Thermo Fisher Scientific) along with DAPI (4',6-Diamidino-2- phenylindole dihydrochloride, Millipore-Sigma # 10236276001, used at 1 pg/mL), washed twice with PBSX for 15 minutes each, then mounted sections on SuperFrost Plus slides (VWR # 48311-703) in Prolong Gold (Thermo Fisher Scientific # P36930), and imaged after overnight curing. We imaged slides on a Nikon TI-Eclipse epifluorescent or an Olympus FV-3000 confocal microscope.
[0246] We used the following primary antibodies: mouse monoclonal anti-FLAG clone M2 (Millipore-Sigma # F1804), rabbit monoclonal anti-HA clone C29F4 (1/1000, Cell Signaling # 3724S), mouse monoclonal anti-HA clone 16B12 (1/1000, Biolegend # 901513), mouse monoclonal anti-HA clone HA.C5 (1/1000, Thermo Fisher Scientific # MA5-27543), mouse monoclonal anti-Gad67 clone 1G10.2 (1/250, Millipore-Sigma # MAB5406), mouse monoclonal anti-NeuN clone 1B7 (1/500, Novus Biologicals # NBP1 -92693 AF647), guinea pig polyclonal anti-GABA(l/500, Millipore-Sigma # AB 175).
[0247] For costains with anti-Gad67, we omitted Triton X-100 detergent permeabilization during Gad67 immunoprobing and detection, then re-fixed antibodies onto sections with 4% PFA in PBS for 15 minutes at room temperature, then washed with PBS twice for 15 minutes, then performed a second round of permeabilization and reblocking and antibody staining with desired co-staining antibodies. This technique permits the best detection ofGad67+ inhibitory neurons. [0248] SCN1A ORF design, cloning, and packaging.
[0249] To design full-length human SCN1 Afor delivery in split intein-fusion SCN1A halves we started with the 1998-amino acid RefSeq sequence NP 001340878.1. This is because previous single cell RNA-seq studies in mouse and human demonstrate the major species expressed by cortical cell types includes the348-bp form of the 11th exon (of 26 exon numbering scheme), which corresponds to a full open reading frame of 1998 amino acids. The 2009 amino acid isoform is expressed at a lower level across multiple cell types including PVALB neurons (see Figure 8A-8B). We also included a threonine at position 1056 of 1998 (corresponding to position 1067 of the 2009-amino acid isoform). This residue is alanine in the NCBI RefSeq sequence but is a threonine in commercial clones available from Origene (catalog # RG220167), as well as a conserved threonine across most other mammalian species (Fig. 8C), and finally we observed this residue to be a threonine in three of three human tissue donors sequenced in our prior work (Fig. 8D, data available at dbGaP # phs002292.vl.pl). From the Genome Aggregation Database (gnomAD) the reference alanine allele appears to be the minor allele in the human population (27%), whereas the majority of alleles in the population encode threonine at that position (73%). As a result, we used threonine in that position for our SCN1A transgene.
[0250] From this protein sequence we split the protein into two halves at the natural cysteine at position 1050 (numbering according to 1998-amino acid isoform), and we appended the Cfa- N and Cfa-C intein proteins to the ends of the split breakpoints with no linkers, so that the natural cysteine residue would work as the C+l extein and result in scarless protein joining. The native Npu split-intein prefers hydrophobic residues such as the native methionine at the C+2 position, which we reasoned would likely promote half joining efficiency. We also appended HA and FLAG epitope tags (HA at the N-terminus of the N-terminal half, and FLAG at the C-terminus of the C-terminal half), with short dipeptide linkers between the epitope tags and SCN1A coding sequence. With these protein sequences we reverse-translated and performed codon optimization using Integrated DNA Technologies online codon optimization tool (www.idtdna.com/pages/tools/codon-optimization-tool), and manually adjusted the codon usage to minimize cryptic splice donors and acceptors which could negatively impact expression from unwanted splicing, and manually minimized repeats over 10 bp which might negatively impact cloning or expression. The full protein open reading frame (ORF) was 100% identical at the amino acid level to the native isoform, but only 76.6% identical at the nucleotide level. We also inserted a human intron from hemoglobin which we engineered to reduce possible cryptic TATA-box promoter sequences, as well as produce efficient splice donor and acceptor sites without need for exonic splice enhancer sequences. We then synthesized these sequences as G-blocks (Integrated DNA Technologies) and cloned them using Infusion kit (Takara biosciences catalog # 638948) into pSMART-HC-Kan (Lucigen) vectors along with upstream CMV promoters and downstream IRES2-SYFP2 or IRES2-mScarlet transfection reporters for expression in HEK-293 cells. Alternatively, for full-length human SCNIAwe assembled synthetic Gblocks lacking the intein fusion proteins but with overlapping ends using Infusion kit into pSMART alongside CMV promoter and downstream IRES2-mScarlet reporter. The full length human SCN1A pSMART vector contained only a C-terminal FLAG epitope but not an N-terminal HA epitope. For cloning into pAAV vectors we inserted the intein-fusion halves into CN1390 (Addgene plasmid #163505) in place of SYFP2 reporter for DLX2.0-driven expression or into CN1839 (Addgene plasmid #163509) in place of SYFP2- 10aa-H2B for hSynl-driven expression. During this cloning we also replaced BGH polyA in the original vectors sequences with shorter synthetic polyA sequences due to size constraints. [0251] Propagating full-length SCN1A sequence-containing plasmids in bacteria is challenging. To minimize rearrangement of full length SCN1A we used pSMART-HC-Kan backbone (Lucigen) which helps minimize backbone-driven transcription in bacteria. We also propagated SCN1A into CopyCutter EPI400 cells (formerly Lucigen, now BioSearch Technologies, catalog # C400CH10) for low-copy growth in absence of CopyCutter Induction Solution, requiring larger culture volumes to compensate for reduced plasmid yields (5 mL for minipreps, 0.5-lLfor maxipreps). Due to their low-copy nature, the full length SCN1 Aplasmid preps from theseCopyCutter EPI400 cells show high contamination from bacterial genomic DNA. We minimized this contamination by adsorbing flocculated genomic DNA against copper-chelated agarose resin (G Biosciences catalog # 786-285), followed by repurification and concentration on Ampure XP beads (Beckman-Coulter # A63882). Each batch of full- length SCN1 A plasmid required full sequence validation by Sanger sequencing across the complete ORF toverify absence of rearrangement prior to use, in the case of minipreps using PCR amplification of the full ORF to generate enough DNA for sequencing. Alternatively, large-scale maxipreps of full-length SCN1A for transfection experiments were produced by Aldevron (Fargo, ND) using their proprietary growth conditions, which were confirmed to be fully intact by full ORF sequencing. In contrast to the special challenges experience with full length SCN1A ORF, plasmids containing the split fusion protein halves alone did not exhibit rearrangements or require special culturing techniques, and we amplified these plasmids in either pSMART-HC-Kan orpAAV backbone using Stbl3 cells (Thermo Fisher Scientific # C737303). For in house preps, all bacteria were grown at 32°C with either 50 pg/mL ampicillin (liquid growth, Millipore-Sigma # A8351-5G), 50 pg/mL carbenicillin (agar plates, Teknova # L1008), or 25 pg/mL kanamycin (liquid growth, Millipore-Sigma # K1377-5G, or agar plates, Teknova # L1023). Liquid growth was performed in a 50:50 mix of LB and TB (Teknova L8000 and T7060). Additionally, we also cloned a recombinant codon-optimized hSCNlB- P2A-hSCN2B ORF to promote folding and activity of Navi.1 channels. Packaging into PHP.eB particles with iodixanol gradient purification was performed.
[0252] Western blot.
[0253] To prepare protein samples from HEK-293 cells, we plated HEK-293 cells (ATCC catalog # CRL-1573) with fewer than 15 passages onto 12-well plates in HEK-293 growth medium (DMEM [Gibco # 10566-061] with 10% FBS [Gibco #16140-071] and lx Pen Strep [Gibco #15070-63]) and transfected them when 50-70% confluent. We transfected 1000 ng of DNA per well using PEI-MAX (Polysciences # 24765-100) using a ratio of 1000 ng of DNA : 5 pl 1 mg/mL PEI-MAX in Opti-MEM (Gibco #11058-021). For expression of SCN1 A or split SCN1A halves, we also co-transfected a CMV-hSCNlB-P2A-hSCN2B construct at a ratio of 900 ng alpha subunits: 100 ng beta subunits. We replenished the medium at 12-18 hours posttransfection with fresh HEK-293 growth medium, and then harvested the cells at 48-72 hours post-transfection. For harvesting we rinsed with PBS and then lysed with RIPA buffer (Pierce 677 #89901) containing lx Halt Protease Inhibitor Cocktail (Thermo Fisher # 87786).
[0254] To prepare mouse brain membrane protein samples, we dissected motor cortex samples (-25-50 mg wet tissue) into 1.7 mL Eppendorf Lo-Bind tubes and froze on dry ice and stored at -20°C. We prepared membrane protein samples using the Mem -Per Plus Membrane Extraction Kit (Thermo Fisher #89842) with small adjustments to the protocol. Briefly we washed the tissue once with ice-cold Cell Wash Solution, spun down, and wash buffer aspirated off. Then we permeabilized tissue with 200 pL ice-cold Permeabilization Buffer supplemented with IX Halt Protease Inhibitor Cocktail, pipetted fifty times with a Rainin P200 pipet, tumbled at 4°C for 15 minutes, and centrifuged at 18,000g for 15 minutes at 4°C. This supernatant was saved as cytoplasmic protein fraction, and we solubilized the pellet containing membrane proteins in 200 pL Solubilization Buffer with pipetting fifty times again with the Rainin P200 pipet and tumbled at 4°C for 15 minutes. Finally, we centrifuged the samples again (18,000g for 15 minutes at 4°C) and saved the supernatant as membrane protein fraction. Aliquots of this membrane protein were diluted with five volumes of PBS prior to protein quantification using BCA assay.
[0255] For western blots of HEK-293 cells and mouse brain membrane proteins, we quantified protein samples using BCA assay kit (Thermo Fisher Scientific # 23225) and analyzed them on a plate reader (Perkin ElmerZEnSpire 2300). For each lane we treated 15-20 pg protein samples with 4XNuPAGELDS Sample Loading buffer (Thermo #NP0007) and 5% 2-mercaptoethanol (Sigma #M3148-25ml), heated them at 70°C for 6 minutes, chilled them on ice, then separated them on a NuPAGE 4-12% Bis-Tris gel (Thermo Fisher Scientific # NP0323BOX) using NuPage MOPS SDS running buffer (Thermo Fisher #NP0001) at 90V for 2 hours, alongside 5 pL Chameleon Duo Pre-stained Protein Ladder (LiCor # 928-60000) as a sizing standard lane. We then transferred the proteins to nitrocellulose membranes with prechilled NuPage Transfer Buffer (Life Technologies #NP0006-l) at 90V for 2 hours on ice.
[0256] After transfer, we blocked membranes with 5% milk in PBS for 1 hour at room temperature and probed overnight with rocking at 4°C with the following primary antibodies: mouse anti-FLAG clone M2 (1/1000 dilution, Sigma # F1804), mouse anti-HA clone 16B12 (1/3000 dilution for HEK-293 cell lysates, Biolegend # 901513), rabbit anti-HA clone C29F4 (1/1000 dilutionfor brain membrane protein preparations, Cell Signaling # 3724S), NeuroMab mouse anti-NaVLl clone K74/71 (1/300 dilution, Antibodies Incorporated # 75-023), and loading controlrabbit polyclonal anti-alpha tubulin (1/5000 dilution, Cell Signaling #2144S) or mouse anti-alpha tubulin clone DM1 A (1/5000 dilution, Santa Cruz Biotechnology sc-32293) in 2.5% milk in PBS with 0.1% Tween 20 (PBST). Following primary antibody, we washed 3x with PBST for 10 minutes each, then performed secondary antibody detection using appropriate IRDye 680LT- or 800CW -labeled goat secondary antibodies (LLCOR) in PBST for 1 hour, then washed 3x with PBS for 10 minutes each, then imaged the blots on the LLCOR Odyssey imager. We performed all antibody and wash steps at room temperature with gentle agitation.
[0257] Electrophysiology.
[0258] HEK293 cells (CRL-1573; ATCC, Gaithersburg, MD) were cultured in standard media consisting of DMEM, high glucose, glutaMAX (Gibco 10566016; Thermo Fisher Scientific, Waltham, MA), supplemented with 10% (v/v) fetal calf serum (FCS) (Gibco A5670401; Thermo Fisher Scientific, Waltham, MA) and 1% (v/v) Penicillin-Streptomycin (P/S) (10,000 U/ml) (Gibco 15140148; Thermo Fisher Scientific, Waltham, MA), grown at 37°Cand 5% CO2. Cells were passaged in T25 tissue culture flasks (FB012935; Thermo Fisher Scientific, Waltham, MA) approximately twice a week. Only cells passaged less than 20 times were used for transfection and expression studies.
[0259] Plasmid constructs were acutely transfected into HEK293 cells using Viafect reagent (E4981; Promega, Madison, WI), following the manufacture’s protocol. In brief, HEK293 cells were first prepared for transfection by plating into 12-well tissue culture plates (Nunc 12-565- 321; Thermo Fisher Scientific, Waltham, MA) at a density of -0.5-2 xlO5 cells per well and grown to -80-90% confluence with standard media, allowing for one confluent well per transfection condition. On the day of transfection, media in confluent wells to be transfected were replaced with 0.5 mL fresh DMEM with 10% FCS, without P/S. Lipophilic/DNA transfection complexes were generated for each well to be transfected. This was achieved by combining a total of -1.0 pg of plasmid DNAs with serum-free OptiMEM (Gibco 31985062; Thermo Fisher Scientific, Waltham, MA) to a final volume of 100 pL, then adding 3.0 pL Viafect with gentle trituration and allowing the mixture to assemble at 24°C for 30 mins. After 30 mins, this mixture was added to each well. Live transfected cells were incubated overnight at 37°C, and visually monitored for transfection efficiency in situ using a plate microscope equipped with fluorescence (Invitrogen EVOS M7000; Thermo Fisher Scientific, Waltham, MA). Transfection efficiencies were typically >70-80%.
[0260] Specific amounts of plasmid DNAs (pDNAs) used for transfections per well : a) Full- length SCN1A (SCN1A-FL) 0.8 pg of pDNA, b) SCNIA-Ntm (SCN1A-N) 0.4 pg pDNA combined with SCNIA-Ctm (SCN1A-C) 0.4 pg pDNA, c) SCNIA-Ntm (SCN1A-N) 0.8 pg pDNA, d) SCNIA-Ctm (SCN1A-C) 0.8 pg pDNA, e) empty vector (SYFP only) 0.8 pg pDNA. All transfection conditions (a-e) were performed in a background of 80 ng pDNA of a bi- cistronic construct expressing SCN1B and SCN2B (pi/2), which encodes two sodium channel P-subunits, yielding an a:P subunit-encoding pDNA mass ratio of 10: 1. One additional control utilized only 80 ng of P 1/2 pDNA.
[0261] Following overnight incubation, cells in transfected wells were dissociated with TrypLE (Gibco 12-604-013, Thermo Fisher Scientific, Waltham, MA); and replated at low density onto 12 mm poly-D-lysine-coated glass coverslips (GG-12-pdl; Neu Vitro, Vancouver, WA) in 24-well tissue culture plates (FisherBrand FB012929; Thermo Fisher Scientific, Waltham, MA), for patch-clamp electrophysiology. Typically, -10,000-15,000 cells were replated per well at sufficiently low density to isolated individual cells. This was necessary to prevent the formation of electrical junctions between contacting cells, which precludes adequate space-clamp recording conditions. Recordings were performed from 0.5-3 days after replating at low densities.
[0262] For patch-clamp recordings, coverslips containing adherent transfected cells were transferred to the stage of a Zeiss AxoExaminer.Al microscope, equipped with an 40X water immersion objective and epifluorescence capability. Pipettes were positioned with a Sutter MPC-325 micromanipulator (Novato, CA). Whole-cell voltage-clamp recordings were acquired with an AxoClamp200B amplifier (Molecular Devices, Union City, CA), using pClampl0.4. The composition of recording solutions was: Bath (in mM, 140 NaCl, 2 CaC12, 2 MgC12, 10 HEPES, pH 7.4); Pipette internal solution (in mM, 35 NaCl, 105 CsF, 10 EGTA, 10 HEPES, pH 7.4). Patch pipettes were pulled from borosilicate glass (1B120F-4; World Precision Instruments, Sarasota, FL) on a P-97 Sutter Instruments puller (Novato, CA), and fire-polished on a Micro-Forge MF-830 (Narashige International USA, Amityville, NY) to a resistance of 0.8-1.5 MQ. Currents were allowed 5-10 mins to stabilize after achieving wholecell recording configuration, and acquired at 50 kHz, filtered at 5 kHz. Capacitive transients were subtracted using a P/4 subtraction scheme, employing a current subtraction template derived from scaling the voltage command protocol by one-fourth. Series resistance compensation was >90% for all recordings. For voltage-dependent measurements of activation (G/V) and steady-state inactivation (SSI) which require optimal voltage-clamp conditions, currents larger than 6 nA were excluded. All currents were included for peak current density measurements. Additional initial recordings were obtained with an Axopatch ID amplifier, which provided data only for peak current density analysis.
[0263] For conductance/voltage (G/V) plots, conductance (G) was calculated by the formula:
G= 1/ (V m E\a-rev) where I = peak current, Vm = membrane potential, and E\a-rev = Na+ reversal potential.
[0264] Na+ reversal potential was 35 mV, based on a calculated Nernst equilibrium potential with the recording solutions used.
[0265] Peak currents were recorded in response to a family of voltage steps from a holding potential of -120 mV to 40 mV, in 5 mV increments, with an inter-pulse interval of 2 seconds to allow channels to fully deactivate to the deep closed state. Conductance/voltage plots were fitted to a single Boltzmann function:
G/Gmax = -1/ [1 + exp((V-Vso)/k)] + 1 where Gmax = maximal conductance, Vso = voltage of half activation, and k = slope of the fitted function.
[0266] For steady-state inactivation (SSI) plots, residual peak currents activated by a step to -20 mV were measured, after a family of 1 second activating preconditioning voltage steps from -120 mV to 30 mV, in 5 mV increments. Peak residual currents after steady-state inactivation were plotted and similarly fitted to a single Boltzmann function.
[0267] Current traces were analyzed and plotted using pClampl0.4 (Molecular Devices, San Jose, CA) and Origin 8.5 (Northampton, MA). All G/V and SSI data were plotted as means with standard errors (SE) using Origin 8.5. Current density plot and all statistical calculations were performed in Prism (GraphPad, La Jolla, CA). Figures composed in Microsoft PowerPoint (Redmond, WA).
[0268] Various embodiments of the invention are described above in the Detailed Description. While these descriptions directly describe the above embodiments, it is understood that those skilled in the art may conceive modifications and/or variations to the specific embodiments shown and described herein. Any such modifications or variations that fall within the purview of this description are intended to be included therein as well. Unless specifically noted, it is the intention of the inventors that the words and phrases in the specification and claims be given the ordinary and accustomed meanings to those of ordinary skill in the applicable art(s).
[0269] The foregoing description of various embodiments of the invention known to the applicant at this time of filing the application has been presented and is intended for the purposes of illustration and description. The present description is not intended to be exhaustive nor limit the invention to the precise form disclosed and many modifications and variations are possible in the light of the above teachings. The embodiments described serve to explain the principles of the invention and its practical application and to enable others skilled in the art to utilize the invention in various embodiments and with various modifications as are suited to the particular use contemplated. Therefore, it is intended that the invention not be limited to the particular embodiments disclosed for carrying out the invention.
[0270] While particular embodiments of the present invention have been shown and described, it will be obvious to those skilled in the art that, based upon the teachings herein, changes and modifications may be made without departing from this invention and its broader aspects and, therefore, the appended claims are to encompass within their scope all such changes and modifications as are within the true spirit and scope of this invention. It will be understood by those within the art that, in general, terms used herein are generally intended as “open” terms (e.g., the term “including” should be interpreted as “including but not limited to,” the term “having” should be interpreted as “having at least,” the term “includes” should be interpreted as “includes but is not limited to,” etc.). As used herein the term “comprising” or “comprises” is used in reference to compositions, methods, and respective component(s) thereof, that are useful to an embodiment, yet open to the inclusion of unspecified elements, whether useful or not. It will be understood by those within the art that, in general, terms used herein are generally intended as “open” terms (e.g., the term “including” should be interpreted as “including but not limited to,” the term “having” should be interpreted as “having at least,” the term “includes” should be interpreted as “includes but is not limited to,” etc.). Although the open-ended term “comprising,” as a synonym of terms such as including, containing, or having, is used herein to describe and claim the invention, the present invention, or embodiments thereof, may alternatively be described using alternative terms such as “consisting of’ or “consisting essentially of.”
[0271] Sequences of various constructs and elements are provided below. lxhI56i(core) (core of human I56i enhancer) (131 bp in length)
CTAAATAAAGATGGCTTTTTAGTATTAAAAGTGGAAGAAAATTACAGGTAATTA TCTTTGACGGTAAAAACGCTGTAATCAGCGGGCTACATGAAAAATTACTCTAAT TATGGCTGCATTTAAGAGAATGG (SEQ ID NO: 1)
DLX2.0 (3x human I56i core) (393 bp in length)
CTAAATAAAGATGGCTTTTTAGTATTAAAAGTGGAAGAAAATTACAGGTAATTA TCTTTGACGGTAAAAACGCTGTAATCAGCGGGCTACATGAAAAATTACTCTAAT TATGGCTGCATTTAAGAGAATGGCTAAATAAAGATGGCTTTTTAGTATTAAAAG TGGAAGAAAATTACAGGTAATTATCTTTGACGGTAAAAACGCTGTAATCAGCGG GCTACATGAAAAATTACTCTAATTATGGCTGCATTTAAGAGAATGGCTAAATAA AGATGGCTTTTTAGTATTAAAAGTGGAAGAAAATTACAGGTAATTATCTTTGAC GGTAAAAACGCTGTAATCAGCGGGCTACATGAAAAATTACTCTAATTATGGCTG CATTTAAGAGAATGG (SEQ ID NO:2) eHGT 078h (537 bp in length)
GTAGTCTGCCTCAGGTACACACTGAGAAACTGCTTTAATGTAACCTGACCCACG GTTATTAGTGAAAATATCACTTTTGTTGTTACCTTATTCCCAACAAATTCATTTCT GCTTTAATGGAAAAGATCCGGGTTCACACTAATCAGGCCCAACGGAAGGCCAT ATTAGCAATTTGGCAGGTACCCGAGGGCCATACCTAATCTGCATAAAATGAAGC AGATTGCAACCGCCCTCATCTTTTTTATTTTTAAACTGGTTTTTGAAGCAGAGCA TAAAATCTCAGAGGGAGAGACAGAAGATGCTAGTGCATACATTTTCCTTCATGC CTTTATTTTCATTCTTTTTGCACAAACCATCTTCCTGAATGGCTGTTTACCTAAAG
AAGAATAACAAAATAAAAGGTGCTAGGAAATGGAGTAGGCAGAGATCACAAAT GTTTAATTAAAAAAAAAAAAAGTCATGTACTTTCATAGATATTCACAATCCTCT CTAGTATACTTTCAAATCAGTTTTAATTTCAGTTTAGTGTTTTTATGT (SEQ ID NO:55) hSynl promoter (495 bp in length)
GTGTCTAGACTGCAGAGGGCCCTGCGTATGAGTGCAAGTGGGTTTTAGGACCAG GATGAGGCGGGGTGGGGGTGCCTACCTGACGACCGACCCCGACCCACTGGACA AGCACCCAACCCCCATTCCCCAAATTGCGCATCCCCTATCAGAGAGGGGGAGGG GAAACAGGATGCGGCGAGGCGCGTGCGCACTGCCAGCTTCAGCACCGCGGACA GTGCCTTCGCCCCCGCCTGGCGGCGCGCGCCACCGCCGCCTCAGCACTGAAGGC
GCGCTGACGTCACTCGCCGGTCCCCCGCAAACTCCCCTTCCCGGCCACCTTGGTC GCGTCCGCGCCGCCGCCGGCCCAGCCGGACCGCACCACGCGAGGCGCGAGATA GGGGGGCACGGGCGCGACCATCTGCGCTGCGGCGCCGGCGACTCAGCGCTGCC TCAGTCTGCGGTGGGCAGCGGAGGAGTCGTGTCGTGCCTGAGAGCGCAGTCGA GAAACCGGCTAGA (SEQ ID NO: 52)
CMV promoter (588 bp in length)
GTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGT TCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCC TGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCC CATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACG GTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCC TATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGAC CTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACC ATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGATAGCGGTTTGACTCA CGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCAC CAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAA ATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTC (SEQ ID NO:53) hSynl promoter (shortened) (348 bp in length)
AGAGAGGGGGAGGGGAAACAGGATGCGGCGAGGCGCGTGCGCACTGCCAGCT TCAGCACCGCGGACAGTGCCTTCGCCCCCGCCTGGCGGCGCGCGCCACCGCCG CCTCAGCACTGAAGGCGCGCTGACGTCACTCGCCGGTCCCCCGCAAACTCCCC TTCCCGGCCACCTTGGTCGCGTCCGCGCCGCCGCCGGCCCAGCCGGACCGCAC CACGCGAGGCGCGAGATAGGGGGGCACGGGCGCGACCATCTGCGCTGCGGCG CCGGCGACTCAGCGCTGCCTCAGTCTGCGGTGGGCAGCGGAGGAGTCGTGTCG
TGCCTGAGAGCGCAGTCGAGAAACCGGCTAGA (SEQ ID NO: 54)
4x2C (214 bp in length):
AAAGAGACCGGTTCACTGTGACAGTAAAAGAGACCGGTTCACTGTGAGAATG AAAGAGACCGGTTCACTGTGATCGGAAAAGAGACCGGTTCACTGTGAGCGGCC TTGAAACCCAGCAGACAATGTAGCTCAGTAGAAACCCAGCAGACAATGTAGCT GAATGGAAACCCAGCAGACAATGTAGCTTCGGAGAAACCCAGCAGACAATGT AGCT (SEQ ID NO:56)
8x2C (450 bp in length):
GCGGCCTTAAAGAGACCGGTTCACTGTGACAGTAAAAGAGACCGGTTCACTGT GAGAATGAAAGAGACCGGTTCACTGTGATCGGAAAAGAGACCGGTTCACTGT GAGCGGCCTTAAAGAGACCGGTTCACTGTGACAGTAAAAGAGACCGGTTCACT GTGAGAATGAAAGAGACCGGTTCACTGTGATCGGAAAAGAGACCGGTTCACT GTGAGCGGCCTTGAAACCCAGCAGACAATGTAGCTCAGTAGAAACCCAGCAG ACAATGTAGCTGAATGGAAACCCAGCAGACAATGTAGCTTCGGAGAAACCCA GCAGACAATGTAGCTGCGGCCTTGAAACCCAGCAGACAATGTAGCTCAGTAGA AACCCAGCAGACAATGTAGCTGAATGGAAACCCAGCAGACAATGTAGCTTCG GAGAAACCCAGCAGACAATGTAGCTGCGGCC (SEQ ID NO: 87)
Cfa-N:
TGTCTCAGTTATGACACAGAAATCTTGACGGTGGAATACGGGTTTCTTCCGATC GGAAAGATTGTTGAGGAGCGCATAGAGTGTACGGTGTATACCGTCGATAAGAA CGGATTTGTCTACACACAGCCTATCGCACAATGGCATAATAGAGGAGAACAAG AAGTCTTCGAATATTGTTTGGAGGACGGATCAATCATACGGGCAACCAAAGAC CACAAGTTTATGACAACAGATGGACAGATGTTGCCAATAGATGAGATATTTGA GAGGGGACTTGATCTCAAGCAAGTGGATGGTCTGCCATAA (SEQ ID NO:57)
Cfa-C:
ATGGTCAAGATCATCTCCAGGAAGTCTCTGGGTACACAGAATGTCTACGATAT CGGAGTCGAGAAAGACCACAATTTTCTCCTGAAAAACGGACTCGTGGCGTCCA AT (SEQ ID NO: 58) hSCNlA-CO-Nterml049-Intron (the Intron is underlined in italics):
ATGGAGCAAACAGTTTTGGTCCCTCCGGGACCAGACAGTTTCAATTTCTTTACT CGGGAGAGTCTTGCCGCCATTGAGAGGCGCATAGCTGAGG™G 4C 4GC4GC TACAATCCAGCTACCATTCTGCTTTTA TTCTA TGGTTGGGA TAAGGCTGGA TTA TTCT GAGTCCAAGCTAGGCCCTTTTGCTAA TCATGTTCA TACCTCTTA TCTTCCTCCCACAG GAAAAGGCTAAGAATCCAAAACCTGACAAGAAAGACGACGACGAAAACGGAC
CCAAACCTAACTCAGATCTCGAAGCTGGAAAGAATCTCCCATTCATCTATGGT
GATATCCCTCCAGAAATGGTTTCAGAACCTCTAGAAGATCTCGATCCATACTAT
ATCAATAAAAAGACCTTCATCGTTCTGAACAAAGGAAAGGCGATTTTCCGGTT
TTCTGCTACTTCTGCTCTCTATATTCTCACACCATTTAATCCACTTCGCAAGATT
GCGATTAAGATACTGGTGCATAGTCTGTTCAGTATGCTGATTATGTGTACAATC
CTTACAAATTGTGTCTTTATGACTATGTCTAACCCGCCGGATTGGACCAAGAAC
GTAGAATACACGTTCACTGGAATCTATACGTTCGAGTCTCTTATTAAGATAATC
GCCAGGGGGTTCTGTCTTGAGGATTTCACTTTCCTCCGCGATCCGTGGAATTGG
CTTGACTTCACCGTTATTACGTTCGCTTACGTTACTGAGTTCGTTGATCTTGGCA
ACGTGTCTGCACTCAGAACATTCAGAGTGCTTAGAGCACTTAAAACCATAAGT
GTCATACCAGGATTGAAAACGATCGTGGGAGCTCTGATACAGAGTGTAAAGAA
GCTTTCAGATGTAATGATCCTTACTGTCTTCTGTCTTTCCGTATTCGCACTCATC
GGGCTCCAGCTGTTTATGGGTAACCTCAGAAACAAATGCATTCAATGGCCACC
AACAAATGCGAGCCTTGAGGAACATAGCATAGAAAAGAATATCACTGTTAACT
ATAATGGGACCCTCATAAACGAAACCGTGTTCGAATTTGACTGGAAATCCTAC
ATTCAGGATTCCAGATATCATTATTTTCTTGAGGGCTTCTTGGACGCACTTTTGT
GCGGAAATTCAAGTGATGCTGGTCAATGTCCTGAAGGTTATATGTGTGTTAAA
GCAGGAAGAAACCCAAACTACGGATACACATCTTTCGATACATTTTCTTGGGC
TTTCCTATCTCTTTTTCGGCTTATGACACAAGACTTTTGGGAAAATTTGTATCAG
CTGACACTCCGAGCGGCTGGAAAAACTTATATGATCTTCTTCGTTCTTGTAATC
TTTTTGGGATCCTTCTACCTCATCAATTTGATACTTGCAGTTGTCGCTATGGCAT
ACGAGGAGCAAAATCAAGCAACGCTAGAAGAAGCGGAGCAGAAAGAGGCTG
AATTTCAACAGATGATTGAGCAATTGAAGAAACAACAGGAAGCTGCACAACA
AGCAGCTACTGCTACTGCATCTGAACATTCTAGAGAGCCAAGTGCAGCTGGAA
GGCTTTCTGATAGTTCAAGTGAAGCATCTAAATTGAGTTCTAAGTCAGCAAAG
GAACGGAGAAATAGACGGAAAAAACGAAAGCAGAAGGAGCAATCTGGAGGA
GAAGAGAAGGACGAAGACGAGTTTCAAAAAAGTGAATCAGAGGACTCAATTA
GACGCAAAGGATTCAGATTTAGTATCGAAGGAAATAGATTGACTTATGAAAAA
CGATATTCCTCACCACATCAGTCACTCCTGAGTATACGCGGGTCACTCTTTTCA
CCGAGACGAAATTCCAGAACTTCACTCTTCTCATTCCGGGGAAGGGCTAAGGA
TGTAGGCTCAGAAAATGATTTCGCAGACGATGAGCATTCCACTTTTGAAGATA
ATGAGAGCAGGCGAGACAGTCTCTTTGTACCACGAAGACATGGCGAAAGAAG
GAACAGCAACCTTAGCCAGACTAGTCGGTCCAGTAGAATGCTAGCTGTATTCC
CTGCTAATGGCAAGATGCATTCCACCGTTGATTGTAATGGGGTCGTCTCGTTAG
TAGGTGGACCTTCAGTTCCTACCTCACCGGTTGGACAATTGCTGCCGGAGGGA
ACCACTACTGAGACTGAAATGAGAAAACGACGTTCTTCAAGCTTCCATGTGTC
TATGGATTTTTTGGAAGACCCGTCACAGCGCCAAAGAGCTATGTCTATAGCTTC
AATCCTGACAAACACCGTAGAGGAGTTGGAGGAGTCACGCCAGAAGTGCCCTC
CTTGTTGGTACAAGTTCTCCAACATCTTCCTGATTTGGGATTGTTCACCTTATTG
GCTGAAAGTCAAGCACGTTGTTAACCTCGTCGTAATGGATCCTTTTGTCGACCT
GGCTATAACGATATGTATCGTCCTGAACACACTCTTCATGGCTATGGAGCATTA
TCCGATGACTGATCATTTTAACAATGTGCTTACCGTGGGTAATCTGGTTTTCAC TGGCATCTTTACTGCAGAAATGTTTCTTAAGATTATTGCAATGGACCCCTACTA CTACTTTCAAGAAGGATGGAATATTTTTGATGGTTTTATCGTCACACTTTCTTTG GTTGAATTGGGCTTGGCAAATGTAGAGGGGCTCTCAGTTCTTAGAAGTTTCAG ACTTCTCCGGGTATTCAAGCTTGCTAAGAGCTGGCCTACTTTGAACATGCTCAT CAAGATTATCGGAAACAGTGTTGGCGCCCTTGGCAATCTGACATTGGTGTTGG CTATCATAGTATTCATCTTCGCGGTTGTGGGAATGCAGTTGTTTGGGAAGTCAT ACAAGGACTGTGTGTGCAAGATAGCGTCCGACTGTCAACTTCCGAGGTGGCAC ATGAACGATTTCTTTCATTCATTCCTCATTGTGTTTCGGGTCCTCTGTGGCGAGT GGATAGAAACTATGTGGGACTGTATGGAAGTAGCTGGGCAGGCGATGTGCCTT ACGGTATTCATGATGGTCATGGTCATCGGAAATCTTGTTGTATTGAATTTATTT CTCGCGTTGTTGTTGAGTTCATTTTCCGCCGATAATTTGGCTGCCACTGATGAC GACAACGAGATGAATAATCTTCAGATAGCTGTAGACCGGATGCACAAAGGCGT TGCCTACGTCAAACGAAAAATCTATGAATTCATACAGCAATCCTTCATACGAA AACAGAAGATTCTGGATGAAATCAAACCCCTTGATGATCTCAATAATAAGAAA GATTCA (SEQ ID NO:59) hSCNlA-CO-Cterm949-Intron (the Intron is underlined and in italics):
GCATGTCGAACCATACCACAGAGATAGGCAAGGACCTTGACTACCTTAAAGAC GTGAACGGTACCACAAGTGGAATAGGCACAGG7A4GZ4CZ4GG4GCZ4CA47CC AGCTACCA TTCTGCTTTTA TTCTATGGTTGGGA TAAGGCTGGA TTA TTCTGAGTCCAA GCTAGGCCCTTTTGCTAATCATGTTCATACCTCTTATCTTCCTCCCACAGGCTCCTCT GTAGAGAAGTACATCATAGACGAGAGCGATTACATGTCTTTCATCAACAACCC GTCCCTCACTGTCACCGTTCCCATCGCCGTAGGAGAATCTGACTTCGAGAATCT CAATACGGAGGATTTCAGCTCCGAATCAGACTTGGAGGAATCAAAGGAGAAG TTGAACGAAAGTTCAAGTTCATCCGAGGGCAGCACCGTGGACATAGGCGCCCC CGTCGAGGAACAACCTGTAGTCGAGCCTGAGGAAACTTTGGAACCCGAAGCGT GTTTCACGGAGGGGTGTGTTCAACGCTTCAAGTGTTGCCAAATTAACGTTGAA GAGGGTCGTGGAAAACAATGGTGGAACCTCCGCAGGACCTGTTTCCGGATCGT CGAACATAATTGGTTCGAGACGTTCATAGTTTTCATGATCTTGCTTTCATCTGG TGCTTTGGCATTCGAGGATATCTACATCGACCAACGAAAGACCATAAAAACTA TGCTGGAATATGCAGACAAGGTTTTCACATACATATTCATCCTTGAAATGCTCC TGAAATGGGTAGCGTATGGTTACCAGACTTATTTCACGAACGCATGGTGCTGG CTCGATTTCCTGATTGTCGACGTCTCCCTGGTGTCATTGACTGCTAACGCACTC GGATATAGCGAACTAGGCGCTATTAAGAGTCTCAGAACCCTGAGAGCATTGAG GCCCCTCCGCGCGCTCTCTCGGTTTGAGGGAATGAGAGTAGTCGTTAATGCAC TGTTGGGAGCGATACCTTCCATTATGAACGTGCTTCTCGTTTGTCTCATCTTCTG GCTGATATTCTCTATTATGGGTGTGAACTTGTTCGCAGGCAAATTTTACCACTG CATTAACACAACTACAGGAGATAGATTTGATATTGAGGATGTAAACAACCACA CCGACTGTTTGAAGTTGATAGAGAGAAACGAGACCGCAAGATGGAAGAATGT AAAAGTCAACTTCGACAATGTCGGCTTTGGATATCTTTCACTGCTGCAAGTAGC CACATTCAAAGGATGGATGGACATTATGTACGCTGCAGTAGATTCCCGAAACG TAGAGTTGCAACCGAAGTATGAAGAAAGTTTGTATATGTACCTCTACTTCGTA ATTTTTATCATCTTTGGCTCATTCTTCACACTTAACCTGTTCATTGGTGTAATCA TCGACAATTTCAATCAGCAGAAAAAGAAATTTGGTGGACAAGACATCTTCATG ACAGAGGAACAGAAGAAATACTATAATGCAATGAAAAAACTAGGGTCCAAAA AGCCCCAAAAACCTATTCCTAGACCGGGCAACAAGTTTCAAGGCATGGTTTTC GACTTCGTAACTAGACAGGTGTTTGATATATCTATTATGATTCTGATATGTCTG AATATGGTTACGATGATGGTTGAGACTGATGATCAATCTGAATACGTTACGAC GATACTTAGCCGAATTAACTTGGTATTCATTGTTCTTTTCACGGGCGAATGTGT ACTTAAACTGATTAGTTTAAGGCACTATTATTTCACAATCGGTTGGAACATTTT TGATTTCGTTGTGGTCATACTTTCCATTGTTGGCATGTTTCTTGCTGAATTGATA GAAAAGTACTTCGTCAGTCCAACACTTTTCCGAGTTATACGGCTTGCCCGAATC GGACGAATTCTCAGGCTAATCAAAGGTGCTAAAGGAATTCGTACACTGCTTTT CGCTCTCATGATGTCACTGCCAGCTCTTTTCAACATCGGTTTGTTACTATTTTTG GTAATGTTTATATATGCGATCTTCGGCATGAGTAATTTCGCTTATGTTAAACGG GAGGTGGGAATCGATGACATGTTTAATTTTGAGACATTCGGCAATTCTATGAT CTGTCTCTTTCAAATTACCACGTCAGCTGGATGGGACGGATTGCTTGCTCCGAT TCTCAACAGTAAACCGCCCGATTGCGACCCTAACAAAGTGAATCCGGGTTCAT CTGTAAAGGGAGACTGCGGAAATCCGAGCGTCGGTATCTTCTTTTTCGTCTCCT ACATTATAATTTCTTTCCTTGTTGTCGTGAACATGTATATAGCTGTGATCTTGG AAAATTTTTCTGTTGCTACTGAGGAATCCGCAGAACCACTTTCAGAAGACGAT TTTGAGATGTTTTACGAAGTTTGGGAGAAGTTTGATCCTGACGCTACACAGTTT ATGGAATTTGAGAAGCTCTCACAGTTCGCAGCTGCCCTGGAGCCTCCGTTGAA TCTTCCACAGCCTAACAAGTTACAACTGATTGCGATGGACCTGCCAATGGTGT CTGGGGACCGAATCCACTGCCTTGATATACTCTTTGCTTTCACAAAAAGGGTCT TGGGCGAGTCTGGAGAAATGGACGCCCTCAGAATACAGATGGAGGAACGATT CATGGCTTCGAATCCTAGCAAAGTGTCTTATCAACCCATCACTACGACTCTTAA AAGAAAACAAGAGGAAGTGTCTGCTGTCATTATCCAGCGAGCATATAGACGG CACTTGCTCAAACGAACTGTTAAGCAAGCCAGTTTCACCTACAATAAAAACAA AATAAAAGGTGGTGCTAATTTGCTGATTAAAGAGGACATGATTATCGACAGAA TCAATGAGAACTCCATTACAGAAAAAACCGATCTCACTATGTCAACAGCAGCC TGTCCTCCCTCATACGACCGTGTCACTAAACCTATAGTCGAAAAACATGAACA
AGAGGGCAAGGATGAGAAGGCCAAAGGCAAA (SEQ ID NO: 60) hSCNlA-CO-Nterm956-Intron (the Intron is underlined and in italics):
ATGGAGCAAACAGTTTTGGTCCCTCCGGGACCAGACAGTTTCAATTTCTTTACT CGGGAGAGTCTTGCCGCCATTGAGAGGCGCATAGCTGAGGZA4GZ4CZ4GG4GC TACAATCCAGCTACCATTCTGCTTTTA TTCTA TGGTTGGGA TAAGGCTGGA TTA TTCT GAGTCCAAGCTAGGCCCTTTTGCTAA TCATGTTCA TACCTCTTA TCTTCCTCCCACAG GAAAAGGCTAAGAATCCAAAACCTGACAAGAAAGACGACGACGAAAACGGA CCCAAACCTAACTCAGATCTCGAAGCTGGAAAGAATCTCCCATTCATCTATGG TGATATCCCTCCAGAAATGGTTTCAGAACCTCTAGAAGATCTCGATCCATACTA TATCAATAAAAAGACCTTCATCGTTCTGAACAAAGGAAAGGCGATTTTCCGGT TTTCTGCTACTTCTGCTCTCTATATTCTCACACCATTTAATCCACTTCGCAAGAT TGCGATTAAGATACTGGTGCATAGTCTGTTCAGTATGCTGATTATGTGTACAAT CCTTACAAATTGTGTCTTTATGACTATGTCTAACCCGCCGGATTGGACCAAGAA CGTAGAATACACGTTCACTGGAATCTATACGTTCGAGTCTCTTATTAAGATAAT CGCCAGGGGGTTCTGTCTTGAGGATTTCACTTTCCTCCGCGATCCGTGGAATTG GCTTGACTTCACCGTTATTACGTTCGCTTACGTTACTGAGTTCGTTGATCTTGGC AACGTGTCTGCACTCAGAACATTCAGAGTGCTTAGAGCACTTAAAACCATAAG TGTCATACCAGGATTGAAAACGATCGTGGGAGCTCTGATACAGAGTGTAAAGA AGCTTTCAGATGTAATGATCCTTACTGTCTTCTGTCTTTCCGTATTCGCACTCAT CGGGCTCCAGCTGTTTATGGGTAACCTCAGAAACAAATGCATTCAATGGCCAC CAACAAATGCGAGCCTTGAGGAACATAGCATAGAAAAGAATATCACTGTTAA CTATAATGGGACCCTCATAAACGAAACCGTGTTCGAATTTGACTGGAAATCCT ACATTCAGGATTCCAGATATCATTATTTTCTTGAGGGCTTCTTGGACGCACTTT TGTGCGGAAATTCAAGTGATGCTGGTCAATGTCCTGAAGGTTATATGTGTGTTA AAGCAGGAAGAAACCCAAACTACGGATACACATCTTTCGATACATTTTCTTGG
GCTTTCCTATCTCTTTTTCGGCTTATGACACAAGACTTTTGGGAAAATTTGTATC
AGCTGACACTCCGAGCGGCTGGAAAAACTTATATGATCTTCTTCGTTCTTGTAA TCTTTTTGGGATCCTTCTACCTCATCAATTTGATACTTGCAGTTGTCGCTATGGC ATACGAGGAGCAAAATCAAGCAACGCTAGAAGAAGCGGAGCAGAAAGAGGC
TGAATTTCAACAGATGATTGAGCAATTGAAGAAACAACAGGAAGCTGCACAA CAAGCAGCTACTGCTACTGCATCTGAACATTCTAGAGAGCCAAGTGCAGCTGG AAGGCTTTCTGATAGTTCAAGTGAAGCATCTAAATTGAGTTCTAAGTCAGCAA
AGGAACGGAGAAATAGACGGAAAAAACGAAAGCAGAAGGAGCAATCTGGAG GAGAAGAGAAGGACGAAGACGAGTTTCAAAAAAGTGAATCAGAGGACTCAAT TAGACGCAAAGGATTCAGATTTAGTATCGAAGGAAATAGATTGACTTATGAAA
AACGATATTCCTCACCACATCAGTCACTCCTGAGTATACGCGGGTCACTCTTTT
CACCGAGACGAAATTCCAGAACTTCACTCTTCTCATTCCGGGGAAGGGCTAAG
GATGTAGGCTCAGAAAATGATTTCGCAGACGATGAGCATTCCACTTTTGAAGA
TAATGAGAGCAGGCGAGACAGTCTCTTTGTACCACGAAGACATGGCGAAAGA AGGAACAGCAACCTTAGCCAGACTAGTCGGTCCAGTAGAATGCTAGCTGTATT CCCTGCTAATGGCAAGATGCATTCCACCGTTGATTGTAATGGGGTCGTCTCGTT
AGTAGGTGGACCTTCAGTTCCTACCTCACCGGTTGGACAATTGCTGCCGGAGG
GAACCACTACTGAGACTGAAATGAGAAAACGACGTTCTTCAAGCTTCCATGTG TCTATGGATTTTTTGGAAGACCCGTCACAGCGCCAAAGAGCTATGTCTATAGCT TCAATCCTGACAAACACCGTAGAGGAGTTGGAGGAGTCACGCCAGAAGTGCC
CTCCTTGTTGGTACAAGTTCTCCAACATCTTCCTGATTTGGGATTGTTCACCTTA
TTGGCTGAAAGTCAAGCACGTTGTTAACCTCGTCGTAATGGATCCTTTTGTCGA
CCTGGCTATAACGATATGTATCGTCCTGAACACACTCTTCATGGCTATGGAGCA TTATCCGATGACTGATCATTTTAACAATGTGCTTACCGTGGGTAATCTGGTTTT CACTGGCATCTTTACTGCAGAAATGTTTCTTAAGATTATTGCAATGGACCCCTA
CTACTACTTTCAAGAAGGATGGAATATTTTTGATGGTTTTATCGTCACACTTTC
TTTGGTTGAATTGGGCTTGGCAAATGTAGAGGGGCTCTCAGTTCTTAGAAGTTT
CAGACTTCTCCGGGTATTCAAGCTTGCTAAGAGCTGGCCTACTTTGAACATGCT CATCAAGATTATCGGAAACAGTGTTGGCGCCCTTGGCAATCTGACATTGGTGT TGGCTATCATAGTATTCATCTTCGCGGTTGTGGGAATGCAGTTGTTTGGGAAGT
CATACAAGGACTGTGTGTGCAAGATAGCGTCCGACTGTCAACTTCCGAGGTGG CACATGAACGATTTCTTTCATTCATTCCTCATTGTGTTTCGGGTCCTCTGTGGCG AGTGGATAGAAACTATGTGGGACTGTATGGAAGTAGCTGGGCAGGCGATG
(SEQ ID N0:61) hSCNlA-CO-Cterml042-Intron (the Intron is underlined and in italics):
TGCCTTACGGTATTCATGATGGTCATGGTCATCGGAAATCTTGTTGTATTGAAT TTATTTCTCGCGTTGTTGTTGAGTTCATTTTCCGCCGATAATTTGGCTGCCACTG ATGACGACAACGAGATGAATAATCTTCAGATAGCTGTAGACCGGATGCACAA
AGGCGTTGCCTACGTCAAACGAAAAATCTATGAATTCATACAGCAATCCTTCA TACGAAAACAGAAGATTCTGGATGAAATCAAACCCCTTGATGATCTCAATAAT AAGAAAGATTCATGCATGTCGAACCATACCACAGAGATAGGCAAGGACCTTG
ACTACCTTAAAGACGTGAACGGTACCACAAGTGGAATAGGCACAGG7M7G7MC
TAGCAGCTACAATCCAGCTACCATTCTGCTTTTATTCTATGGTTGGGATAAGGCTGGA TTATTCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTTCATACCTCTTATCTTCCTC
CG4G4GGCTCCTCTGTAGAGAAGTACATCATAGACGAGAGCGATTACATGTCT
TTCATCAACAACCCGTCCCTCACTGTCACCGTTCCCATCGCCGTAGGAGAATCT
GACTTCGAGAATCTCAATACGGAGGATTTCAGCTCCGAATCAGACTTGGAGGA
ATCAAAGGAGAAGTTGAACGAAAGTTCAAGTTCATCCGAGGGCAGCACCGTG
GACATAGGCGCCCCCGTCGAGGAACAACCTGTAGTCGAGCCTGAGGAAACTTT
GGAACCCGAAGCGTGTTTCACGGAGGGGTGTGTTCAACGCTTCAAGTGTTGCC
AAATTAACGTTGAAGAGGGTCGTGGAAAACAATGGTGGAACCTCCGCAGGAC
CTGTTTCCGGATCGTCGAACATAATTGGTTCGAGACGTTCATAGTTTTCATGAT
CTTGCTTTCATCTGGTGCTTTGGCATTCGAGGATATCTACATCGACCAACGAAA
GACCATAAAAACTATGCTGGAATATGCAGACAAGGTTTTCACATACATATTCA
TCCTTGAAATGCTCCTGAAATGGGTAGCGTATGGTTACCAGACTTATTTCACGA
ACGCATGGTGCTGGCTCGATTTCCTGATTGTCGACGTCTCCCTGGTGTCATTGA
CTGCTAACGCACTCGGATATAGCGAACTAGGCGCTATTAAGAGTCTCAGAACC
CTGAGAGCATTGAGGCCCCTCCGCGCGCTCTCTCGGTTTGAGGGAATGAGAGT
AGTCGTTAATGCACTGTTGGGAGCGATACCTTCCATTATGAACGTGCTTCTCGT
TTGTCTCATCTTCTGGCTGATATTCTCTATTATGGGTGTGAACTTGTTCGCAGGC
AAATTTTACCACTGCATTAACACAACTACAGGAGATAGATTTGATATTGAGGA
TGTAAACAACCACACCGACTGTTTGAAGTTGATAGAGAGAAACGAGACCGCA
AGATGGAAGAATGTAAAAGTCAACTTCGACAATGTCGGCTTTGGATATCTTTC
ACTGCTGCAAGTAGCCACATTCAAAGGATGGATGGACATTATGTACGCTGCAG
TAGATTCCCGAAACGTAGAGTTGCAACCGAAGTATGAAGAAAGTTTGTATATG
TACCTCTACTTCGTAATTTTTATCATCTTTGGCTCATTCTTCACACTTAACCTGT
TCATTGGTGTAATCATCGACAATTTCAATCAGCAGAAAAAGAAATTTGGTGGA
CAAGACATCTTCATGACAGAGGAACAGAAGAAATACTATAATGCAATGAAAA
AACTAGGGTCCAAAAAGCCCCAAAAACCTATTCCTAGACCGGGCAACAAGTTT
CAAGGCATGGTTTTCGACTTCGTAACTAGACAGGTGTTTGATATATCTATTATG
ATTCTGATATGTCTGAATATGGTTACGATGATGGTTGAGACTGATGATCAATCT
GAATACGTTACGACGATACTTAGCCGAATTAACTTGGTATTCATTGTTCTTTTC
ACGGGCGAATGTGTACTTAAACTGATTAGTTTAAGGCACTATTATTTCACAATC
GGTTGGAACATTTTTGATTTCGTTGTGGTCATACTTTCCATTGTTGGCATGTTTC
TTGCTGAATTGATAGAAAAGTACTTCGTCAGTCCAACACTTTTCCGAGTTATAC
GGCTTGCCCGAATCGGACGAATTCTCAGGCTAATCAAAGGTGCTAAAGGAATT
CGTACACTGCTTTTCGCTCTCATGATGTCACTGCCAGCTCTTTTCAACATCGGTT
TGTTACTATTTTTGGTAATGTTTATATATGCGATCTTCGGCATGAGTAATTTCGC
TTATGTTAAACGGGAGGTGGGAATCGATGACATGTTTAATTTTGAGACATTCG
GCAATTCTATGATCTGTCTCTTTCAAATTACCACGTCAGCTGGATGGGACGGAT
TGCTTGCTCCGATTCTCAACAGTAAACCGCCCGATTGCGACCCTAACAAAGTG
AATCCGGGTTCATCTGTAAAGGGAGACTGCGGAAATCCGAGCGTCGGTATCTT
CTTTTTCGTCTCCTACATTATAATTTCTTTCCTTGTTGTCGTGAACATGTATATA
GCTGTGATCTTGGAAAATTTTTCTGTTGCTACTGAGGAATCCGCAGAACCACTT
TCAGAAGACGATTTTGAGATGTTTTACGAAGTTTGGGAGAAGTTTGATCCTGA
CGCTACACAGTTTATGGAATTTGAGAAGCTCTCACAGTTCGCAGCTGCCCTGG
AGCCTCCGTTGAATCTTCCACAGCCTAACAAGTTACAACTGATTGCGATGGAC
CTGCCAATGGTGTCTGGGGACCGAATCCACTGCCTTGATATACTCTTTGCTTTC
ACAAAAAGGGTCTTGGGCGAGTCTGGAGAAATGGACGCCCTCAGAATACAGA TGGAGGAACGATTCATGGCTTCGAATCCTAGCAAAGTGTCTTATCAACCCATC ACTACGACTCTTAAAAGAAAACAAGAGGAAGTGTCTGCTGTCATTATCCAGCG AGCATATAGACGGCACTTGCTCAAACGAACTGTTAAGCAAGCCAGTTTCACCT ACAATAAAAACAAAATAAAAGGTGGTGCTAATTTGCTGATTAAAGAGGACAT GATTATCGACAGAATCAATGAGAACTCCATTACAGAAAAAACCGATCTCACTA TGTCAACAGCAGCCTGTCCTCCCTCATACGACCGTGTCACTAAACCTATAGTCG AAAAACATGAACAAGAGGGCAAGGATGAGAAGGCCAAAGGCAAA (SEQ ID NO: 62) hSCNlA-CO-Nterm947-Intron (the Intron is underlined and in italics):
ATGGAGCAAACAGTTTTGGTCCCTCCGGGACCAGACAGTTTCAATTTCTTTACT CGGGAGAGTCTTGCCGCCATTGAGAGGCGCATAGCTGAGGZA4GZ4CZ4GG4GC TACAATCCAGCTACCATTCTGCTTTTA TTCTA TGGTTGGGA TAAGGCTGGA TTA TTCT GAGTCCAAGCTAGGCCCTTTTGCTAA TCATGTTCA TACCTCTTA TCTTCCTCCCACAG GAAAAGGCTAAGAATCCAAAACCTGACAAGAAAGACGACGACGAAAACGGA CCCAAACCTAACTCAGATCTCGAAGCTGGAAAGAATCTCCCATTCATCTATGG TGATATCCCTCCAGAAATGGTTTCAGAACCTCTAGAAGATCTCGATCCATACTA TATCAATAAAAAGACCTTCATCGTTCTGAACAAAGGAAAGGCGATTTTCCGGT TTTCTGCTACTTCTGCTCTCTATATTCTCACACCATTTAATCCACTTCGCAAGAT TGCGATTAAGATACTGGTGCATAGTCTGTTCAGTATGCTGATTATGTGTACAAT CCTTACAAATTGTGTCTTTATGACTATGTCTAACCCGCCGGATTGGACCAAGAA CGTAGAATACACGTTCACTGGAATCTATACGTTCGAGTCTCTTATTAAGATAAT CGCCAGGGGGTTCTGTCTTGAGGATTTCACTTTCCTCCGCGATCCGTGGAATTG GCTTGACTTCACCGTTATTACGTTCGCTTACGTTACTGAGTTCGTTGATCTTGGC AACGTGTCTGCACTCAGAACATTCAGAGTGCTTAGAGCACTTAAAACCATAAG TGTCATACCAGGATTGAAAACGATCGTGGGAGCTCTGATACAGAGTGTAAAGA AGCTTTCAGATGTAATGATCCTTACTGTCTTCTGTCTTTCCGTATTCGCACTCAT CGGGCTCCAGCTGTTTATGGGTAACCTCAGAAACAAATGCATTCAATGGCCAC CAACAAATGCGAGCCTTGAGGAACATAGCATAGAAAAGAATATCACTGTTAA CTATAATGGGACCCTCATAAACGAAACCGTGTTCGAATTTGACTGGAAATCCT ACATTCAGGATTCCAGATATCATTATTTTCTTGAGGGCTTCTTGGACGCACTTT TGTGCGGAAATTCAAGTGATGCTGGTCAATGTCCTGAAGGTTATATGTGTGTTA AAGCAGGAAGAAACCCAAACTACGGATACACATCTTTCGATACATTTTCTTGG GCTTTCCTATCTCTTTTTCGGCTTATGACACAAGACTTTTGGGAAAATTTGTATC AGCTGACACTCCGAGCGGCTGGAAAAACTTATATGATCTTCTTCGTTCTTGTAA TCTTTTTGGGATCCTTCTACCTCATCAATTTGATACTTGCAGTTGTCGCTATGGC ATACGAGGAGCAAAATCAAGCAACGCTAGAAGAAGCGGAGCAGAAAGAGGC TGAATTTCAACAGATGATTGAGCAATTGAAGAAACAACAGGAAGCTGCACAA CAAGCAGCTACTGCTACTGCATCTGAACATTCTAGAGAGCCAAGTGCAGCTGG AAGGCTTTCTGATAGTTCAAGTGAAGCATCTAAATTGAGTTCTAAGTCAGCAA AGGAACGGAGAAATAGACGGAAAAAACGAAAGCAGAAGGAGCAATCTGGAG GAGAAGAGAAGGACGAAGACGAGTTTCAAAAAAGTGAATCAGAGGACTCAAT TAGACGCAAAGGATTCAGATTTAGTATCGAAGGAAATAGATTGACTTATGAAA AACGATATTCCTCACCACATCAGTCACTCCTGAGTATACGCGGGTCACTCTTTT CACCGAGACGAAATTCCAGAACTTCACTCTTCTCATTCCGGGGAAGGGCTAAG GATGTAGGCTCAGAAAATGATTTCGCAGACGATGAGCATTCCACTTTTGAAGA TAATGAGAGCAGGCGAGACAGTCTCTTTGTACCACGAAGACATGGCGAAAGA AGGAACAGCAACCTTAGCCAGACTAGTCGGTCCAGTAGAATGCTAGCTGTATT CCCTGCTAATGGCAAGATGCATTCCACCGTTGATTGTAATGGGGTCGTCTCGTT AGTAGGTGGACCTTCAGTTCCTACCTCACCGGTTGGACAATTGCTGCCGGAGG GAACCACTACTGAGACTGAAATGAGAAAACGACGTTCTTCAAGCTTCCATGTG TCTATGGATTTTTTGGAAGACCCGTCACAGCGCCAAAGAGCTATGTCTATAGCT TCAATCCTGACAAACACCGTAGAGGAGTTGGAGGAGTCACGCCAGAAGTGCC CTCCTTGTTGGTACAAGTTCTCCAACATCTTCCTGATTTGGGATTGTTCACCTTA TTGGCTGAAAGTCAAGCACGTTGTTAACCTCGTCGTAATGGATCCTTTTGTCGA CCTGGCTATAACGATATGTATCGTCCTGAACACACTCTTCATGGCTATGGAGCA TTATCCGATGACTGATCATTTTAACAATGTGCTTACCGTGGGTAATCTGGTTTT CACTGGCATCTTTACTGCAGAAATGTTTCTTAAGATTATTGCAATGGACCCCTA CTACTACTTTCAAGAAGGATGGAATATTTTTGATGGTTTTATCGTCACACTTTC TTTGGTTGAATTGGGCTTGGCAAATGTAGAGGGGCTCTCAGTTCTTAGAAGTTT CAGACTTCTCCGGGTATTCAAGCTTGCTAAGAGCTGGCCTACTTTGAACATGCT CATCAAGATTATCGGAAACAGTGTTGGCGCCCTTGGCAATCTGACATTGGTGT TGGCTATCATAGTATTCATCTTCGCGGTTGTGGGAATGCAGTTGTTTGGGAAGT CATACAAGGACTGTGTGTGCAAGATAGCGTCCGACTGTCAACTTCCGAGGTGG CACATGAACGATTTCTTTCATTCATTCCTCATTGTGTTTCGGGTCCTCTGTGGCG AGTGGATAGAAACTATGTGGGAC (SEQ ID NO:63) hSCNlA-CO-Cterml051-Intron (the Intron is underlined and in italics):
TGTATGGAAGTAGCTGGGCAGGCGATGTGCCTTACGGTATTCATGATGGTCAT GGTCATCGGAAATCTTGTTGTATTGAATTTATTTCTCGCGTTGTTGTTGAGTTCA TTTTCCGCCGATAATTTGGCTGCCACTGATGACGACAACGAGATGAATAATCTT CAGATAGCTGTAGACCGGATGCACAAAGGCGTTGCCTACGTCAAACGAAAAAT CTATGAATTCATACAGCAATCCTTCATACGAAAACAGAAGATTCTGGATGAAA TCAAACCCCTTGATGATCTCAATAATAAGAAAGATTCATGCATGTCGAACCAT ACCACAGAGATAGGCAAGGACCTTGACTACCTTAAAGACGTGAACGGTACCAC kkGTGGkkTkGGCACAGGTAAGTACTAGCAGCTACAATCCAGCTACCATTCTGCTT TTATTCTATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCTTTTGCT AATCATGTTCATACCTCTTATCTTCCTCCCACAGGGTCGTGTGTkGAGkkGTACAsAC ATAGACGAGAGCGATTACATGTCTTTCATCAACAACCCGTCCCTCACTGTCACC GTTCCCATCGCCGTAGGAGAATCTGACTTCGAGAATCTCAATACGGAGGATTT CAGCTCCGAATCAGACTTGGAGGAATCAAAGGAGAAGTTGAACGAAAGTTCA AGTTCATCCGAGGGCAGCACCGTGGACATAGGCGCCCCCGTCGAGGAACAACC TGTAGTCGAGCCTGAGGAAACTTTGGAACCCGAAGCGTGTTTCACGGAGGGGT GTGTTCAACGCTTCAAGTGTTGCCAAATTAACGTTGAAGAGGGTCGTGGAAAA CAATGGTGGAACCTCCGCAGGACCTGTTTCCGGATCGTCGAACATAATTGGTT CGAGACGTTCATAGTTTTCATGATCTTGCTTTCATCTGGTGCTTTGGCATTCGA GGATATCTACATCGACCAACGAAAGACCATAAAAACTATGCTGGAATATGCAG ACAAGGTTTTCACATACATATTCATCCTTGAAATGCTCCTGAAATGGGTAGCGT ATGGTTACCAGACTTATTTCACGAACGCATGGTGCTGGCTCGATTTCCTGATTG TCGACGTCTCCCTGGTGTCATTGACTGCTAACGCACTCGGATATAGCGAACTAG GCGCTATTAAGAGTCTCAGAACCCTGAGAGCATTGAGGCCCCTCCGCGCGCTC TCTCGGTTTGAGGGAATGAGAGTAGTCGTTAATGCACTGTTGGGAGCGATACC TTCCATTATGAACGTGCTTCTCGTTTGTCTCATCTTCTGGCTGATATTCTCTATT
ATGGGTGTGAACTTGTTCGCAGGCAAATTTTACCACTGCATTAACACAACTAC AGGAGATAGATTTGATATTGAGGATGTAAACAACCACACCGACTGTTTGAAGT
TGATAGAGAGAAACGAGACCGCAAGATGGAAGAATGTAAAAGTCAACTTCGA
CAATGTCGGCTTTGGATATCTTTCACTGCTGCAAGTAGCCACATTCAAAGGATG
GATGGACATTATGTACGCTGCAGTAGATTCCCGAAACGTAGAGTTGCAACCGA
AGTATGAAGAAAGTTTGTATATGTACCTCTACTTCGTAATTTTTATCATCTTTG
GCTCATTCTTCACACTTAACCTGTTCATTGGTGTAATCATCGACAATTTCAATC
AGCAGAAAAAGAAATTTGGTGGACAAGACATCTTCATGACAGAGGAACAGAA
GAAATACTATAATGCAATGAAAAAACTAGGGTCCAAAAAGCCCCAAAAACCT
ATTCCTAGACCGGGCAACAAGTTTCAAGGCATGGTTTTCGACTTCGTAACTAG
ACAGGTGTTTGATATATCTATTATGATTCTGATATGTCTGAATATGGTTACGAT
GATGGTTGAGACTGATGATCAATCTGAATACGTTACGACGATACTTAGCCGAA
TTAACTTGGTATTCATTGTTCTTTTCACGGGCGAATGTGTACTTAAACTGATTA
GTTTAAGGCACTATTATTTCACAATCGGTTGGAACATTTTTGATTTCGTTGTGG
TCATACTTTCCATTGTTGGCATGTTTCTTGCTGAATTGATAGAAAAGTACTTCG
TCAGTCCAACACTTTTCCGAGTTATACGGCTTGCCCGAATCGGACGAATTCTCA
GGCTAATCAAAGGTGCTAAAGGAATTCGTACACTGCTTTTCGCTCTCATGATGT
CACTGCCAGCTCTTTTCAACATCGGTTTGTTACTATTTTTGGTAATGTTTATATA
TGCGATCTTCGGCATGAGTAATTTCGCTTATGTTAAACGGGAGGTGGGAATCG
ATGACATGTTTAATTTTGAGACATTCGGCAATTCTATGATCTGTCTCTTTCAAA
TTACCACGTCAGCTGGATGGGACGGATTGCTTGCTCCGATTCTCAACAGTAAA
CCGCCCGATTGCGACCCTAACAAAGTGAATCCGGGTTCATCTGTAAAGGGAGA
CTGCGGAAATCCGAGCGTCGGTATCTTCTTTTTCGTCTCCTACATTATAATTTCT
TTCCTTGTTGTCGTGAACATGTATATAGCTGTGATCTTGGAAAATTTTTCTGTTG
CTACTGAGGAATCCGCAGAACCACTTTCAGAAGACGATTTTGAGATGTTTTAC
GAAGTTTGGGAGAAGTTTGATCCTGACGCTACACAGTTTATGGAATTTGAGAA
GCTCTCACAGTTCGCAGCTGCCCTGGAGCCTCCGTTGAATCTTCCACAGCCTAA
CAAGTTACAACTGATTGCGATGGACCTGCCAATGGTGTCTGGGGACCGAATCC
ACTGCCTTGATATACTCTTTGCTTTCACAAAAAGGGTCTTGGGCGAGTCTGGAG
AAATGGACGCCCTCAGAATACAGATGGAGGAACGATTCATGGCTTCGAATCCT
AGCAAAGTGTCTTATCAACCCATCACTACGACTCTTAAAAGAAAACAAGAGGA
AGTGTCTGCTGTCATTATCCAGCGAGCATATAGACGGCACTTGCTCAAACGAA
CTGTTAAGCAAGCCAGTTTCACCTACAATAAAAACAAAATAAAAGGTGGTGCT
AATTTGCTGATTAAAGAGGACATGATTATCGACAGAATCAATGAGAACTCCAT
TACAGAAAAAACCGATCTCACTATGTCAACAGCAGCCTGTCCTCCCTCATACG
ACCGTGTCACTAAACCTATAGTCGAAAAACATGAACAAGAGGGCAAGGATGA
GAAGGCCAAAGGCAAA (SEQ ID NO: 64)
Human SCN1A:
ATGGAGCAAACAGTGCTTGTACCACCAGGACCTGACAGCTTCAACTTCTTCAC
CAGAGAATCTCTTGCGGCTATTGAAAGACGCATTGCAGAAGAAAAGGCAAAG
AATCCCAAACCAGACAAAAAAGATGACGACGAAAATGGCCCAAAGCCAAATA
GTGACTTGGAAGCTGGAAAGAACCTTCCATTTATTTATGGAGACATTCCTCCA
GAGATGGTGTCAGAGCCCCTGGAGGACCTGGACCCCTACTATATCAATAAGAA
AACTTTTATAGTATTGAATAAAGGGAAGGCCATCTTCCGGTTCAGTGCCACCTC
TGCCCTGTACATTTTAACTCCCTTCAATCCTCTTAGGAAAATAGCTATTAAGAT
TTTGGTACATTCATTATTCAGCATGCTAATTATGTGCACTATTTTGACAAACTG
TGTGTTTATGACAATGAGTAACCCTCCTGATTGGACAAAGAATGTAGAATACA
CCTTCACAGGAATATATACTTTTGAATCACTTATAAAAATTATTGCAAGGGGAT TCTGTTTAGAAGATTTTACTTTCCTTCGGGATCCATGGAACTGGCTCGATTTCA
CTGTCATTACATTTGCGTACGTCACAGAGTTTGTGGACCTGGGCAATGTCTCGG
CATTGAGAACATTCAGAGTTCTCCGAGCATTGAAGACGATTTCAGTCATTCCA
GGCCTGAAAACCATTGTGGGAGCCCTGATCCAGTCTGTGAAGAAGCTCTCAGA
TGTAATGATCCTGACTGTGTTCTGTCTGAGCGTATTTGCTCTAATTGGGCTGCA
GCTGTTCATGGGCAACCTGAGGAATAAATGTATACAATGGCCTCCCACCAATG
CTTCCTTGGAGGAACATAGTATAGAAAAGAATATAACTGTGAATTATAATGGT
ACACTTATAAATGAAACTGTCTTTGAGTTTGACTGGAAGTCATATATTCAAGAT
TCAAGATATCATTATTTCCTGGAGGGTTTTTTAGATGCACTACTATGTGGAAAT
AGCTCTGATGCAGGCCAATGTCCAGAGGGATATATGTGTGTGAAAGCTGGTAG
AAATCCCAATTATGGCTACACAAGCTTTGATACCTTCAGTTGGGCTTTTTTGTC
CTTGTTTCGACTAATGACTCAGGACTTCTGGGAAAATCTTTATCAACTGACATT
ACGTGCTGCTGGGAAAACGTACATGATATTTTTTGTATTGGTCATTTTCTTGGG
CTCATTCTACCTAATAAATTTGATCCTGGCTGTGGTGGCCATGGCCTACGAGGA
ACAGAATCAGGCCACCTTGGAAGAAGCAGAACAGAAAGAGGCCGAATTTCAG
CAGATGATTGAACAGCTTAAAAAGCAACAGGAGGCAGCTCAGCAGGCAGCAA
CGGCAACTGCCTCAGAACATTCCAGAGAGCCCAGTGCAGCAGGCAGGCTCTCA
GACAGCTCATCTGAAGCCTCTAAGTTGAGTTCCAAGAGTGCTAAGGAAAGAAG
AAATCGGAGGAAGAAAAGAAAACAGAAAGAGCAGTCTGGTGGGGAAGAGAA
AGATGAGGATGAATTCCAAAAATCTGAATCTGAGGACAGCATCAGGAGGAAA
GGTTTTCGCTTCTCCATTGAAGGGAACCGATTGACATATGAAAAGAGGTACTC
CTCCCCACACCAGTCTTTGTTGAGCATCCGTGGCTCCCTATTTTCACCAAGGCG
AAATAGCAGAACAAGCCTTTTCAGCTTTAGAGGGCGAGCAAAGGATGTGGGA
TCTGAGAACGACTTCGCAGATGATGAGCACAGCACCTTTGAGGATAACGAGAG
CCGTAGAGATTCCTTGTTTGTGCCCCGACGACACGGAGAGAGACGCAACAGCA
ACCTGAGTCAGACCAGTAGGTCATCCCGGATGCTGGCAGTGTTTCCAGCGAAT
GGGAAGATGCACAGCACTGTGGATTGCAATGGTGTGGTTTCCTTGGTTGGTGG
ACCTTCAGTTCCTACATCGCCTGTTGGACAGCTTCTGCCAGAGGTGATAATAGA
TAAGCCAGCTACTGATGACAATGGAACAACCACTGAAACTGAAATGAGAAAG
AGAAGGTCAAGTTCTTTCCACGTTTCCATGGACTTTCTAGAAGATCCTTCCCAA
AGGCAACGAGCAATGAGTATAGCCAGCATTCTAACAAATACAGTAGAAGAAC
TTGAAGAATCCAGGCAGAAATGCCCACCCTGTTGGTATAAATTTTCCAACATA
TTCTTAATCTGGGACTGTTCTCCATATTGGTTAAAAGTGAAACATGTTGTCAAC
CTGGTTGTGATGGACCCATTTGTTGACCTGGCCATCACCATCTGTATTGTCTTA
AATACTCTTTTCATGGCCATGGAGCACTATCCAATGACGGACCATTTCAATAAT
GTGCTTACAGTAGGAAACTTGGTTTTCACTGGGATCTTTACAGCAGAAATGTTT
CTGAAAATTATTGCCATGGATCCTTACTATTATTTCCAAGAAGGCTGGAATATC
TTTGACGGTTTTATTGTGACGCTTAGCCTGGTAGAACTTGGACTCGCCAATGTG
GAAGGATTATCTGTTCTCCGTTCATTTCGATTGCTGCGAGTTTTCAAGTTGGCA
AAATCTTGGCCAACGTTAAATATGCTAATAAAGATCATCGGCAATTCCGTGGG
GGCTCTGGGAAATTTAACCCTCGTCTTGGCCATCATCGTCTTCATTTTTGCCGT
GGTCGGCATGCAGCTCTTTGGTAAAAGCTACAAAGATTGTGTCTGCAAGATCG
CCAGTGATTGTCAACTCCCACGCTGGCACATGAATGACTTCTTCCACTCCTTCC
TGATTGTGTTCCGCGTGCTGTGTGGGGAGTGGATAGAGACCATGTGGGACTGT
ATGGAGGTTGCTGGTCAAGCCATGTGCCTTACTGTCTTCATGATGGTCATGGTG
ATTGGAAACCTAGTGGTCCTGAATCTCTTTCTGGCCTTGCTTCTGAGCTCATTT
AGTGCAGACAACCTTGCAGCCACTGATGATGATAATGAAATGAATAATCTCCA
AATTGCTGTGGATAGGATGCACAAAGGAGTAGCTTATGTGAAAAGAAAAATAT
ATGAATTTATTCAACAGTCCTTCATTAGGAAACAAAAGATTTTAGATGAAATT
AAACCACTTGATGATCTAAACAACAAGAAAGACAGTTGTATGTCCAATCATAC AGCAGAAATTGGGAAAGATCTTGACTATCTTAAAGATGTAAATGGAACTACAA
GTGGTATAGGAACTGGCAGCAGTGTTGAATACATTATTGATGAAAGTGATTAC
ATGTCATTCATAAACAACCCCAGTCTTACTGTGACTGTACCAATTGCTGTAGGA
GAATCTGACTTTGAAAATTTAAACACGGAAGACTTTAGTAGTGAATCGGATCT
GGAAGAAAGCAAAGAGAAACTGAATGAAAGCAGTAGCTCATCAGAAGGTAGC
ACTGTGGACATCGGCGCACCTGTAGAAGAACAGCCCGTAGTGGAACCTGAAG
AAACTCTTGAACCAGAAGCTTGTTTCACTGAAGGCTGTGTACAAAGATTCAAG
TGTTGTCAAATCAATGTGGAAGAAGGCAGAGGAAAACAATGGTGGAACCTGA
GAAGGACGTGTTTCCGAATAGTTGAACATAACTGGTTTGAGACCTTCATTGTTT
TCATGATTCTCCTTAGTAGTGGTGCTCTGGCATTTGAAGATATATATATTGATC
AGCGAAAGACGATTAAGACGATGTTGGAATATGCTGACAAGGTTTTCACTTAC
ATTTTCATTCTGGAAATGCTTCTAAAATGGGTGGCATATGGCTATCAAACATAT
TTCACCAATGCCTGGTGTTGGCTGGACTTCTTAATTGTTGATGTTTCATTGGTCA
GTTTAACAGCAAATGCCTTGGGTTACTCAGAACTTGGAGCCATCAAATCTCTC
AGGACACTAAGAGCTCTGAGACCTCTAAGAGCCTTATCTCGATTTGAAGGGAT
GAGGGTGGTTGTGAATGCCCTTTTAGGAGCAATTCCATCCATCATGAATGTGCT
TCTGGTTTGTCTTATATTCTGGCTAATTTTCAGCATCATGGGCGTAAATTTGTTT
GCTGGCAAATTCTACCACTGTATTAACACCACAACTGGTGACAGGTTTGACAT
CGAAGACGTGAATAATCATACTGATTGCCTAAAACTAATAGAAAGAAATGAG
ACTGCTCGATGGAAAAATGTGAAAGTAAACTTTGATAATGTAGGATTTGGGTA
TCTCTCTTTGCTTCAAGTTGCCACATTCAAAGGATGGATGGATATAATGTATGC
AGCAGTTGATTCCAGAAATGTGGAACTCCAGCCTAAGTATGAAGAAAGTCTGT
ACATGTATCTTTACTTTGTTATTTTCATCATCTTTGGGTCCTTCTTCACCTTGAA
CCTGTTTATTGGTGTCATCATAGATAATTTCAACCAGCAGAAAAAGAAGTTTG
GAGGTCAAGACATCTTTATGACAGAAGAACAGAAGAAATACTATAATGCAAT
GAAAAAATTAGGATCGAAAAAACCGCAAAAGCCTATACCTCGACCAGGAAAC
AAATTTCAAGGAATGGTCTTTGACTTCGTAACCAGACAAGTTTTTGACATAAGC
ATCATGATTCTCATCTGTCTTAACATGGTCACAATGATGGTGGAAACAGATGA
CCAGAGTGAATATGTGACTACCATTTTGTCACGCATCAATCTGGTGTTCATTGT
GCTATTTACTGGAGAGTGTGTACTGAAACTCATCTCTCTACGCCATTATTATTT
TACCATTGGATGGAATATTTTTGATTTTGTGGTTGTCATTCTCTCCATTGTAGGT
ATGTTTCTTGCCGAGCTGATAGAAAAGTATTTCGTGTCCCCTACCCTGTTCCGA
GTGATCCGTCTTGCTAGGATTGGCCGAATCCTACGTCTGATCAAAGGAGCAAA
GGGGATCCGCACGCTGCTCTTTGCTTTGATGATGTCCCTTCCTGCGTTGTTTAA
CATCGGCCTCCTACTCTTCCTAGTCATGTTCATCTACGCCATCTTTGGGATGTCC
AACTTTGCCTATGTTAAGAGGGAAGTTGGGATCGATGACATGTTCAACTTTGA
GACCTTTGGCAACAGCATGATCTGCCTATTCCAAATTACAACCTCTGCTGGCTG
GGATGGATTGCTAGCACCCATTCTCAACAGTAAGCCACCCGACTGTGACCCTA
ATAAAGTTAACCCTGGAAGCTCAGTTAAGGGAGACTGTGGGAACCCATCTGTT
GGAATTTTCTTTTTTGTCAGTTACATCATCATATCCTTCCTGGTTGTGGTGAACA
TGTACATCGCGGTCATCCTGGAGAACTTCAGTGTTGCTACTGAAGAAAGTGCA
GAGCCTCTGAGTGAGGATGACTTTGAGATGTTCTATGAGGTTTGGGAGAAGTT
TGATCCCGATGCAACTCAGTTCATGGAATTTGAAAAATTATCTCAGTTTGCAGC
TGCGCTTGAACCGCCTCTCAATCTGCCACAACCAAACAAACTCCAGCTCATTG
CCATGGATTTGCCCATGGTGAGTGGTGACCGGATCCACTGTCTTGATATCTTAT
TTGCTTTTACAAAGCGGGTTCTAGGAGAGAGTGGAGAGATGGATGCTCTACGA
ATACAGATGGAAGAGCGATTCATGGCTTCCAATCCTTCCAAGGTCTCCTATCA
GCCAATCACTACTACTTTAAAACGAAAACAAGAGGAAGTATCTGCTGTCATTA
TTCAGCGTGCTTACAGACGCCACCTTTTAAAGCGAACTGTAAAACAAGCTTCC
TTTACGTACAATAAAAACAAAATCAAAGGTGGGGCTAATCTTCTTATAAAAGA AGACATGATAATTGACAGAATAAATGAAAACTCTATTACAGAAAAAACTGATC TGACCATGTCCACTGCAGCTTGTCCACCTTCCTATGACCGGGTGACAAAGCCA ATTGTGGAAAAACATGAGCAAGAAGGCAAAGATGAAAAAGCCAAAGGGA (SEQ ID NO: 65)
Beta-Globin Minimal Promoter (pBGmin/minBGlobin/minBGprom):
GGGCTGGGCATAAAAGTCAGGGCAGAGCCATCTATTGCTTACATTTGCTTC
TG (SEQ ID NO:3)
MinCMV Promoter:
GAGGTAGGCGTGTACGGTGGGAGGCCTATATAAGCAGAGCTCGTTTAGTGAAC
CGTCAGATCGCCTGG (SEQ ID NO:4)
Mutated minCMV Promoter (SacI RE site removed):
GAGGTAGGCGTGTACGGTGGGAGGCCTATATAAGCAGAGCTGGTTTAGTGAAC
CGTCAGATCGCCTGG (SEQ ID NO: 5) minRho Promoter:
GATTCAGCCGGGAGCTTAGGGAGGGGAGGTCACTTCATAAGGGCCTGGGGGG GGAGTTGGAGCCACGAGTCGTCCAGCCGGAGCCCCGTGTGGCTGAGCTCCGGC CTCAGAAGCATCCCC (SEQ ID NO:6) minRho* Promoter:
GATTCAGCCGGGAGCTTAGGGAGGGGAGGTCACTTCATAAGGGCTTGGGG GGGGAGTTGGAGCCACGAGTCGTCCAGCCGGAGCCCCGTGTGGCTGTGCTC CGGCCTCAGAAGCATCCCC (SEQ ID NO: 7)
Hsp68 minimal Promoter (proHsp68):
CAGGAACATCCAAACTGAGCAGCCGGGGTCCCCCCCACCCCCCACCCCGCCCC ACGCGGCAACTTTGAGCCTGTGCTGGGACAGAGCCTCTAGTTCCTAAATTAGT CCATGAGGTCAGAGGCAGCACTGCCATTGTAACGCGATTGGAGAGGATCACGT
CACCGGACACGCCCCCAGGCATCTCCCTGGGTCTCCTAAACTTGGCGGGGAGA AGTTTTAGCCCTTAAGTTTTAGCCTTTAACCCCCATATTCAGAACTGTGCGAGT TGGCGAAACCCCACAAATCACAACAAACTGTACACAACACCGAGCTAGAGGT GATCTTTCTTGTCCATTCCACACAGGCCTTAGTAATGCGTCGCCATAGCAACAG TGTCACTAGTAGCACCAGCACTTCCCCACACCCTCCCCCTCAGGAATCCGTACT CTCCAGTGAACCCCAGAAACCTCTGGAGAGTTCTGGACAAGGGCGGAACCCAC
AACTCCGATTACTCAAGGGAGGCGGGGAAGCTCCACCAGACGCGAAACTGCT
GGAAGATTCCTGGCCCCAAGGCCTCCTCCGGCTCGCTGATTGGCCCAGCGGAG AGTGGGCGGGGCCGGTGAAGACTCCTTAAAGGCGCAGGGCGGCGAGCAGGTC ACCAGACGCTGACAGCTACTCAGAACCAAATCTGGTTCCATCCAGAGACAAGC
GAAGACAAGAGAAGCAGAGCGAGCGGCGCGTTCCCGATCCTCGGCCAGGACC AGCCTTCCCCAGAGCATCCCTGCCGCGGAGCGCAACCTTCCCAGGAGCATCCC TGCCGCGGAGCGCAACTTTCCCCGGAGCATCCACGCCGCGGAGCGCAGCCTTC
CAGAAGCAGAGCGCGGCGCC (SEQ ID NO: 8)
SYFP2:
ATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGA GCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAG GGCGATGCCACCTACGGCAAGCTGACCCTGAAGCTGATCTGCACCACCGGCAA
GCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGGGCTACGGCGTGCAGT GCTTCGCCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCC
ATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAA
CTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGC
ATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACA
AGCTGGAGTACAACTACAACAGCCACAACGTCTATATCACCGCCGACAAGCA
GAAGAACGGCATCAAGGCCAACTTCAAGATCCGCCACAACATCGAGGACGGC
GGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCC
CCGTGCTGCTGCCCGACAACCACTACCTGAGCTACCAGTCCAAGCTGAGCAAA
GACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGC
CGGGATCACTCTCGGCATGGACGAGCTGTACAAGTAA (SEQ ID NO: 9)
EGFP:
ATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGA
GCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAG
GGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAA
GCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGT
GCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCC
ATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAA
CTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGC
ATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACA
AGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAG
AAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCA
GCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCC
GTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGA
CCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCG
GGATCACTCTCGGCATGGACGAGCTGTACAAGTAA (SEQ ID NO: 10) mScarlet:
ATGGTCTCCAAAGGAGAAGCGGTCATTAAAGAGTTCATGAGGTTCAAGGTTCA
TATGGAAGGCTCCATGAATGGTCATGAGTTCGAGATTGAAGGGGAGGGTGAG
GGGAGACCTTATGAGGGCACTCAGACAGCGAAATTGAAGGTGACAAAGGGAG
GACCTCTCCCGTTCAGTTGGGACATATTGTCACCGCAATTTATGTATGGTTCTA
GAGCCTTCACTAAGCACCCTGCCGACATCCCAGATTACTACAAGCAATCCTTCC
CTGAGGGCTTTAAGTGGGAGAGAGTAATGAATTTTGAAGATGGCGGGGCAGTC
ACAGTAACACAAGATACATCCCTGGAAGATGGAACACTTATCTACAAAGTTAA
GCTCAGAGGAACGAATTTTCCACCGGACGGTCCAGTGATGCAAAAAAAAACA
ATGGGTTGGGAAGCATCTACAGAGCGACTGTACCCTGAAGACGGTGTGCTGAA
GGGGGACATCAAAATGGCCCTGCGACTTAAGGATGGAGGGCGCTATTTGGCAG
ATTTCAAGACTACTTACAAAGCCAAAAAGCCTGTACAAATGCCTGGAGCTTAC
AACGTGGATAGGAAGCTTGATATTACCAGTCACAATGAAGATTATACAGTGGT
AGAACAATATGAACGCTCAGAAGGTCGCCACAGCACTGGAGGCATGGATGAG
TTGTACAAG (SEQ ID NO: 66)
3xNLS:
GATCCAAAGAAGAAAAGGAAAGTTGATCCCAAAAAGAAGAGGAAAGTAGATC
CAAAAAAGAAGCGAAAAGTAGGGTACAAGAAG (SEQ ID NO: 67)
Optimized Flp recombinase (FlpO):
ATGGCTCCTAAGAAGAAGAGGAAGGTGATGAGCCAGTTCGACATCCTGTGCAA GACCCCCCCCAAGGTGCTGGTGCGGCAGTTCGTGGAGAGATTCGAGAGGCCCA
GCGGCGAGAAGATCGCCAGCTGTGCCGCCGAGCTGACCTACCTGTGCTGGATG
ATCACCCACAACGGCACCGCCATCAAGAGGGCCACCTTCATGAGCTACAACAC
CATCATCAGCAACAGCCTGAGCTTCGACATCGTGAACAAGAGCCTGCAGTTCA
AGTACAAGACCCAGAAGGCCACCATCCTGGAGGCCAGCCTGAAGAAGCTGAT
CCCCGCCTGGGAGTTCACCATCATCCCTTACAACGGCCAGAAGCACCAGAGCG
ACATCACCGACATCGTGTCCAGCCTGCAGCTGCAGTTCGAGAGCAGCGAGGAG
GCCGACAAGGGCAACAGCCACAGCAAGAAGATGCTGAAGGCCCTGCTGTCCG
AGGGCGAGAGCATCTGGGAGATCACCGAGAAGATCCTGAACAGCTTCGAGTA
CACCAGCAGGTTCACCAAGACCAAGACCCTGTACCAGTTCCTGTTCCTGGCCA
CATTCATCAACTGCGGCAGGTTCAGCGACATCAAGAACGTGGACCCCAAGAGC
TTCAAGCTGGTGCAGAACAAGTACCTGGGCGTGATCATTCAGTGCCTGGTGAC
CGAGACCAAGACAAGCGTGTCCAGGCACATCTACTTTTTCAGCGCCAGAGGCA
GGATCGACCCCCTGGTGTACCTGGACGAGTTCCTGAGGAACAGCGAGCCCGTG
CTGAAGAGAGTGAACAGGACCGGCAACAGCAGCAGCAACAAGCAGGAGTACC
AGCTGCTGAAGGACAACCTGGTGCGCAGCTACAACAAGGCCCTGAAGAAGAA
CGCCCCCTACCCCATCTTCGCTATCAAGAACGGCCCTAAGAGCCACATCGGCA
GGCACCTGATGACCAGCTTTCTGAGCATGAAGGGCCTGACCGAGCTGACAAAC
GTGGTGGGCAACTGGAGCGACAAGAGGGCCTCCGCCGTGGCCAGGACCACCT
ACACCCACCAGATCACCGCCATCCCCGACCACTACTTCGCCCTGGTGTCCAGGT
ACTACGCCTACGACCCCATCAGCAAGGAGATGATCGCCCTGAAGGACGAGACC
AACCCCATCGAGGAGTGGCAGCACATCGAGCAGCTGAAGGGCAGCGCCGAGG
GCAGCATCAGATACCCCGCCTGGAACGGCATCATCAGCCAGGAGGTGCTGGAC
TACCTGAGCAGCTACATCAACAGGCGGATCTGA (SEQ ID NO: 11)
Improved Cre recombinase (iCre):
ATGGTGCCCAAGAAGAAGAGGAAAGTCTCCAACCTGCTGACTGTGCACCAAA
ACCTGCCTGCCCTCCCTGTGGATGCCACCTCTGATGAAGTCAGGAAGAACCTG
ATGGACATGTTCAGGGACAGGCAGGCCTTCTCTGAACACACCTGGAAGATGCT
CCTGTCTGTGTGCAGATCCTGGGCTGCCTGGTGCAAGCTGAACAACAGGAAAT
GGTTCCCTGCTGAACCTGAGGATGTGAGGGACTACCTCCTGTACCTGCAAGCC
AGAGGCCTGGCTGTGAAGACCATCCAACAGCACCTGGGCCAGCTCAACATGCT
GCACAGGAGATCTGGCCTGCCTCGCCCTTCTGACTCCAATGCTGTGTCCCTGGT
GATGAGGAGAATCAGAAAGGAGAATGTGGATGCTGGGGAGAGAGCCAAGCA
GGCCCTGGCCTTTGAACGCACTGACTTTGACCAAGTCAGATCCCTGATGGAGA
ACTCTGACAGATGCCAGGACATCAGGAACCTGGCCTTCCTGGGCATTGCCTAC
AACACCCTGCTGCGCATTGCCGAAATTGCCAGAATCAGAGTGAAGGACATCTC
CCGCACCGATGGTGGGAGAATGCTGATCCACATTGGCAGGACCAAGACCCTG
GTGTCCACAGCTGGTGTGGAGAAGGCCCTGTCCCTGGGGGTTACCAAGCTGGT
GGAGAGATGGATCTCTGTGTCTGGTGTGGCTGATGACCCCAACAACTACCTGT
TCTGCCGGGTCAGAAAGAATGGTGTGGCTGCCCCTTCTGCCACCTCCCAACTG
TCCACCCGGGCCCTGGAAGGGATCTTTGAGGCCACCCACCGCCTGATCTATGG
TGCCAAGGATGACTCTGGGCAGAGATACCTGGCCTGGTCTGGCCACTCTGCCA
GAGTGGGTGCTGCCAGGGACATGGCCAGGGCTGGTGTGTCCATCCCTGAAATC
ATGCAGGCTGGTGGCTGGACCAATGTGAACATTGTGATGAACTACATCAGAAA
CCTGGACTCTGAGACTGGGGCCATGGTGAGGCTGCTCGAGGATGGGGACTAA
(SEQ ID NO: 12) SP10 insulator (SPlOins):
GAAGCTACCCCTAACACACTATTCTACACACAGAAAATGCTCTTCACTAG (SEQ ID NO:13)
3xSP10ins:
GAAGCTACCCCTAACACACTATTCTACACACAGAAAATGCTCTTCACTAGGAA
GCTACCCCTAACACACTATTCTACACACAGAAAATGCTCTTCACTAGGAAGCT ACCCCTAACACACTATTCTACACACAGAAAATGCTCTTCACTAG (SEQ ID NO: 14)
3XFLAG-version 1:
GACTACAAAGACCATGACGGAGACTATAAAGATCATGACATCGATTACAAGG
ATGACGATGACAAG (SEQ ID NO: 68)
3XFLAG-version 2:
GACTACAAAGACCATGACGGAGATTATAAAGATCATGACATCGATTACAAGG ATGACGATGACAAG (SEQ ID NO: 15)
2xHA:
TACCCGTATGATGTCCCGGATTACGCTGGCAGCTACCCATACGATGTACCCGA
CTATGCCGGCAGT (SEQ ID NO: 69) lOaa:
TCCGGACTCAGATCTGGAGGCTCCGGAGGC (SEQ ID NO: 16)
H2B:
CCAGAGCCAGCGAAGTCTGCTCCCGCCCCGAAAAAGGGCTCCAAGAAGGCGG
TGACTAAGGCGCAGAAGAAAGGCGGCAAGAAGCGCAAGCGCAGCCGCAAGG AGAGCTATTCCATCTATGTGTACAAGGTTCTGAAGCAGGTCCACCCTGACACC GGCATTTCGTCCAAGGCCATGGGCATCATGAATTCGTTTGTGAACGACATTTTC
GAGCGCATCGCAGGAGAGGCTTCCCGCCTGGCGCATTACAACAAGCGCTCGAC CATCACCTCCCGGGAGATCCAGACGGCCGTGCGCCTGCTGCTGCCTGGGGAGT TGGCCAAGCACGCCGTGTCCGAGGGTACTAAGGCCATCACCAAGTACACCAGC GCTAAGTAA (SEQ ID NO: 17)
WPRE3:
ATAATCAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAACT
ATGTTGCTCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCTTTGTATCATGC TATTGCTTCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTTAGTT CTTGCCACGGCGGAACTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGGC
TCGGCTGTTGGGCACTGACAATTCCGTGG (SEQ ID NO: 18)
WPRE:
GCTTATCGATAATCAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTA
TTCTTAACTATGTTGCTCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCTTT GTATCATGCTATTGCTTCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATCC TGGTTGCTGTCTCTTTATGAGGAGTTGTGGCCCGTTGTCAGGCAACGTGGCGTG GTGTGCACTGTGTTTGCTGACGCAACCCCCACTGGTTGGGGCATTGCCACCAC CTGTCAGCTCCTTTCCGGGACTTTCGCTTTCCCCCTCCCTATTGCCACGGCGGA ACTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGGCTCGGCTGTTGGGCA CTGACAATTCCGTGGTGTTGTCGGGGAAATCATCGTCCTTTCCTTGGCTGCTCG CCTATGTTGCCACCTGGATTCTGCGCGGGACGTCCTTCTGCTACGTCCCTTCGG CCCTCAATCCAGCGGACCTTCCTTCCCGCGGCCTGCTGCCGGCTCTGCGGCCTC TTCCGCGTCTTCGCCTTCGCCCTCAGACGAGTCGGATCTCCCTTTGGGCCGCCT CCCCGCATCGATACCG (SEQ ID NO: 19)
BGHpA:
CGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTT CCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAA ATTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGG GCAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGGCATG (SEQ ID NO:20) hGHpA:
ACGGGTGGCATCCCTGTGACCCCTCCCCAGTGCCTCTCCTGGCCCTGGAAGTTG CCACTCCAGTGCCCACCAGCCTTGTCCTAATAAAATTAAGTTGCATCATTTTGT CTGACTAGGTGTCCTTCTATAATATTATGGGGTGGAGGGGGGTGGTATGGAGC AAGGGGCAAGTTGGGAAGACAACCTGTAGGGCCTGCGGGGTCTATTGGGAAC CAAGCTGGAGTGCAGTGGCACAATCTTGGCTCACTGCAATCTCCGCCTCCTGG GTTCAAGCGATTCTCCTGCCTCAGCCTCCCGAGTTGTTGGGATTCCAGGCATGC ATGACCAGGCTCAGCTAATTTTTGTTTTTTTGGTAGAGACGGGGTTTCACCATA TTGGCCAGGCTGGTCTCCAACTCCTAATCTCAGGTGATCTACCCACCTTGGCCT CCCAAATTGCTGGGATTACAGGCGTGAACCACTGCTCCCTTCCCTGTCCTT (SEQ ID NO:21)
SV40pA:
TGTTTATTTGTGAAATTTGTGATGCTATTGCTTTATTTGTAACCATCTAGCTTTA TTTGTGAAATTTGTGATGCTATTGCTTTATTTGTAACCATTATAAGCTGCAATA AACAAGTTAACAACAACAATTGCATTCATTTTATGTTTCAGGTTCAGGGGGAG ATGTGGGAGGTTTTTTAAA (SEQ ID NO:70)
ShortPolyA:
CAATAAAAGATCTTTATTTTCATTAGATCTGTGTGTTGGTTTTTTGTGTG (SEQ ID NO:71)
IRES2:
CCCTCTCCCTCCCCCCCCCCTAACGTTACTGGCCGAAGCCGCTTGGAATAAGGC CGGTGTGCGTTTGTCTATATGTTATTTTCCACCATATTGCCGTCTTTTGGCAATG TGAGGGCCCGGAAACCTGGCCCTGTCTTCTTGACGAGCATTCCTAGGGGTCTTT CCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCGTGAAGGAAGCAGTT CCTCTGGAAGCTTCTTGAAGACAAACAACGTCTGTAGCGACCCTTTGCAGGCA GCGGAACCCCCCACCTGGCGACAGGTGCCTCTGCGGCCAAAAGCCACGTGTAT AAGATACACCTGCAAAGGCGGCACAACCCCAGTGCCACGTTGTGAGTTGGATA GTTGTGGAAAGAGTCAAATGGCTCTCCTCAAGCGTATTCAACAAGGGGCTGAA GGATGCCCAGAAGGTACCCCATTGTATGGGATCTGATCTGGGGCCTCGGTGCA CATGCTTTACATGTGTTTAGTCGAGGTTAAAAAAACGTCTAGGCCCCCCGAAC CACGGGGACGTGGTTTTCCTTTGAAAAACACGATGATAATATGGCCACAACC (SEQ ID NO: 72) P2A:
GGCAGCGGCGCCACCAACTTCAGCCTGCTGAAGCAGGCCGGCGACGTGGAGG
AGAACCCCGGCCCCGGAGCTAGCGGA (SEQ ID NO:22)
T2A:
(GSG)EGRGSLLTCGDVEENPGP (SEQ ID NO:23)
E2A:
(GSG)QCTNYALLKLAGDVESNPGPP (SEQ ID NO:24)
F2A:
(GSG)VKQTLNFDLLKLAGDVESNPGP (SEQ ID NO:25)
Exemplary Plasmid Backbone 1 - Left ITR:
CCTGCAGGCAGCTGCGCGCTCGCTCGCTCACTGAGGCCGCCCGGGCGTCGGGC GACCTTTGGTCGCCCGGCCTCAGTGAGCGAGCGAGCGCGCAGAGAGGGAGTG GCCAACTCCATCACTAGGGGTTCCT (SEQ ID NO:26)
Exemplary Plasmid Backbone 1 - Right ITR:
CCTGCAGGCAGCTGCGCGCTCGCTCGCTCACTGAGGCCGCCCGGGCAAAGCCC GGGCGTCGGGCGACCTTTGGTCGCCCGGCCTCAGTGAGCGAGCGAGCGCGCAG AGAGGGAGTGGCCAACTCCATCACTAGGGGTTCCT (SEQ ID NO:27)
Exemplary Plasmid Backbone 2 - Left ITR:
CATGTCCTGCAGGCAGCTGCGCGCTCGCTCGCTCACTGAGGCCGCCCGGGCAA AGCCCGGGCGTCGGGCGACCTTTGGTCGCCCGGCCTCAGTGAGCGAGCGAGCG CGCAGAGAGGGAGTGGCCAACTCCATCACTAGGGGTTCCT (SEQ ID NO:28)
Exemplary Plasmid Backbone 2 - Right ITR:
AGGAACCCCTAGTGATGGAGTTGGCCACTCCCTCTCTGCGCGCTCGCTCGCTCA CTGAGGCCGGGCGACCAAAGGTCGCCCGACGCCCGGGCTTTGCCCGGGCGGCC TCAGTGAGCGAGCGAGCGCGCAGCTGCCTGCAGGGGCGCCTG (SEQ ID NO:29)
PHP.eB capsid:
MAADGYLPDWLEDNLSEGIREWWALKPGAPQPKANQQHQDNARGLVLPGYKYL GPGNGLDKGEPVNAADAAALEHDKAYDQQLKAGDNPYLKYNHADAEFQERLKE DTSFGGNLGRAVFQAKKRLLEPLGLVEEAAKTAPGKKRPVEQSPQEPDSSAGIGK
SGAQPAKKRLNFGQTGDTESVPDPQPIGEPPAAPSGVGSLTMASGGGAPVADNNE GADGVGSSSGNWHCDSQWLGDRVITTSTRTWALPTYNNHLYKQISNSTSGGSSN DNAYFGYSTPWGYFDFNRFHCHFSPRDWQRLINNNWGFRPKRLNFKLFNIQVKE
VTDNNGVKTIANNLTSTVQVFTDSDYQLPYVLGSAHEGCLPPFPADVFMIPQYGY LTLNDGSQAVGRSSFYCLEYFPSQMLRTGNNFQFSYEFENVPFHSSYAHSQSLDRL
MNPLIDQYLYYLSKTINGSGQNQQTLKFSVAGPSNMAVQGRNYIPGPSYRQQRVS TTVTQNNNSEFAWPGASSWALNGRNSLMNPGPAMASHKEGEDRFFPLSGSLIFGK QGTGRDNVDADKVMITNEEEIKTTNPVATESYGQVATNHQSDGTLAVPFKAQAQ
TGWVQNQGILPGMVWQDRDVYLQGPIWAKIPHTDGNFHPSPLMGGFGMKHPPPQ ILIKNTPVPADPPTAFNKDKLNSFITQYSTGQVSVEIEWELQKENSKRWNPEIQYTS NYYKSNNVEFAVNTEGVYSEPRPIGTRYLTRNL (SEQ ID NO: 30) AAV9 VP1 capsid protein:
MAADGYLPDWLEDNLSEGIREWWALKPGAPQPKANQQHQDNARGLVLPGYKYL GPGNGLDKGEPVNAADAAALEHDKAYDQQLKAGDNPYLKYNHADAEFQERLKE DTSFGGNLGRAVFQAKKRLLEPLGLVEEAAKTAPGKKRPVEQSPQEPDSSAGIGK SGAQPAKKRLNFGQTGDTESVPDPQPIGEPPAAPSGVGSLTMASGGGAPVADNNE GADGVGSSSGNWHCDSQWLGDRVITTSTRTWALPTYNNHLYKQISNSTSGGSSN
DNAYFGYSTPWGYFDFNRFHCHFSPRDWQRLINNNWGFRPKRLNFKLFNIQVKE VTDNNGVKTIANNLTSTVQVFTDSDYQLPYVLGSAHEGCLPPFPADVFMIPQYGY LTLNDGSQAVGRSSFYCLEYFPSQMLRTGNNFQFSYEFENVPFHSSYAHSQSLDRL MNPLIDQYLYYLSKTINGSGQNQQTLKFSVAGPSNMAVQGRNYIPGPSYRQQRVS TTVTQNNNSEFAWPGASSWALNGRNSLMNPGPAMASHKEGEDRFFPLSGSLIFGK
QGTGRDNVDADKVMITNEEEIKTTNPVATESYGQVATNHQSAQAQAQTGWVQN QGILPGMVWQDRDVYLQGPIWAKIPHTDGNFHPSPLMGGFGMKHPPPQILIKNTP VPADPPTAFNKDKLNSFITQYSTGQVSVEIEWELQKENSKRWNPEIQYTSNYYKSN NVEFAVNTEGVYSEPRPIGTRYLTRNL (SEQ ID NO:31) tet-Transactivator version 2 (tTA2):
ATGTCTAGACTGGACAAGAGCAAAGTCATAAACTCTGCTCTGGAATTACTCAA TGAAGTCGGTATCGAAGGCCTGACGACAAGGAAACTCGCTCAAAAGCTGGGA GTTGAGCAGCCTACCCTGTACTGGCACGTGAAGAACAAGCGGGCCCTGCTCGA TGCCCTGGCAATCGAGATGCTGGACAGGCATCATACCCACTTCTGCCCCCTGG AAGGCGAGTCATGGCAAGACTTTCTGCGGAACAACGCCAAGTCATTCCGCTGT
GCTCTCCTCTCACATCGCGACGGGGCTAAAGTGCATCTCGGCACCCGCCCAAC AGAGAAACAGTACGAAACCCTGGAAAATCAGCTCGCGTTCCTGTGTCAGCAAG GCTTCTCCCTGGAGAACGCACTGTACGCTCTGTCCGCCGTGGGCCACTTTACAC TGGGCTGCGTATTGGAGGATCAGGAGCATCAAGTAGCAAAAGAGGAAAGAGA GACACCTACCACCGATTCTATGCCCCCACTTCTGAGACAAGCAATTGAGCTGTT
CGACCATCAGGGAGCCGAACCTGCCTTCCTTTTCGGCCTGGAACTAATCATATG TGGCCTGGAGAAACAGCTAAAGTGCGAAAGCGGCGGGCCGGCCGACGCCCTT GACGATTTTGACTTAGACATGCTCCCAGCCGATGCCCTTGACGACTTTGACCTT GATATGCTGCCTGCTGACGCTCTTGACGATTTTGACCTTGACATGCTCCCCGGG TAA (SEQ ID NO:32)
GTPase HRas [Homo sapiens]:
MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDIL DTAGQEEYSAMRDQYMRTGEGFLCVFAINNTKSFEDIHQYREQIKRVKDSDDVPM VLVGNKCDLAARTVESRQAQDLARSYGIPYIETSAKTRQGVEDAFYTLVREIRQH KLRKLNPPDESGPGCMSCKCVLS (SEQ ID NO:33)
CN3252 (4586 bp between ITRs)
GCGGCCGCACGCGTGGTACCCTAAATAAAGATGGCTTTTTAGTATTAAAAGTG GAAGAAAATTACAGGTAATTATCTTTGACGGTAAAAACGCTGTAATCAGCGGG CTACATGAAAAATTACTCTAATTATGGCTGCATTTAAGAGAATGGCTAAATAA AGATGGCTTTTTAGTATTAAAAGTGGAAGAAAATTACAGGTAATTATCTTTGA CGGTAAAAACGCTGTAATCAGCGGGCTACATGAAAAATTACTCTAATTATGGC
TGCATTTAAGAGAATGGCTAAATAAAGATGGCTTTTTAGTATTAAAAGTGGAA GAAAATTACAGGTAATTATCTTTGACGGTAAAAACGCTGTAATCAGCGGGCTA CATGAAAAATTACTCTAATTATGGCTGCATTTAAGAGAATGGAGCTCGGGCTG GGCATAAAAGTCAGGGCAGAGCCATCTATTGCTTACATTTGCTTCTGGGATCC
ACCATGGTCTACCCGTATGATGTCCCGGATTACGCTGGCAGCTACCCATACGA
TGTACCCGACTATGCCGGCAGTATGGAGCAAACAGTTTTGGTCCCTCCGGGAC
CAGACAGTTTCAATTTCTTTACTCGGGAGAGTCTTGCCGCCATTGAGAGGCGC
ATAGCTGAGGTAAGTACTAGCAGCTACAATCCAGCTACCATTCTGCTTTTATTC
TATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCTTTTGCTAA
TCATGTTCATACCTCTTATCTTCCTCCCACAGGAAAAGGCTAAGAATCCAAAA
CCTGACAAGAAAGACGACGACGAAAACGGACCCAAACCTAACTCAGATCTCG
AAGCTGGAAAGAATCTCCCATTCATCTATGGTGATATCCCTCCAGAAATGGTT
TCAGAACCTCTAGAAGATCTCGATCCATACTATATCAATAAAAAGACCTTCAT
CGTTCTGAACAAAGGAAAGGCGATTTTCCGGTTTTCTGCTACTTCTGCTCTCTA
TATTCTCACACCATTTAATCCACTTCGCAAGATTGCGATTAAGATACTGGTGCA
TAGTCTGTTCAGTATGCTGATTATGTGTACAATCCTTACAAATTGTGTCTTTAT
GACTATGTCTAACCCGCCGGATTGGACCAAGAACGTAGAATACACGTTCACTG
GAATCTATACGTTCGAGTCTCTTATTAAGATAATCGCCAGGGGGTTCTGTCTTG
AGGATTTCACTTTCCTCCGCGATCCGTGGAATTGGCTTGACTTCACCGTTATTA
CGTTCGCTTACGTTACTGAGTTCGTTGATCTTGGCAACGTGTCTGCACTCAGAA
CATTCAGAGTGCTTAGAGCACTTAAAACCATAAGTGTCATACCAGGATTGAAA
ACGATCGTGGGAGCTCTGATACAGAGTGTAAAGAAGCTTTCAGATGTAATGAT
CCTTACTGTCTTCTGTCTTTCCGTATTCGCACTCATCGGGCTCCAGCTGTTTATG
GGTAACCTCAGAAACAAATGCATTCAATGGCCACCAACAAATGCGAGCCTTGA
GGAACATAGCATAGAAAAGAATATCACTGTTAACTATAATGGGACCCTCATAA
ACGAAACCGTGTTCGAATTTGACTGGAAATCCTACATTCAGGATTCCAGATAT
CATTATTTTCTTGAGGGCTTCTTGGACGCACTTTTGTGCGGAAATTCAAGTGAT
GCTGGTCAATGTCCTGAAGGTTATATGTGTGTTAAAGCAGGAAGAAACCCAAA
CTACGGATACACATCTTTCGATACATTTTCTTGGGCTTTCCTATCTCTTTTTCGG
CTTATGACACAAGACTTTTGGGAAAATTTGTATCAGCTGACACTCCGAGCGGC
TGGAAAAACTTATATGATCTTCTTCGTTCTTGTAATCTTTTTGGGATCCTTCTAC
CTCATCAATTTGATACTTGCAGTTGTCGCTATGGCATACGAGGAGCAAAATCA
AGCAACGCTAGAAGAAGCGGAGCAGAAAGAGGCTGAATTTCAACAGATGATT
GAGCAATTGAAGAAACAACAGGAAGCTGCACAACAAGCAGCTACTGCTACTG
CATCTGAACATTCTAGAGAGCCAAGTGCAGCTGGAAGGCTTTCTGATAGTTCA
AGTGAAGCATCTAAATTGAGTTCTAAGTCAGCAAAGGAACGGAGAAATAGAC
GGAAAAAACGAAAGCAGAAGGAGCAATCTGGAGGAGAAGAGAAGGACGAAG
ACGAGTTTCAAAAAAGTGAATCAGAGGACTCAATTAGACGCAAAGGATTCAG
ATTTAGTATCGAAGGAAATAGATTGACTTATGAAAAACGATATTCCTCACCAC
ATCAGTCACTCCTGAGTATACGCGGGTCACTCTTTTCACCGAGACGAAATTCC
AGAACTTCACTCTTCTCATTCCGGGGAAGGGCTAAGGATGTAGGCTCAGAAAA
TGATTTCGCAGACGATGAGCATTCCACTTTTGAAGATAATGAGAGCAGGCGAG
ACAGTCTCTTTGTACCACGAAGACATGGCGAAAGAAGGAACAGCAACCTTAG
CCAGACTAGTCGGTCCAGTAGAATGCTAGCTGTATTCCCTGCTAATGGCAAGA
TGCATTCCACCGTTGATTGTAATGGGGTCGTCTCGTTAGTAGGTGGACCTTCAG
TTCCTACCTCACCGGTTGGACAATTGCTGCCGGAGGGAACCACTACTGAGACT
GAAATGAGAAAACGACGTTCTTCAAGCTTCCATGTGTCTATGGATTTTTTGGA
AGACCCGTCACAGCGCCAAAGAGCTATGTCTATAGCTTCAATCCTGACAAACA
CCGTAGAGGAGTTGGAGGAGTCACGCCAGAAGTGCCCTCCTTGTTGGTACAAG TTCTCCAACATCTTCCTGATTTGGGATTGTTCACCTTATTGGCTGAAAGTCAAG
CACGTTGTTAACCTCGTCGTAATGGATCCTTTTGTCGACCTGGCTATAACGATA
TGTATCGTCCTGAACACACTCTTCATGGCTATGGAGCATTATCCGATGACTGAT
CATTTTAACAATGTGCTTACCGTGGGTAATCTGGTTTTCACTGGCATCTTTACT
GCAGAAATGTTTCTTAAGATTATTGCAATGGACCCCTACTACTACTTTCAAGAA
GGATGGAATATTTTTGATGGTTTTATCGTCACACTTTCTTTGGTTGAATTGGGC
TTGGCAAATGTAGAGGGGCTCTCAGTTCTTAGAAGTTTCAGACTTCTCCGGGT
ATTCAAGCTTGCTAAGAGCTGGCCTACTTTGAACATGCTCATCAAGATTATCG
GAAACAGTGTTGGCGCCCTTGGCAATCTGACATTGGTGTTGGCTATCATAGTA
TTCATCTTCGCGGTTGTGGGAATGCAGTTGTTTGGGAAGTCATACAAGGACTG
TGTGTGCAAGATAGCGTCCGACTGTCAACTTCCGAGGTGGCACATGAACGATT
TCTTTCATTCATTCCTCATTGTGTTTCGGGTCCTCTGTGGCGAGTGGATAGAAA
CTATGTGGGACTGTATGGAAGTAGCTGGGCAGGCGATGTGCCTTACGGTATTC
ATGATGGTCATGGTCATCGGAAATCTTGTTGTATTGAATTTATTTCTCGCGTTG
TTGTTGAGTTCATTTTCCGCCGATAATTTGGCTGCCACTGATGACGACAACGAG
ATGAATAATCTTCAGATAGCTGTAGACCGGATGCACAAAGGCGTTGCCTACGT
CAAACGAAAAATCTATGAATTCATACAGCAATCCTTCATACGAAAACAGAAG
ATTCTGGATGAAATCAAACCCCTTGATGATCTCAATAATAAGAAAGATTCATG
TCTCAGTTATGACACAGAAATCTTGACGGTGGAATACGGGTTTCTTCCGATCG
GAAAGATTGTTGAGGAGCGCATAGAGTGTACGGTGTATACCGTCGATAAGAA
CGGATTTGTCTACACACAGCCTATCGCACAATGGCATAATAGAGGAGAACAAG
AAGTCTTCGAATATTGTTTGGAGGACGGATCAATCATACGGGCAACCAAAGAC
CACAAGTTTATGACAACAGATGGACAGATGTTGCCAATAGATGAGATATTTGA
GAGGGGACTTGATCTCAAGCAAGTGGATGGTCTGCCATAATGATATCATAATC
AACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAACTATGTTG
CTCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCTTTGTATCATGCTATTGC
TTCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTTAGTTCTTGCC
ACGGCGGAACTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGGCTCGGCT
GTTGGGCACTGACAATTCCGTGGTGTTTATTTGTGAAATTTGTGATGCTATTGC
TTTATTTGTAACCATCTAGCTTTATTTGTGAAATTTGTGATGCTATTGCTTTATT
TGTAACCATTATAAGCTGCAATAAACAAGTTAACAACAACAATTGCATTCATT
TTATGTTTCAGGTTCAGGGGGAGATGTGGGAGGTTTTTTAAACGGACCGAGCG
GCCGC (SEQ ID NO:73)
CN3254 (4091 bp between ITRs)
GCGGCCGCACGCGTGGTACCCTAAATAAAGATGGCTTTTTAGTATTAAAAGTG
GAAGAAAATTACAGGTAATTATCTTTGACGGTAAAAACGCTGTAATCAGCGGG
CTACATGAAAAATTACTCTAATTATGGCTGCATTTAAGAGAATGGCTAAATAA
AGATGGCTTTTTAGTATTAAAAGTGGAAGAAAATTACAGGTAATTATCTTTGA
CGGTAAAAACGCTGTAATCAGCGGGCTACATGAAAAATTACTCTAATTATGGC
TGCATTTAAGAGAATGGCTAAATAAAGATGGCTTTTTAGTATTAAAAGTGGAA
GAAAATTACAGGTAATTATCTTTGACGGTAAAAACGCTGTAATCAGCGGGCTA
CATGAAAAATTACTCTAATTATGGCTGCATTTAAGAGAATGGAGCTCGGGCTG
GGCATAAAAGTCAGGGCAGAGCCATCTATTGCTTACATTTGCTTCTGGGATCC
ACCATGGTCAAGATCATCTCCAGGAAGTCTCTGGGTACACAGAATGTCTACGA
TATCGGAGTCGAGAAAGACCACAATTTTCTCCTGAAAAACGGACTCGTGGCGT
CCAATTGCATGTCGAACCATACCACAGAGATAGGCAAGGACCTTGACTACCTT AAAGACGTGAACGGTACCACAAGTGGAATAGGCACAGGTAAGTACTAGCAGC
TACAATCCAGCTACCATTCTGCTTTTATTCTATGGTTGGGATAAGGCTGGATTA
TTCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTTCATACCTCTTATCTTCCT
CCCACAGGCTCCTCTGTAGAGAAGTACATCATAGACGAGAGCGATTACATGTC
TTTCATCAACAACCCGTCCCTCACTGTCACCGTTCCCATCGCCGTAGGAGAATC
TGACTTCGAGAATCTCAATACGGAGGATTTCAGCTCCGAATCAGACTTGGAGG
AATCAAAGGAGAAGTTGAACGAAAGTTCAAGTTCATCCGAGGGCAGCACCGT
GGACATAGGCGCCCCCGTCGAGGAACAACCTGTAGTCGAGCCTGAGGAAACTT
TGGAACCCGAAGCGTGTTTCACGGAGGGGTGTGTTCAACGCTTCAAGTGTTGC
CAAATTAACGTTGAAGAGGGTCGTGGAAAACAATGGTGGAACCTCCGCAGGA
CCTGTTTCCGGATCGTCGAACATAATTGGTTCGAGACGTTCATAGTTTTCATGA
TCTTGCTTTCATCTGGTGCTTTGGCATTCGAGGATATCTACATCGACCAACGAA
AGACCATAAAAACTATGCTGGAATATGCAGACAAGGTTTTCACATACATATTC
ATCCTTGAAATGCTCCTGAAATGGGTAGCGTATGGTTACCAGACTTATTTCACG
AACGCATGGTGCTGGCTCGATTTCCTGATTGTCGACGTCTCCCTGGTGTCATTG
ACTGCTAACGCACTCGGATATAGCGAACTAGGCGCTATTAAGAGTCTCAGAAC
CCTGAGAGCATTGAGGCCCCTCCGCGCGCTCTCTCGGTTTGAGGGAATGAGAG
TAGTCGTTAATGCACTGTTGGGAGCGATACCTTCCATTATGAACGTGCTTCTCG
TTTGTCTCATCTTCTGGCTGATATTCTCTATTATGGGTGTGAACTTGTTCGCAGG
CAAATTTTACCACTGCATTAACACAACTACAGGAGATAGATTTGATATTGAGG
ATGTAAACAACCACACCGACTGTTTGAAGTTGATAGAGAGAAACGAGACCGC
AAGATGGAAGAATGTAAAAGTCAACTTCGACAATGTCGGCTTTGGATATCTTT
CACTGCTGCAAGTAGCCACATTCAAAGGATGGATGGACATTATGTACGCTGCA
GTAGATTCCCGAAACGTAGAGTTGCAACCGAAGTATGAAGAAAGTTTGTATAT
GTACCTCTACTTCGTAATTTTTATCATCTTTGGCTCATTCTTCACACTTAACCTG
TTCATTGGTGTAATCATCGACAATTTCAATCAGCAGAAAAAGAAATTTGGTGG
ACAAGACATCTTCATGACAGAGGAACAGAAGAAATACTATAATGCAATGAAA
AAACTAGGGTCCAAAAAGCCCCAAAAACCTATTCCTAGACCGGGCAACAAGTT
TCAAGGCATGGTTTTCGACTTCGTAACTAGACAGGTGTTTGATATATCTATTAT
GATTCTGATATGTCTGAATATGGTTACGATGATGGTTGAGACTGATGATCAATC
TGAATACGTTACGACGATACTTAGCCGAATTAACTTGGTATTCATTGTTCTTTT
CACGGGCGAATGTGTACTTAAACTGATTAGTTTAAGGCACTATTATTTCACAAT
CGGTTGGAACATTTTTGATTTCGTTGTGGTCATACTTTCCATTGTTGGCATGTTT
CTTGCTGAATTGATAGAAAAGTACTTCGTCAGTCCAACACTTTTCCGAGTTATA
CGGCTTGCCCGAATCGGACGAATTCTCAGGCTAATCAAAGGTGCTAAAGGAAT
TCGTACACTGCTTTTCGCTCTCATGATGTCACTGCCAGCTCTTTTCAACATCGGT
TTGTTACTATTTTTGGTAATGTTTATATATGCGATCTTCGGCATGAGTAATTTCG
CTTATGTTAAACGGGAGGTGGGAATCGATGACATGTTTAATTTTGAGACATTC
GGCAATTCTATGATCTGTCTCTTTCAAATTACCACGTCAGCTGGATGGGACGGA
TTGCTTGCTCCGATTCTCAACAGTAAACCGCCCGATTGCGACCCTAACAAAGT
GAATCCGGGTTCATCTGTAAAGGGAGACTGCGGAAATCCGAGCGTCGGTATCT
TCTTTTTCGTCTCCTACATTATAATTTCTTTCCTTGTTGTCGTGAACATGTATAT
AGCTGTGATCTTGGAAAATTTTTCTGTTGCTACTGAGGAATCCGCAGAACCACT
TTCAGAAGACGATTTTGAGATGTTTTACGAAGTTTGGGAGAAGTTTGATCCTGA
CGCTACACAGTTTATGGAATTTGAGAAGCTCTCACAGTTCGCAGCTGCCCTGG
AGCCTCCGTTGAATCTTCCACAGCCTAACAAGTTACAACTGATTGCGATGGAC CTGCCAATGGTGTCTGGGGACCGAATCCACTGCCTTGATATACTCTTTGCTTTC
ACAAAAAGGGTCTTGGGCGAGTCTGGAGAAATGGACGCCCTCAGAATACAGA
TGGAGGAACGATTCATGGCTTCGAATCCTAGCAAAGTGTCTTATCAACCCATC
ACTACGACTCTTAAAAGAAAACAAGAGGAAGTGTCTGCTGTCATTATCCAGCG
AGCATATAGACGGCACTTGCTCAAACGAACTGTTAAGCAAGCCAGTTTCACCT
ACAATAAAAACAAAATAAAAGGTGGTGCTAATTTGCTGATTAAAGAGGACAT
GATTATCGACAGAATCAATGAGAACTCCATTACAGAAAAAACCGATCTCACTA
TGTCAACAGCAGCCTGTCCTCCCTCATACGACCGTGTCACTAAACCTATAGTCG
AAAAACATGAACAAGAGGGCAAGGATGAGAAGGCCAAAGGCAAAGCCGGCG
ACTACAAAGACCATGACGGAGACTATAAAGATCATGACATCGATTACAAGGA
TGACGATGACAAGTAATGATATCATAATCAACCTCTGGATTACAAAATTTGTG
AAAGATTGACTGGTATTCTTAACTATGTTGCTCCTTTTACGCTATGTGGATACG
CTGCTTTAATGCCTTTGTATCATGCTATTGCTTCCCGTATGGCTTTCATTTTCTC
CTCCTTGTATAAATCCTGGTTAGTTCTTGCCACGGCGGAACTCATCGCCGCCTG
CCTTGCCCGCTGCTGGACAGGGGCTCGGCTGTTGGGCACTGACAATTCCGTGG
TGTTTATTTGTGAAATTTGTGATGCTATTGCTTTATTTGTAACCATCTAGCTTTA
TTTGTGAAATTTGTGATGCTATTGCTTTATTTGTAACCATTATAAGCTGCAATA
AACAAGTTAACAACAACAATTGCATTCATTTTATGTTTCAGGTTCAGGGGGAG
ATGTGGGAGGTTTTTTAAACGGACCGAGCGGCCGC (SEQ ID NO: 74)
CN3683 (4528 bp between ITRs)
GCGGCCGCACGCGTTTAATTAAGTGTCTAGACTGCAGAGGGCCCTGCGTATGA
GTGCAAGTGGGTTTTAGGACCAGGATGAGGCGGGGTGGGGGTGCCTACCTGAC
GACCGACCCCGACCCACTGGACAAGCACCCAACCCCCATTCCCCAAATTGCGC
ATCCCCTATCAGAGAGGGGGAGGGGAAACAGGATGCGGCGAGGCGCGTGCGC
ACTGCCAGCTTCAGCACCGCGGACAGTGCCTTCGCCCCCGCCTGGCGGCGCGC
GCCACCGCCGCCTCAGCACTGAAGGCGCGCTGACGTCACTCGCCGGTCCCCCG
CAAACTCCCCTTCCCGGCCACCTTGGTCGCGTCCGCGCCGCCGCCGGCCCAGCC
GGACCGCACCACGCGAGGCGCGAGATAGGGGGGCACGGGCGCGACCATCTGC
GCTGCGGCGCCGGCGACTCAGCGCTGCCTCAGTCTGCGGTGGGCAGCGGAGGA
GTCGTGTCGTGCCTGAGAGCGCAGTCGAGAAACCGGCTAGAGGATCCTTCGAA
GCTAGCGCTACCGGTGCCACCATGGTCTACCCGTATGATGTCCCGGATTACGCT
GGCAGCTACCCATACGATGTACCCGACTATGCCGGCAGTATGGAGCAAACAGT
TTTGGTCCCTCCGGGACCAGACAGTTTCAATTTCTTTACTCGGGAGAGTCTTGC
CGCCATTGAGAGGCGCATAGCTGAGGTAAGTACTAGCAGCTACAATCCAGCTA
CCATTCTGCTTTTATTCTATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAG
CTAGGCCCTTTTGCTAATCATGTTCATACCTCTTATCTTCCTCCCACAGGAAAA
GGCTAAGAATCCAAAACCTGACAAGAAAGACGACGACGAAAACGGACCCAAA
CCTAACTCAGATCTCGAAGCTGGAAAGAATCTCCCATTCATCTATGGTGATATC
CCTCCAGAAATGGTTTCAGAACCTCTAGAAGATCTCGATCCATACTATATCAAT
AAAAAGACCTTCATCGTTCTGAACAAAGGAAAGGCGATTTTCCGGTTTTCTGCT
ACTTCTGCTCTCTATATTCTCACACCATTTAATCCACTTCGCAAGATTGCGATTA
AGATACTGGTGCATAGTCTGTTCAGTATGCTGATTATGTGTACAATCCTTACAA
ATTGTGTCTTTATGACTATGTCTAACCCGCCGGATTGGACCAAGAACGTAGAAT
ACACGTTCACTGGAATCTATACGTTCGAGTCTCTTATTAAGATAATCGCCAGGG
GGTTCTGTCTTGAGGATTTCACTTTCCTCCGCGATCCGTGGAATTGGCTTGACT
TCACCGTTATTACGTTCGCTTACGTTACTGAGTTCGTTGATCTTGGCAACGTGT CTGCACTCAGAACATTCAGAGTGCTTAGAGCACTTAAAACCATAAGTGTCATA
CCAGGATTGAAAACGATCGTGGGAGCTCTGATACAGAGTGTAAAGAAGCTTTC
AGATGTAATGATCCTTACTGTCTTCTGTCTTTCCGTATTCGCACTCATCGGGCTC
CAGCTGTTTATGGGTAACCTCAGAAACAAATGCATTCAATGGCCACCAACAAA
TGCGAGCCTTGAGGAACATAGCATAGAAAAGAATATCACTGTTAACTATAATG
GGACCCTCATAAACGAAACCGTGTTCGAATTTGACTGGAAATCCTACATTCAG
GATTCCAGATATCATTATTTTCTTGAGGGCTTCTTGGACGCACTTTTGTGCGGA
AATTCAAGTGATGCTGGTCAATGTCCTGAAGGTTATATGTGTGTTAAAGCAGG
AAGAAACCCAAACTACGGATACACATCTTTCGATACATTTTCTTGGGCTTTCCT
ATCTCTTTTTCGGCTTATGACACAAGACTTTTGGGAAAATTTGTATCAGCTGAC
ACTCCGAGCGGCTGGAAAAACTTATATGATCTTCTTCGTTCTTGTAATCTTTTT
GGGATCCTTCTACCTCATCAATTTGATACTTGCAGTTGTCGCTATGGCATACGA
GGAGCAAAATCAAGCAACGCTAGAAGAAGCGGAGCAGAAAGAGGCTGAATTT
CAACAGATGATTGAGCAATTGAAGAAACAACAGGAAGCTGCACAACAAGCAG
CTACTGCTACTGCATCTGAACATTCTAGAGAGCCAAGTGCAGCTGGAAGGCTT
TCTGATAGTTCAAGTGAAGCATCTAAATTGAGTTCTAAGTCAGCAAAGGAACG
GAGAAATAGACGGAAAAAACGAAAGCAGAAGGAGCAATCTGGAGGAGAAGA
GAAGGACGAAGACGAGTTTCAAAAAAGTGAATCAGAGGACTCAATTAGACGC
AAAGGATTCAGATTTAGTATCGAAGGAAATAGATTGACTTATGAAAAACGATA
TTCCTCACCACATCAGTCACTCCTGAGTATACGCGGGTCACTCTTTTCACCGAG
ACGAAATTCCAGAACTTCACTCTTCTCATTCCGGGGAAGGGCTAAGGATGTAG
GCTCAGAAAATGATTTCGCAGACGATGAGCATTCCACTTTTGAAGATAATGAG
AGCAGGCGAGACAGTCTCTTTGTACCACGAAGACATGGCGAAAGAAGGAACA
GCAACCTTAGCCAGACTAGTCGGTCCAGTAGAATGCTAGCTGTATTCCCTGCTA
ATGGCAAGATGCATTCCACCGTTGATTGTAATGGGGTCGTCTCGTTAGTAGGTG
GACCTTCAGTTCCTACCTCACCGGTTGGACAATTGCTGCCGGAGGGAACCACT
ACTGAGACTGAAATGAGAAAACGACGTTCTTCAAGCTTCCATGTGTCTATGGA
TTTTTTGGAAGACCCGTCACAGCGCCAAAGAGCTATGTCTATAGCTTCAATCCT
GACAAACACCGTAGAGGAGTTGGAGGAGTCACGCCAGAAGTGCCCTCCTTGTT
GGTACAAGTTCTCCAACATCTTCCTGATTTGGGATTGTTCACCTTATTGGCTGA
AAGTCAAGCACGTTGTTAACCTCGTCGTAATGGATCCTTTTGTCGACCTGGCTA
TAACGATATGTATCGTCCTGAACACACTCTTCATGGCTATGGAGCATTATCCGA
TGACTGATCATTTTAACAATGTGCTTACCGTGGGTAATCTGGTTTTCACTGGCA
TCTTTACTGCAGAAATGTTTCTTAAGATTATTGCAATGGACCCCTACTACTACT
TTCAAGAAGGATGGAATATTTTTGATGGTTTTATCGTCACACTTTCTTTGGTTG
AATTGGGCTTGGCAAATGTAGAGGGGCTCTCAGTTCTTAGAAGTTTCAGACTTC
TCCGGGTATTCAAGCTTGCTAAGAGCTGGCCTACTTTGAACATGCTCATCAAGA
TTATCGGAAACAGTGTTGGCGCCCTTGGCAATCTGACATTGGTGTTGGCTATCA
TAGTATTCATCTTCGCGGTTGTGGGAATGCAGTTGTTTGGGAAGTCATACAAGG
ACTGTGTGTGCAAGATAGCGTCCGACTGTCAACTTCCGAGGTGGCACATGAAC
GATTTCTTTCATTCATTCCTCATTGTGTTTCGGGTCCTCTGTGGCGAGTGGATAG
AAACTATGTGGGACTGTATGGAAGTAGCTGGGCAGGCGATGTGCCTTACGGTA
TTCATGATGGTCATGGTCATCGGAAATCTTGTTGTATTGAATTTATTTCTCGCGT
TGTTGTTGAGTTCATTTTCCGCCGATAATTTGGCTGCCACTGATGACGACAACG
AGATGAATAATCTTCAGATAGCTGTAGACCGGATGCACAAAGGCGTTGCCTAC
GTCAAACGAAAAATCTATGAATTCATACAGCAATCCTTCATACGAAAACAGAA GATTCTGGATGAAATCAAACCCCTTGATGATCTCAATAATAAGAAAGATTCAT
GTCTCAGTTATGACACAGAAATCTTGACGGTGGAATACGGGTTTCTTCCGATCG
GAAAGATTGTTGAGGAGCGCATAGAGTGTACGGTGTATACCGTCGATAAGAAC
GGATTTGTCTACACACAGCCTATCGCACAATGGCATAATAGAGGAGAACAAGA
AGTCTTCGAATATTGTTTGGAGGACGGATCAATCATACGGGCAACCAAAGACC
ACAAGTTTATGACAACAGATGGACAGATGTTGCCAATAGATGAGATATTTGAG
AGGGGACTTGATCTCAAGCAAGTGGATGGTCTGCCATAATGATATCATAATCA
ACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAACTATGTTGC
TCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCTTTGTATCATGCTATTGCT
TCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTTAGTTCTTGCCA
CGGCGGAACTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGGCTCGGCTG
TTGGGCACTGACAATTCCGTGGCAATAAAAGATCTTTATTTTCATTAGATCTGT
GTGTTGGTTTTTTGTGTGGTGCGGACCGAGCGGCCGC (SEQ ID NO: 75)
CN3684 (4033 bp between ITRs)
GCGGCCGCACGCGTTTAATTAAGTGTCTAGACTGCAGAGGGCCCTGCGTATGA
GTGCAAGTGGGTTTTAGGACCAGGATGAGGCGGGGTGGGGGTGCCTACCTGAC
GACCGACCCCGACCCACTGGACAAGCACCCAACCCCCATTCCCCAAATTGCGC
ATCCCCTATCAGAGAGGGGGAGGGGAAACAGGATGCGGCGAGGCGCGTGCGC
ACTGCCAGCTTCAGCACCGCGGACAGTGCCTTCGCCCCCGCCTGGCGGCGCGC
GCCACCGCCGCCTCAGCACTGAAGGCGCGCTGACGTCACTCGCCGGTCCCCCG
CAAACTCCCCTTCCCGGCCACCTTGGTCGCGTCCGCGCCGCCGCCGGCCCAGCC
GGACCGCACCACGCGAGGCGCGAGATAGGGGGGCACGGGCGCGACCATCTGC
GCTGCGGCGCCGGCGACTCAGCGCTGCCTCAGTCTGCGGTGGGCAGCGGAGGA
GTCGTGTCGTGCCTGAGAGCGCAGTCGAGAAACCGGCTAGAGGATCCTTCGAA
GCTAGCGCTACCGGTGCCACCATGGTCAAGATCATCTCCAGGAAGTCTCTGGG
TACACAGAATGTCTACGATATCGGAGTCGAGAAAGACCACAATTTTCTCCTGA
AAAACGGACTCGTGGCGTCCAATTGCATGTCGAACCATACCACAGAGATAGGC
AAGGACCTTGACTACCTTAAAGACGTGAACGGTACCACAAGTGGAATAGGCAC
AGGTAAGTACTAGCAGCTACAATCCAGCTACCATTCTGCTTTTATTCTATGGTT
GGGATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTT
CATACCTCTTATCTTCCTCCCACAGGCTCCTCTGTAGAGAAGTACATCATAGAC
GAGAGCGATTACATGTCTTTCATCAACAACCCGTCCCTCACTGTCACCGTTCCC
ATCGCCGTAGGAGAATCTGACTTCGAGAATCTCAATACGGAGGATTTCAGCTC
CGAATCAGACTTGGAGGAATCAAAGGAGAAGTTGAACGAAAGTTCAAGTTCA
TCCGAGGGCAGCACCGTGGACATAGGCGCCCCCGTCGAGGAACAACCTGTAGT
CGAGCCTGAGGAAACTTTGGAACCCGAAGCGTGTTTCACGGAGGGGTGTGTTC
AACGCTTCAAGTGTTGCCAAATTAACGTTGAAGAGGGTCGTGGAAAACAATGG
TGGAACCTCCGCAGGACCTGTTTCCGGATCGTCGAACATAATTGGTTCGAGAC
GTTCATAGTTTTCATGATCTTGCTTTCATCTGGTGCTTTGGCATTCGAGGATATC
TACATCGACCAACGAAAGACCATAAAAACTATGCTGGAATATGCAGACAAGG
TTTTCACATACATATTCATCCTTGAAATGCTCCTGAAATGGGTAGCGTATGGTT
ACCAGACTTATTTCACGAACGCATGGTGCTGGCTCGATTTCCTGATTGTCGACG
TCTCCCTGGTGTCATTGACTGCTAACGCACTCGGATATAGCGAACTAGGCGCTA
TTAAGAGTCTCAGAACCCTGAGAGCATTGAGGCCCCTCCGCGCGCTCTCTCGG
TTTGAGGGAATGAGAGTAGTCGTTAATGCACTGTTGGGAGCGATACCTTCCAT
TATGAACGTGCTTCTCGTTTGTCTCATCTTCTGGCTGATATTCTCTATTATGGGT GTGAACTTGTTCGCAGGCAAATTTTACCACTGCATTAACACAACTACAGGAGA
TAGATTTGATATTGAGGATGTAAACAACCACACCGACTGTTTGAAGTTGATAG
AGAGAAACGAGACCGCAAGATGGAAGAATGTAAAAGTCAACTTCGACAATGT
CGGCTTTGGATATCTTTCACTGCTGCAAGTAGCCACATTCAAAGGATGGATGG
ACATTATGTACGCTGCAGTAGATTCCCGAAACGTAGAGTTGCAACCGAAGTAT
GAAGAAAGTTTGTATATGTACCTCTACTTCGTAATTTTTATCATCTTTGGCTCAT
TCTTCACACTTAACCTGTTCATTGGTGTAATCATCGACAATTTCAATCAGCAGA
AAAAGAAATTTGGTGGACAAGACATCTTCATGACAGAGGAACAGAAGAAATA
CTATAATGCAATGAAAAAACTAGGGTCCAAAAAGCCCCAAAAACCTATTCCTA
GACCGGGCAACAAGTTTCAAGGCATGGTTTTCGACTTCGTAACTAGACAGGTG
TTTGATATATCTATTATGATTCTGATATGTCTGAATATGGTTACGATGATGGTT
GAGACTGATGATCAATCTGAATACGTTACGACGATACTTAGCCGAATTAACTT
GGTATTCATTGTTCTTTTCACGGGCGAATGTGTACTTAAACTGATTAGTTTAAG
GCACTATTATTTCACAATCGGTTGGAACATTTTTGATTTCGTTGTGGTCATACTT
TCCATTGTTGGCATGTTTCTTGCTGAATTGATAGAAAAGTACTTCGTCAGTCCA
ACACTTTTCCGAGTTATACGGCTTGCCCGAATCGGACGAATTCTCAGGCTAATC
AAAGGTGCTAAAGGAATTCGTACACTGCTTTTCGCTCTCATGATGTCACTGCCA
GCTCTTTTCAACATCGGTTTGTTACTATTTTTGGTAATGTTTATATATGCGATCT
TCGGCATGAGTAATTTCGCTTATGTTAAACGGGAGGTGGGAATCGATGACATG
TTTAATTTTGAGACATTCGGCAATTCTATGATCTGTCTCTTTCAAATTACCACGT
CAGCTGGATGGGACGGATTGCTTGCTCCGATTCTCAACAGTAAACCGCCCGAT
TGCGACCCTAACAAAGTGAATCCGGGTTCATCTGTAAAGGGAGACTGCGGAAA
TCCGAGCGTCGGTATCTTCTTTTTCGTCTCCTACATTATAATTTCTTTCCTTGTT
GTCGTGAACATGTATATAGCTGTGATCTTGGAAAATTTTTCTGTTGCTACTGAG
GAATCCGCAGAACCACTTTCAGAAGACGATTTTGAGATGTTTTACGAAGTTTG
GGAGAAGTTTGATCCTGACGCTACACAGTTTATGGAATTTGAGAAGCTCTCAC
AGTTCGCAGCTGCCCTGGAGCCTCCGTTGAATCTTCCACAGCCTAACAAGTTAC
AACTGATTGCGATGGACCTGCCAATGGTGTCTGGGGACCGAATCCACTGCCTT
GATATACTCTTTGCTTTCACAAAAAGGGTCTTGGGCGAGTCTGGAGAAATGGA
CGCCCTCAGAATACAGATGGAGGAACGATTCATGGCTTCGAATCCTAGCAAAG
TGTCTTATCAACCCATCACTACGACTCTTAAAAGAAAACAAGAGGAAGTGTCT
GCTGTCATTATCCAGCGAGCATATAGACGGCACTTGCTCAAACGAACTGTTAA
GCAAGCCAGTTTCACCTACAATAAAAACAAAATAAAAGGTGGTGCTAATTTGC
TGATTAAAGAGGACATGATTATCGACAGAATCAATGAGAACTCCATTACAGAA
AAAACCGATCTCACTATGTCAACAGCAGCCTGTCCTCCCTCATACGACCGTGTC
ACTAAACCTATAGTCGAAAAACATGAACAAGAGGGCAAGGATGAGAAGGCCA
AAGGCAAAGCCGGCGACTACAAAGACCATGACGGAGACTATAAAGATCATGA
CATCGATTACAAGGATGACGATGACAAGTAATGATATCATAATCAACCTCTGG
ATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAACTATGTTGCTCCTTTTA
CGCTATGTGGATACGCTGCTTTAATGCCTTTGTATCATGCTATTGCTTCCCGTAT
GGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTTAGTTCTTGCCACGGCGGAA
CTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGGCTCGGCTGTTGGGCACT
GACAATTCCGTGGCAATAAAAGATCTTTATTTTCATTAGATCTGTGTGTTGGTT
TTTTGTGTGGTGCGGACCGAGCGGCCGC (SEQ ID NO: 76)
CN3251 (Cassette length 5956 bp, no ITRs)
GTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAG TTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGC
CTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTT
CCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTT
ACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGC
CCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTAC
ATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCT
ATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGATAGCGGTT
TGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTT
TTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCAT
TGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCT
CTCTGGCTAACTAGAGAACCCACTGCTTACTGGCTTATCGAAATTAATACGACT
CACTATAGGGAGACCCAAGCTGGCTAGCCACCATGGTCTACCCGTATGATGTC
CCGGATTACGCTGGCAGCTACCCATACGATGTACCCGACTATGCCGGCAGTAT
GGAGCAAACAGTTTTGGTCCCTCCGGGACCAGACAGTTTCAATTTCTTTACTCG
GGAGAGTCTTGCCGCCATTGAGAGGCGCATAGCTGAGGTAAGTACTAGCAGCT
ACAATCCAGCTACCATTCTGCTTTTATTCTATGGTTGGGATAAGGCTGGATTAT
TCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTTCATACCTCTTATCTTCCTC
CCACAGGAAAAGGCTAAGAATCCAAAACCTGACAAGAAAGACGACGACGAAA
ACGGACCCAAACCTAACTCAGATCTCGAAGCTGGAAAGAATCTCCCATTCATC
TATGGTGATATCCCTCCAGAAATGGTTTCAGAACCTCTAGAAGATCTCGATCCA
TACTATATCAATAAAAAGACCTTCATCGTTCTGAACAAAGGAAAGGCGATTTT
CCGGTTTTCTGCTACTTCTGCTCTCTATATTCTCACACCATTTAATCCACTTCGC
AAGATTGCGATTAAGATACTGGTGCATAGTCTGTTCAGTATGCTGATTATGTGT
ACAATCCTTACAAATTGTGTCTTTATGACTATGTCTAACCCGCCGGATTGGACC
AAGAACGTAGAATACACGTTCACTGGAATCTATACGTTCGAGTCTCTTATTAA
GATAATCGCCAGGGGGTTCTGTCTTGAGGATTTCACTTTCCTCCGCGATCCGTG
GAATTGGCTTGACTTCACCGTTATTACGTTCGCTTACGTTACTGAGTTCGTTGA
TCTTGGCAACGTGTCTGCACTCAGAACATTCAGAGTGCTTAGAGCACTTAAAA
CCATAAGTGTCATACCAGGATTGAAAACGATCGTGGGAGCTCTGATACAGAGT
GTAAAGAAGCTTTCAGATGTAATGATCCTTACTGTCTTCTGTCTTTCCGTATTC
GCACTCATCGGGCTCCAGCTGTTTATGGGTAACCTCAGAAACAAATGCATTCA
ATGGCCACCAACAAATGCGAGCCTTGAGGAACATAGCATAGAAAAGAATATC
ACTGTTAACTATAATGGGACCCTCATAAACGAAACCGTGTTCGAATTTGACTG
GAAATCCTACATTCAGGATTCCAGATATCATTATTTTCTTGAGGGCTTCTTGGA
CGCACTTTTGTGCGGAAATTCAAGTGATGCTGGTCAATGTCCTGAAGGTTATAT
GTGTGTTAAAGCAGGAAGAAACCCAAACTACGGATACACATCTTTCGATACAT
TTTCTTGGGCTTTCCTATCTCTTTTTCGGCTTATGACACAAGACTTTTGGGAAAA
TTTGTATCAGCTGACACTCCGAGCGGCTGGAAAAACTTATATGATCTTCTTCGT
TCTTGTAATCTTTTTGGGATCCTTCTACCTCATCAATTTGATACTTGCAGTTGTC
GCTATGGCATACGAGGAGCAAAATCAAGCAACGCTAGAAGAAGCGGAGCAGA
AAGAGGCTGAATTTCAACAGATGATTGAGCAATTGAAGAAACAACAGGAAGC
TGCACAACAAGCAGCTACTGCTACTGCATCTGAACATTCTAGAGAGCCAAGTG
CAGCTGGAAGGCTTTCTGATAGTTCAAGTGAAGCATCTAAATTGAGTTCTAAG
TCAGCAAAGGAACGGAGAAATAGACGGAAAAAACGAAAGCAGAAGGAGCAA
TCTGGAGGAGAAGAGAAGGACGAAGACGAGTTTCAAAAAAGTGAATCAGAGG
ACTCAATTAGACGCAAAGGATTCAGATTTAGTATCGAAGGAAATAGATTGACT TATGAAAAACGATATTCCTCACCACATCAGTCACTCCTGAGTATACGCGGGTC
ACTCTTTTCACCGAGACGAAATTCCAGAACTTCACTCTTCTCATTCCGGGGAAG
GGCTAAGGATGTAGGCTCAGAAAATGATTTCGCAGACGATGAGCATTCCACTT
TTGAAGATAATGAGAGCAGGCGAGACAGTCTCTTTGTACCACGAAGACATGGC
GAAAGAAGGAACAGCAACCTTAGCCAGACTAGTCGGTCCAGTAGAATGCTAG
CTGTATTCCCTGCTAATGGCAAGATGCATTCCACCGTTGATTGTAATGGGGTCG
TCTCGTTAGTAGGTGGACCTTCAGTTCCTACCTCACCGGTTGGACAATTGCTGC
CGGAGGGAACCACTACTGAGACTGAAATGAGAAAACGACGTTCTTCAAGCTTC
CATGTGTCTATGGATTTTTTGGAAGACCCGTCACAGCGCCAAAGAGCTATGTCT
ATAGCTTCAATCCTGACAAACACCGTAGAGGAGTTGGAGGAGTCACGCCAGAA
GTGCCCTCCTTGTTGGTACAAGTTCTCCAACATCTTCCTGATTTGGGATTGTTCA
CCTTATTGGCTGAAAGTCAAGCACGTTGTTAACCTCGTCGTAATGGATCCTTTT
GTCGACCTGGCTATAACGATATGTATCGTCCTGAACACACTCTTCATGGCTATG
GAGCATTATCCGATGACTGATCATTTTAACAATGTGCTTACCGTGGGTAATCTG
GTTTTCACTGGCATCTTTACTGCAGAAATGTTTCTTAAGATTATTGCAATGGAC
CCCTACTACTACTTTCAAGAAGGATGGAATATTTTTGATGGTTTTATCGTCACA
CTTTCTTTGGTTGAATTGGGCTTGGCAAATGTAGAGGGGCTCTCAGTTCTTAGA
AGTTTCAGACTTCTCCGGGTATTCAAGCTTGCTAAGAGCTGGCCTACTTTGAAC
ATGCTCATCAAGATTATCGGAAACAGTGTTGGCGCCCTTGGCAATCTGACATT
GGTGTTGGCTATCATAGTATTCATCTTCGCGGTTGTGGGAATGCAGTTGTTTGG
GAAGTCATACAAGGACTGTGTGTGCAAGATAGCGTCCGACTGTCAACTTCCGA
GGTGGCACATGAACGATTTCTTTCATTCATTCCTCATTGTGTTTCGGGTCCTCTG
TGGCGAGTGGATAGAAACTATGTGGGACTGTATGGAAGTAGCTGGGCAGGCG
ATGTGCCTTACGGTATTCATGATGGTCATGGTCATCGGAAATCTTGTTGTATTG
AATTTATTTCTCGCGTTGTTGTTGAGTTCATTTTCCGCCGATAATTTGGCTGCCA
CTGATGACGACAACGAGATGAATAATCTTCAGATAGCTGTAGACCGGATGCAC
AAAGGCGTTGCCTACGTCAAACGAAAAATCTATGAATTCATACAGCAATCCTT
CATACGAAAACAGAAGATTCTGGATGAAATCAAACCCCTTGATGATCTCAATA
ATAAGAAAGATTCATGTCTCAGTTATGACACAGAAATCTTGACGGTGGAATAC
GGGTTTCTTCCGATCGGAAAGATTGTTGAGGAGCGCATAGAGTGTACGGTGTA
TACCGTCGATAAGAACGGATTTGTCTACACACAGCCTATCGCACAATGGCATA
ATAGAGGAGAACAAGAAGTCTTCGAATATTGTTTGGAGGACGGATCAATCATA
CGGGCAACCAAAGACCACAAGTTTATGACAACAGATGGACAGATGTTGCCAAT
AGATGAGATATTTGAGAGGGGACTTGATCTCAAGCAAGTGGATGGTCTGCCAT
AATGAAGGCCTCCCTCTCCCTCCCCCCCCCCTAACGTTACTGGCCGAAGCCGCT
TGGAATAAGGCCGGTGTGCGTTTGTCTATATGTTATTTTCCACCATATTGCCGT
CTTTTGGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCTTCTTGACGAGCATTC
CTAGGGGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCGTGA
AGGAAGCAGTTCCTCTGGAAGCTTCTTGAAGACAAACAACGTCTGTAGCGACC
CTTTGCAGGCAGCGGAACCCCCCACCTGGCGACAGGTGCCTCTGCGGCCAAAA
GCCACGTGTATAAGATACACCTGCAAAGGCGGCACAACCCCAGTGCCACGTTG
TGAGTTGGATAGTTGTGGAAAGAGTCAAATGGCTCTCCTCAAGCGTATTCAAC
AAGGGGCTGAAGGATGCCCAGAAGGTACCCCATTGTATGGGATCTGATCTGGG
GCCTCGGTGCACATGCTTTACATGTGTTTAGTCGAGGTTAAAAAAACGTCTAG
GCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAAAACACGATGATAATAT
GGCCACAACCCGCGGCCGCCACCATGGTGAGCAAGGGCGAGGAGCTGTTCAC CGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGT
TCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTG
AAGCTGATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGAC
CACCCTGGGCTACGGCGTGCAGTGCTTCGCCCGCTACCCCGACCACATGAAGC
AGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACC
ATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGA
GGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAG
GACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACG
TCTATATCACCGCCGACAAGCAGAAGAACGGCATCAAGGCCAACTTCAAGATC
CGCCACAACATCGAGGACGGCGGCGTGCAGCTCGCCGACCACTACCAGCAGA
ACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGC
TACCAGTCCAAGCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCT
GCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACA
AGTAAGAATTCCACCACACTGGACTAGTGGATCCGAGCTCGGTACCAAGCTTA
AGTTTAAACCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTT
GTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTC
CTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCT
ATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGACA
ATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGC (SEQ ID NO: 77)
CN3253 (Cassette length 5486 bp, no ITRs)
GTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAG
TTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGC
CTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTT
CCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTT
ACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGC
CCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTAC
ATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCT
ATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGATAGCGGTT
TGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTT
TTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCAT
TGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGC
TCTCTGGCTAACTAGAGAACCCACTGCTTACTGGCTTATCGAAATTAATACGAC
TCACTATAGGGAGACCCAAGCTGGCTAGCCACCATGGTCAAGATCATCTCCAG
GAAGTCTCTGGGTACACAGAATGTCTACGATATCGGAGTCGAGAAAGACCACA
ATTTTCTCCTGAAAAACGGACTCGTGGCGTCCAATTGCATGTCGAACCATACC
ACAGAGATAGGCAAGGACCTTGACTACCTTAAAGACGTGAACGGTACCACAA
GTGGAATAGGCACAGGTAAGTACTAGCAGCTACAATCCAGCTACCATTCTGCT
TTTATTCTATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCTT
TTGCTAATCATGTTCATACCTCTTATCTTCCTCCCACAGGCTCCTCTGTAGAGA
AGTACATCATAGACGAGAGCGATTACATGTCTTTCATCAACAACCCGTCCCTC
ACTGTCACCGTTCCCATCGCCGTAGGAGAATCTGACTTCGAGAATCTCAATAC
GGAGGATTTCAGCTCCGAATCAGACTTGGAGGAATCAAAGGAGAAGTTGAAC
GAAAGTTCAAGTTCATCCGAGGGCAGCACCGTGGACATAGGCGCCCCCGTCGA
GGAACAACCTGTAGTCGAGCCTGAGGAAACTTTGGAACCCGAAGCGTGTTTCA
CGGAGGGGTGTGTTCAACGCTTCAAGTGTTGCCAAATTAACGTTGAAGAGGGT
CGTGGAAAACAATGGTGGAACCTCCGCAGGACCTGTTTCCGGATCGTCGAACA TAATTGGTTCGAGACGTTCATAGTTTTCATGATCTTGCTTTCATCTGGTGCTTTG
GCATTCGAGGATATCTACATCGACCAACGAAAGACCATAAAAACTATGCTGGA
ATATGCAGACAAGGTTTTCACATACATATTCATCCTTGAAATGCTCCTGAAATG
GGTAGCGTATGGTTACCAGACTTATTTCACGAACGCATGGTGCTGGCTCGATTT
CCTGATTGTCGACGTCTCCCTGGTGTCATTGACTGCTAACGCACTCGGATATAG
CGAACTAGGCGCTATTAAGAGTCTCAGAACCCTGAGAGCATTGAGGCCCCTCC
GCGCGCTCTCTCGGTTTGAGGGAATGAGAGTAGTCGTTAATGCACTGTTGGGA
GCGATACCTTCCATTATGAACGTGCTTCTCGTTTGTCTCATCTTCTGGCTGATAT
TCTCTATTATGGGTGTGAACTTGTTCGCAGGCAAATTTTACCACTGCATTAACA
CAACTACAGGAGATAGATTTGATATTGAGGATGTAAACAACCACACCGACTGT
TTGAAGTTGATAGAGAGAAACGAGACCGCAAGATGGAAGAATGTAAAAGTCA
ACTTCGACAATGTCGGCTTTGGATATCTTTCACTGCTGCAAGTAGCCACATTCA
AAGGATGGATGGACATTATGTACGCTGCAGTAGATTCCCGAAACGTAGAGTTG
CAACCGAAGTATGAAGAAAGTTTGTATATGTACCTCTACTTCGTAATTTTTATC
ATCTTTGGCTCATTCTTCACACTTAACCTGTTCATTGGTGTAATCATCGACAATT
TCAATCAGCAGAAAAAGAAATTTGGTGGACAAGACATCTTCATGACAGAGGA
ACAGAAGAAATACTATAATGCAATGAAAAAACTAGGGTCCAAAAAGCCCCAA
AAACCTATTCCTAGACCGGGCAACAAGTTTCAAGGCATGGTTTTCGACTTCGT
AACTAGACAGGTGTTTGATATATCTATTATGATTCTGATATGTCTGAATATGGT
TACGATGATGGTTGAGACTGATGATCAATCTGAATACGTTACGACGATACTTA
GCCGAATTAACTTGGTATTCATTGTTCTTTTCACGGGCGAATGTGTACTTAAAC
TGATTAGTTTAAGGCACTATTATTTCACAATCGGTTGGAACATTTTTGATTTCG
TTGTGGTCATACTTTCCATTGTTGGCATGTTTCTTGCTGAATTGATAGAAAAGT
ACTTCGTCAGTCCAACACTTTTCCGAGTTATACGGCTTGCCCGAATCGGACGAA
TTCTCAGGCTAATCAAAGGTGCTAAAGGAATTCGTACACTGCTTTTCGCTCTCA
TGATGTCACTGCCAGCTCTTTTCAACATCGGTTTGTTACTATTTTTGGTAATGTT
TATATATGCGATCTTCGGCATGAGTAATTTCGCTTATGTTAAACGGGAGGTGG
GAATCGATGACATGTTTAATTTTGAGACATTCGGCAATTCTATGATCTGTCTCT
TTCAAATTACCACGTCAGCTGGATGGGACGGATTGCTTGCTCCGATTCTCAACA
GTAAACCGCCCGATTGCGACCCTAACAAAGTGAATCCGGGTTCATCTGTAAAG
GGAGACTGCGGAAATCCGAGCGTCGGTATCTTCTTTTTCGTCTCCTACATTATA
ATTTCTTTCCTTGTTGTCGTGAACATGTATATAGCTGTGATCTTGGAAAATTTTT
CTGTTGCTACTGAGGAATCCGCAGAACCACTTTCAGAAGACGATTTTGAGATG
TTTTACGAAGTTTGGGAGAAGTTTGATCCTGACGCTACACAGTTTATGGAATTT
GAGAAGCTCTCACAGTTCGCAGCTGCCCTGGAGCCTCCGTTGAATCTTCCACA
GCCTAACAAGTTACAACTGATTGCGATGGACCTGCCAATGGTGTCTGGGGACC
GAATCCACTGCCTTGATATACTCTTTGCTTTCACAAAAAGGGTCTTGGGCGAGT
CTGGAGAAATGGACGCCCTCAGAATACAGATGGAGGAACGATTCATGGCTTCG
AATCCTAGCAAAGTGTCTTATCAACCCATCACTACGACTCTTAAAAGAAAACA
AGAGGAAGTGTCTGCTGTCATTATCCAGCGAGCATATAGACGGCACTTGCTCA
AACGAACTGTTAAGCAAGCCAGTTTCACCTACAATAAAAACAAAATAAAAGG
TGGTGCTAATTTGCTGATTAAAGAGGACATGATTATCGACAGAATCAATGAGA
ACTCCATTACAGAAAAAACCGATCTCACTATGTCAACAGCAGCCTGTCCTCCC
TCATACGACCGTGTCACTAAACCTATAGTCGAAAAACATGAACAAGAGGGCA
AGGATGAGAAGGCCAAAGGCAAAGCCGGCGACTACAAAGACCATGACGGAG
ACTATAAAGATCATGACATCGATTACAAGGATGACGATGACAAGTAATGAGCC CGGGCAGTTCCCTCTCCCTCCCCCCCCCCTAACGTTACTGGCCGAAGCCGCTTG GAATAAGGCCGGTGTGCGTTTGTCTATATGTTATTTTCCACCATATTGCCGTCT
TTTGGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCTTCTTGACGAGCATTCCT AGGGGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCGTGAA GGAAGCAGTTCCTCTGGAAGCTTCTTGAAGACAAACAACGTCTGTAGCGACCC
TTTGCAGGCAGCGGAACCCCCCACCTGGCGACAGGTGCCTCTGCGGCCAAAAG CCACGTGTATAAGATACACCTGCAAAGGCGGCACAACCCCAGTGCCACGTTGT GAGTTGGATAGTTGTGGAAAGAGTCAAATGGCTCTCCTCAAGCGTATTCAACA
AGGGGCTGAAGGATGCCCAGAAGGTACCCCATTGTATGGGATCTGATCTGGGG CCTCGGTGCACATGCTTTACATGTGTTTAGTCGAGGTTAAAAAAACGTCTAGG CCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAAAACACGATGATAATATG GCCACAACCCGCGGCCGCCACCATGGTCTCCAAAGGAGAAGCGGTCATTAAA GAGTTCATGAGGTTCAAGGTTCATATGGAAGGCTCCATGAATGGTCATGAGTT CGAGATTGAAGGGGAGGGTGAGGGGAGACCTTATGAGGGCACTCAGACAGCG AAATTGAAGGTGACAAAGGGAGGACCTCTCCCGTTCAGTTGGGACATATTGTC ACCGCAATTTATGTATGGTTCTAGAGCCTTCACTAAGCACCCTGCCGACATCCC AGATTACTACAAGCAATCCTTCCCTGAGGGCTTTAAGTGGGAGAGAGTAATGA
ATTTTGAAGATGGCGGGGCAGTCACAGTAACACAAGATACATCCCTGGAAGAT GGAACACTTATCTACAAAGTTAAGCTCAGAGGAACGAATTTTCCACCGGACGG TCCAGTGATGCAAAAAAAAACAATGGGTTGGGAAGCATCTACAGAGCGACTG TACCCTGAAGACGGTGTGCTGAAGGGGGACATCAAAATGGCCCTGCGACTTAA GGATGGAGGGCGCTATTTGGCAGATTTCAAGACTACTTACAAAGCCAAAAAGC CTGTACAAATGCCTGGAGCTTACAACGTGGATAGGAAGCTTGATATTACCAGT CACAATGAAGATTATACAGTGGTAGAACAATATGAACGCTCAGAAGGTCGCC ACAGCACTGGAGGCATGGATGAGTTGTACAAGAGCGCTGATCCAAAGAAGAA AAGGAAAGTTGATCCCAAAAAGAAGAGGAAAGTAGATCCAAAAAAGAAGCG
AAAAGTAGGGTACAAGAAGTGAGTTTAAACCGCTGATCAGCCTCGACTGTGCC TTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTG GAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCA TTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCA AGGGGGAGGATTGGGAAGACAATAGCAGGCATGCTGGGGATGCGGTGGGCTC TATGGC (SEQ ID NO: 78)
CN3677 (4583 bp between ITRs)
GCGGCCGCACGCGTAGAGAGGGGGAGGGGAAACAGGATGCGGCGAGGCGCGT GCGCACTGCCAGCTTCAGCACCGCGGACAGTGCCTTCGCCCCCGCCTGGCGGC GCGCGCCACCGCCGCCTCAGCACTGAAGGCGCGCTGACGTCACTCGCCGGTCC
CCCGCAAACTCCCCTTCCCGGCCACCTTGGTCGCGTCCGCGCCGCCGCCGGCCC AGCCGGACCGCACCACGCGAGGCGCGAGATAGGGGGGCACGGGCGCGACCAT CTGCGCTGCGGCGCCGGCGACTCAGCGCTGCCTCAGTCTGCGGTGGGCAGCGG AGGAGTCGTGTCGTGCCTGAGAGCGCAGTCGAGAAACCGGCTAGATCGACTGG ATCCTTCGAAGCTAGCGCTGCCACCATGGTCTACCCGTATGATGTCCCGGATTA CGCTGGCAGCTACCCATACGATGTACCCGACTATGCCGGCAGTATGGAGCAAA CAGTTTTGGTCCCTCCGGGACCAGACAGTTTCAATTTCTTTACTCGGGAGAGTC TTGCCGCCATTGAGAGGCGCATAGCTGAGGTAAGTACTAGCAGCTACAATCCA GCTACCATTCTGCTTTTATTCTATGGTTGGGATAAGGCTGGATTATTCTGAGTC
CAAGCTAGGCCCTTTTGCTAATCATGTTCATACCTCTTATCTTCCTCCCACAGG AAAAGGCTAAGAATCCAAAACCTGACAAGAAAGACGACGACGAAAACGGACC
CAAACCTAACTCAGATCTCGAAGCTGGAAAGAATCTCCCATTCATCTATGGTG
ATATCCCTCCAGAAATGGTTTCAGAACCTCTAGAAGATCTCGATCCATACTATA
TCAATAAAAAGACCTTCATCGTTCTGAACAAAGGAAAGGCGATTTTCCGGTTT
TCTGCTACTTCTGCTCTCTATATTCTCACACCATTTAATCCACTTCGCAAGATTG
CGATTAAGATACTGGTGCATAGTCTGTTCAGTATGCTGATTATGTGTACAATCC
TTACAAATTGTGTCTTTATGACTATGTCTAACCCGCCGGATTGGACCAAGAACG
TAGAATACACGTTCACTGGAATCTATACGTTCGAGTCTCTTATTAAGATAATCG
CCAGGGGGTTCTGTCTTGAGGATTTCACTTTCCTCCGCGATCCGTGGAATTGGC
TTGACTTCACCGTTATTACGTTCGCTTACGTTACTGAGTTCGTTGATCTTGGCAA
CGTGTCTGCACTCAGAACATTCAGAGTGCTTAGAGCACTTAAAACCATAAGTG
TCATACCAGGATTGAAAACGATCGTGGGAGCTCTGATACAGAGTGTAAAGAAG
CTTTCAGATGTAATGATCCTTACTGTCTTCTGTCTTTCCGTATTCGCACTCATCG
GGCTCCAGCTGTTTATGGGTAACCTCAGAAACAAATGCATTCAATGGCCACCA
ACAAATGCGAGCCTTGAGGAACATAGCATAGAAAAGAATATCACTGTTAACTA
TAATGGGACCCTCATAAACGAAACCGTGTTCGAATTTGACTGGAAATCCTACA
TTCAGGATTCCAGATATCATTATTTTCTTGAGGGCTTCTTGGACGCACTTTTGTG
CGGAAATTCAAGTGATGCTGGTCAATGTCCTGAAGGTTATATGTGTGTTAAAG
CAGGAAGAAACCCAAACTACGGATACACATCTTTCGATACATTTTCTTGGGCTT
TCCTATCTCTTTTTCGGCTTATGACACAAGACTTTTGGGAAAATTTGTATCAGC
TGACACTCCGAGCGGCTGGAAAAACTTATATGATCTTCTTCGTTCTTGTAATCT
TTTTGGGATCCTTCTACCTCATCAATTTGATACTTGCAGTTGTCGCTATGGCATA
CGAGGAGCAAAATCAAGCAACGCTAGAAGAAGCGGAGCAGAAAGAGGCTGA
ATTTCAACAGATGATTGAGCAATTGAAGAAACAACAGGAAGCTGCACAACAA
GCAGCTACTGCTACTGCATCTGAACATTCTAGAGAGCCAAGTGCAGCTGGAAG
GCTTTCTGATAGTTCAAGTGAAGCATCTAAATTGAGTTCTAAGTCAGCAAAGG
AACGGAGAAATAGACGGAAAAAACGAAAGCAGAAGGAGCAATCTGGAGGAG
AAGAGAAGGACGAAGACGAGTTTCAAAAAAGTGAATCAGAGGACTCAATTAG
ACGCAAAGGATTCAGATTTAGTATCGAAGGAAATAGATTGACTTATGAAAAAC
GATATTCCTCACCACATCAGTCACTCCTGAGTATACGCGGGTCACTCTTTTCAC
CGAGACGAAATTCCAGAACTTCACTCTTCTCATTCCGGGGAAGGGCTAAGGAT
GTAGGCTCAGAAAATGATTTCGCAGACGATGAGCATTCCACTTTTGAAGATAA
TGAGAGCAGGCGAGACAGTCTCTTTGTACCACGAAGACATGGCGAAAGAAGG
AACAGCAACCTTAGCCAGACTAGTCGGTCCAGTAGAATGCTAGCTGTATTCCC
TGCTAATGGCAAGATGCATTCCACCGTTGATTGTAATGGGGTCGTCTCGTTAGT
AGGTGGACCTTCAGTTCCTACCTCACCGGTTGGACAATTGCTGCCGGAGGGAA
CCACTACTGAGACTGAAATGAGAAAACGACGTTCTTCAAGCTTCCATGTGTCT
ATGGATTTTTTGGAAGACCCGTCACAGCGCCAAAGAGCTATGTCTATAGCTTC
AATCCTGACAAACACCGTAGAGGAGTTGGAGGAGTCACGCCAGAAGTGCCCTC
CTTGTTGGTACAAGTTCTCCAACATCTTCCTGATTTGGGATTGTTCACCTTATTG
GCTGAAAGTCAAGCACGTTGTTAACCTCGTCGTAATGGATCCTTTTGTCGACCT
GGCTATAACGATATGTATCGTCCTGAACACACTCTTCATGGCTATGGAGCATTA
TCCGATGACTGATCATTTTAACAATGTGCTTACCGTGGGTAATCTGGTTTTCAC
TGGCATCTTTACTGCAGAAATGTTTCTTAAGATTATTGCAATGGACCCCTACTA
CTACTTTCAAGAAGGATGGAATATTTTTGATGGTTTTATCGTCACACTTTCTTTG
GTTGAATTGGGCTTGGCAAATGTAGAGGGGCTCTCAGTTCTTAGAAGTTTCAG ACTTCTCCGGGTATTCAAGCTTGCTAAGAGCTGGCCTACTTTGAACATGCTCAT
CAAGATTATCGGAAACAGTGTTGGCGCCCTTGGCAATCTGACATTGGTGTTGG
CTATCATAGTATTCATCTTCGCGGTTGTGGGAATGCAGTTGTTTGGGAAGTCAT
ACAAGGACTGTGTGTGCAAGATAGCGTCCGACTGTCAACTTCCGAGGTGGCAC
ATGAACGATTTCTTTCATTCATTCCTCATTGTGTTTCGGGTCCTCTGTGGCGAGT
GGATAGAAACTATGTGGGACTGTATGGAAGTAGCTGGGCAGGCGATGTGCCTT
ACGGTATTCATGATGGTCATGGTCATCGGAAATCTTGTTGTATTGAATTTATTT
CTCGCGTTGTTGTTGAGTTCATTTTCCGCCGATAATTTGGCTGCCACTGATGAC
GACAACGAGATGAATAATCTTCAGATAGCTGTAGACCGGATGCACAAAGGCGT
TGCCTACGTCAAACGAAAAATCTATGAATTCATACAGCAATCCTTCATACGAA
AACAGAAGATTCTGGATGAAATCAAACCCCTTGATGATCTCAATAATAAGAAA
GATTCATGTCTCAGTTATGACACAGAAATCTTGACGGTGGAATACGGGTTTCTT
CCGATCGGAAAGATTGTTGAGGAGCGCATAGAGTGTACGGTGTATACCGTCGA
TAAGAACGGATTTGTCTACACACAGCCTATCGCACAATGGCATAATAGAGGAG
AACAAGAAGTCTTCGAATATTGTTTGGAGGACGGATCAATCATACGGGCAACC
AAAGACCACAAGTTTATGACAACAGATGGACAGATGTTGCCAATAGATGAGAT
ATTTGAGAGGGGACTTGATCTCAAGCAAGTGGATGGTCTGCCATAATGATATC
AAAGAGACCGGTTCACTGTGACAGTAAAAGAGACCGGTTCACTGTGAGAATG
AAAGAGACCGGTTCACTGTGATCGGAAAAGAGACCGGTTCACTGTGAGCGGCC
TTGAAACCCAGCAGACAATGTAGCTCAGTAGAAACCCAGCAGACAATGTAGCT
GAATGGAAACCCAGCAGACAATGTAGCTTCGGAGAAACCCAGCAGACAATGT
AGCTATAATCAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTATTCTT
AACTATGTTGCTCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCTTTGTATC
ATGCTATTGCTTCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTT
AGTTCTTGCCACGGCGGAACTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAG
GGGCTCGGCTGTTGGGCACTGACAATTCCGTGGAATAAAAGATCTTTATTTTCA
TTAGATCTGTGTGTTGGTTTTTTGTGTGCGGACCGAGCGGCCGC (SEQ ID NO: 79)
CN3678 (4088 bp between ITRs)
GCGGCCGCACGCGTAGAGAGGGGGAGGGGAAACAGGATGCGGCGAGGCGCGT
GCGCACTGCCAGCTTCAGCACCGCGGACAGTGCCTTCGCCCCCGCCTGGCGGCG
CGCGCCACCGCCGCCTCAGCACTGAAGGCGCGCTGACGTCACTCGCCGGTCCCC
CGCAAACTCCCCTTCCCGGCCACCTTGGTCGCGTCCGCGCCGCCGCCGGCCCAG
CCGGACCGCACCACGCGAGGCGCGAGATAGGGGGGCACGGGCGCGACCATCTG
CGCTGCGGCGCCGGCGACTCAGCGCTGCCTCAGTCTGCGGTGGGCAGCGGAGG
AGTCGTGTCGTGCCTGAGAGCGCAGTCGAGAAACCGGCTAGATCGACTGGATCC
TTCGAAGCTAGCGCTGCCACCATGGTCAAGATCATCTCCAGGAAGTCTCTGGGT
ACACAGAATGTCTACGATATCGGAGTCGAGAAAGACCACAATTTTCTCCTGAAA
AACGGACTCGTGGCGTCCAATTGCATGTCGAACCATACCACAGAGATAGGCAA
GGACCTTGACTACCTTAAAGACGTGAACGGTACCACAAGTGGAATAGGCACAG
GTAAGTACTAGCAGCTACAATCCAGCTACCATTCTGCTTTTATTCTATGGTTGGG
ATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTTCATA
CCTCTTATCTTCCTCCCACAGGCTCCTCTGTAGAGAAGTACATCATAGACGAGA
GCGATTACATGTCTTTCATCAACAACCCGTCCCTCACTGTCACCGTTCCCATCGC
CGTAGGAGAATCTGACTTCGAGAATCTCAATACGGAGGATTTCAGCTCCGAATC
AGACTTGGAGGAATCAAAGGAGAAGTTGAACGAAAGTTCAAGTTCATCCGAGG
GCAGCACCGTGGACATAGGCGCCCCCGTCGAGGAACAACCTGTAGTCGAGCCT GAGGAAACTTTGGAACCCGAAGCGTGTTTCACGGAGGGGTGTGTTCAACGCTTC
AAGTGTTGCCAAATTAACGTTGAAGAGGGTCGTGGAAAACAATGGTGGAACCT
CCGCAGGACCTGTTTCCGGATCGTCGAACATAATTGGTTCGAGACGTTCATAGT
TTTCATGATCTTGCTTTCATCTGGTGCTTTGGCATTCGAGGATATCTACATCGAC
CAACGAAAGACCATAAAAACTATGCTGGAATATGCAGACAAGGTTTTCACATA
CATATTCATCCTTGAAATGCTCCTGAAATGGGTAGCGTATGGTTACCAGACTTAT
TTCACGAACGCATGGTGCTGGCTCGATTTCCTGATTGTCGACGTCTCCCTGGTGT
CATTGACTGCTAACGCACTCGGATATAGCGAACTAGGCGCTATTAAGAGTCTCA
GAACCCTGAGAGCATTGAGGCCCCTCCGCGCGCTCTCTCGGTTTGAGGGAATGA
GAGTAGTCGTTAATGCACTGTTGGGAGCGATACCTTCCATTATGAACGTGCTTCT
CGTTTGTCTCATCTTCTGGCTGATATTCTCTATTATGGGTGTGAACTTGTTCGCA
GGCAAATTTTACCACTGCATTAACACAACTACAGGAGATAGATTTGATATTGAG
GATGTAAACAACCACACCGACTGTTTGAAGTTGATAGAGAGAAACGAGACCGC
AAGATGGAAGAATGTAAAAGTCAACTTCGACAATGTCGGCTTTGGATATCTTTC
ACTGCTGCAAGTAGCCACATTCAAAGGATGGATGGACATTATGTACGCTGCAGT
AGATTCCCGAAACGTAGAGTTGCAACCGAAGTATGAAGAAAGTTTGTATATGTA
CCTCTACTTCGTAATTTTTATCATCTTTGGCTCATTCTTCACACTTAACCTGTTCA
TTGGTGTAATCATCGACAATTTCAATCAGCAGAAAAAGAAATTTGGTGGACAAG
ACATCTTCATGACAGAGGAACAGAAGAAATACTATAATGCAATGAAAAAACTA
GGGTCCAAAAAGCCCCAAAAACCTATTCCTAGACCGGGCAACAAGTTTCAAGG
CATGGTTTTCGACTTCGTAACTAGACAGGTGTTTGATATATCTATTATGATTCTG
ATATGTCTGAATATGGTTACGATGATGGTTGAGACTGATGATCAATCTGAATAC
GTTACGACGATACTTAGCCGAATTAACTTGGTATTCATTGTTCTTTTCACGGGCG
AATGTGTACTTAAACTGATTAGTTTAAGGCACTATTATTTCACAATCGGTTGGAA
CATTTTTGATTTCGTTGTGGTCATACTTTCCATTGTTGGCATGTTTCTTGCTGAAT
TGATAGAAAAGTACTTCGTCAGTCCAACACTTTTCCGAGTTATACGGCTTGCCC
GAATCGGACGAATTCTCAGGCTAATCAAAGGTGCTAAAGGAATTCGTACACTGC
TTTTCGCTCTCATGATGTCACTGCCAGCTCTTTTCAACATCGGTTTGTTACTATTT
TTGGTAATGTTTATATATGCGATCTTCGGCATGAGTAATTTCGCTTATGTTAAAC
GGGAGGTGGGAATCGATGACATGTTTAATTTTGAGACATTCGGCAATTCTATGA
TCTGTCTCTTTCAAATTACCACGTCAGCTGGATGGGACGGATTGCTTGCTCCGAT
TCTCAACAGTAAACCGCCCGATTGCGACCCTAACAAAGTGAATCCGGGTTCATC
TGTAAAGGGAGACTGCGGAAATCCGAGCGTCGGTATCTTCTTTTTCGTCTCCTAC
ATTATAATTTCTTTCCTTGTTGTCGTGAACATGTATATAGCTGTGATCTTGGAAA
ATTTTTCTGTTGCTACTGAGGAATCCGCAGAACCACTTTCAGAAGACGATTTTGA
GATGTTTTACGAAGTTTGGGAGAAGTTTGATCCTGACGCTACACAGTTTATGGA
ATTTGAGAAGCTCTCACAGTTCGCAGCTGCCCTGGAGCCTCCGTTGAATCTTCCA
CAGCCTAACAAGTTACAACTGATTGCGATGGACCTGCCAATGGTGTCTGGGGAC
CGAATCCACTGCCTTGATATACTCTTTGCTTTCACAAAAAGGGTCTTGGGCGAGT
CTGGAGAAATGGACGCCCTCAGAATACAGATGGAGGAACGATTCATGGCTTCG
AATCCTAGCAAAGTGTCTTATCAACCCATCACTACGACTCTTAAAAGAAAACAA
GAGGAAGTGTCTGCTGTCATTATCCAGCGAGCATATAGACGGCACTTGCTCAAA
CGAACTGTTAAGCAAGCCAGTTTCACCTACAATAAAAACAAAATAAAAGGTGG
TGCTAATTTGCTGATTAAAGAGGACATGATTATCGACAGAATCAATGAGAACTC
CATTACAGAAAAAACCGATCTCACTATGTCAACAGCAGCCTGTCCTCCCTCATA
CGACCGTGTCACTAAACCTATAGTCGAAAAACATGAACAAGAGGGCAAGGATG AGAAGGCCAAAGGCAAAGCCGGCGACTACAAAGACCATGACGGAGACTATAA
AGATCATGACATCGATTACAAGGATGACGATGACAAGTAATGATATCAAAGAG
ACCGGTTCACTGTGACAGTAAAAGAGACCGGTTCACTGTGAGAATGAAAGAGA
CCGGTTCACTGTGATCGGAAAAGAGACCGGTTCACTGTGAGCGGCCTTGAAACC
CAGCAGACAATGTAGCTCAGTAGAAACCCAGCAGACAATGTAGCTGAATGGAA
ACCCAGCAGACAATGTAGCTTCGGAGAAACCCAGCAGACAATGTAGCTATAAT
CAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAACTATGTTG
CTCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCTTTGTATCATGCTATTGCT
TCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTTAGTTCTTGCCAC
GGCGGAACTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGGCTCGGCTGTT
GGGCACTGACAATTCCGTGGAATAAAAGATCTTTATTTTCATTAGATCTGTGTGT
TGGTTTTTTGTGTGCGGACCGAGCGGCCGC (SEQ ID NO: 80)
CN4541 (4790 bp between ITRs)
GCGGCCGCACGCGTGAAATCATAAATGCTGAGGGTAGTCTGCCTCAGGTACAC
ACTGAGAAACTGCTTTAATGTAACCTGACCCACGGTTATTAGTGAAAATATCA
CTTTTGTTGTTACCTTATTCCCAACAAATTCATTTCTGCTTTAATGGAAAAGATC
CGGGTTCACACTAATCAGGCCCAACGGAAGGCCATATTAGCAATTTGGCAGGT
ACCCGAGGGCCATACCTAATCTGCATAAAATGAAGCAGATTGCAACCGCCCTC
ATCTTTTTTATTTTTAAACTGGTTTTTGAAGCAGAGCATAAAATCTCAGAGGGA
GAGACAGAAGATGCTAGTGCATACATTTTCCTTCATGCCTTTATTTTCATTCTTT
TTGCACAAACCATCTTCCTGAATGGCTGTTTACCTAAAGAAGAATAACAAAAT
AAAAGGTGCTAGGAAATGGAGTAGGCAGAGATCACAAATGTTTAATTAAAAA
AAAAAAAAGTCATGTACTTTCATAGATATTCACAATCCTCTCTAGTATACTTTC
AAATCAGTTTTAATTTCAGTTTAGTGTTTTTATGTTTTGTGAAGATACGCGAGC
TCGGGCTGGGCATAAAAGTCAGGGCAGAGCCATCTATTGCTTACATTTGCTTCT
GGGATCCAGATCTTTCGAAGCTAGCGCTACCGGTCGCCACCATGGTCTACCCG
TATGATGTCCCGGATTACGCTGGCAGCTACCCATACGATGTACCCGACTATGC
CGGCAGTATGGAGCAAACAGTTTTGGTCCCTCCGGGACCAGACAGTTTCAATT
TCTTTACTCGGGAGAGTCTTGCCGCCATTGAGAGGCGCATAGCTGAGGTAAGT
ACTAGCAGCTACAATCCAGCTACCATTCTGCTTTTATTCTATGGTTGGGATAAG
GCTGGATTATTCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTTCATACCTC
TTATCTTCCTCCCACAGGAAAAGGCTAAGAATCCAAAACCTGACAAGAAAGAC
GACGACGAAAACGGACCCAAACCTAACTCAGATCTCGAAGCTGGAAAGAATC
TCCCATTCATCTATGGTGATATCCCTCCAGAAATGGTTTCAGAACCTCTAGAAG
ATCTCGATCCATACTATATCAATAAAAAGACCTTCATCGTTCTGAACAAAGGA
AAGGCGATTTTCCGGTTTTCTGCTACTTCTGCTCTCTATATTCTCACACCATTTA
ATCCACTTCGCAAGATTGCGATTAAGATACTGGTGCATAGTCTGTTCAGTATGC
TGATTATGTGTACAATCCTTACAAATTGTGTCTTTATGACTATGTCTAACCCGC
CGGATTGGACCAAGAACGTAGAATACACGTTCACTGGAATCTATACGTTCGAG
TCTCTTATTAAGATAATCGCCAGGGGGTTCTGTCTTGAGGATTTCACTTTCCTC
CGCGATCCGTGGAATTGGCTTGACTTCACCGTTATTACGTTCGCTTACGTTACT
GAGTTCGTTGATCTTGGCAACGTGTCTGCACTCAGAACATTCAGAGTGCTTAG
AGCACTTAAAACCATAAGTGTCATACCAGGATTGAAAACGATCGTGGGAGCTC
TGATACAGAGTGTAAAGAAGCTTTCAGATGTAATGATCCTTACTGTCTTCTGTC
TTTCCGTATTCGCACTCATCGGGCTCCAGCTGTTTATGGGTAACCTCAGAAACA
AATGCATTCAATGGCCACCAACAAATGCGAGCCTTGAGGAACATAGCATAGA AAAGAATATCACTGTTAACTATAATGGGACCCTCATAAACGAAACCGTGTTCG
AATTTGACTGGAAATCCTACATTCAGGATTCCAGATATCATTATTTTCTTGAGG
GCTTCTTGGACGCACTTTTGTGCGGAAATTCAAGTGATGCTGGTCAATGTCCTG
AAGGTTATATGTGTGTTAAAGCAGGAAGAAACCCAAACTACGGATACACATCT
TTCGATACATTTTCTTGGGCTTTCCTATCTCTTTTTCGGCTTATGACACAAGACT
TTTGGGAAAATTTGTATCAGCTGACACTCCGAGCGGCTGGAAAAACTTATATG
ATCTTCTTCGTTCTTGTAATCTTTTTGGGATCCTTCTACCTCATCAATTTGATAC
TTGCAGTTGTCGCTATGGCATACGAGGAGCAAAATCAAGCAACGCTAGAAGA
AGCGGAGCAGAAAGAGGCTGAATTTCAACAGATGATTGAGCAATTGAAGAAA
CAACAGGAAGCTGCACAACAAGCAGCTACTGCTACTGCATCTGAACATTCTAG
AGAGCCAAGTGCAGCTGGAAGGCTTTCTGATAGTTCAAGTGAAGCATCTAAAT
TGAGTTCTAAGTCAGCAAAGGAACGGAGAAATAGACGGAAAAAACGAAAGCA
GAAGGAGCAATCTGGAGGAGAAGAGAAGGACGAAGACGAGTTTCAAAAAAG
TGAATCAGAGGACTCAATTAGACGCAAAGGATTCAGATTTAGTATCGAAGGAA
ATAGATTGACTTATGAAAAACGATATTCCTCACCACATCAGTCACTCCTGAGT
ATACGCGGGTCACTCTTTTCACCGAGACGAAATTCCAGAACTTCACTCTTCTCA
TTCCGGGGAAGGGCTAAGGATGTAGGCTCAGAAAATGATTTCGCAGACGATG
AGCATTCCACTTTTGAAGATAATGAGAGCAGGCGAGACAGTCTCTTTGTACCA
CGAAGACATGGCGAAAGAAGGAACAGCAACCTTAGCCAGACTAGTCGGTCCA
GTAGAATGCTAGCTGTATTCCCTGCTAATGGCAAGATGCATTCCACCGTTGATT
GTAATGGGGTCGTCTCGTTAGTAGGTGGACCTTCAGTTCCTACCTCACCGGTTG
GACAATTGCTGCCGGAGGGAACCACTACTGAGACTGAAATGAGAAAACGACG
TTCTTCAAGCTTCCATGTGTCTATGGATTTTTTGGAAGACCCGTCACAGCGCCA
AAGAGCTATGTCTATAGCTTCAATCCTGACAAACACCGTAGAGGAGTTGGAGG
AGTCACGCCAGAAGTGCCCTCCTTGTTGGTACAAGTTCTCCAACATCTTCCTGA
TTTGGGATTGTTCACCTTATTGGCTGAAAGTCAAGCACGTTGTTAACCTCGTCG
TAATGGATCCTTTTGTCGACCTGGCTATAACGATATGTATCGTCCTGAACACAC
TCTTCATGGCTATGGAGCATTATCCGATGACTGATCATTTTAACAATGTGCTTA
CCGTGGGTAATCTGGTTTTCACTGGCATCTTTACTGCAGAAATGTTTCTTAAGA
TTATTGCAATGGACCCCTACTACTACTTTCAAGAAGGATGGAATATTTTTGATG
GTTTTATCGTCACACTTTCTTTGGTTGAATTGGGCTTGGCAAATGTAGAGGGGC
TCTCAGTTCTTAGAAGTTTCAGACTTCTCCGGGTATTCAAGCTTGCTAAGAGCT
GGCCTACTTTGAACATGCTCATCAAGATTATCGGAAACAGTGTTGGCGCCCTT
GGCAATCTGACATTGGTGTTGGCTATCATAGTATTCATCTTCGCGGTTGTGGGA
ATGCAGTTGTTTGGGAAGTCATACAAGGACTGTGTGTGCAAGATAGCGTCCGA
CTGTCAACTTCCGAGGTGGCACATGAACGATTTCTTTCATTCATTCCTCATTGT
GTTTCGGGTCCTCTGTGGCGAGTGGATAGAAACTATGTGGGACTGTATGGAAG
TAGCTGGGCAGGCGATGTGCCTTACGGTATTCATGATGGTCATGGTCATCGGA
AATCTTGTTGTATTGAATTTATTTCTCGCGTTGTTGTTGAGTTCATTTTCCGCCG
ATAATTTGGCTGCCACTGATGACGACAACGAGATGAATAATCTTCAGATAGCT
GTAGACCGGATGCACAAAGGCGTTGCCTACGTCAAACGAAAAATCTATGAATT
CATACAGCAATCCTTCATACGAAAACAGAAGATTCTGGATGAAATCAAACCCC
TTGATGATCTCAATAATAAGAAAGATTCATGTCTCAGTTATGACACAGAAATC
TTGACGGTGGAATACGGGTTTCTTCCGATCGGAAAGATTGTTGAGGAGCGCAT
AGAGTGTACGGTGTATACCGTCGATAAGAACGGATTTGTCTACACACAGCCTA
TCGCACAATGGCATAATAGAGGAGAACAAGAAGTCTTCGAATATTGTTTGGAG GACGGATCAATCATACGGGCAACCAAAGACCACAAGTTTATGACAACAGATG
GACAGATGTTGCCAATAGATGAGATATTTGAGAGGGGACTTGATCTCAAGCAA
GTGGATGGTCTGCCATAATGATATCATAATCAACCTCTGGATTACAAAATTTGT
GAAAGATTGACTGGTATTCTTAACTATGTTGCTCCTTTTACGCTATGTGGATAC
GCTGCTTTAATGCCTTTGTATCATGCTATTGCTTCCCGTATGGCTTTCATTTTCT
CCTCCTTGTATAAATCCTGGTTAGTTCTTGCCACGGCGGAACTCATCGCCGCCT
GCCTTGCCCGCTGCTGGACAGGGGCTCGGCTGTTGGGCACTGACAATTCCGTG
GTGTTTATTTGTGAAATTTGTGATGCTATTGCTTTATTTGTAACCATCTAGCTTT
ATTTGTGAAATTTGTGATGCTATTGCTTTATTTGTAACCATTATAAGCTGCAAT
AAACAAGTTAACAACAACAATTGCATTCATTTTATGTTTCAGGTTCAGGGGGA
GATGTGGGAGGTTTTTTAAACGGACCGAGCGGCCGC (SEQ ID NO: 81)
CN4542 (4295 bp between ITRs)
GCGGCCGCACGCGTGAAATCATAAATGCTGAGGGTAGTCTGCCTCAGGTACAC
ACTGAGAAACTGCTTTAATGTAACCTGACCCACGGTTATTAGTGAAAATATCA
CTTTTGTTGTTACCTTATTCCCAACAAATTCATTTCTGCTTTAATGGAAAAGATC
CGGGTTCACACTAATCAGGCCCAACGGAAGGCCATATTAGCAATTTGGCAGGT
ACCCGAGGGCCATACCTAATCTGCATAAAATGAAGCAGATTGCAACCGCCCTC
ATCTTTTTTATTTTTAAACTGGTTTTTGAAGCAGAGCATAAAATCTCAGAGGGA
GAGACAGAAGATGCTAGTGCATACATTTTCCTTCATGCCTTTATTTTCATTCTTT
TTGCACAAACCATCTTCCTGAATGGCTGTTTACCTAAAGAAGAATAACAAAAT
AAAAGGTGCTAGGAAATGGAGTAGGCAGAGATCACAAATGTTTAATTAAAAA
AAAAAAAAGTCATGTACTTTCATAGATATTCACAATCCTCTCTAGTATACTTTC
AAATCAGTTTTAATTTCAGTTTAGTGTTTTTATGTTTTGTGAAGATACGCGAGC
TCGGGCTGGGCATAAAAGTCAGGGCAGAGCCATCTATTGCTTACATTTGCTTCT
GGGATCCAGATCTTTCGAAGCTAGCGCTACCGGTCGCCACCATGGTCAAGATC
ATCTCCAGGAAGTCTCTGGGTACACAGAATGTCTACGATATCGGAGTCGAGAA
AGACCACAATTTTCTCCTGAAAAACGGACTCGTGGCGTCCAATTGCATGTCGA
ACCATACCACAGAGATAGGCAAGGACCTTGACTACCTTAAAGACGTGAACGGT
ACCACAAGTGGAATAGGCACAGGTAAGTACTAGCAGCTACAATCCAGCTACC
ATTCTGCTTTTATTCTATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAGCT
AGGCCCTTTTGCTAATCATGTTCATACCTCTTATCTTCCTCCCACAGGCTCCTCT
GTAGAGAAGTACATCATAGACGAGAGCGATTACATGTCTTTCATCAACAACCC
GTCCCTCACTGTCACCGTTCCCATCGCCGTAGGAGAATCTGACTTCGAGAATCT
CAATACGGAGGATTTCAGCTCCGAATCAGACTTGGAGGAATCAAAGGAGAAG
TTGAACGAAAGTTCAAGTTCATCCGAGGGCAGCACCGTGGACATAGGCGCCCC
CGTCGAGGAACAACCTGTAGTCGAGCCTGAGGAAACTTTGGAACCCGAAGCGT
GTTTCACGGAGGGGTGTGTTCAACGCTTCAAGTGTTGCCAAATTAACGTTGAA
GAGGGTCGTGGAAAACAATGGTGGAACCTCCGCAGGACCTGTTTCCGGATCGT
CGAACATAATTGGTTCGAGACGTTCATAGTTTTCATGATCTTGCTTTCATCTGG
TGCTTTGGCATTCGAGGATATCTACATCGACCAACGAAAGACCATAAAAACTA
TGCTGGAATATGCAGACAAGGTTTTCACATACATATTCATCCTTGAAATGCTCC
TGAAATGGGTAGCGTATGGTTACCAGACTTATTTCACGAACGCATGGTGCTGG
CTCGATTTCCTGATTGTCGACGTCTCCCTGGTGTCATTGACTGCTAACGCACTC
GGATATAGCGAACTAGGCGCTATTAAGAGTCTCAGAACCCTGAGAGCATTGAG
GCCCCTCCGCGCGCTCTCTCGGTTTGAGGGAATGAGAGTAGTCGTTAATGCAC
TGTTGGGAGCGATACCTTCCATTATGAACGTGCTTCTCGTTTGTCTCATCTTCTG GCTGATATTCTCTATTATGGGTGTGAACTTGTTCGCAGGCAAATTTTACCACTG
CATTAACACAACTACAGGAGATAGATTTGATATTGAGGATGTAAACAACCACA
CCGACTGTTTGAAGTTGATAGAGAGAAACGAGACCGCAAGATGGAAGAATGT
AAAAGTCAACTTCGACAATGTCGGCTTTGGATATCTTTCACTGCTGCAAGTAGC
CACATTCAAAGGATGGATGGACATTATGTACGCTGCAGTAGATTCCCGAAACG
TAGAGTTGCAACCGAAGTATGAAGAAAGTTTGTATATGTACCTCTACTTCGTA
ATTTTTATCATCTTTGGCTCATTCTTCACACTTAACCTGTTCATTGGTGTAATCA
TCGACAATTTCAATCAGCAGAAAAAGAAATTTGGTGGACAAGACATCTTCATG
ACAGAGGAACAGAAGAAATACTATAATGCAATGAAAAAACTAGGGTCCAAAA
AGCCCCAAAAACCTATTCCTAGACCGGGCAACAAGTTTCAAGGCATGGTTTTC
GACTTCGTAACTAGACAGGTGTTTGATATATCTATTATGATTCTGATATGTCTG
AATATGGTTACGATGATGGTTGAGACTGATGATCAATCTGAATACGTTACGAC
GATACTTAGCCGAATTAACTTGGTATTCATTGTTCTTTTCACGGGCGAATGTGT
ACTTAAACTGATTAGTTTAAGGCACTATTATTTCACAATCGGTTGGAACATTTT
TGATTTCGTTGTGGTCATACTTTCCATTGTTGGCATGTTTCTTGCTGAATTGATA
GAAAAGTACTTCGTCAGTCCAACACTTTTCCGAGTTATACGGCTTGCCCGAATC
GGACGAATTCTCAGGCTAATCAAAGGTGCTAAAGGAATTCGTACACTGCTTTT
CGCTCTCATGATGTCACTGCCAGCTCTTTTCAACATCGGTTTGTTACTATTTTTG
GTAATGTTTATATATGCGATCTTCGGCATGAGTAATTTCGCTTATGTTAAACGG
GAGGTGGGAATCGATGACATGTTTAATTTTGAGACATTCGGCAATTCTATGAT
CTGTCTCTTTCAAATTACCACGTCAGCTGGATGGGACGGATTGCTTGCTCCGAT
TCTCAACAGTAAACCGCCCGATTGCGACCCTAACAAAGTGAATCCGGGTTCAT
CTGTAAAGGGAGACTGCGGAAATCCGAGCGTCGGTATCTTCTTTTTCGTCTCCT
ACATTATAATTTCTTTCCTTGTTGTCGTGAACATGTATATAGCTGTGATCTTGG
AAAATTTTTCTGTTGCTACTGAGGAATCCGCAGAACCACTTTCAGAAGACGAT
TTTGAGATGTTTTACGAAGTTTGGGAGAAGTTTGATCCTGACGCTACACAGTTT
ATGGAATTTGAGAAGCTCTCACAGTTCGCAGCTGCCCTGGAGCCTCCGTTGAA
TCTTCCACAGCCTAACAAGTTACAACTGATTGCGATGGACCTGCCAATGGTGT
CTGGGGACCGAATCCACTGCCTTGATATACTCTTTGCTTTCACAAAAAGGGTCT
TGGGCGAGTCTGGAGAAATGGACGCCCTCAGAATACAGATGGAGGAACGATT
CATGGCTTCGAATCCTAGCAAAGTGTCTTATCAACCCATCACTACGACTCTTAA
AAGAAAACAAGAGGAAGTGTCTGCTGTCATTATCCAGCGAGCATATAGACGG
CACTTGCTCAAACGAACTGTTAAGCAAGCCAGTTTCACCTACAATAAAAACAA
AATAAAAGGTGGTGCTAATTTGCTGATTAAAGAGGACATGATTATCGACAGAA
TCAATGAGAACTCCATTACAGAAAAAACCGATCTCACTATGTCAACAGCAGCC
TGTCCTCCCTCATACGACCGTGTCACTAAACCTATAGTCGAAAAACATGAACA
AGAGGGCAAGGATGAGAAGGCCAAAGGCAAAGCCGGCGACTACAAAGACCA
TGACGGAGACTATAAAGATCATGACATCGATTACAAGGATGACGATGACAAG
TAATGATATCATAATCAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGG
TATTCTTAACTATGTTGCTCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCT
TTGTATCATGCTATTGCTTCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAAT
CCTGGTTAGTTCTTGCCACGGCGGAACTCATCGCCGCCTGCCTTGCCCGCTGCT
GGACAGGGGCTCGGCTGTTGGGCACTGACAATTCCGTGGTGTTTATTTGTGAA
ATTTGTGATGCTATTGCTTTATTTGTAACCATCTAGCTTTATTTGTGAAATTTGT
GATGCTATTGCTTTATTTGTAACCATTATAAGCTGCAATAAACAAGTTAACAAC AACAATTGCATTCATTTTATGTTTCAGGTTCAGGGGGAGATGTGGGAGGTTTTT
TAAACGGACCGAGCGGCCGC(SEQ ID NO:82)
CN4217 (4249 bp between ITRs)
GCGGCCGCACGCGTTTAATTAAGTGTCTAGACTGCAGAGGGCCCTGCGTATGAG
TGCAAGTGGGTTTTAGGACCAGGATGAGGCGGGGTGGGGGTGCCTACCTGACG
ACCGACCCCGACCCACTGGACAAGCACCCAACCCCCATTCCCCAAATTGCGCAT
CCCCTATCAGAGAGGGGGAGGGGAAACAGGATGCGGCGAGGCGCGTGCGCACT
GCCAGCTTCAGCACCGCGGACAGTGCCTTCGCCCCCGCCTGGCGGCGCGCGCCA
CCGCCGCCTCAGCACTGAAGGCGCGCTGACGTCACTCGCCGGTCCCCCGCAAAC
TCCCCTTCCCGGCCACCTTGGTCGCGTCCGCGCCGCCGCCGGCCCAGCCGGACC
GCACCACGCGAGGCGCGAGATAGGGGGGCACGGGCGCGACCATCTGCGCTGCG
GCGCCGGCGACTCAGCGCTGCCTCAGTCTGCGGTGGGCAGCGGAGGAGTCGTGT
CGTGCCTGAGAGCGCAGTCGAGAAACCGGCTAGAGGATCCTTCGAAGCTAGCG
CTACCGGTGCCACCATGGTCTACCCGTATGATGTCCCGGATTACGCTGGCAGCT
ACCCATACGATGTACCCGACTATGCCGGCAGTATGGAGCAAACAGTTTTGGTCC
CTCCGGGACCAGACAGTTTCAATTTCTTTACTCGGGAGAGTCTTGCCGCCATTGA
GAGGCGCATAGCTGAGGTAAGTACTAGCAGCTACAATCCAGCTACCATTCTGCT
TTTATTCTATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCTTT
TGCTAATCATGTTCATACCTCTTATCTTCCTCCCACAGGAAAAGGCTAAGAATCC
AAAACCTGACAAGAAAGACGACGACGAAAACGGACCCAAACCTAACTCAGATC
TCGAAGCTGGAAAGAATCTCCCATTCATCTATGGTGATATCCCTCCAGAAATGG
TTTCAGAACCTCTAGAAGATCTCGATCCATACTATATCAATAAAAAGACCTTCA
TCGTTCTGAACAAAGGAAAGGCGATTTTCCGGTTTTCTGCTACTTCTGCTCTCTA
TATTCTCACACCATTTAATCCACTTCGCAAGATTGCGATTAAGATACTGGTGCAT
AGTCTGTTCAGTATGCTGATTATGTGTACAATCCTTACAAATTGTGTCTTTATGA
CTATGTCTAACCCGCCGGATTGGACCAAGAACGTAGAATACACGTTCACTGGAA
TCTATACGTTCGAGTCTCTTATTAAGATAATCGCCAGGGGGTTCTGTCTTGAGGA
TTTCACTTTCCTCCGCGATCCGTGGAATTGGCTTGACTTCACCGTTATTACGTTC
GCTTACGTTACTGAGTTCGTTGATCTTGGCAACGTGTCTGCACTCAGAACATTCA
GAGTGCTTAGAGCACTTAAAACCATAAGTGTCATACCAGGATTGAAAACGATCG
TGGGAGCTCTGATACAGAGTGTAAAGAAGCTTTCAGATGTAATGATCCTTACTG
TCTTCTGTCTTTCCGTATTCGCACTCATCGGGCTCCAGCTGTTTATGGGTAACCT
CAGAAACAAATGCATTCAATGGCCACCAACAAATGCGAGCCTTGAGGAACATA
GCATAGAAAAGAATATCACTGTTAACTATAATGGGACCCTCATAAACGAAACC
GTGTTCGAATTTGACTGGAAATCCTACATTCAGGATTCCAGATATCATTATTTTC
TTGAGGGCTTCTTGGACGCACTTTTGTGCGGAAATTCAAGTGATGCTGGTCAAT
GTCCTGAAGGTTATATGTGTGTTAAAGCAGGAAGAAACCCAAACTACGGATAC
ACATCTTTCGATACATTTTCTTGGGCTTTCCTATCTCTTTTTCGGCTTATGACACA
AGACTTTTGGGAAAATTTGTATCAGCTGACACTCCGAGCGGCTGGAAAAACTTA
TATGATCTTCTTCGTTCTTGTAATCTTTTTGGGATCCTTCTACCTCATCAATTTGA
TACTTGCAGTTGTCGCTATGGCATACGAGGAGCAAAATCAAGCAACGCTAGAA
GAAGCGGAGCAGAAAGAGGCTGAATTTCAACAGATGATTGAGCAATTGAAGAA
ACAACAGGAAGCTGCACAACAAGCAGCTACTGCTACTGCATCTGAACATTCTAG
AGAGCCAAGTGCAGCTGGAAGGCTTTCTGATAGTTCAAGTGAAGCATCTAAATT
GAGTTCTAAGTCAGCAAAGGAACGGAGAAATAGACGGAAAAAACGAAAGCAG
AAGGAGCAATCTGGAGGAGAAGAGAAGGACGAAGACGAGTTTCAAAAAAGTG AATCAGAGGACTCAATTAGACGCAAAGGATTCAGATTTAGTATCGAAGGAAAT
AGATTGACTTATGAAAAACGATATTCCTCACCACATCAGTCACTCCTGAGTATA
CGCGGGTCACTCTTTTCACCGAGACGAAATTCCAGAACTTCACTCTTCTCATTCC
GGGGAAGGGCTAAGGATGTAGGCTCAGAAAATGATTTCGCAGACGATGAGCAT
TCCACTTTTGAAGATAATGAGAGCAGGCGAGACAGTCTCTTTGTACCACGAAGA
CATGGCGAAAGAAGGAACAGCAACCTTAGCCAGACTAGTCGGTCCAGTAGAAT
GCTAGCTGTATTCCCTGCTAATGGCAAGATGCATTCCACCGTTGATTGTAATGG
GGTCGTCTCGTTAGTAGGTGGACCTTCAGTTCCTACCTCACCGGTTGGACAATTG
CTGCCGGAGGGAACCACTACTGAGACTGAAATGAGAAAACGACGTTCTTCAAG
CTTCCATGTGTCTATGGATTTTTTGGAAGACCCGTCACAGCGCCAAAGAGCTAT
GTCTATAGCTTCAATCCTGACAAACACCGTAGAGGAGTTGGAGGAGTCACGCCA
GAAGTGCCCTCCTTGTTGGTACAAGTTCTCCAACATCTTCCTGATTTGGGATTGT
TCACCTTATTGGCTGAAAGTCAAGCACGTTGTTAACCTCGTCGTAATGGATCCTT
TTGTCGACCTGGCTATAACGATATGTATCGTCCTGAACACACTCTTCATGGCTAT
GGAGCATTATCCGATGACTGATCATTTTAACAATGTGCTTACCGTGGGTAATCT
GGTTTTCACTGGCATCTTTACTGCAGAAATGTTTCTTAAGATTATTGCAATGGAC
CCCTACTACTACTTTCAAGAAGGATGGAATATTTTTGATGGTTTTATCGTCACAC
TTTCTTTGGTTGAATTGGGCTTGGCAAATGTAGAGGGGCTCTCAGTTCTTAGAAG
TTTCAGACTTCTCCGGGTATTCAAGCTTGCTAAGAGCTGGCCTACTTTGAACATG
CTCATCAAGATTATCGGAAACAGTGTTGGCGCCCTTGGCAATCTGACATTGGTG
TTGGCTATCATAGTATTCATCTTCGCGGTTGTGGGAATGCAGTTGTTTGGGAAGT
CATACAAGGACTGTGTGTGCAAGATAGCGTCCGACTGTCAACTTCCGAGGTGGC
ACATGAACGATTTCTTTCATTCATTCCTCATTGTGTTTCGGGTCCTCTGTGGCGA
GTGGATAGAAACTATGTGGGACTGTATGGAAGTAGCTGGGCAGGCGATGTGTCT
CAGTTATGACACAGAAATCTTGACGGTGGAATACGGGTTTCTTCCGATCGGAAA
GATTGTTGAGGAGCGCATAGAGTGTACGGTGTATACCGTCGATAAGAACGGATT
TGTCTACACACAGCCTATCGCACAATGGCATAATAGAGGAGAACAAGAAGTCTT
CGAATATTGTTTGGAGGACGGATCAATCATACGGGCAACCAAAGACCACAAGT
TTATGACAACAGATGGACAGATGTTGCCAATAGATGAGATATTTGAGAGGGGA
CTTGATCTCAAGCAAGTGGATGGTCTGCCATAATGATATCATAATCAACCTCTG
GATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAACTATGTTGCTCCTTTTA
CGCTATGTGGATACGCTGCTTTAATGCCTTTGTATCATGCTATTGCTTCCCGTAT
GGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTTAGTTCTTGCCACGGCGGAA
CTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGGCTCGGCTGTTGGGCACT
GACAATTCCGTGGCAATAAAAGATCTTTATTTTCATTAGATCTGTGTGTTGGTTT
TTTGTGTGGTGCGGACCGAGCGGCCGC (SEQ ID NO:83)
CN4218 (4312 bp between ITRs)
GCGGCCGCACGCGTTTAATTAAGTGTCTAGACTGCAGAGGGCCCTGCGTATGA
GTGCAAGTGGGTTTTAGGACCAGGATGAGGCGGGGTGGGGGTGCCTACCTGAC
GACCGACCCCGACCCACTGGACAAGCACCCAACCCCCATTCCCCAAATTGCGC
ATCCCCTATCAGAGAGGGGGAGGGGAAACAGGATGCGGCGAGGCGCGTGCGC
ACTGCCAGCTTCAGCACCGCGGACAGTGCCTTCGCCCCCGCCTGGCGGCGCGC
GCCACCGCCGCCTCAGCACTGAAGGCGCGCTGACGTCACTCGCCGGTCCCCCG
CAAACTCCCCTTCCCGGCCACCTTGGTCGCGTCCGCGCCGCCGCCGGCCCAGC
CGGACCGCACCACGCGAGGCGCGAGATAGGGGGGCACGGGCGCGACCATCTG
CGCTGCGGCGCCGGCGACTCAGCGCTGCCTCAGTCTGCGGTGGGCAGCGGAGG
Ill AGTCGTGTCGTGCCTGAGAGCGCAGTCGAGAAACCGGCTAGAGGATCCTTCGA
AGCTAGCGCTACCGGTGCCACCATGGTCAAGATCATCTCCAGGAAGTCTCTGG
GTACACAGAATGTCTACGATATCGGAGTCGAGAAAGACCACAATTTTCTCCTG
AAAAACGGACTCGTGGCGTCCAATTGCCTTACGGTATTCATGATGGTCATGGT
CATCGGAAATCTTGTTGTATTGAATTTATTTCTCGCGTTGTTGTTGAGTTCATTT
TCCGCCGATAATTTGGCTGCCACTGATGACGACAACGAGATGAATAATCTTCA
GATAGCTGTAGACCGGATGCACAAAGGCGTTGCCTACGTCAAACGAAAAATCT
ATGAATTCATACAGCAATCCTTCATACGAAAACAGAAGATTCTGGATGAAATC
AAACCCCTTGATGATCTCAATAATAAGAAAGATTCATGCATGTCGAACCATAC
CACAGAGATAGGCAAGGACCTTGACTACCTTAAAGACGTGAACGGTACCACA
AGTGGAATAGGCACAGGTAAGTACTAGCAGCTACAATCCAGCTACCATTCTGC
TTTTATTCTATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCT
TTTGCTAATCATGTTCATACCTCTTATCTTCCTCCCACAGGCTCCTCTGTAGAGA
AGTACATCATAGACGAGAGCGATTACATGTCTTTCATCAACAACCCGTCCCTC
ACTGTCACCGTTCCCATCGCCGTAGGAGAATCTGACTTCGAGAATCTCAATAC
GGAGGATTTCAGCTCCGAATCAGACTTGGAGGAATCAAAGGAGAAGTTGAAC
GAAAGTTCAAGTTCATCCGAGGGCAGCACCGTGGACATAGGCGCCCCCGTCGA
GGAACAACCTGTAGTCGAGCCTGAGGAAACTTTGGAACCCGAAGCGTGTTTCA
CGGAGGGGTGTGTTCAACGCTTCAAGTGTTGCCAAATTAACGTTGAAGAGGGT
CGTGGAAAACAATGGTGGAACCTCCGCAGGACCTGTTTCCGGATCGTCGAACA
TAATTGGTTCGAGACGTTCATAGTTTTCATGATCTTGCTTTCATCTGGTGCTTTG
GCATTCGAGGATATCTACATCGACCAACGAAAGACCATAAAAACTATGCTGGA
ATATGCAGACAAGGTTTTCACATACATATTCATCCTTGAAATGCTCCTGAAATG
GGTAGCGTATGGTTACCAGACTTATTTCACGAACGCATGGTGCTGGCTCGATTT
CCTGATTGTCGACGTCTCCCTGGTGTCATTGACTGCTAACGCACTCGGATATAG
CGAACTAGGCGCTATTAAGAGTCTCAGAACCCTGAGAGCATTGAGGCCCCTCC
GCGCGCTCTCTCGGTTTGAGGGAATGAGAGTAGTCGTTAATGCACTGTTGGGA
GCGATACCTTCCATTATGAACGTGCTTCTCGTTTGTCTCATCTTCTGGCTGATAT
TCTCTATTATGGGTGTGAACTTGTTCGCAGGCAAATTTTACCACTGCATTAACA
CAACTACAGGAGATAGATTTGATATTGAGGATGTAAACAACCACACCGACTGT
TTGAAGTTGATAGAGAGAAACGAGACCGCAAGATGGAAGAATGTAAAAGTCA
ACTTCGACAATGTCGGCTTTGGATATCTTTCACTGCTGCAAGTAGCCACATTCA
AAGGATGGATGGACATTATGTACGCTGCAGTAGATTCCCGAAACGTAGAGTTG
CAACCGAAGTATGAAGAAAGTTTGTATATGTACCTCTACTTCGTAATTTTTATC
ATCTTTGGCTCATTCTTCACACTTAACCTGTTCATTGGTGTAATCATCGACAATT
TCAATCAGCAGAAAAAGAAATTTGGTGGACAAGACATCTTCATGACAGAGGA
ACAGAAGAAATACTATAATGCAATGAAAAAACTAGGGTCCAAAAAGCCCCAA
AAACCTATTCCTAGACCGGGCAACAAGTTTCAAGGCATGGTTTTCGACTTCGT
AACTAGACAGGTGTTTGATATATCTATTATGATTCTGATATGTCTGAATATGGT
TACGATGATGGTTGAGACTGATGATCAATCTGAATACGTTACGACGATACTTA
GCCGAATTAACTTGGTATTCATTGTTCTTTTCACGGGCGAATGTGTACTTAAAC
TGATTAGTTTAAGGCACTATTATTTCACAATCGGTTGGAACATTTTTGATTTCG
TTGTGGTCATACTTTCCATTGTTGGCATGTTTCTTGCTGAATTGATAGAAAAGT
ACTTCGTCAGTCCAACACTTTTCCGAGTTATACGGCTTGCCCGAATCGGACGAA
TTCTCAGGCTAATCAAAGGTGCTAAAGGAATTCGTACACTGCTTTTCGCTCTCA
TGATGTCACTGCCAGCTCTTTTCAACATCGGTTTGTTACTATTTTTGGTAATGTT TATATATGCGATCTTCGGCATGAGTAATTTCGCTTATGTTAAACGGGAGGTGG GAATCGATGACATGTTTAATTTTGAGACATTCGGCAATTCTATGATCTGTCTCT
TTCAAATTACCACGTCAGCTGGATGGGACGGATTGCTTGCTCCGATTCTCAACA GTAAACCGCCCGATTGCGACCCTAACAAAGTGAATCCGGGTTCATCTGTAAAG GGAGACTGCGGAAATCCGAGCGTCGGTATCTTCTTTTTCGTCTCCTACATTATA ATTTCTTTCCTTGTTGTCGTGAACATGTATATAGCTGTGATCTTGGAAAATTTTT CTGTTGCTACTGAGGAATCCGCAGAACCACTTTCAGAAGACGATTTTGAGATG TTTTACGAAGTTTGGGAGAAGTTTGATCCTGACGCTACACAGTTTATGGAATTT GAGAAGCTCTCACAGTTCGCAGCTGCCCTGGAGCCTCCGTTGAATCTTCCACA GCCTAACAAGTTACAACTGATTGCGATGGACCTGCCAATGGTGTCTGGGGACC GAATCCACTGCCTTGATATACTCTTTGCTTTCACAAAAAGGGTCTTGGGCGAGT
CTGGAGAAATGGACGCCCTCAGAATACAGATGGAGGAACGATTCATGGCTTCG AATCCTAGCAAAGTGTCTTATCAACCCATCACTACGACTCTTAAAAGAAAACA
AGAGGAAGTGTCTGCTGTCATTATCCAGCGAGCATATAGACGGCACTTGCTCA AACGAACTGTTAAGCAAGCCAGTTTCACCTACAATAAAAACAAAATAAAAGG TGGTGCTAATTTGCTGATTAAAGAGGACATGATTATCGACAGAATCAATGAGA
ACTCCATTACAGAAAAAACCGATCTCACTATGTCAACAGCAGCCTGTCCTCCC TCATACGACCGTGTCACTAAACCTATAGTCGAAAAACATGAACAAGAGGGCA
AGGATGAGAAGGCCAAAGGCAAAGCCGGCGACTACAAAGACCATGACGGAG ACTATAAAGATCATGACATCGATTACAAGGATGACGATGACAAGTAATGATAT
CATAATCAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAAC TATGTTGCTCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCTTTGTATCATG CTATTGCTTCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTTAGT TCTTGCCACGGCGGAACTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGG CTCGGCTGTTGGGCACTGACAATTCCGTGGCAATAAAAGATCTTTATTTTCATT AGATCTGTGTGTTGGTTTTTTGTGTGGTGCGGACCGAGCGGCCGC (SEQ ID
NO: 84)
CN4642 (4222 bp between ITRs)
GCGGCCGCACGCGTTTAATTAAGTGTCTAGACTGCAGAGGGCCCTGCGTATGA GTGCAAGTGGGTTTTAGGACCAGGATGAGGCGGGGTGGGGGTGCCTACCTGAC GACCGACCCCGACCCACTGGACAAGCACCCAACCCCCATTCCCCAAATTGCGC ATCCCCTATCAGAGAGGGGGAGGGGAAACAGGATGCGGCGAGGCGCGTGCGC
ACTGCCAGCTTCAGCACCGCGGACAGTGCCTTCGCCCCCGCCTGGCGGCGCGC GCCACCGCCGCCTCAGCACTGAAGGCGCGCTGACGTCACTCGCCGGTCCCCCG
CAAACTCCCCTTCCCGGCCACCTTGGTCGCGTCCGCGCCGCCGCCGGCCCAGCC GGACCGCACCACGCGAGGCGCGAGATAGGGGGGCACGGGCGCGACCATCTGC GCTGCGGCGCCGGCGACTCAGCGCTGCCTCAGTCTGCGGTGGGCAGCGGAGGA GTCGTGTCGTGCCTGAGAGCGCAGTCGAGAAACCGGCTAGAGGATCCTTCGAA GCTAGCGCTACCGGTGCCACCATGGTCTACCCGTATGATGTCCCGGATTACGCT GGCAGCTACCCATACGATGTACCCGACTATGCCGGCAGTATGGAGCAAACAGT TTTGGTCCCTCCGGGACCAGACAGTTTCAATTTCTTTACTCGGGAGAGTCTTGC CGCCATTGAGAGGCGCATAGCTGAGGTAAGTACTAGCAGCTACAATCCAGCTA CCATTCTGCTTTTATTCTATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAG
CTAGGCCCTTTTGCTAATCATGTTCATACCTCTTATCTTCCTCCCACAGGAAAA GGCTAAGAATCCAAAACCTGACAAGAAAGACGACGACGAAAACGGACCCAAA CCTAACTCAGATCTCGAAGCTGGAAAGAATCTCCCATTCATCTATGGTGATATC CCTCCAGAAATGGTTTCAGAACCTCTAGAAGATCTCGATCCATACTATATCAAT AAAAAGACCTTCATCGTTCTGAACAAAGGAAAGGCGATTTTCCGGTTTTCTGCT ACTTCTGCTCTCTATATTCTCACACCATTTAATCCACTTCGCAAGATTGCGATTA AGATACTGGTGCATAGTCTGTTCAGTATGCTGATTATGTGTACAATCCTTACAA ATTGTGTCTTTATGACTATGTCTAACCCGCCGGATTGGACCAAGAACGTAGAAT ACACGTTCACTGGAATCTATACGTTCGAGTCTCTTATTAAGATAATCGCCAGGG GGTTCTGTCTTGAGGATTTCACTTTCCTCCGCGATCCGTGGAATTGGCTTGACT TCACCGTTATTACGTTCGCTTACGTTACTGAGTTCGTTGATCTTGGCAACGTGT CTGCACTCAGAACATTCAGAGTGCTTAGAGCACTTAAAACCATAAGTGTCATA CCAGGATTGAAAACGATCGTGGGAGCTCTGATACAGAGTGTAAAGAAGCTTTC AGATGTAATGATCCTTACTGTCTTCTGTCTTTCCGTATTCGCACTCATCGGGCTC CAGCTGTTTATGGGTAACCTCAGAAACAAATGCATTCAATGGCCACCAACAAA TGCGAGCCTTGAGGAACATAGCATAGAAAAGAATATCACTGTTAACTATAATG GGACCCTCATAAACGAAACCGTGTTCGAATTTGACTGGAAATCCTACATTCAG GATTCCAGATATCATTATTTTCTTGAGGGCTTCTTGGACGCACTTTTGTGCGGA AATTCAAGTGATGCTGGTCAATGTCCTGAAGGTTATATGTGTGTTAAAGCAGG AAGAAACCCAAACTACGGATACACATCTTTCGATACATTTTCTTGGGCTTTCCT ATCTCTTTTTCGGCTTATGACACAAGACTTTTGGGAAAATTTGTATCAGCTGAC ACTCCGAGCGGCTGGAAAAACTTATATGATCTTCTTCGTTCTTGTAATCTTTTT GGGATCCTTCTACCTCATCAATTTGATACTTGCAGTTGTCGCTATGGCATACGA GGAGCAAAATCAAGCAACGCTAGAAGAAGCGGAGCAGAAAGAGGCTGAATTT CAACAGATGATTGAGCAATTGAAGAAACAACAGGAAGCTGCACAACAAGCAG CTACTGCTACTGCATCTGAACATTCTAGAGAGCCAAGTGCAGCTGGAAGGCTT TCTGATAGTTCAAGTGAAGCATCTAAATTGAGTTCTAAGTCAGCAAAGGAACG GAGAAATAGACGGAAAAAACGAAAGCAGAAGGAGCAATCTGGAGGAGAAGA GAAGGACGAAGACGAGTTTCAAAAAAGTGAATCAGAGGACTCAATTAGACGC AAAGGATTCAGATTTAGTATCGAAGGAAATAGATTGACTTATGAAAAACGATA TTCCTCACCACATCAGTCACTCCTGAGTATACGCGGGTCACTCTTTTCACCGAG ACGAAATTCCAGAACTTCACTCTTCTCATTCCGGGGAAGGGCTAAGGATGTAG GCTCAGAAAATGATTTCGCAGACGATGAGCATTCCACTTTTGAAGATAATGAG AGCAGGCGAGACAGTCTCTTTGTACCACGAAGACATGGCGAAAGAAGGAACA GCAACCTTAGCCAGACTAGTCGGTCCAGTAGAATGCTAGCTGTATTCCCTGCTA ATGGCAAGATGCATTCCACCGTTGATTGTAATGGGGTCGTCTCGTTAGTAGGTG GACCTTCAGTTCCTACCTCACCGGTTGGACAATTGCTGCCGGAGGGAACCACT ACTGAGACTGAAATGAGAAAACGACGTTCTTCAAGCTTCCATGTGTCTATGGA TTTTTTGGAAGACCCGTCACAGCGCCAAAGAGCTATGTCTATAGCTTCAATCCT GACAAACACCGTAGAGGAGTTGGAGGAGTCACGCCAGAAGTGCCCTCCTTGTT GGTACAAGTTCTCCAACATCTTCCTGATTTGGGATTGTTCACCTTATTGGCTGA
AAGTCAAGCACGTTGTTAACCTCGTCGTAATGGATCCTTTTGTCGACCTGGCTA TAACGATATGTATCGTCCTGAACACACTCTTCATGGCTATGGAGCATTATCCGA TGACTGATCATTTTAACAATGTGCTTACCGTGGGTAATCTGGTTTTCACTGGCA TCTTTACTGCAGAAATGTTTCTTAAGATTATTGCAATGGACCCCTACTACTACT TTCAAGAAGGATGGAATATTTTTGATGGTTTTATCGTCACACTTTCTTTGGTTG AATTGGGCTTGGCAAATGTAGAGGGGCTCTCAGTTCTTAGAAGTTTCAGACTTC TCCGGGTATTCAAGCTTGCTAAGAGCTGGCCTACTTTGAACATGCTCATCAAGA TTATCGGAAACAGTGTTGGCGCCCTTGGCAATCTGACATTGGTGTTGGCTATCA TAGTATTCATCTTCGCGGTTGTGGGAATGCAGTTGTTTGGGAAGTCATACAAGG
ACTGTGTGTGCAAGATAGCGTCCGACTGTCAACTTCCGAGGTGGCACATGAAC
GATTTCTTTCATTCATTCCTCATTGTGTTTCGGGTCCTCTGTGGCGAGTGGATAG
AAACTATGTGGGACTGTCTCAGTTATGACACAGAAATCTTGACGGTGGAATAC
GGGTTTCTTCCGATCGGAAAGATTGTTGAGGAGCGCATAGAGTGTACGGTGTA
TACCGTCGATAAGAACGGATTTGTCTACACACAGCCTATCGCACAATGGCATA
ATAGAGGAGAACAAGAAGTCTTCGAATATTGTTTGGAGGACGGATCAATCATA
CGGGCAACCAAAGACCACAAGTTTATGACAACAGATGGACAGATGTTGCCAAT
AGATGAGATATTTGAGAGGGGACTTGATCTCAAGCAAGTGGATGGTCTGCCAT
AATGATATCATAATCAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGT
ATTCTTAACTATGTTGCTCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCTT
TGTATCATGCTATTGCTTCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATC
CTGGTTAGTTCTTGCCACGGCGGAACTCATCGCCGCCTGCCTTGCCCGCTGCTG
GACAGGGGCTCGGCTGTTGGGCACTGACAATTCCGTGGCAATAAAAGATCTTT
ATTTTCATTAGATCTGTGTGTTGGTTTTTTGTGTGGTGCGGACCGAGCGGCCGC (SEQ ID NO:85)
CN4643 (4339 bp between ITRs)
GCGGCCGCACGCGTTTAATTAAGTGTCTAGACTGCAGAGGGCCCTGCGTATGA
GTGCAAGTGGGTTTTAGGACCAGGATGAGGCGGGGTGGGGGTGCCTACCTGAC
GACCGACCCCGACCCACTGGACAAGCACCCAACCCCCATTCCCCAAATTGCGC
ATCCCCTATCAGAGAGGGGGAGGGGAAACAGGATGCGGCGAGGCGCGTGCGC
ACTGCCAGCTTCAGCACCGCGGACAGTGCCTTCGCCCCCGCCTGGCGGCGCGC
GCCACCGCCGCCTCAGCACTGAAGGCGCGCTGACGTCACTCGCCGGTCCCCCG
CAAACTCCCCTTCCCGGCCACCTTGGTCGCGTCCGCGCCGCCGCCGGCCCAGC
CGGACCGCACCACGCGAGGCGCGAGATAGGGGGGCACGGGCGCGACCATCTG
CGCTGCGGCGCCGGCGACTCAGCGCTGCCTCAGTCTGCGGTGGGCAGCGGAGG
AGTCGTGTCGTGCCTGAGAGCGCAGTCGAGAAACCGGCTAGAGGATCCTTCGA
AGCTAGCGCTACCGGTGCCACCATGGTCAAGATCATCTCCAGGAAGTCTCTGG
GTACACAGAATGTCTACGATATCGGAGTCGAGAAAGACCACAATTTTCTCCTG
AAAAACGGACTCGTGGCGTCCAATTGTATGGAAGTAGCTGGGCAGGCGATGTG
CCTTACGGTATTCATGATGGTCATGGTCATCGGAAATCTTGTTGTATTGAATTT
ATTTCTCGCGTTGTTGTTGAGTTCATTTTCCGCCGATAATTTGGCTGCCACTGAT
GACGACAACGAGATGAATAATCTTCAGATAGCTGTAGACCGGATGCACAAAG
GCGTTGCCTACGTCAAACGAAAAATCTATGAATTCATACAGCAATCCTTCATA
CGAAAACAGAAGATTCTGGATGAAATCAAACCCCTTGATGATCTCAATAATAA
GAAAGATTCATGCATGTCGAACCATACCACAGAGATAGGCAAGGACCTTGACT
ACCTTAAAGACGTGAACGGTACCACAAGTGGAATAGGCACAGGTAAGTACTA
GCAGCTACAATCCAGCTACCATTCTGCTTTTATTCTATGGTTGGGATAAGGCTG
GATTATTCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTTCATACCTCTTAT
CTTCCTCCCACAGGCTCCTCTGTAGAGAAGTACATCATAGACGAGAGCGATTA
CATGTCTTTCATCAACAACCCGTCCCTCACTGTCACCGTTCCCATCGCCGTAGG
AGAATCTGACTTCGAGAATCTCAATACGGAGGATTTCAGCTCCGAATCAGACT
TGGAGGAATCAAAGGAGAAGTTGAACGAAAGTTCAAGTTCATCCGAGGGCAG
CACCGTGGACATAGGCGCCCCCGTCGAGGAACAACCTGTAGTCGAGCCTGAGG
AAACTTTGGAACCCGAAGCGTGTTTCACGGAGGGGTGTGTTCAACGCTTCAAG
TGTTGCCAAATTAACGTTGAAGAGGGTCGTGGAAAACAATGGTGGAACCTCCG CAGGACCTGTTTCCGGATCGTCGAACATAATTGGTTCGAGACGTTCATAGTTTT
CATGATCTTGCTTTCATCTGGTGCTTTGGCATTCGAGGATATCTACATCGACCA
ACGAAAGACCATAAAAACTATGCTGGAATATGCAGACAAGGTTTTCACATACA
TATTCATCCTTGAAATGCTCCTGAAATGGGTAGCGTATGGTTACCAGACTTATT
TCACGAACGCATGGTGCTGGCTCGATTTCCTGATTGTCGACGTCTCCCTGGTGT
CATTGACTGCTAACGCACTCGGATATAGCGAACTAGGCGCTATTAAGAGTCTC
AGAACCCTGAGAGCATTGAGGCCCCTCCGCGCGCTCTCTCGGTTTGAGGGAAT
GAGAGTAGTCGTTAATGCACTGTTGGGAGCGATACCTTCCATTATGAACGTGC
TTCTCGTTTGTCTCATCTTCTGGCTGATATTCTCTATTATGGGTGTGAACTTGTT
CGCAGGCAAATTTTACCACTGCATTAACACAACTACAGGAGATAGATTTGATA
TTGAGGATGTAAACAACCACACCGACTGTTTGAAGTTGATAGAGAGAAACGA
GACCGCAAGATGGAAGAATGTAAAAGTCAACTTCGACAATGTCGGCTTTGGAT
ATCTTTCACTGCTGCAAGTAGCCACATTCAAAGGATGGATGGACATTATGTAC
GCTGCAGTAGATTCCCGAAACGTAGAGTTGCAACCGAAGTATGAAGAAAGTTT
GTATATGTACCTCTACTTCGTAATTTTTATCATCTTTGGCTCATTCTTCACACTT
AACCTGTTCATTGGTGTAATCATCGACAATTTCAATCAGCAGAAAAAGAAATT
TGGTGGACAAGACATCTTCATGACAGAGGAACAGAAGAAATACTATAATGCA
ATGAAAAAACTAGGGTCCAAAAAGCCCCAAAAACCTATTCCTAGACCGGGCA
ACAAGTTTCAAGGCATGGTTTTCGACTTCGTAACTAGACAGGTGTTTGATATAT
CTATTATGATTCTGATATGTCTGAATATGGTTACGATGATGGTTGAGACTGATG
ATCAATCTGAATACGTTACGACGATACTTAGCCGAATTAACTTGGTATTCATTG
TTCTTTTCACGGGCGAATGTGTACTTAAACTGATTAGTTTAAGGCACTATTATT
TCACAATCGGTTGGAACATTTTTGATTTCGTTGTGGTCATACTTTCCATTGTTGG
CATGTTTCTTGCTGAATTGATAGAAAAGTACTTCGTCAGTCCAACACTTTTCCG
AGTTATACGGCTTGCCCGAATCGGACGAATTCTCAGGCTAATCAAAGGTGCTA
AAGGAATTCGTACACTGCTTTTCGCTCTCATGATGTCACTGCCAGCTCTTTTCA
ACATCGGTTTGTTACTATTTTTGGTAATGTTTATATATGCGATCTTCGGCATGA
GTAATTTCGCTTATGTTAAACGGGAGGTGGGAATCGATGACATGTTTAATTTTG
AGACATTCGGCAATTCTATGATCTGTCTCTTTCAAATTACCACGTCAGCTGGAT
GGGACGGATTGCTTGCTCCGATTCTCAACAGTAAACCGCCCGATTGCGACCCT
AACAAAGTGAATCCGGGTTCATCTGTAAAGGGAGACTGCGGAAATCCGAGCG
TCGGTATCTTCTTTTTCGTCTCCTACATTATAATTTCTTTCCTTGTTGTCGTGAA
CATGTATATAGCTGTGATCTTGGAAAATTTTTCTGTTGCTACTGAGGAATCCGC
AGAACCACTTTCAGAAGACGATTTTGAGATGTTTTACGAAGTTTGGGAGAAGT
TTGATCCTGACGCTACACAGTTTATGGAATTTGAGAAGCTCTCACAGTTCGCAG
CTGCCCTGGAGCCTCCGTTGAATCTTCCACAGCCTAACAAGTTACAACTGATTG
CGATGGACCTGCCAATGGTGTCTGGGGACCGAATCCACTGCCTTGATATACTC
TTTGCTTTCACAAAAAGGGTCTTGGGCGAGTCTGGAGAAATGGACGCCCTCAG
AATACAGATGGAGGAACGATTCATGGCTTCGAATCCTAGCAAAGTGTCTTATC
AACCCATCACTACGACTCTTAAAAGAAAACAAGAGGAAGTGTCTGCTGTCATT
ATCCAGCGAGCATATAGACGGCACTTGCTCAAACGAACTGTTAAGCAAGCCAG
TTTCACCTACAATAAAAACAAAATAAAAGGTGGTGCTAATTTGCTGATTAAAG
AGGACATGATTATCGACAGAATCAATGAGAACTCCATTACAGAAAAAACCGA
TCTCACTATGTCAACAGCAGCCTGTCCTCCCTCATACGACCGTGTCACTAAACC
TATAGTCGAAAAACATGAACAAGAGGGCAAGGATGAGAAGGCCAAAGGCAA
AGCCGGCGACTACAAAGACCATGACGGAGACTATAAAGATCATGACATCGAT TACAAGGATGACGATGACAAGTAATGATATCATAATCAACCTCTGGATTACAA
AATTTGTGAAAGATTGACTGGTATTCTTAACTATGTTGCTCCTTTTACGCTATG
TGGATACGCTGCTTTAATGCCTTTGTATCATGCTATTGCTTCCCGTATGGCTTTC
ATTTTCTCCTCCTTGTATAAATCCTGGTTAGTTCTTGCCACGGCGGAACTCATC
GCCGCCTGCCTTGCCCGCTGCTGGACAGGGGCTCGGCTGTTGGGCACTGACAA
TTCCGTGGCAATAAAAGATCTTTATTTTCATTAGATCTGTGTGTTGGTTTTTTGT
GTGGTGCGGACCGAGCGGCCGC (SEQ ID NO: 86)

Claims

WHAT IS CLAIMED IS:
1. A system to express one or more coding sequences, the system comprising:
(i) a first expression construct comprising: a first portion of a polynucleotide sequence of a gene encoding the sodium channel alpha subunit, and a polynucleotide sequence encoding an N-fragment of a split intein (N- intein), at the 3’ end relative to the first portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit;
(ii) a second expression construct comprising: a second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit, and a polynucleotide sequence encoding a C-fragment of the split intein (C- intein), at the 5’ end relative to the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit; and
(iii) optionally, a third expression construct comprising a polynucleotide sequence encoding a degron; wherein if the third expression construct is not included in the system, the first expression construct, the second expression construct, or both further comprises a polynucleotide sequence encoding a degron, located if within the first expression construct at the 3’ end relative to the first portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit, or located if within the second expression construct at the 5’ end relative to the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit.
2. The system of claim 1, wherein the system comprises the third expression construct.
3. The system of claim 1, wherein the first and/or the second expression construct further comprises an enhancer sequence and a promoter sequence, and optionally an intron having a polynucleotide sequence of SEQ ID NO: 107.
4. The system of claim 1, wherein the gene encoding the sodium channel alpha subunit is selected from the group consisting of SCNlAt SCN2A, SCN3A, SCN4A, SCN5A, SCN8A, SCN9A, SCN10A, SCN11A, and SCN7A,- preferably the sodium channel alpha subunit comprising sodium channel protein type 1 subunit alpha, or the gene encoding the sodium channel alpha subunit comprises SCN1A.
5. The system of claim 1, wherein the first expression construct comprises the polynucleotide sequence encoding the degron, and the first expression construct comprises from 5’ to 3’: the first portion of the polynucleotide sequence of the SCN1A
- the polynucleotide sequence encoding the N-intein - the polynucleotide sequence encoding the degron.
6. The system of claim 1, wherein the first expression construct comprises the polynucleotide sequence encoding the degron, and the first expression construct comprises from 5’ to 3’: the first portion of the polynucleotide sequence of the SCN1A
- the polynucleotide sequence encoding the degron - the polynucleotide sequence encoding the N-intein.
7. The system of claim 1, wherein the second expression construct comprises the polynucleotide sequence encoding the degron, and when expressed, the degron is two or more amino acid residues at the N-terminus relative to a protein product encoded by the second portion of the polynucleotide sequence of the SCN1A.
8. The system of claim 1, wherein the second expression construct comprises the polynucleotide sequence encoding the degron, and the second expression construct comprises from 5’ to 3’: a first portion of the polynucleotide sequence encoding the C- intein - the polynucleotide sequence encoding the degron - a second portion of the polynucleotide sequence encoding the C-intein - the second portion of the polynucleotide sequence of the SCN1A, wherein when expression, a protein product of the first portion of the polynucleotide sequence encoding the C-intein and a protein product of the second portion of the polynucleotide sequence encoding the C-intein together form the C-intein.
9. The system of any one of claim 1-8, wherein when the first and the second expression constructs are expressed, a protein product of the first portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit and a protein product of the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit are linked, via a peptide bond between the C-terminus of the first portion’s protein product and the N-terminus of the second portion’s protein product, to reconstitute the sodium channel alpha subunit.
10. The system of any one of claims 1-8, wherein the degron has an amino acid sequence selected from the group consisting of: MSCAQESITSLYKKAGSENLYFQ (SEQ ID NO:88), MSCAQES (SEQ ID NO: 90), RPGSTSPFAPSATDLPSMPEPALTSR (SEQ ID NO:91), ACKNWFSSLSHFVIHL (SEQ ID NO: 92), and GSLIIFIIL (SEQ ID NO:93).
11. The system of claim 5, wherein the degron has an amino acid sequence of RPGSTSPFAPSATDLPSMPEPALTSR (SEQ ID NO:91) or ACKNWFSSLSHFVIHL (SEQ ID NO.92).
12. The system of claim 7, wherein the degron has an amino acid sequence of MSCAQESITSLYKKAGSENLYFQ (SEQ ID NO:88), MSCAQES (SEQ ID NO:90), or GSLIIFIIL (SEQ ID NO.93).
13. The system of any one of claims 1-8, wherein the sodium channel protein type 1 subunit alpha is sodium channel protein type 1 subunit alpha isoform 2, having amino acid residue numbering according to NCBI accession number NP 001340878.1 and optionally having an amino acid substitution of A1056T, or an isoform variant thereof; and wherein the first and the second portions of the polynucleotide sequence of the SCN1A encode residues 1-1049 and residues 1050-1998 of the sodium channel protein type 1 subunit alpha isoform 2, respectively, or residue segments of the isoform variant when sequence aligned to the sodium channel protein type 1 subunit alpha isoform 2, wherein the first and the second portions of the polynucleotide sequence of the SCN1A encode residues 1-956 and residues 957-1998 of the sodium channel protein type 1 subunit alpha isoform 2, respectively, or residue segments of the isoform variant when sequence aligned to the sodium channel protein type 1 subunit alpha isoform 2, or wherein the first and the second portions of the polynucleotide sequence of the SCN1A encode residues 1-947 and residues 948-1998 of the sodium channel protein type 1 subunit alpha isoform 2, respectively, or residue segments of the isoform variant when sequence aligned to the sodium channel protein type 1 subunit alpha isoform 2.
14. The system of claim 1, wherein the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterml049 (SEQ ID NO: 59), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterm949 (SEQ ID NO: 60); wherein the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterm956 (SEQ ID NO: 61), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterml042 (SEQ ID NO: 62); or wherein the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterm947 (SEQ ID NO: 63), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterml051 (SEQ ID NO: 64).
15. The system of any one of claims 1-8, wherein the split intein comprises consensus fast intein (Cfa); the degron is a polypeptide being 5-30 amino-acid residues in length or preferably 9-26 amino-acid residues in length; and a polypeptide product encoded by the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit starts with a cystein, serine, or threonine residue.
16. The system of any one of claims 1-8, wherein the polynucleotide sequence encoding the N-intein comprises a polynucleotide sequence of Cfa-N (SEQ ID NO:57), and the the polynucleotide sequence encoding the C-intein comprises a polynucleotide sequence of Cfa-C (SEQ ID NO:58).
17. The system of any one of claims 1-8, wherein the first expression construct and the second expression construct independently further comprise the promoter sequence selected from a minBglobin promoter having a polynucleotide sequence of SEQ ID NO:3, an hSynl promoter having a polynucleotide sequence of SEQ ID NO: 52, or a CMV promoter having a polynucleotide sequence of SEQ ID NO:53; optionally a shortened hSynl promoter having a polynucleotide sequence of SEQ ID NO: 54.
18. The system of claim 3, wherein the enhancer sequence is configured for targeted expression of the first, the second, or both portions, respectively, of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit within a targeted central nervous system cell type; preferably the targeted central nervous system cell type being GAB Aergic, glutamatergic, or both cell types.
19. The system of claim 3, wherein the enhancer sequence is set forth in SEQ ID NO: 2 (DLX2.0) or has a concatemerized core having a polynucleotide sequence of SEQ ID NO: 1, or the targeted central nervous system cell type comprises a GABAergic interneuron; or wherein the enhancer sequence is set forth in SEQ ID NO: 55 (eHGT_078h), or the targeted central nervous system cell type comprises a glutamatergic neuron.
20. The system of claim 3, wherein the first expression construct, the second expression construct, or both independently further comprise a miRNA binding site sequence, configured for targeted expression of the first, the second, or both portions, respectively, of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit within a selected central nervous system cell type.
21. The system of claim 20, wherein the miRNA binding site sequence is set forth in SEQ ID NO: 56 (4x2C miRNA binding site) or SEQ ID NO:87 (8x2C miRNA binding site), or the selected central nervous system cell type comprises a pan-GABAergic neuron.
22. An artificial expression construct, comprising the first, the second or the third expression construct of the system of any one of claims 1-21, wherein the artificial expression construct is associated with a capsid that crosses the blood brain barrier.
23. The artificial expression construct of claim 22, wherein the capsid comprises PHP.eB, AAV-BR1, AAV-PHP.S, AAV-PHP.B, or AAV-PPS.
24. An administrable composition, comprising: the artificial expression construct of claim 22; and a pharmaceutically acceptable excipient.
25. A transgenic cell comprising the system of any one of claims 1-21, wherein optionally the transgenic cell comprises a GABAergic neuron or a glutamatergic neuron.
26. A method of rescuing voltage-gated sodium channel function within a targeted population of cells, the method comprising: administering a therapeutically effective amount of the system of any one of claims 1-21 to a sample or subject comprising the targeted population of cells, and inducing expression of the first expression construct and the second expression construct of the system to reconstitute a sodium channel alpha subunit, thereby rescuing voltage-gated sodium channel function within the targeted population of cells.
27. The method of claim 26, wherein the subject has a sodium channel opathies, optionally comprising Dravet syndrome, myoclonic seizures, myoclonic astatic epilepsy (MAE), intractable childhood epilepsy with generalized tonic-clonic seizures, simple febrile seizures, generalized epilepsy and febrile seizures plus (GEFS+), migrating partial seizures of infancy, Lennox-Gastaut syndrome, or West syndrome.
28. The method of claim 26, wherein the subject is a pediatric patient having Dravet syndrome.
29. A method of administering a system of expression constructs to a subject in need thereof, comprising administering a therapeutically effective amount of the system of any one of claims 1-21 to a sample or subject comprising the targeted population of cells, and inducing expression of the first expression construct and the second expression construct of the system to reconstitute a sodium channel alpha subunit, thereby rescuing voltage-gated sodium channel function within the targeted population of cells, wherein the subject in need thereof has a sodium channel opathies, optionally comprising Dravet syndrome, myoclonic seizures, myoclonic astatic epilepsy (MAE), intractable childhood epilepsy with generalized tonic-clonic seizures, simple febrile seizures, generalized epilepsy and febrile seizures plus (GEFS+), migrating partial seizures of infancy, Lennox-Gastaut syndrome, or West syndrome.
PCT/US2024/042744 2023-08-16 2024-08-16 Systems and methods for improving safety of split intein aav mediated gene therapy Pending WO2025038955A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202363532994P 2023-08-16 2023-08-16
US63/532,994 2023-08-16

Publications (2)

Publication Number Publication Date
WO2025038955A1 true WO2025038955A1 (en) 2025-02-20
WO2025038955A9 WO2025038955A9 (en) 2025-04-24

Family

ID=94632728

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2024/042744 Pending WO2025038955A1 (en) 2023-08-16 2024-08-16 Systems and methods for improving safety of split intein aav mediated gene therapy

Country Status (1)

Country Link
WO (1) WO2025038955A1 (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003072751A2 (en) * 2002-02-25 2003-09-04 Vanderbilt University Expression system for human brain-specific voltage-gated sodium channel, type 1
WO2017191274A2 (en) * 2016-05-04 2017-11-09 Curevac Ag Rna encoding a therapeutic protein
WO2019199867A1 (en) * 2018-04-09 2019-10-17 Allen Institute Rescuing voltage-gated sodium channel function in inhibitory neurons
WO2020051561A1 (en) * 2018-09-07 2020-03-12 Beam Therapeutics Inc. Compositions and methods for delivering a nucleobase editing system
WO2021163492A1 (en) * 2020-02-14 2021-08-19 Ohio State Innovation Foundation Nucleobase editors and methods of use thereof
WO2022103859A1 (en) * 2020-11-10 2022-05-19 Allen Institute Artificial expression constructs for modulating gene expression in chandelier cells
US20230116688A1 (en) * 2020-03-26 2023-04-13 Splicebio, S.L. Split inteins and their uses
WO2024163796A2 (en) * 2023-02-01 2024-08-08 Allen Institute Intein-mediated reconstitution of voltage-gated sodium channel function

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003072751A2 (en) * 2002-02-25 2003-09-04 Vanderbilt University Expression system for human brain-specific voltage-gated sodium channel, type 1
WO2017191274A2 (en) * 2016-05-04 2017-11-09 Curevac Ag Rna encoding a therapeutic protein
WO2019199867A1 (en) * 2018-04-09 2019-10-17 Allen Institute Rescuing voltage-gated sodium channel function in inhibitory neurons
WO2020051561A1 (en) * 2018-09-07 2020-03-12 Beam Therapeutics Inc. Compositions and methods for delivering a nucleobase editing system
WO2021163492A1 (en) * 2020-02-14 2021-08-19 Ohio State Innovation Foundation Nucleobase editors and methods of use thereof
US20230116688A1 (en) * 2020-03-26 2023-04-13 Splicebio, S.L. Split inteins and their uses
WO2022103859A1 (en) * 2020-11-10 2022-05-19 Allen Institute Artificial expression constructs for modulating gene expression in chandelier cells
WO2024163796A2 (en) * 2023-02-01 2024-08-08 Allen Institute Intein-mediated reconstitution of voltage-gated sodium channel function

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DATABASE PROTEIN 19 January 2018 (2018-01-19), XP093283160, Database accession no. PNI86222.1 *
DATABASE PROTEIN 24 April 2002 (2002-04-24), XP093283155, Database accession no. AAK95360.1 *
TAN PENG, HE LIAN, HUANG YUN, ZHOU YUBIN: "Optophysiology: Illuminating cell physiology with optogenetics", PHYSIOLOGICAL REVIEWS, AMERICAN PHYSIOLOGICAL SOCIETY, US, vol. 102, no. 3, 1 July 2022 (2022-07-01), US , pages 1263 - 1325, XP093283202, ISSN: 0031-9333, DOI: 10.1152/physrev.00021.2021 *

Also Published As

Publication number Publication date
WO2025038955A9 (en) 2025-04-24

Similar Documents

Publication Publication Date Title
US20250041384A1 (en) Rescuing voltage-gated sodium channel function in inhibitory neurons
US20220195459A1 (en) Regulatable expression using adeno-associated virus (aav)
US20210371470A1 (en) Compositions and methods for delivery of aav
US20210214749A1 (en) Directed evolution
JP2020533959A (en) Compositions and Methods for Delivering AAV
EP2121914B1 (en) Mitochondrial nucleic acid delivery systems
JP2018522529A (en) Modified factor IX and compositions, methods and uses for gene transfer into cells, organs and tissues
JP2020518268A (en) Compositions and methods for expressing otoferlin
AU2020217894B2 (en) Polynucleotides
KR20230058102A (en) Recombinant adeno-associated virus for the treatment of GRN-associated adult-onset neurodegeneration
CN113966400A (en) Interneuron-specific therapeutic agents for normalizing neuronal cell excitability and treating delaviru syndrome
JP2020527335A (en) Gene therapy for eye diseases
CN113710693B (en) DNA binding domain transactivator and its use
Mich et al. AAV-mediated interneuron-specific gene replacement for Dravet syndrome
JP2021512871A (en) Adeno-associated virus compositions for restoring PAH gene function and how to use them
CN116322744A (en) Mutant beta-glucocerebrosidase with increased stability
JP7253274B2 (en) AAV compatible laminin-linker polymeric protein
KR20250141200A (en) Intein-mediated reorganization of voltage-gated sodium channel function
WO2025038955A1 (en) Systems and methods for improving safety of split intein aav mediated gene therapy
EP3516052B1 (en) Recombinant dgkk gene for fragile x syndrome gene therapy
Suh Treatment of an Inherited Retinal Disease in a Mouse Model by In Vivo Base Editing
WO2025212993A1 (en) Materials and methods for trangene expression in neural cells
WO2025231387A1 (en) Globus pallidus route of administration for deep brain gene therapies
HK40096543A (en) Polynucleotides

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24855010

Country of ref document: EP

Kind code of ref document: A1