WO2025038955A1

WO2025038955A1 - Systems and methods for improving safety of split intein aav mediated gene therapy

Info

Publication number: WO2025038955A1
Application number: PCT/US2024/042744
Authority: WO
Inventors: Franck KALUME
Original assignee: Seattle Childrens Hospital
Current assignee: Seattle Childrens Hospital
Priority date: 2023-08-16
Filing date: 2024-08-16
Publication date: 2025-02-20
Anticipated expiration: 2026-02-16
Also published as: WO2025038955A9

Abstract

Systems and methods are provided for split intein-mediated AAV mediated delivery and expression of large protein molecules with improved safety features. Specifically degron sequences as degradation signals are included at specific terminus of N- or C- split intein fragments that are each coupled to a portion of the coding sequence of voltage-gated sodium channel alpha subunit (e.g., alpha subunit 1, Nav1.1), and upon expression and trans-splicing of split intein fragments, Nav1.1 is reconstituted whereas trans-spliced intein is digested along with the coupled degron via degron degradation pathways.

Description

SYSTEMS AND METHODS FOR IMPROVING SAFETY OF SPLIT INTEIN AAV

MEDIATED GENE THERAPY

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application includes a claim of priority under 35 U.S.C. §119(e) to U.S. provisional patent application No. 63/532,994, filed August 16, 2023, the entirety of which is hereby incorporated by reference.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

[0002] This invention was made with Government support under grant no. MH 120095 awarded by the National Institutes of Health. The Government has certain rights in the invention.

REFERENCE TO SEQUENCE LISTING

[0003] This application contains a Sequence Listing submitted as a computer readable form named “067505_000010WOPT_SequenceListing.xml”, having a size in bytes of 198,105 bytes, and created on August 16, 2024. The information contained in this computer readable form is hereby incorporated by reference in its entirety.

FIELD OF DISCLOSURE

[0004] This invention relates to split intein mediated gene expression systems incorporating a degradation signal for removal of gene expression byproducts.

BACKGROUND

[0005] In gene therapy, the relatively small DNA packaging capacity of adeno- associated virus (AAV), which is restricted to the size of the parental genome (approximately 5 kb), prevents their application for the treatment of diseases that arise due to mutations in genes with larger coding sequences. Split intein-mediated protein trans-splicing (PTS) has been evaluated as a strategy to reconstitute large proteins via AAV vectors, thus overcoming their limited cargo capacity. In this system, a large protein coding sequence is split into two or more parts, each at least tagged on one end (or, for internal parts of the split sequence, flanked) by sequences that encode split-inteins, which are independently cloned in two or more AAV vectors. Split-inteins are expressed as two independent polypeptides (N-intein and C-intein) at the extremities of the host polypeptides (N-polypeptide and C-polypeptide) and remain inactive until encountering their complementary partner. On counterpart association, the reconstituted intein excises itself from the host protein while mediating ligation of the N- and C-polypeptides via a peptide bond, in a traceless manner. However, AAV intein vectors encode components

1 067505-00001 OWOPT (e.g., excised inteins) of nonmammalian origin that could elicit immune or toxic responses in target cells and/or raise regulatory concerns for clinical translation.

[0006] One exemplary application of AAV intein vectors is the delivery for expression and reconstitution of large protein molecules to supplement or salvage the function of endogenously deficient or mutated protein molecules due to one or more diseases or disorders. [0007] Epilepsy is a neurological disorder that occurs when the brain presents an enduring predisposition to generate two or more epileptic seizures. An epileptic seizure is a temporary disruption of brain function due to abnormal excessive or synchronous neuronal activity. Its manifestation may include periods of unusual behavior, sensations and sometimes loss of consciousness. Dravet Syndrome (DS) particularly is a rare and catastrophic form of intractable epilepsy that begins in infancy. Initially, the patient experiences prolonged seizures. In their second year, additional types of seizures begin to occur and this typically coincides with a developmental decline. This leads to poor development of language and motor skills. Diseases such as Dravet syndrome, myoclonic seizures, and intractable childhood epilepsy with generalized tonic-clonic seizures often arise from mutations in SCN1A gene.

[0008] Various sodium channelopathies resulting in seizure disorders and associated comorbidities are associated with reduction or loss of function of the sodium channels in cell membranes. Sodium channels are made up of large alpha subunits that may associate with accessory proteins, such as beta subunits. An alpha subunit forms the core of the channel and is functional on its own. When the alpha subunit protein is expressed by a cell, it is able to form a pore in the cell membrane that conducts Na+ in a voltage-dependent way, even if beta subunits or other known modulating proteins are not expressed. When accessory proteins assemble with a subunits, the resulting complex can display altered voltage dependence and cellular localization.

[0009] Therefore, it is an object of the present invention to provide a /ra/z.s-spl icing system with enhanced safety or reduced immunogenicity by removing byproducts.

[0010] It is another object of the present invention to improve the safety or trans- splicing system for use in safe delivery of coding sequence of a sodium channel protein in the treatment of Dravet Syndrome or epilepsy disorders.

[0011] All publications herein are incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference. The following description includes information that may be useful in understanding the present invention. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed invention, or that any publication specifically or implicitly referenced is prior art.

SUMMARY OF THE DISCLOSURE

[0012] The following embodiments and aspects thereof are described and illustrated in conjunction with compositions and methods which are meant to be exemplary and illustrative, not limiting in scope.

[0013] Systems are provided to express one or more coding sequences. In various embodiments, a system includes: (a) a first expression construct comprising: a first portion of a polynucleotide sequence of a gene encoding the sodium channel alpha subunit, and a polynucleotide sequence encoding an N-fragment of a split intein (N-intein), at the 3’ end relative to the first portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit; (b) a second expression construct comprising: a second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit, and a polynucleotide sequence encoding a C-fragment of the split intein (C-intein), at the 5’ end relative to the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit; and (c) a third expression construct comprising a polynucleotide sequence encoding a degron.

[0014] In some embodiments, a system includes: (a) a first expression construct comprising: a first portion of a polynucleotide sequence of a gene encoding the sodium channel alpha subunit, and a polynucleotide sequence encoding an N-fragment of a split intein (N- intein), at the 3’ end relative to the first portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit; (b) a second expression construct comprising: a second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit, and a polynucleotide sequence encoding a C-fragment of the split intein (C-intein), at the 5’ end relative to the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit; wherein the first expression construct, the second expression construct, or both further comprises a polynucleotide sequence encoding a degron, located if within the first expression construct at the 3’ end relative to the first portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit, or located if within the second expression construct at the 5’ end relative to the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit.

[0015] In other embodiments, a system includes the first expression construct, the second expression, either one or both further comprising the polynucleotide sequence encoding the degron, and a third expression construct coding for a same or different degron. [0016] In some embodiments, a system for expression one or more coding sequences includes (a) a first expression construct coding for a first fusion protein comprising a first segment of a sodium channel alpha subunit and an N-fragment of a split intein, wherein the first segment of the sodium channel alpha subunit is at the N-terminus relative to the N- fragment of the split intein; (b) a second expression construct coding for a second fusion protein comprising a C-fragment of the split intein and a second segment of the sodium channel alpha subunit, wherein the second segment of the sodium channel alpha subunit is at the C-terminus relative to the C-fragment of the split intein; and (c) a polynucleotide sequence encoding a degron, wherein the polynucleotide sequence encoding the degron is in a third expression construct different from the first or the second expression constructs OR is within the first and/or the second expression constructs. In various aspects, the first fusion protein and the second fusion protein are spliced together, thereby joining the first segment and the second segment of the sodium channel alpha subunit and joining the N-fragment and the C-fragment of the split intein. In various embodiments, the system comprising the first, second, and third expression constructs is co-transduced to a cell for expression of the first fusion protein, the second fusion protein, and the degron in the cell.

[0017] In various embodiments, the split intein is one having fast splicing rate (e.g., half-lives within 5 min), such as consensus fast intein. In various embodiments, the degron is a polypeptide having no more than 30 amino-acid residues in length. In various embodiments, the split intein in the system is consensus fast intein, and the degron in the system is one having 9-26 or no more than 30 amino-acid residues in length. Without wishing to be bound by a particular theory, the splicing to joining the first and second segments of the sodium channel alpha subunit takes places faster than degron-mediated degradation, resulting in the joined segments (preferably a full) sodium channel alpha subunit locating in a cell membrane, whereas joined intein is degraded via degron.

[0018] In some embodiments, the first and/or the second expression construct further comprises an enhancer sequence, configured for targeted expression within a targeted cell type. In various embodiments, the expression construct(s) further comprises a promoter sequence. In some embodiments, the first and/or the second expression construct further comprises an intron having a polynucleotide sequence of SEQ ID NO: 107. In some embodiments, the first and/or the second expression construct further comprises the enhancer sequence, the promoter, and the intron.

[0019] In some embodiments, the gene encoding the sodium channel alpha subunit is selected from the group consisting of SCNlA_t SCN2A, SCN3A, SCN4A, SCN5A, SCN8A, SCN9A, SCN10A, SCN11A, and SCN7A. In some embodiments, the sodium channel alpha subunit comprises sodium channel protein type 1 subunit alpha, or the gene encoding the sodium channel alpha subunit comprises SCN1A. In some embodiments, the sodium channel alpha subunit comprises human sodium channel protein type 1 subunit alpha isoform 2. In some embodiments, the sodium channel alpha subunit comprises a variant of human sodium channel protein type 1 subunit alpha isoform 2 having an amino acid substitution of A1056T, wherein amino acid residue numbering is according to NCBI accession number NP 001340878.1.

[0020] In some embodiments, the first expression construct comprises the polynucleotide sequence encoding the degron, and the first expression construct comprises from 5’ to 3’ : the first portion of the polynucleotide sequence of the SCN1A - the polynucleotide sequence encoding the N-intein - the polynucleotide sequence encoding the degron. For example, the degron has an amino acid sequence of RPGSTSPFAPSATDLPSMPEPALTSR (SEQ ID NO:91) or ACKNWFSSLSHFVIHL (SEQ ID NO: 92).

[0021] In some embodiments, the first expression construct comprises the polynucleotide sequence encoding the degron, and the first expression construct comprises from 5’ to 3’ : the first portion of the polynucleotide sequence of the SCN1A - the polynucleotide sequence encoding the degron - the polynucleotide sequence encoding the N- intein.

[0022] In some embodiments, the second expression construct comprises the polynucleotide sequence encoding the degron, and when expressed, the degron is two or more amino acid residues at the N-terminus relative to a protein product encoded by the second portion of the polynucleotide sequence of the SCN1A. For example, the degron has an amino acid sequence of MSCAQESITSLYKKAGSENLYFQ (SEQ ID NO:88), MSCAQES (SEQ ID NO:90), or GSLIIFIIL (SEQ ID NO:93).

[0023] In some embodiments, the second expression construct comprises the polynucleotide sequence encoding the degron, and the second expression construct comprises from 5’ to 3’ : a first portion of the polynucleotide sequence encoding the C-intein - the polynucleotide sequence encoding the degron - a second portion of the polynucleotide sequence encoding the C-intein - the second portion of the polynucleotide sequence of the SCN1A, wherein when expression, a protein product of the first portion of the polynucleotide sequence encoding the C-intein and a protein product of the second portion of the polynucleotide sequence encoding the C-intein together form the C-intein. [0024] In various aspect, wherein when the first and the second expression constructs are expressed, a protein product of the first portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit and a protein product of the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit are linked, via a peptide bond between the C-terminus of the first portion’ s protein product and the N-terminus of the second portion’s protein product, to reconstitute the sodium channel alpha subunit.

[0025] In various aspects, the degron has an amino acid sequence selected from the group consisting of MSCAQESITSLYKKAGSENLYFQ (SEQ ID NO:88), MSCAQES (SEQ ID NO:90), RPGSTSPFAPSATDLPSMPEPALTSR (SEQ ID NO:91), ACKNWFSSLSHFVIHL (SEQ ID NO: 92), and GSLIIFIIL (SEQ ID NO:93).

[0026] In various embodiment, the breakpoint of the first and second segment of the sodium channel alpha subunit is at a place wherein the first residue of the C-terminus segment (fragment) is Cys, Ser, or Thr.

[0027] In some embodiment, the first and the second portions of the polynucleotide sequence of the SCN1A encode residues 1-1049 and residues 1050-1998 of sodium channel protein type 1 subunit alpha isoform 2 (or variants containing A1056T substitution), respectively.

[0028] In some embodiments, the first and the second portions of the polynucleotide sequence of the SCN1A encode residues 1-956 and residues 957-1998 of the sodium channel protein type 1 subunit alpha isoform 2 (or variants containing A1056T substitution), respectively.

[0029] In some embodiment, the first and the second portions of the polynucleotide sequence of the SCN1A encode residues 1-947 and residues 948-1998 of the sodium channel protein type 1 subunit alpha isoform 2 (or variants containing A1056T substitution), respectively.

[0030] In some embodiment, the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterml049 (SEQ ID NO: 59), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterm949 (SEQ ID NO: 60).

[0031] In some embodiments, the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterm956 (SEQ ID NO: 61), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCN!A-CO-Cterml042 (SEQ ID NO: 62). [0032] In some embodiment, the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterm947 (SEQ ID NO: 63), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterml051 (SEQ ID NO: 64).

[0033] In various embodiments, the split intein comprises consensus fast intein (Cfa); the degron is a polypeptide being 5-30 amino-acid residues in length or preferably 9-26 aminoacid residues in length; and a polypeptide product encoded by the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit starts with a cystein, serine, or threonine residue. Preferably the first segment and the second segment of the sodium channel alpha subunits are even or about even sized; or the lengths of segments are not more than 20%, 30%, 40%, or 50% different.

[0034] In some embodiments, the polynucleotide sequence encoding the N-intein comprises a polynucleotide sequence of Cfa-N (SEQ ID NO:57), and the polynucleotide sequence encoding the C-intein comprises a polynucleotide sequence of Cfa-C (SEQ ID NO:58).

[0035] In some embodiments, the first expression construct and the second expression construct independently further comprise the promoter sequence selected from a minBglobin promoter having a polynucleotide sequence of SEQ ID NO:3, an hSynl promoter having a polynucleotide sequence of SEQ ID NO: 52, or a CMV promoter having a polynucleotide sequence of SEQ ID NO:53; optionally a shortened hSynl promoter having a polynucleotide sequence of SEQ ID NO: 54. In some embodiments, the first expression construct and the second expression construct independently further comprise the minBglobin promoter having a polynucleotide sequence of SEQ ID NO:3.

[0036] In some embodiments, the enhancer sequence is configured for targeted expression of the first, the second, or both portions, respectively, of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit within a targeted central nervous system cell type. In some embodiment, the targeted central nervous system cell type is GABAergic neuron, glutamatergic neuron, or both cell types. In some embodiment, the targeted central nervous system cell type is GABAergic interneuron.

[0037] In some embodiment, the enhancer sequence is set forth in SEQ ID NO: 2 (DLX2.0). In some embodiment, the enhancer sequence has a concatemerized core having a polynucleotide sequence of SEQ ID NO: 1. In some embodiment, the enhancer sequence is a concatemerized repeat (2, 3, 4, 5, 6, 7, 8, 9, 10, or more contiguous repeats) of a polynucleotide sequence of SEQ ID NO: 1. In some embodiment, an enhancer sequence is set forth in SEQ ID NO: 55 (eHGT_078h), or the targeted central nervous system cell type comprises a glutamatergic neuron.

[0038] In some embodiments, the first expression construct, the second expression construct, or both independently further comprise a miRNA binding site sequence, configured for targeted expression of the first, the second, or both portions, respectively, of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit within a selected central nervous system cell type. In some embodiments, the miRNA binding site sequence is set forth in SEQ ID NO: 56 (4x2C miRNA binding site) or SEQ ID NO:87 (8x2C miRNA binding site), or the selected central nervous system cell type comprises a pan- GABAergic neuron.

[0039] An artificial expression construct is also provided, which includes the first, the second and/or the third expression construct of a system disclosed herein, wherein each expression construct is associated with a capsid that crosses the blood brain barrier. In some embodiments, a capsid that crosses the blood brain barrier comprises PHP.eB. In some embodiments, a capsid that crosses the blood brain barrier comprises AAV-BR1. In some embodiments, a capsid that crosses the blood brain barrier comprises AAV-PHP.S. In some embodiments, a capsid that crosses the blood brain barrier comprises AAV-PHP.B. In some embodiments, a capsid that crosses the blood brain barrier comprises AAV-PPS.

[0040] Also provided is an administrable composition, which includes one or more artificial expression constructs disclosed herein, preferably in association with a capsid that crosses blood brain barrier; and a pharmaceutically acceptable excipient.

[0041] A transgenic cell is also provided comprising a system of one or more expression constructs disclosed herein. In some embodiments, the transgenic cell comprises a GAB Aergic neuron, or more specifically GAB Aergic interneuron. In some embodiments, the transgenic cell comprises a glutamatergic neuron.

[0042] Methods are also provided for rescuing voltage-gated sodium channel function within a targeted population of cells, the method comprising: co-administering a therapeutically effective amount of the two or more expression constructs of a system disclosed herein to a sample or subject comprising the targeted population of cells, and inducing expression of the expression constructs to reconstitute a sodium channel alpha subunit, thereby rescuing voltage-gated sodium channel function within the targeted population of cells.

[0043] Methods are also provided for administering a system of expression constructs to a subject in need thereof, the method comprising administering a therapeutically effective amount of a system disclosed herein, or co-administering two or more expression constructs of a system disclosed herein, to a sample or subject comprising the targeted population of cells, and inducing expression of the expression constructs to reconstitute a sodium channel alpha subunit, thereby rescuing voltage-gated sodium channel function within the targeted population of cells.

[0044] In some embodiments, the subject has a sodium channel opathies, optionally comprising Dravet syndrome, myoclonic seizures, myoclonic astatic epilepsy (MAE), intractable childhood epilepsy with generalized tonic-clonic seizures, simple febrile seizures, generalized epilepsy and febrile seizures plus (GEFS+), migrating partial seizures of infancy, Lennox-Gastaut syndrome, or West syndrome. In some embodiments, the subject is a pediatric patient having Dravet syndrome.

[0045] Other features and advantages of the invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, which illustrate, by way of example, various features of embodiments of the invention.

BRIEF DESCRIPTION OF THE FIGURES

[0046] Exemplary embodiments are illustrated in referenced figures. It is intended that the embodiments and figures disclosed herein are to be considered illustrative rather than restrictive.

[0047] Figures 1A-1G: A functional split-intein design of SCN1A that reconstitutes functional Navi.i activity. (FIG. 1A) Design of split-intein fusion protein halves of SCN1A. We inserted the breakpoint and Cfa-N and Cfa-C split-intein peptides just before the native Cysl050 with no additional amino acids. After joining, the Cfa-N and Cfa-C intein fragments self-excise and yield scarless reconstituted SCN1 A. HA and FLAG epitopes are inserted at the N- and C-termini of the N- and C-terminal halves for detection. (FIG. IB) Cloning SCN1A split-intein fusion protein halves into CMV-driven plasmid vectors for testing functionality in cell lines, as well as IRES2-SYFP2 and IRES2-mScarlet transfection reporters. (FIG. 1C) Reconstitution of full-length SCN1A after co-transfection of split-intein fusion protein halves into HEK-293 cells. We analyzed whole cell protein preparations by western blot for HA epitope tag after transfection. Lanes: 1) empty vector, 2) full-length HA-SCN1 A, 3) full-length SCN1A-FLAG, 4) HA-SCNIA-Ntm, 5) SCNIA-FLAG-Ctm, and 6) HA-SCNIA-Ntm plus SCNIA-FLAG-Ctm. Expected sizes: full-length HA-SCN1A 232 kDa, HA-SCNIA-Ntm 134 kDa, reconstituted full-length HA-SCN1A-FLAG after joining 233 kDa. Anti-tubulin is a loading control. (FIG. ID) Exemplary currents evoked in HEK293 cells transfected with full- length SCN1A (SCN1A-FL, green), a combination of SCNIA-Ntm and SCNIA-Ctm (SCN1A-N+C, orange), SCNIA-Ntm only (SCN1A-N, blue) and SCNIA-Ctm only (SCN1A- C, blue), in response to a family of step depolarizations from a holding potential of -120 mV to 40 mV, in 5 mV increments. Co-transfection with SCNIA-Ntm and SCNIA-Ctm produced functional Navi.i currents comparable in size to full-length SCN1A. Scale bar: 1 nA, 10 msec. (FIG. IE) Peak current densities, normalized to capacitance, for full-length SCN1A (FL, n= 39), SCNIA-Ntm + SCNIA-Ctm (N+C, n= 28), SCNIA-Ntm (N, n= 15), SCNIA-Ctm (C, n= 11), GFP/empty vector (GFP, n= 11) and SCN1B/2B (pi/2, n= 13). Medians displayed with data points. FL and N+C datasets are significantly different (**, p= 0.0083). N+C is highly significantly different compared to N or C (****, p< 0.0001). N and C are not significantly different from GFP and pi/2 controls (p> 0.05, ns). Statistical comparisons performed by pairwise Mann-Whitney U tests. All HEK293 transfections performed in the background of a separate SCN1B/2B expression plasmid, at a ratio of 10: 1. (FIG. IF) Voltage-dependent gating properties of WT full-length human SCN1A (SCN1A-FL) and reconstituted SCN1A channels formed by co-expression of N-terminal SCNIA-Cfa intein (SCN1A-N) and Cfa intein-C- terminal SCN1A (SCN1A-C) plasmid constructs, acutely expressed in HEK293 cells and characterized by whole-cell patch-clamp recordings. Currents activated by a family of step depolarizations from SCN1A-FL (green) and reconstituted SCN1A-N+C (orange), from a holding potential of -120 mV (left panels). Steady-state inactivation current traces evoked by a step to -20 mV, after a family of 1 sec preconditioning steps from -120 mV to 30 mV (right panels). Scale bars for activation currents (IF, left panels) equal 1.0 nA, and for SSI currents (IF, right panels) equal 0.5 nA; time equals 1.0 msec for all panels. (FIG. 1G) Conductancevoltage (G/V) and steady-state inactivation (SSI) plots for SCN1A-FL (green) and reconstituted SCN1A-N+C (orange) channels, fitted to single Boltzmann functions. G/V plots for both constructs are indistinguishable, whereas the SSI plot for SCN1A-N+C is shifted by +5 mV relative to SCN1A-FL. Statistical significance derived from unpaired pairwise t-tests, assuming equal variance (* p< 0.05; ** p< 0.01).

[0048] Figures 2A-2F: Cell class-specific delivery of SCN1A to telencephalic GABAergic interneurons using optimized enhancer DLX2.0. (FIG. 2A) Recombinant AAV2/PHP.eB vectors for delivery of DLX2.0-split-intein-SCNlA. (FIG. 2B) Efficient SCN1 A reconstitution in mouse brain with DLX2.0-split-intein-SCNl A vectors. In panels 2B- 2F we injected P2 neonatal mice BL-ICV with 3el0 gc of each indicated vector. At P24 we analyzed mouse brain membrane protein fractions by western blotting with antibodies targeting HA or FLAG epitope tags, the C-terminus of NaVl.l, and alpha-tubulin as a loading control. Note the PBS-injected negative control is the same lane as that shown in Fig. 6B as these experiments were performed together. (FIG. 2C) Specific detection of HA- and FLAG- expressing GABA+ cells in cortex. HA- and FLAG-expressing cells can be observed when either half is delivered alone or together. Layer 2/3 of VISp is shown at P20. (FIG. 2D) Representative stitched fluorescence image of biodistribution of HA and FLAG epitopes in scattered telencephalic neurons. Expression is pseudo-colored black. (FIG. 2E) Representative stitched fluorescence image of HA and FLAG epitopes, and Gad67+ neurons in VISp. In panels D-E we show expression at P47. (FIG. 2F) High specificity and completeness of expression within Gad67+ neurons in multiple telencephalic regions across multiple animals. We counted cells that express both HA and FLAG epitopes in visual VISp, MO, and HPF. Layer 1 was excluded from VISp and MO analysis due to DLX2.0-PHP.eB vectors inefficiently targeting that layer. Each point represents one mouse, bars represent the means, and error bars represent standard error of the mean. Mice span ages P47-P139, mean age P85. Abbreviations: CTX cerebral cortex, OLF olfactory areas, HPF hippocampal formation, OT olfactory tubercle, STR striatum, VISp primary visual cortex, MO motor cortex.

[0049] Figures 3A-3E: Recovery of mortality and epileptic symptoms in DS model mice with DLX2.0-split intein-SCNl A. (FIG. 3A) Experimental timeline to rescue of epileptic symptoms in DS model mice. We used Scnlafl/+;Meox2-Cre animals on a pure C57BL/6 background to model DS45. We injected P0-P3 pups with empty control, single-part alone control, or telencephalic GABAergic interneuron-targeting dual DLX2.0 N+C SCN1A AAVs (3el0 gc each vector per animal), and monitored mortality until P70. Some animals were tested for seizure susceptibility by thermal challenge between P25-P35. (FIG. 3B) Mortality protection in DS model mice after treatment with DLX2.0-SCN1A AAVs. Mice treated with DLX2.0 N+C SCN1 A AAVs (n= 27) exhibit significantly greater survival than untreated mice (n= 68, *** Log-rank test p= 1.4e-5) or mice treated with empty or single-part vector controls (n= 40, *** Log-rank test p= 6.6e-4). Note the untreated control groups in Fig. 3B-EE represents the same set of untreated animals as that shown in Fig. 7B, 7D-7F. (FIG. 3C-3E) Protection from heat-induced seizures in DS model mice after treatment with DLX2.0-SCN1 A AAVs. (FIG. 3C) DS model mice treated with DLX2.0 N+C SCN1A AAVs (n= 16) are significantly less likely to exhibit MC seizures by 42°C than untreated mice (n= 11, * Fisher’s exact test p= 0.042) or mice treated with empty or single-part vector controls (n= 27, * Fisher’ s exact test p= 0.011). (FIG. 3D) DS model mice treated with DLX2.0 N+C SCN1A AAVs exhibit significantly less likely to exhibit GTC seizures by 42°C than untreated mice (** Fisher’s exact test p= 0.0025) or mice treated with empty or single-part vector controls (** Fisher’s exact test p= 0.00010). (FIG. 3E) DS model mice treated with DLX2.0 N+C SCN1A AAVs exhibit significantly fewer MC events during thermal challenge assay than untreated mice or mice treated with empty or single-part vector controls (** unpaired t-test p< 0.001 at each indicated timepoint, for comparison to Untreated and Empty/single part negative control animals).

[0050] Figures 4A-4F: Recovery of spontaneous epileptic symptoms in DS model mice with DLX2.0-SCN1 A AAVs. (FIG. 4A-4B) Interictal spike reduction in DS model mice. We implanted Scnlafl/+; Meox2-Cre DS model mice with ECoG/EMG electrodes, which revealed frequent interictal spikes generalized across the brain in both left (L) and right (R) channels as in our previous work45,50. (FIG. 4A) Example interictal spikes are shown in untreated mice. (FIG. 4B) Counting interictal spikes in untreated and treated (3el0 gc each DLX2.0 N+C SCN1A AAV vector delivered BL-ICV at P0-P3) mice reveals a significantly decreased frequency of interictal spikes with treatment. n= 10 animals per condition, each circle represents one animal, bars and error bars represent means and standard error of the means. *** p = 0.0001 by two-tailed unpaired t-test. (FIG. 4C-4D) Spontaneous MC event prevention in DS model mice. (FIG. 4C) Example MC event observed in untreated Scnlafl/+; Meox2-Cre DS model mice, which have sharp spikes followed closely by EMG signals. (FIG. 4D) We categorized mice as having MC events or not having MC events during the recording period, which revealed a highly significant prevention of MCs in the treated animals. *** p = 7.1e-4 by Fisher’s exact test. (FIG. 4E-4F) Spontaneous GTCs inDS model mice. (FIG. 4E) Example spontaneous GTC seizure observed in an untreated Scnlafl/+; Meox2-Cre DS model mice. (FIG. 4F) We categorized mice as having or not having GTC seizures during the recording period, which revealed treated animals did not exhibit GTC seizures although this effect was not significant (ns, p = 0.47 by Fisher’s exact test).

[0051] Figures 5A-5D: Recovery of severe epileptic phenotypes in mice lacking SCN1A in telencephalic GABAergic interneurons using DLX2.0-SCN1A AAVs. (FIG. 5A) Genetic cross resulting in 100% Scnla+/fl; Dlx5/6-Cre animals. (FIG. 5B) Survival curves for treated and untreated Scnla+/fl; Dlx5/6-Cre animals. Untreated pups exhibit 100% mortality by the 6th week of life (n= 31/31). In contrast, pups injected with DLX2.0-split-intein-SCNl A (3el0 gc each vector at P2 by BL-ICV) exhibit 100% survival through the 10^th week of life (n= 9/9). *** p= 3.3e-14 by Fisher’s exact test. (FIG. 5C-5D) Protection from seizures in treated Scnla+/fl; Dlx5/6-Cre animals during thermal challenge assay. (FIG. 5C) Significant protection from thermally induced MC seizures (*** p= 7.1e-4, Fisher’s exact test). (FIG. 5D) Significant protection from thermally induced GTC seizures (*** p= l.le-5, Fisher’s exact test). [0052] Figures 6A-6F: Nonselective delivery of SCN1A to neurons using hSynl promoter. (FIG. 6A) Recombinant AAV2/PHP.eB vectors for delivery of hSynl-split-intein- SCNIA. (FIG. 6B) Efficient SCN1A reconstitution in mouse brain with hSynl -split-intein- SCN1A vectors. In panels 6B-6F we injected P2 neonatal mice BL-ICV with 3el0 gc of each indicated vector (6el0 total gc in the N+C animals). At P24 we analyzed mouse brain membrane protein fractions by western blotting with antibodies targeting HA or FLAG epitope tags, the C-terminus of Navi. i, and alpha-tubulin as a loading control. Note the PBS-injected negative control lane is the same lane as that shown in Fig. 2B as these experiments were performed together. (FIG. 6C) HA- and FLAG-expressing NeuN+ and Gad67+ cells in cortex after co-inj ection with N- and C-terminal vectors. White arrows indicate NeuN+ and Gad67+ cells that express FLAG but not HA. Cyan arrows indicate NeuN+ and Gad67+ cells that express both FLAG and HA. Layer 2/3 of VISp is shown at P84. (FIG. 6D) Representative stitched fluorescence image of biodistribution of biodistribution of HA and FLAG epitopes in neurons after co-inj ection with N- and C-terminal vectors. Expression is pseudo-colored black. Expression shown at P84. (FIG. 6E) Representative stitched fluorescence image of HA and FLAG epitopes and NeuN+ neurons throughout the layers of VISp. Expression shown at P84. (FIG. 6F) Quantification of specificity and completeness of expression within Gad67+ and NeuN+ neurons in multiple telencephalic regions. We counted cells that express both HA and FLAG epitopes in VISp, MO, and HPF (layer 1 was excluded from VISp and MO analysis due to hSynl-PHP.eB vectors inefficiently targeting that layer). Each point represents one mouse, bars represent the means, and error bars represent standard error of the mean. As expected hSynl-driven expression shows specificity for NeuN+ cells but not Gad67+ cells. Mice span ages P76-P86, mean age P81. Abbreviations: CTX cerebral cortex, OLF olfactory areas, HPF hippocampal formation, STR striatum, MB midbrain, HB hindbrain, CBX cerebellar cortex, VISp primary visual cortex, MO motor cortex.

[0053] Figures 7A-7F: Early pre-weaning toxicity and weak protection from epileptic symptoms with nonselective SCN1A vectors. (FIG. 7A) Experimental timeline to rescue of epileptic symptoms in DS model mice with nonselective hSynl promoter-driven SCN1A vectors. (FIG. 7B-7C) Preweaning mortality in DS model mice after treatment with nonselective SCN1A AAVs. (FIG. 7B) Mice treated with hSynl N+C SCN1A AAVs at high dose (3el0 gc each vector, n= 18) or low dose (lelO gc each vector, n= 36) exhibit significantly greater preweaning mortality by P21 than untreated mice (n= 68) and empty/single part negative control mice (n= 33). *** p < .001 atP21 timepoint versus untreated by Fisher’s exact test. Low dose versus untreated p= 3.6e-4; low dose versus empty/single p= 0.014; high dose versus untreated p= 6.6e-6; high dose versus empty/single p= 4.9e-4. We did not observe significant effects on survival in the post-weaning period. Note the untreated control groups in Fig. 7B, 7D-7F represents the same sets of untreated animals as that shown in Fig. 3B-3E. (FIG. 7C) From analysis of recovered genotypes at P21 we inferred that both DS and littermate control animals were similarly affected by nonselective SCN1A AAV lethality. FD: found dead. (FIG. 7D-7F) Protection from heat-induced seizures in DS model mice after treatment with high-dose nonselective SCN1 A AAVs. (FIG. 7D) DS model mice treated with high-dose hSynl N+C SCN1 A AAVs (n= 5) are significantly less likely to exhibit MC seizures by 42°C than Empty/single part mice (n= 18, * Log-rank test p= 0.020). (FIG. 7E) DS model mice treated with high-dose hSynl N+C SCN1A AAVs exhibit a trend towards less GTC seizure likelihood by 42°C than empty/single part mice (*Log-rank test p= 0.047). (FIG. 7F) Mice treated with high-dose hSynl N+C SCN1A AAVs exhibit significantly fewer MC events during thermal challenge assay than mice treated with empty or single-part vector controls (* p < 0.05, unpaired t-test at each indicated temperature for High-dose versus Untreated or Empty/single animals). In D-F, we did not observe a protective effect against seizures from low-dose hSynl N+C AAVs (n= 12).

[0054] Figures 8A-8E: Isoform usage and allele prevalence of human SCN1A. (FIG. 8A-8B) SCN1A isoform usage across cortical cell type subclasses in mice (8A) and humans (8B). Mouse VISp cortical cell type-specific RNA-seq profiles are from Tasic, et al., Nature 563, 72 (2018) and human middle temporal gyrus (MTG) cortical cell type-specific profiles are from Hodge, et al., Nature 573, 61-68 (2019). Genome-aligned reads are aggregated according cell type subclasses, and visualized as pileups on UCSC genome browser alongside the positions of the exon whose splice donor usage determines whether the 2009-, 1998-, or 1981-amino acid isoform of SCN1A is expressed. Regions shown: mmlO chr2:66324527- 66325003, hg38 chr2: 166043623-166044099 (reverse complement reference sequences for legibility). Full vertical scale represents 0.65 (mouse) or 0.4 (human) read counts per million. (FIG. 8C) Alignment of mammalian SCN1 A protein sequences. The alanine residue at position 1056 in the NCBI RefSeq human SCN1A sequence is orthologous to a conserved threonine residue in most other mammalian species. Additionally, Origene commercial clones of human SCN1A contain threonine residue at this position, but agree at all other positions. Sequences used for alignment: human NP_001340878.1, gorilla XP_055236229.1, chimpanzee XP_054535897.1, Macaque XP_001101023.1, mouse lemur XP_020143996.1, domestic ferret XP_004744052.1, rat NP_110502.2, mouse NP_001300926.1, human Origene clone catalog number RG220167. Amino acid positions numbering is according the human 1998 amino acid isoform (NP_001340878.1). (FIG. 8D) SCN1A sequences in donated human brain samples. DNA sequences represent unique (non-PCR duplicate) reads from assay for transposase- accessible chromatin with sequencing (ATAC-seq) from three independent human patient brain samples (H17.26.001, H17.26.003, H18.03.001). (FIG. 8E) Population allele frequencies in human SCN1A via gnomAD database. gnomAD v3 and v2 populations represent partially overlapping healthy patient populations subject to genome and/or exome sequencing. The threonine-encoding allele is represented by a T on the + strand (corresponding to A on the - strand) in 73-74% of the population.

[0055] Figures 9A-9C: Cell class specificity observed with an independent marker of telencephalic GABAergic interneurons. Cell type-specific expression of SCNlA in Dlx56-Cre; Ail4 reporter mice. We injected these mice at P2 BL-ICV with 3el0 gc each DLX2.0-SCN1 A AAV vector produced at PackGene and analyzed expression at P30-P35 with both sagittal and coronal sections (n = 4 mice analyzed). We analyzed expression specifically in somatosensory cortex (SS), and motor cortex (MC), and counted absolute numbers of expressing cells in somatosensory cortex (9 A) and hippocampus (9B) and motor cortex (9C).

[0056] Figures 10A-10E: Biodistribution and rescue of epileptic symptoms from independently produced batches of DLX2.0-split intein SCN1A vectors. Specific expression of HA-tagged N-terminal and FLAG-tagged C-terminal SCN1A half-channels in Gad67+ neurons with independent packaging of DLX2.0-SCN1A AAVs. Animals were dosed at low dose (lelO gc each vector) or high dose (3el0 gc each vector) of DLX2.0-SCN1A AAVs by BL-ICV at P0-P3. (FIG. 10A) Telencephalic GABAergic interneuron specificity is maintained while completeness increases with greater dose of AAV. (FIG. 10B-10E) Protection from mortality and heat-induced seizures in Scnlafl/+; Meox2-Cre mice with independently packaged DLX2.0-SCN1 A AAV vectors. (FIG. 10C) Trend towards protection from mortality with high dose of DLX2.0-SCN1A AAV vectors (p = 0.06 by Log-rank test, High dose [n = 10] versus Untreated [n = 68]). (FIG. 10D-10E) Dose-dependent protection from heat-induced MC (10D) and GTC (10E) seizures. * p < 0.05 for comparisons of High dose (n= 10) versus Untreated (n= 11) or Empty/single vector (n= 23) negative controls by Log-rank test. We did not observe significant seizure protection with low doses of DLX2.0-SCN1A AAVs.

[0057] Figures 11A-11E: Rescue of mortality and epileptic symptoms in a second independent mouse model of DS. (FIG. HA) Testing DLX2.0-split-intein-SCNlA in an independent mouse model of DS. We generated Scnla+/R613X mice on a mixed Fl C57Bl/6: 129Sv background, injected neonates BL-ICV with DLX2.0-split-intein-SCNlA (3el0 gc each vector), and monitored mortality over the first 70 days of life, and then performed ECoG seizure monitoring on rescued animals following mortality monitoring. (FIG. 11B) Mortality monitoring after DLX2.0-split-intein-SCNl A administration. N+C DLX2.0-SCN1 A provides highly significant complete rescue from mortality to beyond 365 days. *** p < .001 by Mantel-Cox test (chi-square= 77.69, df= 7) after correction for multiple planned comparisons (3 comparisons of injection materials against untreated Scnla+/R613X mice). All other within-genotype comparisons are not significant (p> 0.05) after accounting for multiple comparisons. (FIG. 11C) Seizure monitoring and ECoG in DS model mice. After P70, we implanted headmounts and monitored animals for seizures and epileptic activity by paired ECoG with channels in somatosensory (SS) and parietal (P) cortex, EMG, and video monitoring. Example #1 : untreated Scnla+/R613X mouse displays several non-uniformly distributed GTC seizures over 228 hours of recording (one GTC seizure shown), as well as interictal spikes marked by red dots (zoom in on one example spike). Example #2 N+C DLX2.0-SCNlA-injected Scnla+/R613X mouse displays no GTCs and few spikes. Example #3 N+C DLX2.0-SCNlA-injected Scnla+/R613X mouse shows no GTCs but frequent spikes with aberrant spike morphology, likely an outlier. (FIG. 11D) Protection from GTCs with treatment in DS model mice. We manually quantified GTC events over the full recording session for each animal, and displayed results as events per 24 hour recording period, which revealed a significant seizure reduction with administration of N+C DLX2.0-SCN1 A. * Mann- Whitney U test p= 0.020; data are non-normally distributed according to Shapiro-Wilk test (untreated p= 0.018; N+C DLX2.0-SCN1A p= 1.0e-6). We also observe that all untreated Scnla+/R613X mice with zero GTCs during the recording session eventually survive beyond P150. Marked mice (#1, #2, #3) correspond to the example mice displayed in panel 11C. (FIG. HE) Fewer spikes with treatment in DS model mice. We identified spikes over the final day of the recording session for each animal using the line length threshold method. Quantified spikes are contemporaneous in both SS and P channels, as understood for genetic generalized epilepsy such as DS. Marked mice (#1, #2, #3) correspond to the example mice in panel 11C. Despite outlier injected example mouse #3 having aberrant frequent spikes, we observe significantly fewer spikes after injection in the injected mice versus untreated mice (* Mann- Whitney U test p= 0.033). Data are non-normally distributed according to Shapiro-Wilk test (untreated p= 0.0015; N+C DLX2.0-SCN1A p= 1.5e-6).

DESCRIPTION OF THE DISCLOSURE

[0058] All references cited herein are incorporated by reference in their entirety as though fully set forth. Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Singleton et al., Dictionary of Microbiology and Molecular Biology 3^rd ed., Revised, J. Wiley & Sons (New York, NY 2006); March, Advanced Organic Chemistry Reactions, Mechanisms and Structure 7^th ed., J. Wiley & Sons (New York, NY 2013); and Sambrook and Russel, Molecular Cloning: A Laboratory Manual 4^th ed., Cold Spring Harbor Laboratory Press (Cold Spring Harbor, NY 2012), provide one skilled in the art with a general guide to many of the terms used in the present application.

[0059] One skilled in the art will recognize many methods and materials similar or equivalent to those described herein, which could be used in the practice of the present invention. Indeed, the present invention is in no way limited to the methods and materials described. For purposes of the present invention, the following terms are defined below.

[0060] “Intein” refers to a polypeptide sequence capable of catalyzing a protein splicing reaction that excises its (the intein) sequence from their host protein and joins flanking sequences (N- and C-exteins) with a peptide bond. Intein excision is a posttranslational process that does not require auxiliary enzymes or cofactors. This self-excision process is called “protein splicing,” by analogy to the splicing of RNA introns from pre-mRNA (Perl er F et al, Nucl Acids Res. 22: 1125-1127 (1994)). The segments are called “intein” for internal protein sequence, and “extein” for external protein sequence, with upstream exteins termed “N- exteins” and downstream exteins called “C-exteins.” The products of the protein splicing process are two stable proteins: the mature protein and the intein. Inteins are typically 150-550 amino acids in size and may also contain a homing endonuclease domain. A list of known inteins, and exemplary mutually orthogonal split inteins, are shown at www.inteins.com, described by Pinto et al. in Nature Communications 2020, 11 : 1529, and provided in for example US2023/0116688, which is incorporated by reference herein.

[0061] Known inteins share a low degree of sequence similarity, with conserved residues only at the N- and C-termini. Most inteins begin with Ser or Cys and end in His-Asn or in His-Gln. In various embodiments, the first amino acid of the C-extein is an invariant Ser, Thr, or Cys, but the residue preceding the intein at the N-extein is not conserved.

[0062] The term “split intein” as used herein refers to any intein in which the N- terminal and C-terminal amino acid sequences are not directly linked via a peptide bond, such that the N-terminal and C-terminal sequences become separate fragments that can non- covalently re-associate, or reconstitute, into an intein that is functional for trans-splicing reactions.

[0063] A split intein involves two complementary half inteins, termed the N-intein and C-intein, that associate selectively and extremely tightly to form an active intein enzyme (Shah N.H., et al, J. Amer. Chem. Soc. 135: 18673-18681; Dassa B., et al, Nucl. Acids Res., 37:2560- 2573 (2009)). The two fragments of the split intein are encoded by two separately transcribed and translated genes. These so-called split inteins self-associate and catalyze protein-splicing activity in trans. Split inteins have been identified in diverse cyanobacteria and archaea (Caspi et al, Mol Microbiol. 50: 1569-1577 (2003); Choi J. et al, J Mol Biol. 556: 1093- 1106 (2006.); DassaB. et al, Biochemistry. 46:322-330 (2007.); Liu X. and Yang J., J Biol Chem. 275:26315- 26318 (2003); Wu H. et al, Proc Natl Acad Sci USA. £5:9226-9231 (1998.); and Zettler J. et al, FEBS Letters. 553:909-914 (2009)), but have not been found in eukaryotes thus far. Recently, a bioinformatic analysis of environmental metagenomic data revealed 26 different loci with a novel genomic arrangement. At each locus, a conserved enzyme coding region is interrupted by a split intein, with a freestanding endonuclease gene inserted between the sections coding for intein subdomains. Among them, five loci were completely assembled: DNA helicases (gp41 -1, gp41-8); Inosine-5 '-monophosphate dehydrogenase (IMPDH-1); and Ribonucleotide reductase catalytic subunits (NrdA-2 and NrdJ-1). This fractured gene organization appears to be present mainly in phages (Dassa et al, Nucleic Acids Research. 57:2560-2573 (2009)).

[0064] The term “split intein N-fragment,” “N-fragment of a split intein,” “N-terminal split intein,” “N-terminal intein fragment,” “N-terminal intein sequence” (abbreviated “IntN” or “N-intein”) refers to any intein sequence that comprises an N-terminal amino acid sequence that is functional for trans-splicing reactions, that is, that is capable of associating with a functional split intein C-fragment to form a complete intein that is capable of excising itself from the host protein, catalyzing the ligation of the extein or flanking sequences with a peptide bond, or that upon association with a split intein C-fragment catalyzes the “N-terminal cleavage”, that is, the nucleophilic attack of the peptide bond between the extein and the N- terminus of the split intein N-fragment resulting in the breaking of said peptide bond. An IntN thus also comprises a sequence that is spliced out when trans-splicing occurs. An IntN can comprise a sequence that is a modification of the N-terminal portion of a naturally occurring intein sequence. For example, it can comprise additional amino acid residues and/or mutated residues so long as the inclusion of such additional and/or mutated residues does not render the IntN non-functional in trans-splicing. Preferably, the inclusion of the additional and/or mutated residues improves or enhances the trans-splicing activity of the IntN. In various embodiments, N-intein is fused to the N-terminal fragment of a protein to be reconstituted, wherein the N- intein is at the C-teminus relative to the N-fragment of the protein to be reconstituted. [0065] The terms “split intein C-fragment,” “C-fragment of a split intein,” “C-terminal split intein,” “C-terminal intein fragment,” and “C-terminal intein sequence” (abbreviated “IntC” or “C-intein”) refer to any intein sequence that comprises a C-terminal amino acid sequence that is functional for trans-splicing reactions, that is, that upon association is capable of associating with a functional split intein N-fragment to form a complete intein that is capable of excising itself from the host protein, catalyzing the ligation of the extein or flanking sequences with a peptide bond, or that upon association with a split N-intein catalyzes the “C- terminal cleavage”, that is, the nucleophilic attack of the peptide bond between the extein and the C-terminus of the split intein C-fragment resulting in the breaking of said peptide bond. An IntC thus also comprises a sequence that is spliced out when trans-splicing occurs. An IntC can comprise a sequence that is a modification of the C-terminal portion of a naturally occurring intein sequence. For example, it can comprise additional amino acid residues and/or mutated residues so long as the inclusion of such additional and/or mutated residues does not render the IntC non-functional in trans-splicing. Preferably, the inclusion of the additional and/or mutated residues improves or enhances the trans-splicing activity of the IntC. In various embodiments, the C-intein is fused to the C-terminal fragment of a protein to be reconstituted, wherein the C- intein is at the N-terminus relative to the C-terminal fragment of the protein to be reconstituted. [0066] “Degrons,” “degradation signal,” or “destabilizing domain” refers to a naturally-occurring or artificially-constructed polypeptide sequence which when recombinantly fused to another polypeptide it accelerates its protein degradation via the proteosomal degradation pathway, or any other cellular degradation mechanism.

[0067] “Enhancer” or “enhancer element” refers to a cis-acting sequence that increases the level of transcription associated with a promoter and can function in either orientation relative to the promoter and the coding sequence that is to be transcribed and can be located upstream or downstream relative to the promoter or the coding sequence to be transcribed. In preferable embodiments, an “enhancer” is an DNA regulatory element that confer cell type specificity of gene expression. For example, a targeted central nervous system cell type enhancer is an enhancer that is uniquely or predominantly utilized by the targeted central nervous system cell type; and a targeted central nervous system cell type enhancer enhances expression of a gene in the targeted central nervous system cell type, but does not substantially direct expression of genes in other non-targeted cell types, thus having neural specific transcriptional activity. Examples of enhancers, especially interneuron-specific enhancers, are provided in US2021/0348195 and US2018/0078658, which are incorporated by reference. [0068] Neurons found in the mammalian (e.g., human) nervous system can be divided into three classes based on their roles: sensory neurons, motor neurons, and interneurons. Targeted cell types can be identified based on transcriptional profiles, such as those described in Tasic et al., 2018 Nature. For example, GABAergic interneurons express GABA synthesis genes Gadl/GADl and/or Gad2/GAD2; whereas glutamatergic neurons express glutamate transmitters SIcl7a6 and/or SIcl7a7.

[0069] Ion transporters are transmembrane proteins that mediate transport of ions across cell membranes. In particular embodiments, ion transporters include voltage gated sodium channels, potassium channels, and calcium channels.

[0070] Mammalian voltage-gated sodium (Na_v) channels are composed of a highly glycosylated -260 kDa a subunit, the pore forming protein, linked via disulfide bonds to (32/(34 subunits and non-covalently with (31/(33 subunits. Nine N_av1 a subunit genes (SCN1A-SCN9A) have been identified in mammals, constituting the N_av1 gene subfamily.

[0071] the N_av channel a subunit is a complex of transmembrane helices surrounding a central ion-conducting pore, usually capable of producing functional channels in a heterologous expression system. Approximately 2000 amino acid residues are arranged in 4 homologous domains, each consisting of 6 transmembrane segments, and a hairpin loop that lines the pore and includes the selectivity filter. An additional family of accessory (3 subunits also exists, split into 2 groups discriminated by their mechanism of interaction with the a subunit: disulphide-linked (32 and (34; and non-covalently associated (31 (including splicing variant) and (33; wherein the extracellular immunolgobulin-like domain of the (3 subunit is important for surface expression and modulation of a subunit gating, while the transmembrane domain influences N_av voltage-dependence.

[0072] The SCN1A gene codes for the alpha subunit of Navl.1 channel. The Navl. I channel is mainly responsible for the generation and propagation of neuronal action potentials. Different mutations in this gene are associated with epilepsy and febrile seizures. SCN1A may encode proteins of various isoforms, including but are not limited to NP 001340878.1, NP_001159435.1, NP_001189364.1, and those provided in biogps.org/#goto=genereport&id=6323. Situated at position 2q24.3, SCN1A is part of a cluster of voltage-gated sodium channel genes that is home to SCN2A, SCN3A, SCN7A, as well as SCN9A, which encode Navl.2, Navl.3, Na_x, and Navl.7, respectively.

[0073] The Navl. l open-reading frame is believed to be organized into 26 exons and blueprints the instructions for a protein incorporating between 1976 and 2009 amino acids. Generally the variance in length stems from alternative splice junctions at the end of exon 11 that produce a full-length isoform or shortened versions thereof.

[0074] Coding sequences encoding molecules (e.g., RNA, proteins) described herein can be readily obtained from publicly available databases and publications. Coding sequences can further include various sequence polymorphisms, mutations, and/or sequence variants wherein such alterations do not affect the function of the encoded molecule. The term “encode” or “encoding” refers to a property of sequences of nucleic acids, such as a vector, a plasmid, a gene, cDNA, mRNA, to serve as templates for synthesis of other molecules such as proteins.

[0075] The term “gene” may include not only coding sequences but also regulatory regions such as promoters, enhancers, and termination regions. The term further can include all introns and other DNA sequences spliced from the mRNA transcript, along with variants resulting from alternative splice sites. The sequences can also include degenerate codons of a reference sequence or sequences that may be introduced to provide codon preference in a specific organism or cell type.

[0076] In various embodiments, expression constructs are provided within vectors. “Vector” refers to a nucleic acid molecule capable of transferring or transporting another nucleic acid molecule, such as an expression construct. Examples of vectors include plasmids (e.g., DNA plasmids or RNA plasmids), transposons, cosmids, bacterial artificial chromosomes, and viral vectors.

[0077] Adenovirus vectors” refer to those constructs containing adenovirus sequences sufficient to support packaging of an expression construct and to express a coding sequence that has been cloned therein in a sense or antisense orientation. A recombinant Adenovirus vector includes a genetically engineered form of an adenovirus. Both ends of the viral genome contain 100-200 base pair inverted repeats (ITRs), which are cis elements necessary for viral DNA replication and packaging. The early (E) and late (L) regions of the genome contain different transcription units that are divided by the onset of viral DNA replication. The El region (El A and E1B) encodes proteins responsible for the regulation of transcription of the viral genome and a few cellular genes. The expression of the E2 region (E2A and E2B) results in the synthesis of the proteins for viral DNA replication. These proteins are involved in DNA replication, late gene expression, and host cell shut-off. The typical vector is replication defective and will not have an adenovirus El region.

[0078] In particular embodiments vectors (e.g., AAV) with capsids that cross the blood-brain barrier (BBB) are selected. In particular embodiments, vectors are modified to include capsids that cross the BBB. Examples of AAV with viral capsids that cross the blood brain barrier include AAV9, AAVrh.10, AAV1R6, AAV1R7, rAAVrh.8, AAV-BR1, AAV- PHP.S, AAV-PHP.B, and AAV-PPS. The PHP.eB capsid differs from AAV9 such that, using AAV9 as a reference, amino acids starting at residue 586: S-AQ-A (SEQ ID NO:46) are changed to S-DGTLAVPFK-A (SEQ ID NO:47). Additional description regarding capsids that cross the blood brain barrier is provided by Chan et al., Nat. Neurosci. 2017 August: 20(8): 1172-1179.

[0079] Various embodiments provide the inclusion of a degradation signal in a system, wherein the degradation signal is a peptide sequence, known as degron, which mediates rapid ubiquitination and subsequent proteasomal degradation of a nearby protein or a protein that the degron is embedded in. In some embodiments, the degradation signal, or degron, is included with theN-terminal segment (split) of an intein, i.e., N-intein. In some embodiments, the drgron is included with the C-terminal segment (split) of the intein, i.e., C-intein. In some embodiments, the degron is included with both the N-intein and the C-intein.

[0080] In further embodiments, the degron is in an individual expression construct or vector, separate from the expression con struct! s) or vector(s) that contain C-intein or N-intein. [0081] In some embodiments, a degron is included with (or attached to) just one of the C-intein or the N-intein, and not with both the C-intein and the N-intein. In some embodiments, when a degron is attached to the N-intein, the degron is attached to the C-terminal end of the N-intein, and the configuration from N- to C-end of this half of the split intein fusion is: N- terminal fragment of a sodium channel protein (e.g., N-terminal fragment of a sodium channel alpha subunit) - N-intein - degron. In some embodiments, when a degron is included with (or attached to) the C-intein, the degron is inserted into the C-intein and a few residues (e.g., 2, 3, 4, 5, 6, 7, 8, 9. 10 residues) before/upstream/at the N-terminus relative to the fragment of the sodium channel protein; and that is, the configuration from N to C is: a first portion of C-intein - degron - a second portion of the C-intein - C-terminal end containing fragment of the sodium channel protein. The first portion of C-intein and the second portion of the C-intein, when operably connected, form a C-intein.

[0082] Some embodiments provide that the degradation signal, or degron, is integrated at the 5’ end of the N-intein, so that upon intein-mediated protein /ra/z.s-spl icing and intein excision, the degron would be placed at the N-terminal of the excised intein. Some embodiments provide that the degradation signal, or degron, is integrated at the 3’ end of the C-intein, so that upon intein-mediated protein /ra/z.s-spl icing and intein excision, the degron would be placed at the C-terminal of the excised intein. Other embodiments provide that the degradation signal, or degron, is integrated at the 3’ end of the N-intein and/or the 5’ end of the C-intein, so that upon intein-mediated protein /ra//.s-spl icing and intein excision, the degron would be in the middle of the excised intein.

[0083] Preferably, the degron encoded in the system is one having no more than 30 amino acid residues. In some embodiments, the degron encoded in the system is one being 5- 27 amino-acid residues in length. In some embodiments, the degron encoded in the system is one being 5-10 amino-acid residues in length. In some embodiments, the degron encoded in the system is one being 11-20 amino-acid residues in length. In some embodiments, the degron encoded in the system is one being 21-26 amino-acid residues in length.

[0084] In various embodiments, the system includes polynucleotides encoding a short degron (e.g., no more than 30 amino-acid residues in length) and an intein having fast splicing rates (e.g., half-lives below 5 min) such as consensus fast intein (Cfa); and the system does not include polynucleotides encoding degron that is more than 30 amino-acid residues in length or polynucleotides encoding segments that form an intein with a splicing rate slower than Cfa or half-lives greater than 5 min. In various embodiments, the system is effective for reconstituting a target protein (e.g., sodium channel alpha subunit) at a faster speed or higher efficiency than degron-mediated degradation of intein, N-intein and/or C-intein. In various embodiments, the system having a polynucleotide sequence encoding the degron is effective for reconstituting a target protein (e.g., sodium channel alpha subunit) at an amount or yield that is at least 100%, 95%, 90%, 85%, or 80% compared to that reconstituted in a system without a polynucleotide sequence. In various embodiments, the system having a polynucleotide sequence encoding the degron is effective for reducing amount of intein (e.g., free intein after target protein splicing) by at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 95%.

[0085] In some embodiments, a degron is from the class II trans-activator (CIITA). It some embodiments, the CIITA degron is a 26 amino acid-long peptide, having an amino acid sequence of RPGSTSPFAPSATDLPSMPEPALTSR (SEQ ID NO:91). In some embodiments, the CIITA degron is a variant having at least 95% or 90% sequence identity with SEQ ID NO:91. In some embodiments, the CITTA degron is a N-terminal degron. That is, preferably, when the CITTA degron is attached to the N-intein, the CITTA degron is at the C-terminal of the N-intein, and the configuration from N to C is: N-terminal fragment of the sodium channel alpha subunit - N-intein - CITTA degron; and the other half of the split-intein has a configuration from N to C being C-intein - C-terminal fragment of the sodium channel alpha subunit. In other embodiments, the CITTA degron is inserted within the C-intein, and the configuration from N to C of this half of the split intein fusion is: a first portion of C-intein - degron - a second portion of the C-intein - C-terminal fragment of the sodium channel alpha subunit; and the other half of the split-intein has a configuration from N to C being N-terminal fragment of the sodium channel alpha subunit - N-intein. .

[0086] In some embodiments, a degron is derived from the ornithine decarboxylase 1 (ODC1). In some embodiments, the ODC1 degron is a 23 amino acid-long peptide, having an amino acid sequence of MSCAQESITSLYKKAGSENLYFQ (SEQ ID NO:88). In some embodiments, the ODC 1 degron is a variant having at least 95% or 90% sequence identity with SEQ ID NO:88. In other embodiments, the ODC1 degron is a 37 amino acid-long peptide, having an amino acid sequence of

FPPEVEEQDDGTLPMSCAQESGMDRHPAACASARINV (SEQ ID NO:89). In some embodiments, the ODC1 degron is a variant having at least 95% or 90% sequence identity with SEQ ID NO:89. In yet other embodiments, the ODC1 degron is a 7 amino acid-long peptide, having an amino acid sequence of MSCAQES (SEQ ID NO:90). In some embodiments, the ODC1 degron is a variant having at least 95% or 90% sequence identity with SEQ ID NO:90. It has been determined that part, of some degron sequences do not participate in the binding with ubiquitin ligase; and hence, a shorter fragment of some long degron sequences may be used which involves in binding with ubiquitin ligase to mediate degradation. For example, an active (ligase-binding) fragment of the ODC1 degron consists of an amino acid sequence of MSCAQES. In some embodiments, one or more ODC1 degrons are each a C-terminal degron. That is, preferably, the ODC1 degron is attached to as inserted into the C-intein, and the configuration from N to C of this half of the split intein fusion is: a first portion of C-intein - degron - a second portion of the C-intein - C-terminal fragment of the sodium channel alpha subunit; and the other half of the split-intein has a configuration from N to C being N-terminal fragment of the sodium channel alpha subunit - N-intein.

[0087] In some embodiments, a degron is the peptide CL1 or a variant thereof. In some embodiments, the CL1 degron is a 16 amino acid-long peptide, having an amino acid sequence of ACKNWFSSLSHFVIHL (SEQ ID NO:92). In some embodiments, the CL1 degron is a variant having at least 95% or 90% sequence identity with SEQ ID NO:92. In some embodiments, the CL1 degron or a variant thereof (“CL degron”) is attached to the N-intein and at the C-terminal of the N-intein, i.e., as a C-terminal tail of N-intein. A configuration from N- to C-terminus is: N-terminal fragment of the sodium channel alpha subunit - N-intein - CL degron; and the other half of the split-intein has a configuration from N to C being C-intein - C-terminal fragment of the sodium channel alpha subunit. Description of the CL degron is further provided by Gilon et al., the EMBO Journal, Vol.17 No.10 pp.2759-2766, 1998. [0088] Variants of peptide CL1 include CL2, CL6, CL9, CLIO, CLI 1, CL12, CL15, CL16, and SL17, whose sequences are summarized below, as well as those having at least 95% or 90% sequence identity thereto:

CL2 SLISLPLPTRVKFSSLLLIRIMKIITMTFPKKLRS (SEQ ID NO:94)

CL6 FYYPIWFARVLLVHYQ (SEQ ID NO: 95)

CL9 SNPFSSLFGASLLIDSVSLKSNWDTSSSSCLISFFSSVMFSSTTRS (SEQ ID NO: 96)

CLIO CRQRFSCHLTASYPQSTVTPFLAFLRRDFFFLRHNSSAD (SEQ ID NO:97)

CLI 1 GAPHVVLFDFELRITNPLSHIQSVSLQITLIFCSLPSLILSKFLQV (SEQ ID NO:98) CL12 NTPLFSKSFSTTCGVAKKTLLLAQISSLFFLLLSSNIAV (SEQ ID NO:99)

CL15 PTVKNSPKIFCLSSSPYLAFNLEYLSLRIFSTLSKCSNTLLTSLS (SEQ ID NO: 100)

CL 16 SNQLKRLWLWLLEVRSFDRTLRRPWIHLPS (SEQ ID NO : 101 )

SL17 SISFVIRSHASIRMGASNDFFHKLYFTKCLTSVILSKFLIHLLLRSTPRV (SEQ ID NO: 102)

[0089] In some embodiments, a degron is a short DEG1 with a hydrophobic end. In some embodiments, the DEG1 degron is a 9 amino acid-long peptide, having an amino acid sequence of GSLIIFIIL (SEQ ID NO:93). In some embodiments, the DEG1 degron is a variant having at least 95% or 90% sequence identity with SEQ ID NO:93. In some embodiments, the DEG1 degron is a C-terminal degron. That is, preferably, the DEG1 degron is inserted in the C-intein, and the configuration from N to C of this half of the split intein fusion is: a first portion of C-intein - degron - a second portion of the C-intein - C-terminal fragment of the sodium channel alpha subunit; and the other half of the split-intein has a configuration from N to C being N-terminal fragment of the sodium channel alpha subunit - N-intein.

[0090] Various embodiments provide systems to express a coding sequence of a sodium channel alpha subunit (e.g., Navl.l a subunit, or short as Navl.l unless otherwise noted) for reconstitution of the sodium channel alpha subunit. In various embodiments, the system expresses a coding sequence of Navl. l, which is gene SCN1A. In some embodiments, the systems express the coding sequence of a sodium channel alpha subunit selected from Navl. l through Navi.9, and correspondingly encoded by genes SCN1A through SCN11A. In other embodiments, the systems express SCN2A, SCN3A, SCN4A, SCN5A, SCN8A, SCN9A, SCN10A, SCN11A, or SCN7A, for reconstitution of Navi.2, Navi.3, Navi.4, Navi.5, Navi.6, Navi.7, Navi.8, Navi.9, or Nax, respectively. [0091] In some embodiments, a system to express a coding sequence of Navi.1 alpha subunit, wherein the coding sequence of Navl. l alpha subunit is or comprises the polynucleotide sequence of SCN1A, and the system includes: a first expression construct including a first portion of the polynucleotide sequence of the SCN1 A, and a polynucleotide sequence encoding an N-fragment of a split intein (N-intein) at the 3’ end relative to the first portion of the polynucleotide sequence of the SCN1A,' and a second expression construct including a second portion of the polynucleotide sequence of the SCN1A, and a polynucleotide sequence encoding a C-fragment of the split intein (C-intein) at the 5’ end relative to the second portion of the polynucleotide sequence of the SCN1A- wherein protein products of the first and the second portions of the polynucleotide sequence of the SCN1A are linked, via a peptide bond between the C-terminus of the first portion’s protein product and the N-terminus of the second portion’s protein product, to reconstitute the Navl. l; and wherein the first and/or second expression construct further comprises a polynucleotide sequence encoding a degron, located if within the first expression construct at the 3 ’ end relative to the first portion of the polynucleotide sequence of the SCN1A, and if within the second expression construct at the 5’ end relative to the second portion of the polynucleotide sequence of the SCN1A, OR the system includes a third expression construct encoding the degron.

[0092] In some embodiments, a composition or a system is provided for reconstitution of Navl. l alpha subunit, which comprises: a. a first polynucleotide encoding a polypeptide comprising an N-fragment of a split intein, wherein the N-fragment of the split intein is directly linked via a peptide bond, optionally through a peptide linker, to the N-terminal fragment of Navl.l alpha subunit; b. a second polynucleotide encoding a polypeptide comprising a C-fragment of the split intein, wherein the C-fragment of the split intein is directly linked via a peptide bond, optionally through a peptide linker, to the C-terminal fragment of the Navl. l alpha subunit; and c. a third polynucleotide encoding a degron; wherein the polynucleotides of the composition may be packed together in a single formulation or separately in different formulations, wherein the first and the second polynucleotides encode the N- and the C-terminal fragments of the Navl.l alpha subunit, respectively, so that when both fragments are spliced together, the N-terminal fragment is linked to the C-terminal fragment, generating whole Navl.l alpha subunit.

In some embodiment, the composition is further characterized in that: the split intein N-fragment is further directly linked via a peptide bond to a degron, wherein the degron is linked to the intein N-fragment via the C-terminus of the intein, with or without a linker between the intein N-fragment and the degron, and wherein the N-terminus of the Split intein N-fragment is directly linked via a peptide bond to the N-terminal fragment of the Navi .1 alpha subunit; and/or the split intein C-fragment is further directly linked via a peptide bond to a degron, wherein the degron is linked to the intein C-fragment via the N-terminus of the intein, with or without a linker between the intein C-fragment and the degron, and wherein the C-terminus of the Split intein C-fragment is directly linked via a peptide bond to the C-terminal fragment of the Navi.1 alpha subunit.

[0093] In some embodiments, a system to express a coding sequence of Navi.1 for its reconstitution includes a first expression construct and a second expression construct, wherein the first expression constructs includes from 5’ to 3’ : a first portion of the polynucleotide sequence of the SCN1A - a polynucleotide sequence encoding the N-intein - a polynucleotide sequence encoding the degron; and wherein the second expression constructs includes from 5’ to 3’ end: a polynucleotide sequence encoding a C-fragment of the split intein (C-intein) - a second portion of the polynucleotide sequence of the SCN1A. In further embodiments of this system, the polynucleotide sequence encoding the degron is one that encodes RPGSTSPFAPSATDLPSMPEPALTSR (SEQ ID NO:91) or ACKNWFSSLSHFVIHL (SEQ ID NO:92) or a variant having a sequence identity of at least 90% or 95% thereto.

[0094] In some embodiments, a system to express a coding sequence of Navi.1 for its reconstitution includes a first expression construct and a second expression construct, wherein the first expression constructs includes from 5’ to 3’ : a first portion of the polynucleotide sequence of the SCN1A - a polynucleotide sequence encoding the degron - a polynucleotide sequence encoding the N-intein; and wherein the second expression constructs includes from 5’ to 3’ end: a polynucleotide sequence encoding a C-fragment of the split intein (C-intein) - a second portion of the polynucleotide sequence of the SCN1A.

[0095] In some embodiments, a system to express a coding sequence of Navi.1 for its reconstitution includes a first expression construct and a second expression construct, wherein the first expression construct includes from 5’ to 3’ : a first portion of the polynucleotide sequence of the SCN1A - a polynucleotide sequence encoding the N-intein; and the second expression construct includes from 5’ to 3’: a polynucleotide sequence encoding the degron - a polynucleotide sequence encoding the C-intein - a second portion of the polynucleotide sequence of the SCN1A. In further embodiments of this system, the polynucleotide sequence encoding the degron is one encoding MSCAQESITSLYKKAGSENLYFQ (SEQ ID NO:88), FPPEVEEQDDGTLPMSCAQESGMDRHPAACASARINV (SEQ ID NO: 89), MSCAQES (SEQ ID NOVO), or GSLIIFIIL (SEQ ID NO:93), or a sequence having at least 90% or 95% sequence identity thereto.

[0096] In some embodiments, a system to express a coding sequence of Navi.1 for its reconstitution includes a first expression construct and a second expression construct, wherein the first expression construct includes from 5’ to 3’ : a first portion of the polynucleotide sequence of the SCN1A - a polynucleotide sequence encoding the N-intein; and wherein the second expression construct includes from 5’ to 3’ : a polynucleotide sequence encoding the C- intein - a polynucleotide sequence encoding the degron - a second portion of the polynucleotide sequence of the SCN1A.

[0097] In some embodiments, the degron has an amino acid sequence selected from the group consisting of: MSCAQESITSLYKKAGSENLYFQ (SEQ ID NO:88), FPPEVEEQDDGTLPMSCAQESGMDRHPAACASARINV (SEQ ID NO: 89), MSCAQES (SEQ ID NOVO), RPGSTSPFAPSATDLPSMPEPALTSR (SEQ ID NO:91), ACKNWFSSLSHFVIHL (SEQ ID NO:92), and GSLIIFIIL (SEQ ID NO:93), and variants having at least 90% or 90% sequence identity to any of the above.

[0098] Alternative sites are provided for splitting human SCN1A to make AAV-sized halves. In some aspects, the protein is split at (right before, i.e., at the N-terminus end before) breakpoint Cysl050, according to amino acid positions in sodium channel protein type 1 subunit alpha isoform 2 of NCBI reference no. NP 001340878.1. In various implementations, an endogenous cysteine residue is required to make half joining scarless and reconstitute a full- length unmutated protein. Alternative split intein breakpoints are at either Cys957 or Cys948, according to amino acid positions in sodium channel protein type 1 subunit alpha isoform 2 of NCBI reference no. NP 001340878.1. These breakpoints would permit a better AAV size and packaging efficiency of the N-terminal half for better expression than that seen with hSCNl A- CO-Nterml049 when using the Cysl050 breakpoint. However, they would place the intein junctions in the extracellular/lumenal space. The N-terminal junction sequence at the front of an intein is termed as the -1 position; and the +1 position after the intein sequence usually has a Cys, Ser, or Thr residue.

[0099] In some embodiments of the one or more systems disclosed herein, the first portion of the polynucleotide sequence of the SCN1A encodes residues 1-1049 of the Navl. l, and the second portion of the polynucleotide sequence of the SCN1A encodes residues 1050- 1998 of the Navl. l, wherein the amino acid position is based on numberings in NP 001340878.1. In some embodiments, the first portion of the polynucleotide sequence encodes a sodium channel alpha subunit N-terminus fragment when sequence aligned being corresponding to residues 1-1049 of the Navl. l having an NCBI reference no. NP 001340878.1, and the second of portion of the polynucleotide sequence encodes a sodium channel alpha subunit N-terminus fragment when sequence aligned being corresponding to residues 1050-1998 of the Navl.l having an NCBI reference no. NP_001340878.1.

[0100] In some embodiments of the one or more systems disclosed herein, the first portion of the polynucleotide sequence of the SCN1A encodes residues 1-956 of the Navl. l, and the second portion of the polynucleotide sequence of the SCN1A encodes residues 957- 1998, wherein the amino acid position is based on numberings in NP 001340878.1. In some embodiments, the first portion of the polynucleotide sequence encodes a sodium channel alpha subunit N-terminus fragment when sequence aligned being corresponding to residues 1-956 of the Navl.l having an NCBI reference no. NP_001340878.1, and the second of portion of the polynucleotide sequence encodes a sodium channel alpha subunit N-terminus fragment when sequence aligned being corresponding to residues 957-1998 of the Navl. l having an NCBI reference no. NP_001340878.1.

[0101] In some embodiments of the one or more systems disclosed herein, the first portion of the polynucleotide sequence of the SCN1A encodes residues 1-947 of the Navl. l, and the second portion of the polynucleotide sequence of the SCN1A encodes residues 948- 1998, wherein the amino acid position is based on numberings in NP 001340878.1. In some embodiments, the first portion of the polynucleotide sequence encodes a sodium channel alpha subunit N-terminus fragment when sequence aligned being corresponding to residues 1-947 of the Navl.l having an NCBI reference no. NP_001340878.1, and the second of portion of the polynucleotide sequence encodes a sodium channel alpha subunit N-terminus fragment when sequence aligned being corresponding to residues 947-1998 of the Navl. l having an NCBI reference no. NP_001340878.1.

[0102] In some embodiments of the one or more systems disclosed herein, the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterml049 (SEQ ID NO: 59), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterm949 (SEQ ID NO: 60).

[0103] In some embodiments of the one or more systems disclosed herein, the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCN!A-CO-Nterm956 (SEQ ID NO: 61), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterml042 (SEQ ID NO: 62).

[0104] In some embodiments of the one or more systems disclosed herein, the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterm947 (SEQ ID NO: 63), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterml051 (SEQ ID NO: 64).

[0105] In some embodiments, the intein includes a Cfa intein, an Ssp intein, a gp41-l intein, IMPDH-1 intein, Nrdj-1 intein, gp41-8 intein, or an Npu intein. In various aspects, the split intein comprises consensus fast intein (Cfa),. We conceive that with the Cfa, the split intein trans-splicing reaction will occur first, before the binding of ubiquitin ligase to degron and subsequent degradation. In some aspects, the polynucleotide sequence encoding the N- intein comprises a polynucleotide sequence of Cfa-N (SEQ ID NO:57), and the polynucleotide sequence encoding the C-intein comprises a polynucleotide sequence of Cfa-C (SEQ ID NO: 58). In other embodiments, the intein is functionally similar to a Cfa intein. Herein, functionally similar to a Cfa intein means that the expression construct includes a variant of a Cfa intein, yet still results in construction of a functional protein (e.g., voltage-gated sodium channel). In the event that trans-splicing efficiency is reduced in the presence of degron, we conceive that targeting a specific cell type gives flexibility to deliver higher doses without side effect, which may also compensate for the attenuation in the trans-splicing efficiency.

[0106] In some embodiments, a mature protein Navl. l is expressed by splitting the coding sequence into three fragments and putting the N-terminal portion of the coding sequence with a first N-intein into a first artificial expression construct, putting the middle portion of the coding sequence with a first C-intein and second N-intein into a second artificial expression construct, and putting the C-terminal portion of the coding sequence with a second C-intein into a second artificial expression construct, wherein the first N-intein and first C-intein specifically splice together to form an intein, and wherein the second N-intein and second C- intein specifically splice together to form an intein. In further embodiments, at least one, or two or all three fragments of the coding sequence includes a degron. Preferably, the first intein and the second intein are different, so that the first N-intein and the second C-intein do not splice together, and the second N-intein and the first C-intein do not splice together. That is, preferably, first N-intein and first C-intein, and second N-intein and second C-intein, are two mutually orthogonal split inteins. Exemplary mutually orthogonal split inteins are described at least by Pinto et al. in Nature Communications (2020)11 : 1529. A method of using this system includes administering the first, second, and third artificial expression construct to a cell. Similarly, mature proteins can be formed from several fragments using the appropriate number of inteins.

[0107] In various aspects, the first expression construct and the second expression construct independently further comprise a promoter sequence, selected from a minBglobin promoter, an hSynl promoter, or a CMV promoter; optionally the hSynl promoter comprising a shortened hSynl promoter having a polynucleotide sequence of SEQ ID NO:54.

[0108] In various aspects, the first expression construct, the second expression construct, or both independently further comprise an enhancer sequence, configured for targeted expression of the first, the second, or both portions, respectively, of the polynucleotide sequence of the sodium channel alpha subunit within a targeted central nervous system cell type.

[0109] In some aspects, the enhancer sequence comprises a polynucleotide sequence of DLX2.0 (SEQ ID NO:2). In some aspects, the enhancer sequence has a concatemerized core of a I56i enhancer optionally as set forth in SEQ ID NO: 1. These artificial enhancer elements provide more rapid onset of transgene expression compared to a single full length original (native) enhancer. In some aspects, the targeted central nervous system cell type comprises a GABAergic neuron, preferably a GABAergic interneuron, or more preferably a telencephalic GABAergic interneuron. In other embodiments, the enhancer sequence is a 527 bp enhancer sequence (referred to as mI56i or mDIx) from the intergenic interval between the distal-less homeobox 5 and 6 genes (DIx5/6), which are naturally expressed by forebrain GABAergic interneurons during embryonic development. Further description of enhance sequences, such as those for selectively modulating gene expression in interneurons, is provided in US20210348195, which is incorporated by reference.

[0110] In some aspects, the enhancer sequence comprises a polynucleotide sequence of eHGT_078h (SEQ ID NO:55). In some aspects, the targeted central nervous system cell type comprises a forebrain glutamatergic neuron.

[OHl] In some aspects, the first expression construct, the second expression construct, or both independently further comprise a miRNA binding site sequence, configured for targeted expression of the first, the second, or both portions, respectively, of the polynucleotide sequence of the sodium channel alpha subunit within a selected central nervous system cell type. [0112] In some aspects, the miRNA binding site sequence comprises a polynucleotide sequence of 4x2C miRNA binding site (SEQ ID NO:56), and the selected central nervous system cell type comprises a pan-GABAergic neuron.

[0113] In particular embodiments, an enhancer is used to drive gene expression in a targeted central nervous system cell population. Particular embodiments of the artificial expression constructs utilize the following enhancers to drive gene expression within targeted central nervous system cell populations as follows (enhancer / targeted cell population): DLX2.0 / forebrain GABAergic; hSynl with 4x2C or 8x2C miR binding site / pan-GABAergic neurons; eHGT_078h / forebrain glutamatergic neurons. In particular embodiments, the artificial expression construct can include a shortened promoter or a minimal promoter. In particular embodiments, the shortened promoter includes the hSynl prom oter( shortened). In particular embodiments, the minimal promoter includes minBglobin.

[0114] Particular embodiments provide artificial expression construct pairs including the features of vectors described herein including vectors: CN3252 and CN3254, CN3683 and CN3684, CN3251 and CN3253, CN3677 and CN3678, CN4541 and CN4542, CN4217 and CN4218, or CN4642 and CN4643, as described in Tables below.

[0115] Various embodiments further provide an artificial expression construct containing a first expression construct as disclosed herein. Further embodiments also provide an artificial expression construct containing a second expression construct as disclosed herein. Preferably, the artificial expression construct is within an adeno-associated viral (AAV) vector. The artificial expression construct can also include other regulatory elements if necessary or beneficial. Examples of regulatory elements utilized within artificial expression constructs disclosed herein include DLX2.0, minBglobin promoter, hSynl promoter, CMV promoter, hSynl promoter (shortened), 4x2C miR binding site, 8x2c miR binding site, and eHGT_078h. [0116] In particular embodiments, the artificial expression constructs are expressed in all neurons. In particular embodiments, the artificial expression constructs include an hSynl promoter and are expressed in neurons. In particular embodiments, the artificial expression constructs are expressed in all cell lines. In particular embodiments, the artificial expression constructs include a CMV promoter and are expressed in cell lines.

[0117] Various embodiments provide an administrable composition, which includes any one or more systems disclosed herein to express coding sequence of and reconstitute Navl. l. Additional embodiments provide an administrable composition, which includes either one or both of an artificial expression construct containing the first expression construct, and an artificial expression construct containing the second expression construct. [0118] A transgenic cell is also provided, comprising any one or more systems disclosed herein. In some aspects, the transgenic cell is a GAB Aergic neuron or a glutamatergic neuron or a cell line of GAB Aergic neuron or glutamatergic neuron.

[0119] Additional embodiments provide a method of rescuing voltage-gated sodium channel function in a population of cells, and the method includes co-administering a therapeutically effective amount of a first expression construct and a therapeutically effective amount of a second expression construct, as disclosed herein, to a sample or subject comprising the population of cells, and inducing expression of the first expression construct and the second expression construct to reconstitute Navl. l, thereby rescuing voltage-gated sodium channel function in the population of cells.

[0120] Preferably, the method is for rescuing voltage-gated sodium channel function in a targeted population of cells. In some embodiments, the methods involve a targeted central nervous system cell type enhancer, which is uniquely or predominantly utilized by the targeted central nervous system cell type. A targeted central nervous system cell type enhancer enhances expression of a gene in the targeted central nervous system. In certain embodiments, a targeted central nervous system cell type enhancer is also a targeted central nervous system type enhancer that enhances expression of a gene in the targeted central nervous system and does not substantially direct expression of genes in other non-targeted cell types, thus having cell type specific transcriptional activity.

[0121] In some embodiments of the methods, the subject has an SCN1 A-related seizure disorder comprising Dravet syndrome, myoclonic seizures, myoclonic astatic epilepsy (MAE), intractable childhood epilepsy with generalized tonic-clonic seizures, simple febrile seizures, generalized epilepsy and febrile seizures plus (GEFS+), migrating partial seizures of infancy, Lennox-Gastaut syndrome, or West syndrome.

[0122] In some embodiments of the methods, the subject is a pediatric patient having Dravet syndrome. In some embodiments, the subject is a pediatric human. In some embodiments, the subject is an infant (1 year old or younger). In some embodiments, the subject is a young child, e.g., between 1 and 10 years old. In some embodiments, the is a teenager. In various embodiments, the human subject is age 1 day through 5 months, 6 months through 4 years, 5 years through 11 years, or 12 years through 17 years.

[0123] In particular embodiments, artificial expression constructs can deliver SCN1A as several fragments of SCN1A delivered by several artificial expression constructs. For example, SCN1A can be delivered in a first artificial expression construct including a first portion of the SCN1A coding sequence and second artificial expression construct including a second portion of the SCN1A coding sequence. In particular embodiments, the first portion of the SCN1A coding sequence is the N-terminal portion of the coding sequence and the second portion of the SCN1 A coding sequence is the C-terminal portion of the coding sequence.

[0124] The sodium channel alpha subunit coding sequence can be split into an N- terminal portion and C-terminal portion at any point, or preferably at a breakpoint wherein the first amino acid residue encoded downstream of the breakpoint is Cys, Ser, or Thr, such that upon intein fusion, a functional sodium channel alpha subunit molecule is expressed. In particular embodiments, an N-terminal portion of the SCN1A coding sequence includes hSCNlA-CO-Nterml049 (SEQ ID NO:59), hSCNlA-CO-Nterm956 (SEQ ID NO:61), or hSCNl A-CO-Nterm947 (SEQ ID NO:63). In particular embodiments, a C-terminal portion of the SCN1A coding sequence includes hSCNIA- CO-Cterm949 (SEQ ID NO: 60), hSCNlA- CO-Cterml042 (SEQ ID NO:62), or hSCNIA-CO- Cterml051 (SEQ ID NO:64).

[0125] Exemplary reporter genes/proteins include those expressed by Addgene ID#s 83894 (pAAV-hDlx-Flex-dTomato-Fishell_7), 83895 (pAAV-hDlx-Flex-GFP-Fishell_6), 83896 (pAAV-hDlx-GiDREADD-dTomato-Fishell-5), 83898 (pAAV-mDlx-ChR2- mCherry-Fishell-3), 83899 (pAAV-mDlx-GCaMP6f-Fishell-2), 83900 (pAAV-mDlx-GFP- Fishell-1), and 89897 (pcDNA3- FLAG-mTET2 (N500)). Exemplary reporter genes particularly can include those which encode an expressible fluorescent protein, or expressible biotin; blue fluorescent proteins (e.g. eBFP, eBFP2, Azurite, mKalamal, GFPuv, Sapphire, T- sapphire); cyan fluorescent proteins (e.g. eCFP, Cerulean, CyPet, AmCyanl, Midoriishi-Cyan, mTurquoise); green fluorescent proteins (e.g. GFP, GFP-2, tagGFP, turboGFP, EGFP, Emerald, Azami Green, Monomeric Azami Green (mAzamigreen), CopGFP, AceGFP, avGFP, ZsGreenl, Oregon Green™(Thermo Fisher Scientific)); Luciferase; orange fluorescent proteins (mOrange, mKO, Kusabira-Orange, Monomeric Kusabira-Orange, mTangerine, tdTomato, dTomato); red fluorescent proteins (mKate, mKate2, mPlum, DsRed monomer, mCherry, mRuby, mRFPl, DsRed-Express, DsRed2, DsRed-Monomer, HcRed-Tandem, HcRedl, AsRed2, eqFP611, mRaspberry, mStrawberry, Jred, Texas Red™ (Thermo Fisher Scientific)); far red fluorescent proteins (e.g., mPlum and mNeptune); yellow fluorescent proteins (e.g., YFP, eYFP, Citrine, SYFP2, Venus, YPet,PhiYFP, ZsYellowl); and tandem conjugates.

[0126] In particular embodiments, artificial expression constructs can include DNA and RNA editing tools such CRISPR/Cas (e.g., guide RNA and a nuclease, such as Cas, Cas9 or cpfl). Functional molecules can also include engineered Cpfls such as those described in US 2018/0030425, US 2016/0208243, WO/2017/184768 and Zetsche et al. (2015) Cell 163: 759-771; single gRNA (see e.g., Jinek et al. (2012) Science 337:816-821; Jinek et al. (2013) eLife 2:e00471; Segal (2013) eLife 2:e00563) or editase, guide RNA molecules, microRNA, or homologous recombination donor cassettes.

[0127] In particular embodiments, artificial expression constructs can include tag cassettes. A tag cassette includes His tag (HHHHHH; SEQ ID NO:34), Flag tag (DYKDDDDK; SEQ ID NO:35), Xpress tag (DLYDDDDK; SEQ ID NO:36), Avi tag (GLNDIFEAQKIEWHE; SEQ ID NO: 37), Calmodulin tag (KRRWKKNFIAVSAANRFKKISSSGAL; SEQ ID NO: 38), Polyglutamate tag, HA tag (YPYDVPDYA; SEQ ID NO:39), Myc tag (EQKLISEEDL; SEQ ID NO:40), Strep tag (which refers the original STREP® tag (WRHPQFGG; SEQ ID NO:41), STREP® tag II (WSHPQFEK SEQ ID NO:42 (IBA Institut fur Bioanalytik, Germany); see, e.g., US 7,981,632), Softag 1 (SLAELLNAGLGGS; SEQ ID NO:43), Softag 3 (TQDPSRVG; SEQ ID NO:44), and V5 tag (GKPIPNPLLGLDST; SEQ ID NO:45). In particular embodiments, a tag cassette includes a fusion of tag cassettes such as 3XFLAG. In particular embodiments, 3XFLAG includes the sequence set forth in SEQ ID NO: 15.

[0128] In particular embodiments, the artificial expression constructs include an internal ribosome entry site (IRES) sequence. See for example, figure IB. IRES allow ribosomes to initiate translation at a second internal site on a mRNA molecule, leading to production of two proteins from one mRNA. In particular embodiments, IRES includes IRES2. In particular embodiments, IRES2 allows for a second protein open reading frame (ORF) to be translated from the same transcript. This is unlike the 2A sequence which allows for a single ORF to be cleaved into two proteins, with similar efficiencies of production

[0129] Coding sequences encoding molecules (e.g., RNA, proteins) described herein can be obtained from publicly available databases and publications. Coding sequences can further include various sequence polymorphisms, mutations, and/or sequence variants wherein such alterations do not affect the function of the encoded molecule. The term “encode” or “encoding” refers to a property of sequences of nucleic acids, such as a vector, a plasmid, a gene, cDNA, mRNA, to serve as templates for synthesis of other molecules such as proteins.

[0130] The term “gene” may include not only coding sequences but also regulatory regions such as promoters, enhancers, insulators, and/or post-regulatory elements, such as termination regions. The term further can include all introns and other DNA sequences spliced from the mRNA transcript, along with variants resulting from alternative splice sites. The sequences can also include degenerate codons of a reference sequence or sequences that may be introduced to provide codon preference in a specific organism or cell type. [0131] Promoters can include general promoters, tissue-specific promoters, cellspecific promoters, and/or promoters specific for the cytoplasm. Promoters may include strong promoters, weak promoters, constitutive expression promoters, and/or inducible promoters. Inducible promoters direct expression in response to certain conditions, signals or cellular events. For example, the promoter may be an inducible promoter that requires a particular ligand, small molecule, transcription factor or hormone protein in order to effect transcription from the promoter. Particular examples of promoters include minBglobin (also referred to as minBGprom), CMV promoter, hSynl promoter, hSynl promoter (shortened), minCMV, minCMV* (minCMV* is minCMV with a SacI restriction site removed), minRho, minRho* (minRho* is minRho with a SacI restriction site removed), SV40 immediately early promoter, the Hsp68 minimal promoter (proHSP68), and the Rous Sarcoma Virus (RSV) long-terminal repeat (LTR) promoter. Minimal promoters have no activity to drive gene expression on their own but can be activated to drive gene expression when linked to a proximal enhancer element. [0132] In particular embodiments, expression constructs are provided within vectors. The term vector refers to a nucleic acid molecule capable of transferring or transporting another nucleic acid molecule, such as an expression construct. The transferred nucleic acid is generally linked to, e.g., inserted into, the vector nucleic acid molecule. A vector may include sequences that direct autonomous replication in a cell or may include sequences that permit integration into host cell DNA. Useful vectors include, for example, plasmids (e.g., DNA plasmids or RNA plasmids), transposons, cosmids, bacterial artificial chromosomes, and viral vectors.

[0133] Adeno-Associated Virus (AAV) is a parvovirus, discovered as a contamination of adenoviral stocks. It is a ubiquitous virus (antibodies are present in 85% of the US human population) that has not been linked to any disease. It is also classified as a dependovirus, because its replication is dependent on the presence of a helper virus, such as adenovirus. Various serotypes have been isolated, of which AAV-2 is the best characterized. AAV has a single-stranded linear DNA that is encapsidated into capsid proteins VP1, VP2 and VP3 to form an icosahedral virion of 20 to 24 nm in diameter.

[0134] The AAV DNA is 4.7 kilobases long. It contains two open reading frames and is flanked by two ITRs. There are two major genes in the AAV genome: rep and cap. The rep gene codes for proteins responsible for viral replications, whereas cap codes for capsid protein VP1-3. Each ITR forms a T-shaped hairpin structure. These terminal repeats are the only essential cis components of the AAV for chromosomal integration. Therefore, the AAV can be used as a vector with all viral coding sequences removed and replaced by the cassette of genes for delivery. Three AAV viral promoters have been identified and named p5, pl 9, and p40, according to their map position. Transcription from p5 and pl9 results in production of rep proteins, and transcription from p40 produces the capsid proteins.

[0135] AAVs stand out for use within the current disclosure because of their superb safety profile and because their capsids and genomes can be tailored to allow expression in targeted cell populations. scAAV refers to a self-complementary AAV. pAAV refers to a plasmid adeno-associated virus. rAAV refers to a recombinant adeno-associated virus. pSMART-HCKan is a high copy number vector with a kanamycin resistance marker for efficient blunt cloning of unstable sequences.

[0136] Other viral vectors may also be employed. For example, vectors derived from viruses such as vaccinia virus, polioviruses and herpes viruses may be employed. They offer several attractive features for various mammalian cells

[0137] Elements directing the efficient termination and polyadenylation of a heterologous nucleic acid transcript can increase heterologous gene expression. Transcription termination signals are generally found downstream of the polyadenylation signal. In particular embodiments, vectors include a polyadenylation signal 3' of a polynucleotide encoding a molecule (e.g., protein) to be expressed. The term "poly(A) site" or "poly(A) sequence" denotes a DNA sequence which directs both the termination and polyadenylation of the nascent RNA transcript by RNA polymerase II. Polyadenylation sequences can promote mRNA stability by addition of a poly(A) tail to the 3' end of the coding sequence and thus, contribute to increased translational efficiency. Particular embodiments may utilize BGHpA, hGHpA, SV40pA, or shortPolyA. In particular embodiments, a preferred embodiment of an expression construct includes a terminator element. These elements can serve to enhance transcript levels and to minimize read through from the construct into other plasmid sequences.

[0138] Particular embodiments of vectors include:

[0139] Subcomponent sequences within the larger vector sequences can be readily identified by one of ordinary skill in the art and based on the contents of the current disclosure. Nucleotides between identifiable and enumerated subcomponents reflect restriction enzyme recognition sites used in assembly (cloning) of the constructs, and in some cases, additional nucleotides do not convey any identifiable function. These segments of complete vector sequences can be adjusted based on use of different cloning strategies and/or vectors. In general, short 6-nucleotide palindromic sequences reflect vector construction artifacts that are not important to vector function. [0140] In particular embodiments vectors (e.g., AAV) with capsids that cross the blood-brain barrier (BBB) are selected. In particular embodiments, vectors are modified to include capsids that cross the BBB. Examples of AAV with viral capsids that cross the blood brain barrier include AAV9 (Gombash et al., Front Mol Neurosci. 2014; 7:81), AAVrh.10 (Yang, et al., Mol Ther.2014; 22(7): 1299-1309), AAV1R6, AAV1R7 (Albright et al., Mol Ther. 2018; 26(2): 510), rAAVrh.8 (Yang, et al., supra), AAV-BR1 (Marchio et al., EMBO Mol Med. 2016; 8(6): 592), AAV-PHP.S (Chan et al., Nat Neurosci. 2017; 20(8): 1172), AAV- PHP.B (Deverman et al., Nat Biotechnol. 2016; 34(2): 204), AAV-PPS (Chen et al., Nat Med. 2009; 15: 1215), and PHP.eB. In particular embodiments, the PHP.eB capsid differs from AAV9 such that, using AAV9 as a reference, amino acids starting at residue 586: S-AQ-A (SEQ ID NO: 46) are changed to S-DGTLAVPFK-A (SEQ ID NO: 47). In particular embodiments, PHP.eB refers to SEQ ID NO: 30. Further description of capsids that cross the BBB is provided in US20210348195, which is incorporated by reference.

[0141] AAV9 is a naturally occurring AAV serotype that, unlike many other naturally occurring serotypes, can cross the BBB following intravenous injection. It transduces large sections of the central nervous system (CNS), thus permitting minimally invasive treatments (Naso et al., BioDrugs. 2017; 31(4): 317), for example, as described in relation to clinical trials for the treatment of spinal muscular atrophy (SMA) syndrome by AveXis (AVXS-101, NCT03505099) and the treatment of CLN3 gene-Related Neuronal Ceroid-Lipofuscinosis (NCT03770572).

[0142] AAVrh.10 was originally isolated from rhesus macaques and shows low seropositivity in humans when compared with other common serotypes used for gene delivery applications (Selot et al., Front Pharmacol. 2017; 8: 441) and has been evaluated in clinical trials LYS-SAF302, LYSOGENE, and NCT03612869.

[0143] AAV1R6 and AAV1R7, two variants isolated from a library of chimeric AAV vectors (AAV1 capsid domains swapped into AAVrh.10), retain the ability to cross the BBB and transduce the CNS while showing significantly reduced hepatic and vascular endothelial transduction.

[0144] rAAVrh.8, also isolated from rhesus macaques, shows a global transduction of glial and neuronal cell types in regions of clinical importance following peripheral administration and also displays reduced peripheral tissue tropism compared to other vectors.

[0145] AAV-BR1 is an AAV2 variant displaying the NRGTEWD (SEQ ID NO:48) epitope that was isolated during in vivo screening of a random AAV display peptide library. It shows high specificity accompanied by high transgene expression in the brain with minimal off-target affinity (including for the liver) (Korbelin et al., EMBO Mol Med. 2016; 8(6): 609). [0146] AAV-PHP.S (Addgene, Watertown, MA) is a variant of AAV9 generated with the CREATE method that encodes the 7-mer sequence QAVRTSL (SEQ ID NO:49), transduces neurons in the enteric nervous system, and strongly transduces peripheral sensory aff erents entering the spinal cord and brain stem.

[0147] AAV-PHP.B (Addgene, Watertown, MA) is a variant of AAV9 generated with the CREATE method that encodes the 7-mer sequence TLAVPFK (SEQ ID NO:50). It transfers genes throughout the CNS with higher efficiency than AAV9 and transduces the majority of astrocytes and neurons across multiple CNS regions.

[0148] AAV-PPS, an AAV2 variant crated by insertion of the DSPAHPS (SEQ ID NO:51) epitope into the capsid of AAV2, shows a dramatically improved brain tropism relative to AAV2.

[0149] For additional information regarding capsids that cross the blood brain barrier, see Chan et al., Nat. Neurosci. 2017 Aug: 20(8): 1172-1179.

[0150] In particular embodiments, a capsid that results in transduction of targeted cell types in a primate following administration (e.g., i.v. administration) is chosen. In particular embodiments, a capsid that results in widespread transduction of tissue and cell types impacted by the loss of Senia following administration is chosen. In particular embodiments, targeted cell types are neurons. In particular embodiments, neurons include GABAergic neurons or glutamatergic neurons. In particular embodiments, GABAergic neurons include pan- GABAergic neurons, forebrain GABAergic neurons, hippocampal GABAergic neurons, or cortical GABAergic neurons. In particular embodiments, glutamatergic neurons include forebrain glutamatergic neurons.

[0151] Artificial expression constructs and vectors that result in rescue of voltage-gated sodium channel function of the present disclosure (referred to herein as physiologically active components) can be formulated with a carrier or more than one carrier that is suitable for administration to a cell, tissue slice, animal (e.g., mouse, non-human primate), or human. Physiologically active components within compositions described herein can be prepared in neutral forms, as freebases, or as pharmacologically acceptable salts.

[0152] Pharmaceutically-acceptable salts include the acid addition salts (formed with the free amino groups of the protein) and which are formed with inorganic acids such as, for example, hydrochloric or phosphoric acids, or such organic acids as acetic, oxalic, tartaric, mandelic, and the like. Salts formed with the free carboxyl groups can also be derived from inorganic bases such as, for example, sodium, potassium, ammonium, calcium, or ferric hydroxides, and such organic bases as isopropylamine, trimethylamine, histidine, procaine and the like.

[0153] Carriers of physiologically active components can include solvents, dispersion media, vehicles, coatings, diluents, isotonic and absorption delaying agents, buffers, solutions, suspensions, colloids, and the like. The use of such carriers for physiologically active components is well known in the art. Except insofar as any conventional media or agent is incompatible with the physiologically active components, it can be used with compositions as described herein.

[0154] The phrase "pharmaceutically-acceptable carriers" refer to carriers that do not produce an allergic or similar untoward reaction when administered to a human, and in particular embodiments, when administered intravenously.

[0155] In particular embodiments, compositions can be formulated for intravenous, intraparenchymal, intraocular, intravitreal, parenteral, subcutaneous, intracerebro-ventricular, intramuscular, intrathecal, intraspinal, intraperitoneal, oral or nasal inhalation, or by direct injection in or application to one or more cells, tissues, or organs.

[0156] Compositions may include liposomes, lipids, lipid complexes, microspheres, microparticles, nanospheres, and/or nanoparticles.

[0157] The formation and use of liposomes is generally known to those of skill in the art. Liposomes have been developed with improved serum stability and circulation half-times (see, for instance, U.S. Pat. No. 5,741,516). Further, various methods of liposome and liposome like preparations as potential drug carriers have been described (see, for instance U.S. Pat. Nos. 5,567,434; 5,552,157; 5,565,213; 5,738,868; and 5,795,587).

[0158] The disclosure also provides for pharmaceutically acceptable nanocapsule formulations of the physiologically active components. Nanocapsules can generally entrap compounds in a stable and reproducible way (Quintanar-Guerrero et al., Drug Dev Ind Pharm 24(12): 1113-1128, 1998; Quintanar-Guerrero et al., Pharm Res. 15(7): 1056-1062, 1998; Quintanar-Guerrero et al., J. Microencapsul. 15(l):107-l 19, 1998; Douglas et al., Crit Rev Ther Drug Carrier Syst 3(3):233- 261, 1987). To avoid side effects due to intracellular polymeric overloading, such ultrafine particles can be designed using polymers able to be degraded in vivo. Biodegradable polyalkyl- cyanoacrylate nanoparticles that meet these requirements are contemplated for use in the present disclosure. Such particles can be easily made, as described in Couvreur et al., J Pharm Sci 69(2): 199-202, 1980; Couvreur etal., Crit Rev Ther Drug Carrier Syst. 5(1)1-20, 1988; zur Muhlen et al., Eur J Pharm Biopharm, 45(2): 149-155, 1998; Zambaux et a/., JControl Realease 50(1-3):31- 40, 1998; and U.S. Pat. No. 5,145,684.

[0159] Injectable compositions can include sterile aqueous solutions or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersions (U.S. Pat. No. 5,466,468). For delivery via injection, the form is sterile and fluid to the extent that it can be delivered by syringe. In particular embodiments, it is stable under the conditions of manufacture and storage, and optionally contains one or more preservative compounds against the contaminating action of microorganisms, such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (e.g., glycerol, propylene glycol, and liquid polyethylene glycol, and the like), suitable mixtures thereof, and/or vegetable oils. Proper fluidity may be maintained, for example, by the use of a coating, such as lecithin, by the maintenance of the required particle size in the case of dispersion, and/or by the use of surfactants. The prevention of the action of microorganisms can be brought about by various antibacterial and/or antifungal agents, for example, parabens, chlorobutanol, phenol, sorbic acid, thimerosal, and the like. In various embodiments, the preparation will include an isotonic agent(s), for example, sugar(s) or sodium chloride. Prolonged absorption of the injectable compositions can be accomplished by including in the compositions of agents that delay absorption, for example, aluminum monostearate and gelatin. Injectable compositions can be suitably buffered, if necessary, and the liquid diluent first rendered isotonic with sufficient saline or glucose.

[0160] Dispersions may also be prepared in glycerol, liquid polyethylene glycols, and mixtures thereof and in oils. As indicated, under ordinary conditions of storage and use, these preparations can contain a preservative to prevent the growth of microorganisms.

[0161] Sterile compositions can be prepared by incorporating the physiologically active component in an appropriate amount of a solvent with other optional ingredients (e.g., as enumerated above), followed by filtered sterilization. Generally, dispersions are prepared by incorporating the various sterilized physiologically active components into a sterile vehicle that contains the basic dispersion medium and the required other ingredients (e.g., from those enumerated above). In the case of sterile powders for the preparation of sterile injectable solutions, preferred methods of preparation can be vacuum-drying and freeze-drying techniques which yield a powder of the physiologically active components plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[0162] Oral compositions may be in liquid form, for example, as solutions, syrups or suspensions, or may be presented as a drug product for reconstitution with water or other suitable vehicle before use. Such liquid preparations may be prepared by conventional means with pharmaceutically acceptable additives such as suspending agents (e.g., sorbitol syrup, cellulose derivatives or hydrogenated edible fats); emulsifying agents (e.g., lecithin or acacia); non- aqueous vehicles (e.g., almond oil, oily esters, or fractionated vegetable oils); and preservatives (e.g., methyl or propyl-p-hydroxybenzoates or sorbic acid). The compositions may take the form of, for example, tablets or capsules prepared by conventional means with pharmaceutically acceptable excipients such as binding agents (e.g., pregelatinized maize starch, polyvinyl pyrrolidone or hydroxypropyl methylcellulose); fillers (e.g., lactose, microcrystalline cellulose or calcium hydrogen phosphate); lubricants (e.g., magnesium stearate, talc or silica); disintegrants (e.g., potato starch or sodium starch glycolate); or wetting agents (e.g., sodium lauryl sulphate). Tablets may be coated by methods well-known in the art. [0163] Inhalable compositions can be delivered in the form of an aerosol spray presentation from pressurized packs or a nebulizer, with the use of a suitable propellant, e.g., dichlorodifluoromethane, trichlorofluoromethane, di chlorotetrafluoroethane, carbon dioxide or other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in an inhaler or insufflator may be formulated containing a powder mix of the compound and a suitable powder base such as lactose or starch.

[0164] Compositions can also include microchip devices (U.S. Pat. No. 5,797,898), ophthalmic formulations (Bourlais etal., Prog Retin Eye Res, 17(l):33-58, 1998), transdermal matrices (U.S. Pat. No. 5,770,219 and U.S. Pat. No. 5,783,208) and feedback-controlled delivery (U.S. Pat. No. 5,697,899).

[0165] Supplementary active ingredients can also be incorporated into the compositions.

[0166] Typically, compositions can include at least 0.1% of the physiologically active components or more, although the percentage of the physiologically active components may, of course, be varied and may conveniently be between 1 or 2% and 70% or 80% or more or 0.5-99% of the weight or volume of the total composition. Naturally, the amount of physiologically active components in each physiologically-useful composition may be prepared in such a way that a suitable dosage will be obtained in any given unit dose of the compound. Factors such as solubility, bioavailability, biological half-life, route of administration, product shelf life, as well as other pharmacological considerations will be contemplated by one skilled in the art of preparing such pharmaceutical formulations, and as such, a variety of compositions and dosages may be desirable. [0167] In particular embodiments, for administration to humans, compositions should meet sterility, pyrogenicity, and the general safety and purity standards as required by United States Food and Drug Administration (FDA) or other applicable regulatory agencies in other countries.

[0168] The present disclosure includes cells including an artificial expression construct described herein. A cell that has been transformed with an artificial expression construct can be used for many purposes, including in neuroanatomical studies, assessments of functioning and/or non-functioning proteins, and drug screens that assess the regulatory properties of enhancers.

[0169] A variety of host cell lines can be used, but in particular embodiments, the cell is a mammalian cell. In particular embodiments, the artificial expression construct includes a regulatory element and/or a vector sequence of DLX2.0, minBglobin promoter, hSynl promoter, CMV promoter, hSynl promoter (shortened), 4x2C miR binding site, 8x2C miR binding site and/or eHGT_078h and/or CN3252, CN3254, CN3683, CN3684, CN3251, CN3253, CN3677, CN3678, CN4541, CN4542, CN4217, CN4218, CN4642, or CN4643, and the cell line is a human, primate, or murine cell. Cell lines which can be utilized for transgenesis in the present disclosure also include primary cell lines derived from living tissue such as rat or mouse brains and organotypic cell cultures, including brain slices from animals such as rats, mice, non-human primates, or human neurosurgical tissue.

[0170] WO 91/13150 describes a variety of cell lines, including neuronal cell lines, and methods of producing them. Similarly, WO 97/39117 describes a neuronal cell line and methods of producing such cell lines. The neuronal cell lines disclosed in these patent applications are applicable for use in the present disclosure.

[0171] In particular embodiments, "neuronal" describes something that is of, related to, or includes, neuronal cells. Neuronal cells are defined by the presence of an axon and dendrites. [0172] The term "neuronal-specific" refers to something that is found, or an activity that occurs, in neuronal cells or cells derived from neuronal cells, but is not found in or occur in, or is not found substantially in or occur substantially in, non-neuronal cells or cells not derived from neuronal cells, for example glial cells such as astrocytes or oligodendrocytes.

[0173] In particular embodiments, non-neuronal cell lines may be used, including mouse embryonic stem cells. Cultured mouse embryonic stem cells can be used to analyze expression of genetic constructs using transient transfection with plasmid constructs. Mouse embryonic stem cells are pluripotent and undifferentiated. These cells can be maintained in this undifferentiated state by Leukemia Inhibitory Factor (LIF). Withdrawal of LIF induces differentiation of the embryonic stem cells. In culture, the stem cells form a variety of differentiated cell types. Differentiation is caused by the expression of tissue specific transcription factors, allowing the function of an enhancer sequence to be evaluated. (See for example Fiskerstrand et al., FEBS Lett 458: 171-174, 1999).

[0174] Methods to differentiate stem cells into neuronal cells include replacing a stem cell culture media with a media including basic fibroblast growth factor (bFGF) heparin, an N2 supplement (e.g., transferrin, insulin, progesterone, putrescine, and selenite), laminin and poly ornithine. A process to produce myelinating oligodendrocytes from stem cells is described in Hu, etal., 2009, Nat. Protoc. 4: 1614-22. Bibel, etal., 2007, Nat. Protoc. 2: 1034-43 describes a protocol to produce glutamatergic neurons from stem cells while Chatzi, et al., 2009, Exp. Neurol. 217:407-16 describes a procedure to produce GABAergic neurons. This procedure includes exposing stem cells to all-trans-RA for three days.

[0175] U.S. Publication No. 2012/0329714 describes use of prolactin to increase neural stem cell numbers while U.S. Publication No. 2012/0308530 describes a culture surface with amino groups that promotes neuronal differentiation into neurons, astrocytes and oligodendrocytes. Thus, the fate of neural stem cells can be controlled by a variety of extracellular factors. Commonly used factors include brain derived growth factor (BDNF; Shetty and Turner, 1998, J. Neurobiol. 35:395- 425); fibroblast growth factor (bFGF; U.S. Pat. No.5,766,948; FGF-1, FGF-2); Neurotrophin-3 (NT-3) and Neurotrophin-4 (NT-4); Caldwell, et al., 2001, Nat. Biotechnol. l;19:475-9); ciliary neurotrophic factor (CNTF); BMP-2 (U.S. Pat. Nos. 5,948,428 and 6,001,654); isobutyl 3- methylxanthine; leukemia inhibitory growth factor (LIF; U.S. PatentNo. 6,103,530); somatostatin; amphiregulin; neurotrophins (e.g, cyclic adenosine monophosphate; epidermal growth factor (EGF); dexamethasone (glucocorticoid hormone); forskolin; GDNF family receptor ligands; potassium; retinoic acid (U.S. PatentNo. 6,395,546); tetanus toxin; and transforming growth factor-a and TGF-P (U.S. Pat. Nos. 5,851,832 and 5,753,506).

[0176] In particular embodiments, yeast one-hybrid systems may also be used to identify compounds that inhibit specific protein/DNA interactions, such as transcription factors for DLX2.0, minBglobin promoter, hSynl, promoter, CMV promoter, hSynl promoter (shortened), 4x2C miR binding site, 8x2C miR binding site, and/or eHGT_078h.

[0177] Methods are also provided for administering a system of expression constructs to a subject in need thereof, which include administering a therapeutically effective amount of a system disclosed herein to a sample or subject comprising a targeted population of cells, and inducing expression of the first expression construct and the second expression construct of the system to reconstitute a sodium channel alpha subunit, thereby rescuing voltage-gated sodium channel function within the targeted population of cells.

[0178] In various embodiments, the subject in need thereof has a sodium channelopathies, optionally comprising Dravet syndrome, myoclonic seizures, myoclonic astatic epilepsy (MAE), intractable childhood epilepsy with generalized tonic-clonic seizures, simple febrile seizures, generalized epilepsy and febrile seizures plus (GEFS+), migrating partial seizures of infancy, Lennox-Gastaut syndrome, or West syndrome.

[0179] In various embodiments, the system of expression constructs is administered to a human subject or mammalian subject. In some embodiments, the system of expression constructs is administered to cells or tissue cultures obtained from, or derived from, a human subject or a mammalian subject.

EXAMPLES

[0180] The following examples are provided to better illustrate the claimed invention and are not to be interpreted as limiting the scope of the invention. To the extent that specific materials are mentioned, it is merely for purposes of illustration and is not intended to limit the invention. One skilled in the art may develop equivalent means or reactants without the exercise of inventive capacity and without departing from the scope of the invention.

[0181] AAV-MEDIATED INTERNEURON-SPECIFIC GENE REPLACEMENT FOR DRAVET SYNDROME

[0182] Dravet syndrome (DS) is a devastating developmental epileptic encephalopathy marked by treatment-resistant seizures, developmental delay, intellectual disability, motor deficits, and a 10-20% rate of premature death. Most DS patients harbor loss-of-function mutations in one copy of SCN1A, which has been associated with inhibitory neuron dysfunction. Here we developed an interneuron-targeting AAV human SCN1A gene replacement therapy using cell class-specific enhancers. We generated a split-intein fusion form of SCN1A to circumvent AAV packaging limitations and deliver SCN1A via a dual vector approach using cell class-specific enhancers. These constructs produced full-length NaVEl protein and functional sodium channels in HEK293 cells and in brain cells in vivo. After packaging these vectors into enhancer- AAVs and administering to mice, immunohistochemical analyses showed telencephalic GAB Aergic interneuron-specific and dose-dependent transgene biodistribution.

[0183] These vectors conferred strong dose-dependent protection against postnatal mortality and seizures in two DS mouse models carrying independent loss-of-function alleles of Senia, at two independent research sites, supporting the robustness of this approach. No mortality or toxicity was observed in wild-type mice injected with single vectors expressing either the N-terminal or C-terminal halves of SCN1A, or the dual vector system targeting interneurons. In contrast, nonselective neuronal targeting of SCN1A conferred less rescue against mortality and presented substantial preweaning lethality. These findings demonstrate that interneuron-specific AAV-mediated SCN1A gene replacement is sufficient for significant rescue in DS mouse models and show that it could be an effective therapeutic approach for patients with DS.

[0184] Dravet syndrome (DS) is a severe early-onset epileptic encephalopathy marked by spontaneous and febrile seizures, motor disabilities, cognitive dysfunction, developmental delay, and heightened risk of premature death by sudden unexpected death in epilepsy (SUDEP). DS afflicts approximately 1 : 16000 births, usually manifests in the first year of life, and produces profound symptoms that require life-long care. Most first-line anti-epileptic drugs are ineffective or contraindicated for DS, although several recently approved drugs now partially ameliorate DS symptoms. Critically, no FDA-approved long-term disease-modifying treatments currently exist for DS despite extensive efforts. As a result, a treatment for DS is a pressing unmet need for patients and their caregivers.

[0185] Several aspects of DS pathophysiology have become clear. Over 80% of patients harbor monoallelic loss-of-function mutations in SCN1A, which encodes NaVl.l, one of the voltage-gated sodium channels expressed in brain. Mouse models with monoallelic Senia disruptions recapitulate the major clinical presentations of DS, confirming genetic causality in DS. These mouse models indicate DS is associated with loss of excitability in telencephalic fast-spiking interneurons, consistent with inhibition stimulation studies in patients.

[0186] Furthermore, targeted monoallelic Senia disruption in interneurons is sufficient for epileptic symptomology, which can be ameliorated by simultaneous disruption in excitatory neurons. Overall, these results indicate interneurons are the critical pathological cell population in DS.

[0187] Herein, we utilize enhancer-adeno-associated viruses (AAVs) to allow for interneuron-specific expression of a DS therapeutic transgene. To overcome the limitation of genome size constraints (~4.7kb) of AAVs, which precludes delivery of human SCN1A open reading frame (ORF, 6.0kb) in a single AAV, we split the SCN1A ORF into two fragments (two “halves”) that undergo intein-mediated ligation to reconstitute a scarless, full-length functional voltage-gated sodium channel Na_vl.l. [0188] In this study, we demonstrate functional rescue in DS mice by restoring SCN1A to telencephalic GABAergic interneurons using AAV vectors. With this split-intein mechanism and class-specific enhancers, we demonstrate interneuron-specific delivery and reconstitution of NaV 1 1, which completely rescues mortality in DS model mice and confers strong resistance to epileptic seizures. Importantly, we show that expression of the transgenes or individual SCN1A halves in WT mice does not cause any overt toxicity. In contrast, we also find that nonselective neuronal expression of SCN1A generates early lethal toxicity. Together these data indicate that telencephalic GABAergic interneuron-specific expression of SCN1A could provide a safe and effective therapeutic for DS.

[0189] Split-intein fusion constructs produce full-length functional NaVl- 1 channels.

[0190] The open reading frame for human SCN1A (6.0 kb) is larger than the packaging limit of recombinant AAVs(~4.7kb); this has thus far prevented an AAV gene replacement therapy for Dravet syndrome. To overcome this challenge, we divided the gene into two halves and used split-intein protein splicing to reconstitute the gene product (NaVl • 1 channel) after translation. We designed and built DNA constructs using this approach for a dual vector gene replacement therapy for Dravet syndrome in mice.

[0191] We developed split-intein DNA constructs using the predominantly expressed 1998- aminoacid SCN1A isoform (Fig. 8A-8E) sequence which was codon optimized to maximize expression. We placed the split-intein breakpoint directly upstream of Cysl050 since intein- mediated protein ligation requires the presence of a cysteine residue adjacent to the split site. We utilizedthe Cfa split-intein which was engineered for rapid activity and chemical stability. The N- and C-terminals of the Cfa split-intein were respectively fused to the N- and C-terminals of halvesof the split SCN1A gene (Fig. 1A). Tothe recombinant split-intein SCN1A halves, we added optimized short intron sequences within each coding sequence for improved expression, as well as HA and FLAG epitope tags for immunodetection (Fig. 1A, IB). Exemplary short intron sequences added are highlighted in italics or underlined in ‘hSCNlA-CO-Nterml049- Intron,’ ‘hSCNlA-CO-Cterm949-Intron,’ ‘hSCNlA-CO-Nterm956-Intron,’ ‘hSCNlA-CO- Cterml042-Intron,’ ‘hSCNlA-CO-Nterm947-Intron,’ and ‘hSCNIA-CO-Cterm 1051 -Intron’; or has a polynucleotide sequence of

GTAAGTACTAGCAGCTACAATCCAGCTACCATTCTGCTTTTATTCTATGGTTGGG ATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTTCATA CCTCTTATCTTCCTCCCACAG (SEQ ID NO: 107).

[0192] To confirm intein-mediated reconstitution of full-length NaVl. l channels, the SCN1A gene halves were expressed in HEK-293 cells using the CMV promoter (Fig. IB, 1C). By western blot analysis, cells transfected with HA-tagged full-length SCN1A produced a single band with an apparent mass of approximately 250 kDa (predicted mass 232 kDa, Fig. 1C Lane 2). Those transfected with HA-tagged SCNIA-Ntm half led to an apparent mass of approximately 135 kDa (predicted mass 134 kDa, Fig. 1C Lane 4). However, cells transfected with both SCNIA-Ntm and SCNIA-Ctm halves demonstrated a strong band at approximately 250 kDa and only a weak band at approximately 135 kDa (Fig. 1C Lane 6). These results demonstrate that our constructs led to efficient intein-directed ligation of NaV 1.1 protein halves.

[0193] To examine whether reconstituted NaVLl protein derived from split-intein SCN1A halves was functional, we assessed HEK-293 cells expressing these constructs by whole-cell patch clamp electrophysiology. Fluorescent protein reporters appended to each SCN1A expression construct were used to identify transfected cells and confirm expression (Fig. IB). To promote NaVLl channel cell surface expression, constructs were co-transfected with NaVL l P-subunits (SCN1B/SCN2B). We observed that cells expressing full lengthW/M (SCN1A-FL) constructs show rapid and large inward sodium currents in response to depolarizing voltage steps (Fig. ID). Cells expressing both SCNIA-Ntm and SCNIA-Ctm (SCN1A-N+C) also showed large inward sodium currents, but not cells expressing either halves alone or empty vector (Fig. ID). Quantification revealed significantly greater capacitance-normalized peak current densities in SCN1A-N+C cells (median current density 130.6 pA/pF) than cells singly expressing either SCN lA-Ntmor SCNIA-Ctm alone or negative controls (SCN1A-N 4.5 pA/pF, p< 0.0001; SCN1A-C 3.3pA/pF, p< 0.0001; GFP/empty vector 3.7 pA/pF, p= 0.0084; P-subunits alone 2.3 pA/pF, p= 0.0038, all pairwise Mann-Whitney U tests, Fig. IE). Median peak current density for SCN1A-N+C cells (130.6 pA/pF) was significantly less than that observed in SCN1A-FL cells (232.0 pA/pF)(p= 0.0083, pairwise Mann-Whitney U test), possibly due to less of each DNA construct used for transfections (0.4 pg each of SCNIA-Ntm and SCNIA-Ctm, as versus 0.8 pg forSCNIA-FL). Examinations of the records at faster time scales show that both SCN1A-FL and SCN1 A-N+C cells expressed similarly rapidly activating and inactivating inward currents in response to step depolarizations (Fig. IF)

[0194] To quantify whether the current mediated by split-intein reconstructed channels exhibit the known normal voltage-dependent gating properties of NaVLl channels, we analyzed conductance-voltage (G/V) activation and steady-state inactivation (SSI) relationships. Indistinguishable G/V activation plots were exhibited by cells transfected with SCN1A-FL (V50= -27.9 mV, slope= 6.2) or SCN1A-N+C (V50= -27.3 mV, slope= 5.9), similar to previously reported full-length SCN1A (V50= -26.4 mV, slope= 7.1). Steady-state inactivation profiles for SCN1 A-FL and SCN1 A-N+C also were both similar to that previously reported for full-length SCN1A (V50= -67.5 mV, slope= -6.2). The SSI profile for reconstituted SCN1 A-N+C currents (V50= -64.1 mV, slope= -6.0) was observed to be slightly but significantly depolarized by approximately 5 mV relative to SCN1A-FL (V50= -69.4 mV, slope= -6.0) (Fig. 1G). Overall, these results demonstrate that functional NaVl.l sodium channels are efficiently reconstituted from our two half SCN1A split-intein constructs, with similar voltage-dependent gating properties to full-length SCN1A, and this permits delivery of NaVl.l channel activity from two AAV-sized vectors.

[0195] DLX2.0-driven intein-SCNIA vectors specifically transduce telencephalic GABAergic interneurons.

[0196] We cloned the split-intein fusion protein halves HA-SCN1 A-Ntm and SCN1 A-Ctm- FLAG into AAV plasmid vectors under control of an optimized hDLXI5/6i enhancer (“DLX2.0”) that drives transgene expression in telencephalic GABAergic interneurons in mice and in human organotypic slice cultures. These constructs were packaged into research-grade AAV2/PHP.eB viral vectors at Packgene Inc. (Fig. 2A). Neonatal mice were injected at postnatal day 2 (P2) with 5 pL total volume of these vectors delivered bilaterally via intracerebroventricular (ICV) route at a dose of 3el0 genome copies (gc) of each half. After twenty days, we analyzed mouse motor cortex membrane protein content by western blot to assess efficiency of half joining byintein-mediated fusion. Mice injected with either DLX2.0- HA-SCN1 A-Ntm or DLX2.0-SCN1 A-Ctm-FLAG PHP.eB AAVs alone showed weak HA- or FLAG-immunoreactive bands near their expected sizes of their protein products (N-term predicted size 134 kDa, C-term predicted size 115 kDa, Fig. 2B). The weakness of the unjoined SCN1 A half-protein bands in our membrane protein fractionis likely a technical artifact of the western blot membrane protein preparation since half proteins alone are efficiently detected by immunohistochemistry (IHC) with no obvious change in subcellular distribution when codelivered (see Fig. 2C). Regardless, when we co-injected viral vectors for both halves, we observed strong bands near the expected apparent size of intact NaVl. l near 250 kDa, demonstrating that intein-mediated reconstitution of full-length human NaVl.l occurs in mouse telencephalic GABAergic interneurons (Fig. 2B).

[0197] We also analyzed the biodistribution of transgene expression by IHC in animals injected with these vectors. We observed strong HA and FLAG immunoreactivity in scattered tel encephalic neurons of mice injected with each vector alone, or when injected together (Fig. 2C). Both HA- and FLAG-expressing cells were present throughout the telencephalon after BL-ICV delivery (Fig. 2D). Co-staining with the GABAergic neuron markers GABA and Gad67 demonstrated high overlap of HA- and FLAG-expressing cells (Fig. 2C, 2E). Quantitative analysis of these IHC data showed that 98-99% of cells co-expressing HA and FLAG were Gad67+ in several telencephalic regions (Fig. 2F), indicating high specificity. Additionally, HA+FLAG+ expression was observed in a substantial proportion of the telencephalic Gad67+ GABAergic neuron population in different brain regions, including those known to be involved in seizure generation such as hippocampus and cortex (mean ± standard deviation: VISp 57 ± 18%, MO 57 ± 10%, HPF 31 ± 11%, n= 8 mice, Fig.2F). Using a separate cohort of Dlx5/6-Cre; AH4 mice to provide an independent label for telencephalic GABAergic interneurons, we observed similar high levels of specificity and moderate completeness of AAV transduction (Fig. 9A-9C). These data demonstrate that ourvectors effectively deliver both the C- and N-terminal split-intein SCN1A halves with high specificity and moderate coverage for telencephalic GABAergic interneurons in mice. Together with our western blot and electrophysiology results above, these findings indicate that upon their expression, the two protein products of SCN1A halves fuse using the split-intein leadingto expression of functional full-length NaVLl channels in telencephalic GABAergic interneurons.

[0198] Dual DLX2.0 split-intein SCN1A vectors protect against Sudden Unexpected Death in Epilepsy (SUDEP) in DS mouse models.

[0199] With the ability to deliver SCN1A with cell class specificity, we next tested whether telencephalic GABAergic interneuron-specific SCN1A gene replacement could rescue DS symptoms in mouse models. We used Sen lcr ;Meox2-('re DS mice bred on a pure C57B1/6 background housed at Seattle Children’s Research Institute. These animals demonstrate -50% mortality due to SUDEP by P70 (young adulthood), similar to other DS model lines, analogous to the -15% rate of SUDEP in DS patients. We injected a cohort of neonatal DS model mice (BL-ICV at P0-P3 with 3el0 gc each vector of DLX2.0-split-intein-SCNlA produced at PackGene Inc.) and measured mortality and susceptibility to thermally induced seizures in these animals (Fig. 3A). Remarkably, all treated DS mice (n= 27/27) survived beyond P70, as opposed to negative control mice either untreated or receiving control AAVs (empty or single part-only vectors), which showed significantly higher mortality (Untreated n= 34/68, 50% mortality, Log-rank testp=1.4e-5; Empty/single-part n= 15/40, 37.5% mortality, Log-rank test p= 6.6e-4; Fig. 3B). No side effects were observed in littermate control mice that received dual vector treatment. Strikingly, a subset of the treated DS animals was maintained longer, and 100% survived to beyond P200 (n= 11/11). These findings demonstrate that telencephalic GABAergic interneuron supplementation of AAV-mediated SCN1A transgene is sufficient to completely prevent SUDEP in DS model mice.

[0200] Dual DLX2.0 split-intein SCN1A vectors protect against thermal and spontaneous myoclonic and generalized tonic-clonic seizures in DS mice.

[0201] DS model mice are sensitive to thermally induced seizures, similar to patients with DS. We previously showed that small elevations of core body temperature trigger several myoclonic (MC) seizures, leading ultimately to a generalized tonic-clonic (GTC) seizure in DS mice. To examine the efficacy of our DLX2.0-SCN1A vectors in preventing thermally induced MC and GTC seizures, we analyzed DS mice between P25 to P35 (Fig. 3A). First, video analysis revealed that mice treated with dual DLX2.0-SCNlAAAVs were protected from MC seizures up to 40°C (n= 16/16), at which temperature fewer than half of negative control mice were MC seizure-free (Untreated n= 5/11, Empty/single part AAVs n= 9/23, Fig. 3C). With further increase in temperature, a growing fraction of the treated mice began to exhibit MC seizures, reaching 50% of treated subjects at 42°C (n= 8/16), but at this temperature significantly greater numbers of negative control mice experienced MC seizures (Untreated n= 9/11, Fisher’s exact test p= 0.042; Empty/single part AAVs n= 21/23, Fisher’s exact test p= 0.011, Fig. 3C). Second, we also observed treatment conferred GTC seizure protection in this assay. All treated DS mice were free from GTC seizures up to 41.5°C (n= 16/16, 100%, Fig. 3D), and the majority were protected from GTC seizures at 42°C, the last stage of the test (n= 15/16, 94%), which was significantly greater than negative control treatment groups (Untreated n= 4/11, Fisher’s exact test p= 0.0025; Empty/single part AAVs n= 7/23, Fisher’s exact test p= 0.00010, Fig. 3D). Finally, assessment of all the MC seizures preceding a GTC seizure showed that the cumulative MC eventnumber before GTC seizure onset was significantly suppressed when DS mice were treated with dual DLX2.0-SCN1A AAVs compared to control AAV treatment or untreated controls (Untreated or Empty/single versus DLX2.0-SCN1A, unpaired t-test p< 0.01 at each temperature from 39.5 to 41, Fig. 3E). These findings demonstrate that DLX2.0 mediated delivery of SCN1 A transgene specifically to telencephalic GABAergic interneurons yields substantial protection from thermally induced seizures.

[0202] To assess the effectiveness of DLX2.0-SCN1A AAV vectors to protect against spontaneous epileptic symptoms, we implanted untreated and treated DS model mice with electrodes for electrocorticography (ECoG) recordings. Untreated DS model mice displayed high-amplitude interictal spikes (Fig. 4A), and the frequency of these spikes was significantly diminished by dual DLX2.0-SCN1A AAV treatment (ANOVA p <0.05, Fig. 4B). We also observed MC events during recordings of untreated DS model mice (n = 10/10, Fig. 4C), which were absent in recordings from treated DS model mice (n = 0/10, p = 7. le-4 Fisher’s exact test, Fig. 4D). Finally, some untreated DS model mice exhibited spontaneous generalized tonic- clonic (GTC) seizures during recording (n = 2/10, Fig. 4E), but none of the treated animals exhibited GTCs (n= 0/10, Fig. 4F) although this effect was not significant (p = 0.47 Fisher’s exact test). These results indicate that dual DLX2.0-SCN1A AAV treatment can yield protection against spontaneous epileptic symptoms in DS model mice.

[0203] Reproducibility across AA V batches.

[0204] To confirm these findings, we tested independent batches of DLX2.0-split-intein- SCN1 A vectors produced in-house at Allen Institute, and we observed dose-dependent specific expression in telencephalic GABAergic interneurons (Fig. 10A), which correlated with dosedependent increases in survival (Fig. 10C) and thermal seizure protection (Fig. 10D, 10E). These results show reproducibility of epileptic rescue with independent batches of vector, although this in-house-packaged vector showed slightly reduced levels of transduction and rescue as compared to the commercially produced vector.

[0205] Reproducibility across mouse models and testing sites.

[0206] To further confirm the therapeutic effects were robust, we retested the PackGene batch of DLX2.0-SCN1A AAV vectors in a second independent Scnla^+/R6I3X mouse model cohort on 129/BL6 Fl background housed at the Allen Institute for Brain Science (Fig. 11A). We observed extensive premature lethality by P70 (n= 19/33, 58%) in uninjected mice, but 100% survival in mice injected with the dual DLX2.0-SCN1A AAVs (n= 30, p< 0.001 by Mantel-Cox test, Fig. 11B) After mortality monitoring, some mice underwent ECoG recordings which revealed untreated DS model mice demonstrate spontaneous GTC seizures and interictal spikes between P70 to P120 (Fig. 11C), which were both significantly reduced by the administration of DLX2.0-SCN1A vectors (GTCs, Mann-Whitney U test p= 0.020; spikes, Mann-Whitney U test p=0.033; Fig. 11C-11E). A separate subset of this cohort was monitored for long-term survival, and the dual -vector injected DS model mice exhibit 100% survival to beyond P365 (n= 14/14), with no evidence of toxicity in littermate control animals receiving either or both halves of the DLX2.0-SCN1A dual AAV vector system (Fig. 11B). Thus, DLX2.0-SCN1A AAV vectors conferred strong and reproducible protection from mortality, induced seizures, and spontaneous seizure burden in two independent genetic mouse models of DS, although this protective effect of the AAV treatment is dose- and AAV quality- dependent.

[0207] Dual DLX2.0 vectors completely protect against SUDEP and seizures in mice with telencephalic GABAergic interneuron-specific Senia deletion.

[0208] Much more severe mortality has been seen with specific Senia loss in interneurons, likely due to a disrupted excitatory/inhibitory balance. To further investigate the effectiveness of the dual vector DLX2.0-SCN1 A AAV, we delivered the AAVs to mice carrying the diseasecausing mutation in the same telencephalic GABAergic interneuron population. These mice were generated by crossing mice carrying the Dlx5/6-Cre allele with mice carrying floxed Senia (Fig. 5A). Since the site of disease pathogenesis precisely matches the treatment target, in this experiment we directly tested the therapeutic effectiveness of the DLX2.0-SCN1 A AAV vectors in amore precise rescue scenario. Scnlaf⁺; Dlx5/6-Cre mice were injected with dual DLX2.0-SCN1A AAVs via BL-ICV route at P0-3 and monitored these and untreated controls for premature death up to P70. Untreated mice showed severe mortality starting at postnatal week 3, with all the mice succumbing by week 6 (n= 31/31, Fig. 5B). In striking contrast, all mice treated with dual DLX2.0-SCN1 AAAVs survived up to P70 (n= 9/9, p= 3.3e-14, Fisher’s exact test, Fig. 5B), despite the more severe adverse phenotype compared to mice with a global heterozygous loss of Senia. Furthermore, none of the treated mice exhibited either MC or GTC seizures during thermal challenge up to 42°C, unlike untreated animals (MCs: n= 8/10 untreated versus 0/9 treated, p= 7.1e-4; GTCs n= 10/10 untreated versus 0/9 treated, p= l.le-5; both Fisher’s exact test; Fig. 5C). These data demonstrate that rescue of DS phenotypes is possible when the therapeutic transgene is precisely delivered to the critical cell populations carrying disease-causing mutations, even in the face of more severe symptoms.

[0209] hSynl -driven vectors lead to nonselective neuronal expression of SCN1A.

[0210] To determine whether telencephalic GABAergic interneuron-selective targeting is beneficial for gene replacement therapy in DS, as a comparator we produced and tested split- intein vectors driven by hSynl which expresses in most brain neuronal populations, including excitatory and inhibitory neurons (Fig. 6A). Constructs for these vectors were built and packaged in the same way as DLX2.0 ones, except that hSynl promoter was used in this case. Delivered by BL-ICV at P2, these hSynl split-intein vectors led to reconstitution of full-length NaVl .1 in mouse brain (Fig. 6B), with expression observed in both Gad67+NeuN+ and Gad67- NeuN+ neurons (Fig. 6C). This expression pattern was observed throughout the telencephalon, with little expression seen in subtelencephalic structures likely due to the forebrain-biased delivery route (Fig.6D). Quantification confirmed all labeled cells to be NeuN+ neurons (Fig. 6E-6F). Average levels of completeness for HA+FLAG+ cells ranged from 6-18% NeuN+ cells and 13-33% Gad67+ cells, depending on the telencephalic structure analyzed (Fig. 6F). [0211] Dual hSynl nonselective neuronal AAV vectors led to pre-weaning mortality.

[0212] To characterize the efficacy of nonselective neural SCN1 A AAVs to prevent SUDEP, we conducted spontaneous mortality surveillance in Senia'¹ ;Meox2-Cre DS model mice after BL-ICV injection of dual N+C AAVs at P0-3, as compared to untreated or empty or single part control animals (Fig. 7A). During the preweaning weeks, mice treated with dual nonselective AAVs exhibited a surfeit of unexpected deaths. The extent of pre-weaning mortality by P21 was dose-dependent and significantly greater than that observed under negative control conditions (low dose lelOgc each vector, n= 9/36 [25%] deaths, Fisher’s exact test untreated comparison p= 3.6e-4; high dose 3el0 each vector, n= 8/18 [44%] deaths, Fisher’s exact test untreated comparison p= 6.6e-6, Fig. 7B). Since it was not possible to recover the lost pups, census of DS and control mice during pre-weaning stage were estimated based on the number of DS and control mice identified after genotyping at P21 in this experiment and the number of DS and control mice observed in our colony in untreated mice in the prior 6 months period. This analysis did not reveal any detectable influence of genotype on nonselective hSynpl SCNIA-induced mortality during the preweaning period (Fig. 7C), and pathology of surviving nonselective SCNIA-expressing brains indicates no obvious microgliosis but greater DS- associated astrogliosis. For the latter, we sacrificed Scnla+/fl; Meox2-Cre DS model and Cre- negative littermate control mice following mortality monitoring after P2 BL-ICV injections (3el0 gc each vector) of hSynl-driven N+C SCN1A or DLX2.0-driven N-only or C-only single-part negative controls as indicated. Animal ages ranged from P72-P89, and conditions represent two (littermate control) or three (DS model) animals analyzed per condition. We analyzed brains by H4C to assess astrogliosis (GFAP) or microgliosis (Ibal) in cortex (VISp shown). For DS model mice, two example mice spanning the range of astrogliosis observed are shown. Nonselective neuronal expression of SCN1A exacerbated astrogliosis in DS model mice but did not cause overt changes in microglial appearance.

[0213] In the post-weaning period, we did not observe significant effects on survival or average age of death with either the high-dose or low-dose nonselective SCN1A AAV treatments as compared to the untreated or control AAV-treated mice (Fig. 7B). Together these findings indicate that nonselective neuronal expression of SCN1 A offers little protection from SUDEP and concemingly, has a dose-dependent mortality side effect during the pre-weaning period.

[0214] Dual hSynl AAV vectors confer partial protection against thermal MC and GTC seizures.

[0215] To examine whether the nonselective AAV mediated gene therapy might counter thermal seizure susceptibility in DS mice, we also tested these mice with the thermal seizure induction protocol. In surviving mice treated with the high dose of dual hSynl AAVs, as compared to empty/single part negative controls we observed significant protection from thermally induced MC seizures (Log-rank test p= 0.020, Fig. 7D) and from thermal GTC seizures (Log-rank test p= 0.047, Fig. 7E), as well as greatly lessened cumulative MC seizure load during the thermal challenge (Untreated or Empty/single versus Treated, unpaired t-test p< 0.05 at each temperature from 40 to 41, Fig. 7F). In contrast we observed no protection from heat-induced MC or GTC seizures in mice treated with the lower dose of the dual AAVs compared to negative control DS mice (Fig. 7C, 7D-7F). Thus, despite the early mortality induced by nonselective neuronal SCNIAAAVs, they can offer protection from thermally induced MC and GTC seizures at the higher dose in surviving treated adults.

[0216] Overall in this study we use an AAV viral vector system to deliver functional human NaVL l tovpathological telencephalic GABAergic interneurons in DS model mice, using the optimized enhancer DLX2.0 to achieve high specificity. Importantly, we find not only that delivery to telencephalic GABAergic interneurons is sufficient for strong rescue of epileptic symptoms, but also that this specificity is required to deliver human NaVL l in a manner that can be tolerated. DLX2.0-SCN1A AAV vectors achieve long-term recovery of DS mortality (from 200 days to one year) in two independent DS mouse models at two independent testing sites. This robust mortality protection correlates with strong anti-seizure effect, indicating the mechanism behind mortality rescue is seizure reduction due to resupplying NaVLl voltagegated sodium channel activity.

[0217] The cell types sufficient to rescue DS epilepsy using SCN1A gene replacement [0218] Mouse models of DS indicate that epileptic symptomology is driven by a disruption of the excitatory/inhibitory balance, and congruently, patients display disrupted inhibition in the cortical microcircuit. Our results confirm the hypothesis that DS epilepsy is primarily a disease of the interneurons, and that interneuron targeting is an effective therapeutic strategy. Within this cell class, several subclasses of telencephalic GABAergic interneurons may contribute to the disease. In particular, PVALB-expressing interneurons are thought to promote beneficial gamma rhythms, possibly through their 5cw7a-dependent fast-spiking behavior. However, SST-expressing and other telencephalic interneuron populations may also contribute to the epileptic phenotype. Importantly, the DLX2.0 enhancer used in this study targets both PVALB and SST and also other telencephalic interneuron populations, in both mouse and human tissue, which may explain the strong anti-epileptic effect of DLX2.0-SCN1A.

[0219] A unique SCN lA-targeting disease-modifying strategy for DS.

[0220] Drug development to find pathway modulators that can overcome deficient NaVl .1 has been challenging. In preferable embodiments, we do not utilize high-capacity adenoviral vectors (HC-AdVs, also known as Helper-Dependent or “gutless”), which are devoid of all viral coding genes but offers limited biodistribution. Our AAV-mediated and telencephalic GABAergic interneuron-selective gene replacement strategy is unique in several ways. First, our vectors express a new copy of the SCN1A gene at moderate levels, and don’t upregulate the endogenous SCN1 A allele, a strategy that could be ineffective or harmful for certain disease alleles. Second, the vectors target telencephalic GABAergic interneurons, the most essential cell type, whereas antisense oligonucleotides lack targeting ability. Since SCNIAis expressed in both excitatory and inhibitory neurons, activation of the SCN1A in excitatory neurons could attenuate the corrective effect of SCN1A upregulation. Last, using AAV delivery, with the PHP.eB capsid and neonatal ICV injection, widespread viral transduction of telencephalic GABAergic interneurons is achieved. We are not aware of a vector that can give superior widespread transduction to the brain. None of the previous studies delivering or activating NaVl .1 using exogenous agents have demonstrated complete recovery of mortality as we have observed. Moreover, we have demonstrated robust rescue in two genetic models, at two research sites, and even in a severe £>/x5/6-Cre-driven knockout model. Thus, gene replacement, cell type specificity, and broad coverage of telencephalic interneurons provides a unique and highly effective treatment for DS.

[0221] In some embodiments, for human DS patients, the split-intein fused SCN1A halves are delivered in advanced BBB-penetrant AAV capsids. For example, the AAV capsids comprise AV-PHP.eB, which efficiently transduce the central nervous systems. In another example, the AAV capsids comprise AAV-PHP.S, which efficiently transduce the peripheral nervous systems. In some embodiments, for human patients, the split-intein fused SCN1A halves delivered in AAV capsids are administered locally with intraparenchymal delivery.

[0222] We have not observed signs of toxicity over one year in mouse brain, indicating the intein fusion proteins delivered with cell type specificity may be safe. Other regulatory elements that confer a broader distribution pattern extending to sub-telencephalic regions may be used to further treat non-epileptic DS. For example, regulatory elements known capable of restricting expression to interneurons include the distal-less homeobox 5 and 6 (Dlx5/6) genes, which are specifically expressed by all forebrain GABAergic interneurons during embryonic development. These genes have an inverted orientation relative to one another and share a 400bp (mI56i or mDlx) and a 300bp (mI56ii) enhancer sequence in the lOkb non-coding intergenic region 3' to each of them. The high degree of conservation of these sequences across vertebrate species is indicative of an important role in gene regulation. Indeed, the mDlx enhancer can be used to target reporter genes in a pattern very similar to the normal patterns of Dlx5/6 expression during embryonic development, e.g., selectively expressed within GABAergic interneurons in a wide variety of vertebrate species.

[0223] We demonstrate proof-of-concept for AAV-mediated SCN1A gene replacement therapy in DS, which requires cell class specificity for safety and efficacy. These results and these vectors represent an important step towards a gene replacement gene therapy for DS patients, and possibly other conditions of pathologic insufficient sodium channel function.

[0224] Materials and Techniques

[0225] Study design.

[0226] In this study, we tested if SCN1A gene replacement was possible using a dual vector system to deliver both halves of the molecule and split-intein technology to fuse the expressed halves into a single scarless full length protein. We tested if SCN1A could be expressed in a circuit-selective fashion using enhancer- AAVs to deliver the transgenes to telencephalic GABAergic interneurons or all neurons. Then, we tested if these circuit-selective enhancer- AAVs were sufficient to correct major epileptic phenotypes of mouse models of DS. We selected three major phenotypes for testing that correspond to clinical phenotypes seen in DS: mortality, heat-induced seizures, and spontaneous seizures. In vitro studies were conducted with multiple biological replicates to demonstrate the full-length NaVl.l functional channels were formed. NaVl.l assembly in GABAergic interneurons was demonstrated in multiple replicates using viruses from two sources in vivo in mouse brains. Circuit specificity and completeness were demonstrated by immunohistochemistry, and full-length NaVl .1 assembly by western blot analysis. Totest in vivo efficacy, litters of mice were treated between P0 andP3 with the dual viruses or single virus controls by ICV administration prior to genotyping. In vivo efficacy was further confirmed at a second site on a genetically distinct DS model. Animal sexes were noted but no differences in efficacy were observed. All experiments were run across multiple litters, and sufficient animals were injected and evaluated to draw statistically significant conclusions. Dual vectors produced from two sources were tested in vivo for efficacy. Researchers were not blinded to the animal genotypes after they were determined, or tovectors delivered. No data were excluded from the study. For statistical analysis, data are reported as means ± standard error of the means, Kaplan-Meier survivor plots, or grouped bar graphs. Comparisons across continuously distributed grouped data were prefaced by Shapiro- Wilk tests for distribution normality. When normality assumptions held, the groups were compared by unpaired t-tests or ANOVA, but when normality is not held, the groups were compared by Mann -Whitney U test. To compare Kaplan-Meier survivorship, we used Mantel- Cox or Log-rank tests. To compare groups of categorical count data, we used Fisher’s exact tests. Differences were considered significant at p< 0.05, with Bonferroni adjustments to significance thresholds with comparisons across multiple groups.

[0227] Mice at Seattle Children ’s Research Institute (SCRI).

[0228] For studies conducted at SCRI, all experimental procedures were conducted in compliance with the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health and were approved by the Institutional Animal Care and Use Committee (IACUC) of the Seattle Children’ s Research Institute under protocol ACU000108 (PI-Kalume). WT and mutant mice within litters were used in these experiments. They were subjected to a treatment or control paradigm as described below. Both male and female mice were included in the studies. Each litter was randomly assigned to the treatment or control group.

[0229] Mice at SCRI were maintained in standard cages for laboratory mice, on a 12h:12h light-dark cycle, with ad-libitum access to food and water, at 23 degrees C. Mouse models and their littemiates of DS used in these studies were generated using Cre-Lox technology'. DS mice carrying a whole body heterozygous knock-out of Senia were obtained by breeding fl oxed Senia mice with MCOX2~Cre mice (Strain #: 003755; Jackson Laboratories). DS mice carrying an Senia KO allele restricted to (specifically in) forebrain GABAergic neurons alone were generated by breeding fl oxed Senia mice with Dlx5/6-Cre mice (Strain#: 008199, Jackson Laboratories). All breeder mice were maintained on a C57BL/6J background for at least 10 generations. Animals were genotyped for Senia floxed allele using the following primers: FHY311 (5’-CTTGATGTGTTGAAATTCAC-3’ (SEQ ID NO: 103)) and FHY314 (5’- TATAGAGTGTTTAATCTCAAC-3’ (SEQ ID NO: 104)) which yielded a 846 BP WT allele, 1019 BP floxed allele; and 258 BP excised allele. For the Cre alleles, the following primers were used: (5’-GGTTTCCCGCAGAACCTGAA-3^' (SEQ ID NO: 105)) and (5’- CCATCGCTCGACCAGTTTAGT-3’ (SEQ ID NO: 106)) (Jackson Laboratories) [0230] Thermal seizure test.

[0231] Mouse core body temperature was monitored and controlled using a rectal temperatureprobe (RET4) and a heat lamp, both connected to a temperature controller in a feedback loop (Physitemp Instruments Inc.). Baseline body temperature was measured and subsequently, gradual temperature increases of 0.5°C every 2 minutes were conducted until seizure occurrence ora 42°C temperature was attained. Then, the mouse was immediately cooled down using a small fan. Mouse behavior during the whole test was recorded using a digital video camera and reviewed for MC and GTC seizure scoring.

[0232] Intracerebroventricukir Injections.

[0233] Neonatal P0-3 mice were cryo-anesthetized on a small aluminum plate placed on ice. Single AAVs or dual AAVs were injected bilaterally into lateral ventricles using a33-gauge needle attached to Hamilton microliter syringe. 2.5 pl of the AAV solution were injected in each ventricle for a total of 5 pl per mouse containing a total of lelO or 3el 0 gc each viral vector. Following the injection, mice were put back into their nest and placed on a wanning pad until their body temperature returned to normal. Subsequently, they were returned to the cage with the mother.

[0234] Electrocorlicography (ECoG) electrode implantation surgery.

[0235] Mice underwent survival surgery to implant ECoG and EMG electrodes, under isoflurane anesthesia. A midline incision was made above the skull to expose the site of electrode implantation. ECoG electrodes consisted of a micro-screw attached to a silver ware (diameter: 130 pm bare; 180 pm coated). EMG electrodes were made of asilver wire shaped in a loop at one end. An ECoG electrode micro-screw was inserted into a small cranial burr hole above the somatosensory cortex in each hemisphere. Similarly, a reference electrode microscrew was placed in a burr hole above the cerebellum. EMG electrodes were inserted and secured into the neck muscles. All electrodes were attached to an interface connector and the assembly was affixed to the skull with dental cement (Lang Dental Manufacturing Co., Inc., Wheeling, IL, United States). The incision around the electrode implant was closed using sutures. Mice were allowed to recover from surgery for 1-3 days before recording.

[0236] Video-ECoG-FJvIG recording.

[0237] Simultaneous video-ECoG-EMG records were collected in conscious mice on a PowerLab 8/35 data acquisition unit using LabChart 8.0 software (AD Instruments, Colorado Spring, Co). All bioelectrical signals were acquired at 1 -KHz sampling rate. The ECoG signals were processed with a 1-70 Hz bandpass filter and the EMG signals with a 10-Hz highpass filter. Power-spectral densities of the electrical signals were computed, and video-ECoG-EMG records were inspected for interictal spikes and ictal epileptiform events. Interictal spikes were characterized on ECoG as discharges with an abrupt onset, a sharp contour, and an amplitude greater than twice the background activity. Interictal spikes were frequently followed by a slow wave, but they were not associated with increased EMG activity or movement on video. Conversely, GTC seizure events were marked at their onset by bursts of generalized spikes and waves of increasing amplitude and decreasing frequency on ECoG. They coincided with increased activity on EMG and video and were followed by a distinct period of post-ictal ECoG suppression.

[0238] For studies conducted at Allen Institute for Brain Science (AIBS), all mice were handled under appropriate institutional protocols and guidelines. Procedures were approvedby the Allen Institute Institutional Animal Care and Use Committee under protocols 2002 and 2301. We housed animals in a 14: 10 light:dark cycle in ventilated racks with ad libitum access to food (LabDiet 5001) and water, as well as enrichment items consisting of plastic shelters and nesting materials. Young animals are weaned promptly at 21 days of age. We obtained 129S1 SvIm.J-Scn

^f J (here Scnla^+/R613X) mice from Jackson Laboratory (strain # 034129) and maintained breeders on a 129Sl/SvImJ genetic background. To generate experimental animals to model DS, we crossed these animals with C57B1/6J mice (Jackson strain # 000664), which resulted in a 50:50 129:BL/6 Fl genetic background which has been successfully used to model epileptic phenotypes of DS in animals containing one loss-of-function allele of Senia. Genotyping was performed with tail biopsy at P2, and we utilized PCR-sanger genotyping services at Transnetyx for this line. For survival monitoring we checked cages twice daily for the presence of deceased animals.

[0239] Neonatal ICV injection.

[0240] We used the neonatal intracerebroventricular (ICV) injection technique. Briefly, we anesthetized P2 neonates with ice but shielded from direct ice exposure. During anesthesia, pups were injected freehand bilaterally with 5 pL (2.5 pL each hemisphere) of AAV-containing solution using a Hamilton syringe. AAVs were diluted in sterile PBS to expel either lelO or 3el0 gc of each of two halves of the dual -vector encoding split-intein human SCN1 A. In control animals, only one half of the dual-vector system was delivered, orDLX2.0-SYFP2-only empty control vector (CN1390), or mice were left untreated. Control vectors were delivered at 3el0 gc per animal with BL-ICV delivery. After injection, pups were gently warmed on a cage warmer set to 28°C with mother present.

[0241] Continuous video ECoG/EMG recordings.

[0242] We implanted adult mice (P56-90) with ECoG/EMG headmount. For stereotaxic surgical procedures, we induced anesthesia in mice first with 5% isoflurane in oxygen, and then maintained anesthesia with 1.5-2.5% isoflurane. We implanted screw electrodes with wire lead (0.08”, #8405, Pinnacle Technology Inc., KS, USA) over the left somatosensory cortex (AP: + 1 mm; ML: - 2.5 mm), the right parietal lobe (AP: - 2 mm; ML: + 1.5 mm), the right frontal area (AP: + 1 mm; ML: + 2.5 mm; as Ground), and the cerebellum (AP: - 5.8 to -6.2 mm; ML: 0 mm; as Reference). Electrode leads were soldered onto the 8-pin headmount (#8431-SM, Pinnacle Technology Inc.). The headmount contains two insulated EMG wire electrodes that are pre-soldered, and these EMG electrodes were inserted into the neck muscles. All wires, pins and the headmount were embedded in light curable dental composite resin (Prime-Dent, Prime Dental Manufacturing Inc., Chicago, IL, USA). Mice were singly housed post-surgery and recovered for at least 7 days prior to recording. Recordings were thus acquired between ages P74 and Pl 23. For recordings at AIBS, mice were singly housed in 10-inch clear acrylic chambers (#8228, Pinnacle Technology Inc.) under a 14-hr on, 10-hr off light/dark cycle. Mice were tethered with the pre-amplifier through a commutator to the data acquisition system (#8401 -HR, Pinnacle Technology Inc.). All ECoGZEMG data were recorded with a 500 Hz sampling rate, 10 X gain, a low pass (ECoG: 0.5 Hz; EMG: 1 Hz) filter, and a high pass (500 Hz) filter. Videos were recorded synchronously at a frame rate of 10 frame/s with a resolution of 640x480 pixels. We implanted a total of 18 non-injected Scnla^+/R6I3X mice. Of these 18, seven mice (3M+4F) died during recovery prior to recording, and we recorded from the remaining 11 mice (9M+2F). For these non-injected animals we recorded for 53 to 335 hrs, with some recordings prematurely shortened because of death during the recording session. We also implanted a total of 11 DLX2.0-intein-SCNlA-injected Scnla^+/R613X

Of these 11, we could not record from three mice due to surgical error (IM), hardware failure (IM), and one death following surgery (IF), and we recorded from the remaining 8 mice (4M+4F). For these DLX2.0-intein-SCNl A-injected Scnla^+/R613X mice we recorded animals for 207-257 hrs.

[0243] To quantify ECoG data at AIBS, we quantified the number of GTC seizures with manual counting over the recording period and expressed as the average number of GTC seizures per 24 hrs of recording. We also counted the number of interictal spikes (IISs) during the last 24 hrs of recording for each mouse. To do so, we identified candidate interictal spike events by cumulative line length over a time interval of 50 msec with a threshold of 50 microvolts, plotted each candidate event for visual confirmation of spike-like characteristics (high amplitude strong deflection and return to baseline within 30 msec) and counting. These measurements of epileptic activity are non-normally distributed among animals by Shapiro- Wilk tests for normality (GTCs within non-injected Scnla^+/R613X mice: W= 0.81, p=0.018; IISs within non-injected Scnla^+/R613X mice W= 0.72, p= 0.0015), we compared number of seizures and IISs were compared between groups by two-sample Mann-Whitney U test (with significance level at p= 0.05). [0244] IHC.

[0245] Under avertin terminal anesthesia, we perfused mice with ice-cold PBS with 0.25 mM EDTA added (25 mL), followed by cold 4% PFA inPBS (12 mL). Following brain and other organ dissection, we post-fixed brain in 4% PFA in PBS overnight at 4°C. We prepared PFA in PBS in one liter-sized batches by dissolving PFApowder in PBS with heating, and froze 50 mL aliquots at -20°C until use, which was important as we found anti-Gad67 and antiGABA stain quality depended upon PFA preparation method. After overnight postfixation, we transferred brainsto 30% sucrose solution in PBS, and then embedded in OCT after sinking (48-72 hours), froze on dry ice and stored at -80°C until sectioning. We sectioned brains sagittally at 25 micron thickness using a Leica CM3050S cryostat and stored sections in PBS at 4°C until IHC. For IHC we permeabilized and blocked sections with blocking solution (PBS containing 5% normal goat serum [Thermo Fisher Scientific # 10000C] and 0.1% Triton X-100 [Millipore-Sigma # X100-100ML]) for 60 minutes. Then we probed with diluted primary antibody in blocking solution overnight, washed twice for 15 minutes each with PBS containing 0.1% Triton X-100 (PBSX), then detected with diluted 488-, 555-, and 647-conjugated secondary antibodies (all from Thermo Fisher Scientific) along with DAPI (4',6-Diamidino-2- phenylindole dihydrochloride, Millipore-Sigma # 10236276001, used at 1 pg/mL), washed twice with PBSX for 15 minutes each, then mounted sections on SuperFrost Plus slides (VWR # 48311-703) in Prolong Gold (Thermo Fisher Scientific # P36930), and imaged after overnight curing. We imaged slides on a Nikon TI-Eclipse epifluorescent or an Olympus FV-3000 confocal microscope.

[0246] We used the following primary antibodies: mouse monoclonal anti-FLAG clone M2 (Millipore-Sigma # F1804), rabbit monoclonal anti-HA clone C29F4 (1/1000, Cell Signaling # 3724S), mouse monoclonal anti-HA clone 16B12 (1/1000, Biolegend # 901513), mouse monoclonal anti-HA clone HA.C5 (1/1000, Thermo Fisher Scientific # MA5-27543), mouse monoclonal anti-Gad67 clone 1G10.2 (1/250, Millipore-Sigma # MAB5406), mouse monoclonal anti-NeuN clone 1B7 (1/500, Novus Biologicals # NBP1 -92693 AF647), guinea pig polyclonal anti-GABA(l/500, Millipore-Sigma # AB 175).

[0247] For costains with anti-Gad67, we omitted Triton X-100 detergent permeabilization during Gad67 immunoprobing and detection, then re-fixed antibodies onto sections with 4% PFA in PBS for 15 minutes at room temperature, then washed with PBS twice for 15 minutes, then performed a second round of permeabilization and reblocking and antibody staining with desired co-staining antibodies. This technique permits the best detection ofGad67+ inhibitory neurons. [0248] SCN1A ORF design, cloning, and packaging.

[0249] To design full-length human SCN1 Afor delivery in split intein-fusion SCN1A halves we started with the 1998-amino acid RefSeq sequence NP 001340878.1. This is because previous single cell RNA-seq studies in mouse and human demonstrate the major species expressed by cortical cell types includes the348-bp form of the 11^th exon (of 26 exon numbering scheme), which corresponds to a full open reading frame of 1998 amino acids. The 2009 amino acid isoform is expressed at a lower level across multiple cell types including PVALB neurons (see Figure 8A-8B). We also included a threonine at position 1056 of 1998 (corresponding to position 1067 of the 2009-amino acid isoform). This residue is alanine in the NCBI RefSeq sequence but is a threonine in commercial clones available from Origene (catalog # RG220167), as well as a conserved threonine across most other mammalian species (Fig. 8C), and finally we observed this residue to be a threonine in three of three human tissue donors sequenced in our prior work (Fig. 8D, data available at dbGaP # phs002292.vl.pl). From the Genome Aggregation Database (gnomAD) the reference alanine allele appears to be the minor allele in the human population (27%), whereas the majority of alleles in the population encode threonine at that position (73%). As a result, we used threonine in that position for our SCN1A transgene.

[0250] From this protein sequence we split the protein into two halves at the natural cysteine at position 1050 (numbering according to 1998-amino acid isoform), and we appended the Cfa- N and Cfa-C intein proteins to the ends of the split breakpoints with no linkers, so that the natural cysteine residue would work as the C+l extein and result in scarless protein joining. The native Npu split-intein prefers hydrophobic residues such as the native methionine at the C+2 position, which we reasoned would likely promote half joining efficiency. We also appended HA and FLAG epitope tags (HA at the N-terminus of the N-terminal half, and FLAG at the C-terminus of the C-terminal half), with short dipeptide linkers between the epitope tags and SCN1A coding sequence. With these protein sequences we reverse-translated and performed codon optimization using Integrated DNA Technologies online codon optimization tool (www.idtdna.com/pages/tools/codon-optimization-tool), and manually adjusted the codon usage to minimize cryptic splice donors and acceptors which could negatively impact expression from unwanted splicing, and manually minimized repeats over 10 bp which might negatively impact cloning or expression. The full protein open reading frame (ORF) was 100% identical at the amino acid level to the native isoform, but only 76.6% identical at the nucleotide level. We also inserted a human intron from hemoglobin which we engineered to reduce possible cryptic TATA-box promoter sequences, as well as produce efficient splice donor and acceptor sites without need for exonic splice enhancer sequences. We then synthesized these sequences as G-blocks (Integrated DNA Technologies) and cloned them using Infusion kit (Takara biosciences catalog # 638948) into pSMART-HC-Kan (Lucigen) vectors along with upstream CMV promoters and downstream IRES2-SYFP2 or IRES2-mScarlet transfection reporters for expression in HEK-293 cells. Alternatively, for full-length human SCNIAwe assembled synthetic Gblocks lacking the intein fusion proteins but with overlapping ends using Infusion kit into pSMART alongside CMV promoter and downstream IRES2-mScarlet reporter. The full length human SCN1A pSMART vector contained only a C-terminal FLAG epitope but not an N-terminal HA epitope. For cloning into pAAV vectors we inserted the intein-fusion halves into CN1390 (Addgene plasmid #163505) in place of SYFP2 reporter for DLX2.0-driven expression or into CN1839 (Addgene plasmid #163509) in place of SYFP2- 10aa-H2B for hSynl-driven expression. During this cloning we also replaced BGH polyA in the original vectors sequences with shorter synthetic polyA sequences due to size constraints. [0251] Propagating full-length SCN1A sequence-containing plasmids in bacteria is challenging. To minimize rearrangement of full length SCN1A we used pSMART-HC-Kan backbone (Lucigen) which helps minimize backbone-driven transcription in bacteria. We also propagated SCN1A into CopyCutter EPI400 cells (formerly Lucigen, now BioSearch Technologies, catalog # C400CH10) for low-copy growth in absence of CopyCutter Induction Solution, requiring larger culture volumes to compensate for reduced plasmid yields (5 mL for minipreps, 0.5-lLfor maxipreps). Due to their low-copy nature, the full length SCN1 Aplasmid preps from theseCopyCutter EPI400 cells show high contamination from bacterial genomic DNA. We minimized this contamination by adsorbing flocculated genomic DNA against copper-chelated agarose resin (G Biosciences catalog # 786-285), followed by repurification and concentration on Ampure XP beads (Beckman-Coulter # A63882). Each batch of full- length SCN1 A plasmid required full sequence validation by Sanger sequencing across the complete ORF toverify absence of rearrangement prior to use, in the case of minipreps using PCR amplification of the full ORF to generate enough DNA for sequencing. Alternatively, large-scale maxipreps of full-length SCN1A for transfection experiments were produced by Aldevron (Fargo, ND) using their proprietary growth conditions, which were confirmed to be fully intact by full ORF sequencing. In contrast to the special challenges experience with full length SCN1A ORF, plasmids containing the split fusion protein halves alone did not exhibit rearrangements or require special culturing techniques, and we amplified these plasmids in either pSMART-HC-Kan orpAAV backbone using Stbl3 cells (Thermo Fisher Scientific # C737303). For in house preps, all bacteria were grown at 32°C with either 50 pg/mL ampicillin (liquid growth, Millipore-Sigma # A8351-5G), 50 pg/mL carbenicillin (agar plates, Teknova # L1008), or 25 pg/mL kanamycin (liquid growth, Millipore-Sigma # K1377-5G, or agar plates, Teknova # L1023). Liquid growth was performed in a 50:50 mix of LB and TB (Teknova L8000 and T7060). Additionally, we also cloned a recombinant codon-optimized hSCNlB- P2A-hSCN2B ORF to promote folding and activity of Navi.1 channels. Packaging into PHP.eB particles with iodixanol gradient purification was performed.

[0252] Western blot.

[0253] To prepare protein samples from HEK-293 cells, we plated HEK-293 cells (ATCC catalog # CRL-1573) with fewer than 15 passages onto 12-well plates in HEK-293 growth medium (DMEM [Gibco # 10566-061] with 10% FBS [Gibco #16140-071] and lx Pen Strep [Gibco #15070-63]) and transfected them when 50-70% confluent. We transfected 1000 ng of DNA per well using PEI-MAX (Polysciences # 24765-100) using a ratio of 1000 ng of DNA : 5 pl 1 mg/mL PEI-MAX in Opti-MEM (Gibco #11058-021). For expression of SCN1 A or split SCN1A halves, we also co-transfected a CMV-hSCNlB-P2A-hSCN2B construct at a ratio of 900 ng alpha subunits: 100 ng beta subunits. We replenished the medium at 12-18 hours posttransfection with fresh HEK-293 growth medium, and then harvested the cells at 48-72 hours post-transfection. For harvesting we rinsed with PBS and then lysed with RIPA buffer (Pierce 677 #89901) containing lx Halt Protease Inhibitor Cocktail (Thermo Fisher # 87786).

[0254] To prepare mouse brain membrane protein samples, we dissected motor cortex samples (-25-50 mg wet tissue) into 1.7 mL Eppendorf Lo-Bind tubes and froze on dry ice and stored at -20°C. We prepared membrane protein samples using the Mem -Per Plus Membrane Extraction Kit (Thermo Fisher #89842) with small adjustments to the protocol. Briefly we washed the tissue once with ice-cold Cell Wash Solution, spun down, and wash buffer aspirated off. Then we permeabilized tissue with 200 pL ice-cold Permeabilization Buffer supplemented with IX Halt Protease Inhibitor Cocktail, pipetted fifty times with a Rainin P200 pipet, tumbled at 4°C for 15 minutes, and centrifuged at 18,000g for 15 minutes at 4°C. This supernatant was saved as cytoplasmic protein fraction, and we solubilized the pellet containing membrane proteins in 200 pL Solubilization Buffer with pipetting fifty times again with the Rainin P200 pipet and tumbled at 4°C for 15 minutes. Finally, we centrifuged the samples again (18,000g for 15 minutes at 4°C) and saved the supernatant as membrane protein fraction. Aliquots of this membrane protein were diluted with five volumes of PBS prior to protein quantification using BCA assay.

[0255] For western blots of HEK-293 cells and mouse brain membrane proteins, we quantified protein samples using BCA assay kit (Thermo Fisher Scientific # 23225) and analyzed them on a plate reader (Perkin ElmerZEnSpire 2300). For each lane we treated 15-20 pg protein samples with 4XNuPAGELDS Sample Loading buffer (Thermo #NP0007) and 5% 2-mercaptoethanol (Sigma #M3148-25ml), heated them at 70°C for 6 minutes, chilled them on ice, then separated them on a NuPAGE 4-12% Bis-Tris gel (Thermo Fisher Scientific # NP0323BOX) using NuPage MOPS SDS running buffer (Thermo Fisher #NP0001) at 90V for 2 hours, alongside 5 pL Chameleon Duo Pre-stained Protein Ladder (LiCor # 928-60000) as a sizing standard lane. We then transferred the proteins to nitrocellulose membranes with prechilled NuPage Transfer Buffer (Life Technologies #NP0006-l) at 90V for 2 hours on ice.

[0256] After transfer, we blocked membranes with 5% milk in PBS for 1 hour at room temperature and probed overnight with rocking at 4°C with the following primary antibodies: mouse anti-FLAG clone M2 (1/1000 dilution, Sigma # F1804), mouse anti-HA clone 16B12 (1/3000 dilution for HEK-293 cell lysates, Biolegend # 901513), rabbit anti-HA clone C29F4 (1/1000 dilutionfor brain membrane protein preparations, Cell Signaling # 3724S), NeuroMab mouse anti-NaVLl clone K74/71 (1/300 dilution, Antibodies Incorporated # 75-023), and loading controlrabbit polyclonal anti-alpha tubulin (1/5000 dilution, Cell Signaling #2144S) or mouse anti-alpha tubulin clone DM1 A (1/5000 dilution, Santa Cruz Biotechnology sc-32293) in 2.5% milk in PBS with 0.1% Tween 20 (PBST). Following primary antibody, we washed 3x with PBST for 10 minutes each, then performed secondary antibody detection using appropriate IRDye 680LT- or 800CW -labeled goat secondary antibodies (LLCOR) in PBST for 1 hour, then washed 3x with PBS for 10 minutes each, then imaged the blots on the LLCOR Odyssey imager. We performed all antibody and wash steps at room temperature with gentle agitation.

[0257] Electrophysiology.

[0258] HEK293 cells (CRL-1573; ATCC, Gaithersburg, MD) were cultured in standard media consisting of DMEM, high glucose, glutaMAX (Gibco 10566016; Thermo Fisher Scientific, Waltham, MA), supplemented with 10% (v/v) fetal calf serum (FCS) (Gibco A5670401; Thermo Fisher Scientific, Waltham, MA) and 1% (v/v) Penicillin-Streptomycin (P/S) (10,000 U/ml) (Gibco 15140148; Thermo Fisher Scientific, Waltham, MA), grown at 37°Cand 5% CO2. Cells were passaged in T25 tissue culture flasks (FB012935; Thermo Fisher Scientific, Waltham, MA) approximately twice a week. Only cells passaged less than 20 times were used for transfection and expression studies.

[0259] Plasmid constructs were acutely transfected into HEK293 cells using Viafect reagent (E4981; Promega, Madison, WI), following the manufacture’s protocol. In brief, HEK293 cells were first prepared for transfection by plating into 12-well tissue culture plates (Nunc 12-565- 321; Thermo Fisher Scientific, Waltham, MA) at a density of -0.5-2 xlO⁵ cells per well and grown to -80-90% confluence with standard media, allowing for one confluent well per transfection condition. On the day of transfection, media in confluent wells to be transfected were replaced with 0.5 mL fresh DMEM with 10% FCS, without P/S. Lipophilic/DNA transfection complexes were generated for each well to be transfected. This was achieved by combining a total of -1.0 pg of plasmid DNAs with serum-free OptiMEM (Gibco 31985062; Thermo Fisher Scientific, Waltham, MA) to a final volume of 100 pL, then adding 3.0 pL Viafect with gentle trituration and allowing the mixture to assemble at 24°C for 30 mins. After 30 mins, this mixture was added to each well. Live transfected cells were incubated overnight at 37°C, and visually monitored for transfection efficiency in situ using a plate microscope equipped with fluorescence (Invitrogen EVOS M7000; Thermo Fisher Scientific, Waltham, MA). Transfection efficiencies were typically >70-80%.

[0260] Specific amounts of plasmid DNAs (pDNAs) used for transfections per well : a) Full- length SCN1A (SCN1A-FL) 0.8 pg of pDNA, b) SCNIA-Ntm (SCN1A-N) 0.4 pg pDNA combined with SCNIA-Ctm (SCN1A-C) 0.4 pg pDNA, c) SCNIA-Ntm (SCN1A-N) 0.8 pg pDNA, d) SCNIA-Ctm (SCN1A-C) 0.8 pg pDNA, e) empty vector (SYFP only) 0.8 pg pDNA. All transfection conditions (a-e) were performed in a background of 80 ng pDNA of a bi- cistronic construct expressing SCN1B and SCN2B (pi/2), which encodes two sodium channel P-subunits, yielding an a:P subunit-encoding pDNA mass ratio of 10: 1. One additional control utilized only 80 ng of P 1/2 pDNA.

[0261] Following overnight incubation, cells in transfected wells were dissociated with TrypLE (Gibco 12-604-013, Thermo Fisher Scientific, Waltham, MA); and replated at low density onto 12 mm poly-D-lysine-coated glass coverslips (GG-12-pdl; Neu Vitro, Vancouver, WA) in 24-well tissue culture plates (FisherBrand FB012929; Thermo Fisher Scientific, Waltham, MA), for patch-clamp electrophysiology. Typically, -10,000-15,000 cells were replated per well at sufficiently low density to isolated individual cells. This was necessary to prevent the formation of electrical junctions between contacting cells, which precludes adequate space-clamp recording conditions. Recordings were performed from 0.5-3 days after replating at low densities.

[0262] For patch-clamp recordings, coverslips containing adherent transfected cells were transferred to the stage of a Zeiss AxoExaminer.Al microscope, equipped with an 40X water immersion objective and epifluorescence capability. Pipettes were positioned with a Sutter MPC-325 micromanipulator (Novato, CA). Whole-cell voltage-clamp recordings were acquired with an AxoClamp200B amplifier (Molecular Devices, Union City, CA), using pClampl0.4. The composition of recording solutions was: Bath (in mM, 140 NaCl, 2 CaC12, 2 MgC12, 10 HEPES, pH 7.4); Pipette internal solution (in mM, 35 NaCl, 105 CsF, 10 EGTA, 10 HEPES, pH 7.4). Patch pipettes were pulled from borosilicate glass (1B120F-4; World Precision Instruments, Sarasota, FL) on a P-97 Sutter Instruments puller (Novato, CA), and fire-polished on a Micro-Forge MF-830 (Narashige International USA, Amityville, NY) to a resistance of 0.8-1.5 MQ. Currents were allowed 5-10 mins to stabilize after achieving wholecell recording configuration, and acquired at 50 kHz, filtered at 5 kHz. Capacitive transients were subtracted using a P/4 subtraction scheme, employing a current subtraction template derived from scaling the voltage command protocol by one-fourth. Series resistance compensation was >90% for all recordings. For voltage-dependent measurements of activation (G/V) and steady-state inactivation (SSI) which require optimal voltage-clamp conditions, currents larger than 6 nA were excluded. All currents were included for peak current density measurements. Additional initial recordings were obtained with an Axopatch ID amplifier, which provided data only for peak current density analysis.

[0263] For conductance/voltage (G/V) plots, conductance (G) was calculated by the formula:

G= 1/ (V m ^— E\a-rev) where I = peak current, Vm = membrane potential, and E\_a-rev = Na⁺ reversal potential.

[0264] Na⁺ reversal potential was 35 mV, based on a calculated Nernst equilibrium potential with the recording solutions used.

[0265] Peak currents were recorded in response to a family of voltage steps from a holding potential of -120 mV to 40 mV, in 5 mV increments, with an inter-pulse interval of 2 seconds to allow channels to fully deactivate to the deep closed state. Conductance/voltage plots were fitted to a single Boltzmann function:

G/Gmax = -1/ [1 + exp((V-Vso)/k)] + 1 where Gmax = maximal conductance, Vso = voltage of half activation, and k = slope of the fitted function.

[0266] For steady-state inactivation (SSI) plots, residual peak currents activated by a step to -20 mV were measured, after a family of 1 second activating preconditioning voltage steps from -120 mV to 30 mV, in 5 mV increments. Peak residual currents after steady-state inactivation were plotted and similarly fitted to a single Boltzmann function.

[0267] Current traces were analyzed and plotted using pClampl0.4 (Molecular Devices, San Jose, CA) and Origin 8.5 (Northampton, MA). All G/V and SSI data were plotted as means with standard errors (SE) using Origin 8.5. Current density plot and all statistical calculations were performed in Prism (GraphPad, La Jolla, CA). Figures composed in Microsoft PowerPoint (Redmond, WA).

[0268] Various embodiments of the invention are described above in the Detailed Description. While these descriptions directly describe the above embodiments, it is understood that those skilled in the art may conceive modifications and/or variations to the specific embodiments shown and described herein. Any such modifications or variations that fall within the purview of this description are intended to be included therein as well. Unless specifically noted, it is the intention of the inventors that the words and phrases in the specification and claims be given the ordinary and accustomed meanings to those of ordinary skill in the applicable art(s).

[0269] The foregoing description of various embodiments of the invention known to the applicant at this time of filing the application has been presented and is intended for the purposes of illustration and description. The present description is not intended to be exhaustive nor limit the invention to the precise form disclosed and many modifications and variations are possible in the light of the above teachings. The embodiments described serve to explain the principles of the invention and its practical application and to enable others skilled in the art to utilize the invention in various embodiments and with various modifications as are suited to the particular use contemplated. Therefore, it is intended that the invention not be limited to the particular embodiments disclosed for carrying out the invention.

[0270] While particular embodiments of the present invention have been shown and described, it will be obvious to those skilled in the art that, based upon the teachings herein, changes and modifications may be made without departing from this invention and its broader aspects and, therefore, the appended claims are to encompass within their scope all such changes and modifications as are within the true spirit and scope of this invention. It will be understood by those within the art that, in general, terms used herein are generally intended as “open” terms (e.g., the term “including” should be interpreted as “including but not limited to,” the term “having” should be interpreted as “having at least,” the term “includes” should be interpreted as “includes but is not limited to,” etc.). As used herein the term “comprising” or “comprises” is used in reference to compositions, methods, and respective component(s) thereof, that are useful to an embodiment, yet open to the inclusion of unspecified elements, whether useful or not. It will be understood by those within the art that, in general, terms used herein are generally intended as “open” terms (e.g., the term “including” should be interpreted as “including but not limited to,” the term “having” should be interpreted as “having at least,” the term “includes” should be interpreted as “includes but is not limited to,” etc.). Although the open-ended term “comprising,” as a synonym of terms such as including, containing, or having, is used herein to describe and claim the invention, the present invention, or embodiments thereof, may alternatively be described using alternative terms such as “consisting of’ or “consisting essentially of.”

[0271] Sequences of various constructs and elements are provided below. lxhI56i(core) (core of human I56i enhancer) (131 bp in length)

CTAAATAAAGATGGCTTTTTAGTATTAAAAGTGGAAGAAAATTACAGGTAATTA TCTTTGACGGTAAAAACGCTGTAATCAGCGGGCTACATGAAAAATTACTCTAAT TATGGCTGCATTTAAGAGAATGG (SEQ ID NO: 1)

DLX2.0 (3x human I56i core) (393 bp in length)

CTAAATAAAGATGGCTTTTTAGTATTAAAAGTGGAAGAAAATTACAGGTAATTA TCTTTGACGGTAAAAACGCTGTAATCAGCGGGCTACATGAAAAATTACTCTAAT TATGGCTGCATTTAAGAGAATGGCTAAATAAAGATGGCTTTTTAGTATTAAAAG TGGAAGAAAATTACAGGTAATTATCTTTGACGGTAAAAACGCTGTAATCAGCGG GCTACATGAAAAATTACTCTAATTATGGCTGCATTTAAGAGAATGGCTAAATAA AGATGGCTTTTTAGTATTAAAAGTGGAAGAAAATTACAGGTAATTATCTTTGAC GGTAAAAACGCTGTAATCAGCGGGCTACATGAAAAATTACTCTAATTATGGCTG CATTTAAGAGAATGG (SEQ ID NO:2) eHGT 078h (537 bp in length)

GTAGTCTGCCTCAGGTACACACTGAGAAACTGCTTTAATGTAACCTGACCCACG GTTATTAGTGAAAATATCACTTTTGTTGTTACCTTATTCCCAACAAATTCATTTCT GCTTTAATGGAAAAGATCCGGGTTCACACTAATCAGGCCCAACGGAAGGCCAT ATTAGCAATTTGGCAGGTACCCGAGGGCCATACCTAATCTGCATAAAATGAAGC AGATTGCAACCGCCCTCATCTTTTTTATTTTTAAACTGGTTTTTGAAGCAGAGCA TAAAATCTCAGAGGGAGAGACAGAAGATGCTAGTGCATACATTTTCCTTCATGC CTTTATTTTCATTCTTTTTGCACAAACCATCTTCCTGAATGGCTGTTTACCTAAAG

AAGAATAACAAAATAAAAGGTGCTAGGAAATGGAGTAGGCAGAGATCACAAAT GTTTAATTAAAAAAAAAAAAAGTCATGTACTTTCATAGATATTCACAATCCTCT CTAGTATACTTTCAAATCAGTTTTAATTTCAGTTTAGTGTTTTTATGT (SEQ ID NO:55) hSynl promoter (495 bp in length)

GTGTCTAGACTGCAGAGGGCCCTGCGTATGAGTGCAAGTGGGTTTTAGGACCAG GATGAGGCGGGGTGGGGGTGCCTACCTGACGACCGACCCCGACCCACTGGACA AGCACCCAACCCCCATTCCCCAAATTGCGCATCCCCTATCAGAGAGGGGGAGGG GAAACAGGATGCGGCGAGGCGCGTGCGCACTGCCAGCTTCAGCACCGCGGACA GTGCCTTCGCCCCCGCCTGGCGGCGCGCGCCACCGCCGCCTCAGCACTGAAGGC

GCGCTGACGTCACTCGCCGGTCCCCCGCAAACTCCCCTTCCCGGCCACCTTGGTC GCGTCCGCGCCGCCGCCGGCCCAGCCGGACCGCACCACGCGAGGCGCGAGATA GGGGGGCACGGGCGCGACCATCTGCGCTGCGGCGCCGGCGACTCAGCGCTGCC TCAGTCTGCGGTGGGCAGCGGAGGAGTCGTGTCGTGCCTGAGAGCGCAGTCGA GAAACCGGCTAGA (SEQ ID NO: 52)

CMV promoter (588 bp in length)

GTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGT TCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCC TGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCC CATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACG GTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCC TATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGAC CTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACC ATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGATAGCGGTTTGACTCA CGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCAC CAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAA ATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTC (SEQ ID NO:53) hSynl promoter (shortened) (348 bp in length)

AGAGAGGGGGAGGGGAAACAGGATGCGGCGAGGCGCGTGCGCACTGCCAGCT TCAGCACCGCGGACAGTGCCTTCGCCCCCGCCTGGCGGCGCGCGCCACCGCCG CCTCAGCACTGAAGGCGCGCTGACGTCACTCGCCGGTCCCCCGCAAACTCCCC TTCCCGGCCACCTTGGTCGCGTCCGCGCCGCCGCCGGCCCAGCCGGACCGCAC CACGCGAGGCGCGAGATAGGGGGGCACGGGCGCGACCATCTGCGCTGCGGCG CCGGCGACTCAGCGCTGCCTCAGTCTGCGGTGGGCAGCGGAGGAGTCGTGTCG

TGCCTGAGAGCGCAGTCGAGAAACCGGCTAGA (SEQ ID NO: 54)

4x2C (214 bp in length):

AAAGAGACCGGTTCACTGTGACAGTAAAAGAGACCGGTTCACTGTGAGAATG AAAGAGACCGGTTCACTGTGATCGGAAAAGAGACCGGTTCACTGTGAGCGGCC TTGAAACCCAGCAGACAATGTAGCTCAGTAGAAACCCAGCAGACAATGTAGCT GAATGGAAACCCAGCAGACAATGTAGCTTCGGAGAAACCCAGCAGACAATGT AGCT (SEQ ID NO:56)

8x2C (450 bp in length):

GCGGCCTTAAAGAGACCGGTTCACTGTGACAGTAAAAGAGACCGGTTCACTGT GAGAATGAAAGAGACCGGTTCACTGTGATCGGAAAAGAGACCGGTTCACTGT GAGCGGCCTTAAAGAGACCGGTTCACTGTGACAGTAAAAGAGACCGGTTCACT GTGAGAATGAAAGAGACCGGTTCACTGTGATCGGAAAAGAGACCGGTTCACT GTGAGCGGCCTTGAAACCCAGCAGACAATGTAGCTCAGTAGAAACCCAGCAG ACAATGTAGCTGAATGGAAACCCAGCAGACAATGTAGCTTCGGAGAAACCCA GCAGACAATGTAGCTGCGGCCTTGAAACCCAGCAGACAATGTAGCTCAGTAGA AACCCAGCAGACAATGTAGCTGAATGGAAACCCAGCAGACAATGTAGCTTCG GAGAAACCCAGCAGACAATGTAGCTGCGGCC (SEQ ID NO: 87)

Cfa-N:

TGTCTCAGTTATGACACAGAAATCTTGACGGTGGAATACGGGTTTCTTCCGATC GGAAAGATTGTTGAGGAGCGCATAGAGTGTACGGTGTATACCGTCGATAAGAA CGGATTTGTCTACACACAGCCTATCGCACAATGGCATAATAGAGGAGAACAAG AAGTCTTCGAATATTGTTTGGAGGACGGATCAATCATACGGGCAACCAAAGAC CACAAGTTTATGACAACAGATGGACAGATGTTGCCAATAGATGAGATATTTGA GAGGGGACTTGATCTCAAGCAAGTGGATGGTCTGCCATAA (SEQ ID NO:57)

Cfa-C:

ATGGTCAAGATCATCTCCAGGAAGTCTCTGGGTACACAGAATGTCTACGATAT CGGAGTCGAGAAAGACCACAATTTTCTCCTGAAAAACGGACTCGTGGCGTCCA AT (SEQ ID NO: 58) hSCNlA-CO-Nterml049-Intron (the Intron is underlined in italics):

ATGGAGCAAACAGTTTTGGTCCCTCCGGGACCAGACAGTTTCAATTTCTTTACT CGGGAGAGTCTTGCCGCCATTGAGAGGCGCATAGCTGAGG™G 4C 4GC4GC TACAATCCAGCTACCATTCTGCTTTTA TTCTA TGGTTGGGA TAAGGCTGGA TTA TTCT GAGTCCAAGCTAGGCCCTTTTGCTAA TCATGTTCA TACCTCTTA TCTTCCTCCCACAG GAAAAGGCTAAGAATCCAAAACCTGACAAGAAAGACGACGACGAAAACGGAC

CCAAACCTAACTCAGATCTCGAAGCTGGAAAGAATCTCCCATTCATCTATGGT

GATATCCCTCCAGAAATGGTTTCAGAACCTCTAGAAGATCTCGATCCATACTAT

ATCAATAAAAAGACCTTCATCGTTCTGAACAAAGGAAAGGCGATTTTCCGGTT

TTCTGCTACTTCTGCTCTCTATATTCTCACACCATTTAATCCACTTCGCAAGATT

GCGATTAAGATACTGGTGCATAGTCTGTTCAGTATGCTGATTATGTGTACAATC

CTTACAAATTGTGTCTTTATGACTATGTCTAACCCGCCGGATTGGACCAAGAAC

GTAGAATACACGTTCACTGGAATCTATACGTTCGAGTCTCTTATTAAGATAATC

GCCAGGGGGTTCTGTCTTGAGGATTTCACTTTCCTCCGCGATCCGTGGAATTGG

CTTGACTTCACCGTTATTACGTTCGCTTACGTTACTGAGTTCGTTGATCTTGGCA

ACGTGTCTGCACTCAGAACATTCAGAGTGCTTAGAGCACTTAAAACCATAAGT

GTCATACCAGGATTGAAAACGATCGTGGGAGCTCTGATACAGAGTGTAAAGAA

GCTTTCAGATGTAATGATCCTTACTGTCTTCTGTCTTTCCGTATTCGCACTCATC

GGGCTCCAGCTGTTTATGGGTAACCTCAGAAACAAATGCATTCAATGGCCACC

AACAAATGCGAGCCTTGAGGAACATAGCATAGAAAAGAATATCACTGTTAACT

ATAATGGGACCCTCATAAACGAAACCGTGTTCGAATTTGACTGGAAATCCTAC

ATTCAGGATTCCAGATATCATTATTTTCTTGAGGGCTTCTTGGACGCACTTTTGT

GCGGAAATTCAAGTGATGCTGGTCAATGTCCTGAAGGTTATATGTGTGTTAAA

GCAGGAAGAAACCCAAACTACGGATACACATCTTTCGATACATTTTCTTGGGC

TTTCCTATCTCTTTTTCGGCTTATGACACAAGACTTTTGGGAAAATTTGTATCAG

CTGACACTCCGAGCGGCTGGAAAAACTTATATGATCTTCTTCGTTCTTGTAATC

TTTTTGGGATCCTTCTACCTCATCAATTTGATACTTGCAGTTGTCGCTATGGCAT

ACGAGGAGCAAAATCAAGCAACGCTAGAAGAAGCGGAGCAGAAAGAGGCTG

AATTTCAACAGATGATTGAGCAATTGAAGAAACAACAGGAAGCTGCACAACA

AGCAGCTACTGCTACTGCATCTGAACATTCTAGAGAGCCAAGTGCAGCTGGAA

GGCTTTCTGATAGTTCAAGTGAAGCATCTAAATTGAGTTCTAAGTCAGCAAAG

GAACGGAGAAATAGACGGAAAAAACGAAAGCAGAAGGAGCAATCTGGAGGA

GAAGAGAAGGACGAAGACGAGTTTCAAAAAAGTGAATCAGAGGACTCAATTA

GACGCAAAGGATTCAGATTTAGTATCGAAGGAAATAGATTGACTTATGAAAAA

CGATATTCCTCACCACATCAGTCACTCCTGAGTATACGCGGGTCACTCTTTTCA

CCGAGACGAAATTCCAGAACTTCACTCTTCTCATTCCGGGGAAGGGCTAAGGA

TGTAGGCTCAGAAAATGATTTCGCAGACGATGAGCATTCCACTTTTGAAGATA

ATGAGAGCAGGCGAGACAGTCTCTTTGTACCACGAAGACATGGCGAAAGAAG

GAACAGCAACCTTAGCCAGACTAGTCGGTCCAGTAGAATGCTAGCTGTATTCC

CTGCTAATGGCAAGATGCATTCCACCGTTGATTGTAATGGGGTCGTCTCGTTAG

TAGGTGGACCTTCAGTTCCTACCTCACCGGTTGGACAATTGCTGCCGGAGGGA

ACCACTACTGAGACTGAAATGAGAAAACGACGTTCTTCAAGCTTCCATGTGTC

TATGGATTTTTTGGAAGACCCGTCACAGCGCCAAAGAGCTATGTCTATAGCTTC

AATCCTGACAAACACCGTAGAGGAGTTGGAGGAGTCACGCCAGAAGTGCCCTC

CTTGTTGGTACAAGTTCTCCAACATCTTCCTGATTTGGGATTGTTCACCTTATTG

GCTGAAAGTCAAGCACGTTGTTAACCTCGTCGTAATGGATCCTTTTGTCGACCT

GGCTATAACGATATGTATCGTCCTGAACACACTCTTCATGGCTATGGAGCATTA

TCCGATGACTGATCATTTTAACAATGTGCTTACCGTGGGTAATCTGGTTTTCAC TGGCATCTTTACTGCAGAAATGTTTCTTAAGATTATTGCAATGGACCCCTACTA CTACTTTCAAGAAGGATGGAATATTTTTGATGGTTTTATCGTCACACTTTCTTTG GTTGAATTGGGCTTGGCAAATGTAGAGGGGCTCTCAGTTCTTAGAAGTTTCAG ACTTCTCCGGGTATTCAAGCTTGCTAAGAGCTGGCCTACTTTGAACATGCTCAT CAAGATTATCGGAAACAGTGTTGGCGCCCTTGGCAATCTGACATTGGTGTTGG CTATCATAGTATTCATCTTCGCGGTTGTGGGAATGCAGTTGTTTGGGAAGTCAT ACAAGGACTGTGTGTGCAAGATAGCGTCCGACTGTCAACTTCCGAGGTGGCAC ATGAACGATTTCTTTCATTCATTCCTCATTGTGTTTCGGGTCCTCTGTGGCGAGT GGATAGAAACTATGTGGGACTGTATGGAAGTAGCTGGGCAGGCGATGTGCCTT ACGGTATTCATGATGGTCATGGTCATCGGAAATCTTGTTGTATTGAATTTATTT CTCGCGTTGTTGTTGAGTTCATTTTCCGCCGATAATTTGGCTGCCACTGATGAC GACAACGAGATGAATAATCTTCAGATAGCTGTAGACCGGATGCACAAAGGCGT TGCCTACGTCAAACGAAAAATCTATGAATTCATACAGCAATCCTTCATACGAA AACAGAAGATTCTGGATGAAATCAAACCCCTTGATGATCTCAATAATAAGAAA GATTCA (SEQ ID NO:59) hSCNlA-CO-Cterm949-Intron (the Intron is underlined and in italics):

GCATGTCGAACCATACCACAGAGATAGGCAAGGACCTTGACTACCTTAAAGAC GTGAACGGTACCACAAGTGGAATAGGCACAGG7A4GZ4CZ4GG4GCZ4CA47CC AGCTACCA TTCTGCTTTTA TTCTATGGTTGGGA TAAGGCTGGA TTA TTCTGAGTCCAA GCTAGGCCCTTTTGCTAATCATGTTCATACCTCTTATCTTCCTCCCACAGGCTCCTCT GTAGAGAAGTACATCATAGACGAGAGCGATTACATGTCTTTCATCAACAACCC GTCCCTCACTGTCACCGTTCCCATCGCCGTAGGAGAATCTGACTTCGAGAATCT CAATACGGAGGATTTCAGCTCCGAATCAGACTTGGAGGAATCAAAGGAGAAG TTGAACGAAAGTTCAAGTTCATCCGAGGGCAGCACCGTGGACATAGGCGCCCC CGTCGAGGAACAACCTGTAGTCGAGCCTGAGGAAACTTTGGAACCCGAAGCGT GTTTCACGGAGGGGTGTGTTCAACGCTTCAAGTGTTGCCAAATTAACGTTGAA GAGGGTCGTGGAAAACAATGGTGGAACCTCCGCAGGACCTGTTTCCGGATCGT CGAACATAATTGGTTCGAGACGTTCATAGTTTTCATGATCTTGCTTTCATCTGG TGCTTTGGCATTCGAGGATATCTACATCGACCAACGAAAGACCATAAAAACTA TGCTGGAATATGCAGACAAGGTTTTCACATACATATTCATCCTTGAAATGCTCC TGAAATGGGTAGCGTATGGTTACCAGACTTATTTCACGAACGCATGGTGCTGG CTCGATTTCCTGATTGTCGACGTCTCCCTGGTGTCATTGACTGCTAACGCACTC GGATATAGCGAACTAGGCGCTATTAAGAGTCTCAGAACCCTGAGAGCATTGAG GCCCCTCCGCGCGCTCTCTCGGTTTGAGGGAATGAGAGTAGTCGTTAATGCAC TGTTGGGAGCGATACCTTCCATTATGAACGTGCTTCTCGTTTGTCTCATCTTCTG GCTGATATTCTCTATTATGGGTGTGAACTTGTTCGCAGGCAAATTTTACCACTG CATTAACACAACTACAGGAGATAGATTTGATATTGAGGATGTAAACAACCACA CCGACTGTTTGAAGTTGATAGAGAGAAACGAGACCGCAAGATGGAAGAATGT AAAAGTCAACTTCGACAATGTCGGCTTTGGATATCTTTCACTGCTGCAAGTAGC CACATTCAAAGGATGGATGGACATTATGTACGCTGCAGTAGATTCCCGAAACG TAGAGTTGCAACCGAAGTATGAAGAAAGTTTGTATATGTACCTCTACTTCGTA _{ATTTTTATCATCTTTGGCTCATTCTTCACACTTAACCTGTTCATTGGTGTAATCA} TCGACAATTTCAATCAGCAGAAAAAGAAATTTGGTGGACAAGACATCTTCATG ACAGAGGAACAGAAGAAATACTATAATGCAATGAAAAAACTAGGGTCCAAAA AGCCCCAAAAACCTATTCCTAGACCGGGCAACAAGTTTCAAGGCATGGTTTTC GACTTCGTAACTAGACAGGTGTTTGATATATCTATTATGATTCTGATATGTCTG AATATGGTTACGATGATGGTTGAGACTGATGATCAATCTGAATACGTTACGAC GATACTTAGCCGAATTAACTTGGTATTCATTGTTCTTTTCACGGGCGAATGTGT ACTTAAACTGATTAGTTTAAGGCACTATTATTTCACAATCGGTTGGAACATTTT TGATTTCGTTGTGGTCATACTTTCCATTGTTGGCATGTTTCTTGCTGAATTGATA GAAAAGTACTTCGTCAGTCCAACACTTTTCCGAGTTATACGGCTTGCCCGAATC GGACGAATTCTCAGGCTAATCAAAGGTGCTAAAGGAATTCGTACACTGCTTTT CGCTCTCATGATGTCACTGCCAGCTCTTTTCAACATCGGTTTGTTACTATTTTTG GTAATGTTTATATATGCGATCTTCGGCATGAGTAATTTCGCTTATGTTAAACGG GAGGTGGGAATCGATGACATGTTTAATTTTGAGACATTCGGCAATTCTATGAT CTGTCTCTTTCAAATTACCACGTCAGCTGGATGGGACGGATTGCTTGCTCCGAT TCTCAACAGTAAACCGCCCGATTGCGACCCTAACAAAGTGAATCCGGGTTCAT CTGTAAAGGGAGACTGCGGAAATCCGAGCGTCGGTATCTTCTTTTTCGTCTCCT ACATTATAATTTCTTTCCTTGTTGTCGTGAACATGTATATAGCTGTGATCTTGG AAAATTTTTCTGTTGCTACTGAGGAATCCGCAGAACCACTTTCAGAAGACGAT TTTGAGATGTTTTACGAAGTTTGGGAGAAGTTTGATCCTGACGCTACACAGTTT ATGGAATTTGAGAAGCTCTCACAGTTCGCAGCTGCCCTGGAGCCTCCGTTGAA TCTTCCACAGCCTAACAAGTTACAACTGATTGCGATGGACCTGCCAATGGTGT CTGGGGACCGAATCCACTGCCTTGATATACTCTTTGCTTTCACAAAAAGGGTCT TGGGCGAGTCTGGAGAAATGGACGCCCTCAGAATACAGATGGAGGAACGATT CATGGCTTCGAATCCTAGCAAAGTGTCTTATCAACCCATCACTACGACTCTTAA AAGAAAACAAGAGGAAGTGTCTGCTGTCATTATCCAGCGAGCATATAGACGG CACTTGCTCAAACGAACTGTTAAGCAAGCCAGTTTCACCTACAATAAAAACAA AATAAAAGGTGGTGCTAATTTGCTGATTAAAGAGGACATGATTATCGACAGAA TCAATGAGAACTCCATTACAGAAAAAACCGATCTCACTATGTCAACAGCAGCC TGTCCTCCCTCATACGACCGTGTCACTAAACCTATAGTCGAAAAACATGAACA

AGAGGGCAAGGATGAGAAGGCCAAAGGCAAA (SEQ ID NO: 60) hSCNlA-CO-Nterm956-Intron (the Intron is underlined and in italics):

ATGGAGCAAACAGTTTTGGTCCCTCCGGGACCAGACAGTTTCAATTTCTTTACT CGGGAGAGTCTTGCCGCCATTGAGAGGCGCATAGCTGAGGZA4GZ4CZ4GG4GC TACAATCCAGCTACCATTCTGCTTTTA TTCTA TGGTTGGGA TAAGGCTGGA TTA TTCT GAGTCCAAGCTAGGCCCTTTTGCTAA TCATGTTCA TACCTCTTA TCTTCCTCCCACAG GAAAAGGCTAAGAATCCAAAACCTGACAAGAAAGACGACGACGAAAACGGA CCCAAACCTAACTCAGATCTCGAAGCTGGAAAGAATCTCCCATTCATCTATGG TGATATCCCTCCAGAAATGGTTTCAGAACCTCTAGAAGATCTCGATCCATACTA TATCAATAAAAAGACCTTCATCGTTCTGAACAAAGGAAAGGCGATTTTCCGGT TTTCTGCTACTTCTGCTCTCTATATTCTCACACCATTTAATCCACTTCGCAAGAT TGCGATTAAGATACTGGTGCATAGTCTGTTCAGTATGCTGATTATGTGTACAAT CCTTACAAATTGTGTCTTTATGACTATGTCTAACCCGCCGGATTGGACCAAGAA CGTAGAATACACGTTCACTGGAATCTATACGTTCGAGTCTCTTATTAAGATAAT CGCCAGGGGGTTCTGTCTTGAGGATTTCACTTTCCTCCGCGATCCGTGGAATTG GCTTGACTTCACCGTTATTACGTTCGCTTACGTTACTGAGTTCGTTGATCTTGGC AACGTGTCTGCACTCAGAACATTCAGAGTGCTTAGAGCACTTAAAACCATAAG TGTCATACCAGGATTGAAAACGATCGTGGGAGCTCTGATACAGAGTGTAAAGA AGCTTTCAGATGTAATGATCCTTACTGTCTTCTGTCTTTCCGTATTCGCACTCAT CGGGCTCCAGCTGTTTATGGGTAACCTCAGAAACAAATGCATTCAATGGCCAC CAACAAATGCGAGCCTTGAGGAACATAGCATAGAAAAGAATATCACTGTTAA CTATAATGGGACCCTCATAAACGAAACCGTGTTCGAATTTGACTGGAAATCCT ACATTCAGGATTCCAGATATCATTATTTTCTTGAGGGCTTCTTGGACGCACTTT TGTGCGGAAATTCAAGTGATGCTGGTCAATGTCCTGAAGGTTATATGTGTGTTA AAGCAGGAAGAAACCCAAACTACGGATACACATCTTTCGATACATTTTCTTGG

GCTTTCCTATCTCTTTTTCGGCTTATGACACAAGACTTTTGGGAAAATTTGTATC

AGCTGACACTCCGAGCGGCTGGAAAAACTTATATGATCTTCTTCGTTCTTGTAA TCTTTTTGGGATCCTTCTACCTCATCAATTTGATACTTGCAGTTGTCGCTATGGC ATACGAGGAGCAAAATCAAGCAACGCTAGAAGAAGCGGAGCAGAAAGAGGC

TGAATTTCAACAGATGATTGAGCAATTGAAGAAACAACAGGAAGCTGCACAA CAAGCAGCTACTGCTACTGCATCTGAACATTCTAGAGAGCCAAGTGCAGCTGG AAGGCTTTCTGATAGTTCAAGTGAAGCATCTAAATTGAGTTCTAAGTCAGCAA

AGGAACGGAGAAATAGACGGAAAAAACGAAAGCAGAAGGAGCAATCTGGAG GAGAAGAGAAGGACGAAGACGAGTTTCAAAAAAGTGAATCAGAGGACTCAAT TAGACGCAAAGGATTCAGATTTAGTATCGAAGGAAATAGATTGACTTATGAAA

AACGATATTCCTCACCACATCAGTCACTCCTGAGTATACGCGGGTCACTCTTTT

CACCGAGACGAAATTCCAGAACTTCACTCTTCTCATTCCGGGGAAGGGCTAAG

GATGTAGGCTCAGAAAATGATTTCGCAGACGATGAGCATTCCACTTTTGAAGA

TAATGAGAGCAGGCGAGACAGTCTCTTTGTACCACGAAGACATGGCGAAAGA AGGAACAGCAACCTTAGCCAGACTAGTCGGTCCAGTAGAATGCTAGCTGTATT CCCTGCTAATGGCAAGATGCATTCCACCGTTGATTGTAATGGGGTCGTCTCGTT

AGTAGGTGGACCTTCAGTTCCTACCTCACCGGTTGGACAATTGCTGCCGGAGG

GAACCACTACTGAGACTGAAATGAGAAAACGACGTTCTTCAAGCTTCCATGTG TCTATGGATTTTTTGGAAGACCCGTCACAGCGCCAAAGAGCTATGTCTATAGCT TCAATCCTGACAAACACCGTAGAGGAGTTGGAGGAGTCACGCCAGAAGTGCC

CTCCTTGTTGGTACAAGTTCTCCAACATCTTCCTGATTTGGGATTGTTCACCTTA

TTGGCTGAAAGTCAAGCACGTTGTTAACCTCGTCGTAATGGATCCTTTTGTCGA

CCTGGCTATAACGATATGTATCGTCCTGAACACACTCTTCATGGCTATGGAGCA TTATCCGATGACTGATCATTTTAACAATGTGCTTACCGTGGGTAATCTGGTTTT CACTGGCATCTTTACTGCAGAAATGTTTCTTAAGATTATTGCAATGGACCCCTA

CTACTACTTTCAAGAAGGATGGAATATTTTTGATGGTTTTATCGTCACACTTTC

TTTGGTTGAATTGGGCTTGGCAAATGTAGAGGGGCTCTCAGTTCTTAGAAGTTT

CAGACTTCTCCGGGTATTCAAGCTTGCTAAGAGCTGGCCTACTTTGAACATGCT CATCAAGATTATCGGAAACAGTGTTGGCGCCCTTGGCAATCTGACATTGGTGT TGGCTATCATAGTATTCATCTTCGCGGTTGTGGGAATGCAGTTGTTTGGGAAGT

CATACAAGGACTGTGTGTGCAAGATAGCGTCCGACTGTCAACTTCCGAGGTGG CACATGAACGATTTCTTTCATTCATTCCTCATTGTGTTTCGGGTCCTCTGTGGCG AGTGGATAGAAACTATGTGGGACTGTATGGAAGTAGCTGGGCAGGCGATG

(SEQ ID N0:61) hSCNlA-CO-Cterml042-Intron (the Intron is underlined and in italics):

TGCCTTACGGTATTCATGATGGTCATGGTCATCGGAAATCTTGTTGTATTGAAT TTATTTCTCGCGTTGTTGTTGAGTTCATTTTCCGCCGATAATTTGGCTGCCACTG ATGACGACAACGAGATGAATAATCTTCAGATAGCTGTAGACCGGATGCACAA

AGGCGTTGCCTACGTCAAACGAAAAATCTATGAATTCATACAGCAATCCTTCA TACGAAAACAGAAGATTCTGGATGAAATCAAACCCCTTGATGATCTCAATAAT AAGAAAGATTCATGCATGTCGAACCATACCACAGAGATAGGCAAGGACCTTG

ACTACCTTAAAGACGTGAACGGTACCACAAGTGGAATAGGCACAGG7M7G7MC

TAGCAGCTACAATCCAGCTACCATTCTGCTTTTATTCTATGGTTGGGATAAGGCTGGA TTATTCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTTCATACCTCTTATCTTCCTC

CG4G4GGCTCCTCTGTAGAGAAGTACATCATAGACGAGAGCGATTACATGTCT

TTCATCAACAACCCGTCCCTCACTGTCACCGTTCCCATCGCCGTAGGAGAATCT

GACTTCGAGAATCTCAATACGGAGGATTTCAGCTCCGAATCAGACTTGGAGGA

ATCAAAGGAGAAGTTGAACGAAAGTTCAAGTTCATCCGAGGGCAGCACCGTG

GACATAGGCGCCCCCGTCGAGGAACAACCTGTAGTCGAGCCTGAGGAAACTTT

GGAACCCGAAGCGTGTTTCACGGAGGGGTGTGTTCAACGCTTCAAGTGTTGCC

AAATTAACGTTGAAGAGGGTCGTGGAAAACAATGGTGGAACCTCCGCAGGAC

CTGTTTCCGGATCGTCGAACATAATTGGTTCGAGACGTTCATAGTTTTCATGAT

CTTGCTTTCATCTGGTGCTTTGGCATTCGAGGATATCTACATCGACCAACGAAA

GACCATAAAAACTATGCTGGAATATGCAGACAAGGTTTTCACATACATATTCA

TCCTTGAAATGCTCCTGAAATGGGTAGCGTATGGTTACCAGACTTATTTCACGA

ACGCATGGTGCTGGCTCGATTTCCTGATTGTCGACGTCTCCCTGGTGTCATTGA

CTGCTAACGCACTCGGATATAGCGAACTAGGCGCTATTAAGAGTCTCAGAACC

CTGAGAGCATTGAGGCCCCTCCGCGCGCTCTCTCGGTTTGAGGGAATGAGAGT

AGTCGTTAATGCACTGTTGGGAGCGATACCTTCCATTATGAACGTGCTTCTCGT

TTGTCTCATCTTCTGGCTGATATTCTCTATTATGGGTGTGAACTTGTTCGCAGGC

AAATTTTACCACTGCATTAACACAACTACAGGAGATAGATTTGATATTGAGGA

TGTAAACAACCACACCGACTGTTTGAAGTTGATAGAGAGAAACGAGACCGCA

AGATGGAAGAATGTAAAAGTCAACTTCGACAATGTCGGCTTTGGATATCTTTC

ACTGCTGCAAGTAGCCACATTCAAAGGATGGATGGACATTATGTACGCTGCAG

TAGATTCCCGAAACGTAGAGTTGCAACCGAAGTATGAAGAAAGTTTGTATATG

TACCTCTACTTCGTAATTTTTATCATCTTTGGCTCATTCTTCACACTTAACCTGT

TCATTGGTGTAATCATCGACAATTTCAATCAGCAGAAAAAGAAATTTGGTGGA

CAAGACATCTTCATGACAGAGGAACAGAAGAAATACTATAATGCAATGAAAA

AACTAGGGTCCAAAAAGCCCCAAAAACCTATTCCTAGACCGGGCAACAAGTTT

CAAGGCATGGTTTTCGACTTCGTAACTAGACAGGTGTTTGATATATCTATTATG

ATTCTGATATGTCTGAATATGGTTACGATGATGGTTGAGACTGATGATCAATCT

GAATACGTTACGACGATACTTAGCCGAATTAACTTGGTATTCATTGTTCTTTTC

ACGGGCGAATGTGTACTTAAACTGATTAGTTTAAGGCACTATTATTTCACAATC

GGTTGGAACATTTTTGATTTCGTTGTGGTCATACTTTCCATTGTTGGCATGTTTC

TTGCTGAATTGATAGAAAAGTACTTCGTCAGTCCAACACTTTTCCGAGTTATAC

GGCTTGCCCGAATCGGACGAATTCTCAGGCTAATCAAAGGTGCTAAAGGAATT

CGTACACTGCTTTTCGCTCTCATGATGTCACTGCCAGCTCTTTTCAACATCGGTT

TGTTACTATTTTTGGTAATGTTTATATATGCGATCTTCGGCATGAGTAATTTCGC

TTATGTTAAACGGGAGGTGGGAATCGATGACATGTTTAATTTTGAGACATTCG

GCAATTCTATGATCTGTCTCTTTCAAATTACCACGTCAGCTGGATGGGACGGAT

TGCTTGCTCCGATTCTCAACAGTAAACCGCCCGATTGCGACCCTAACAAAGTG

AATCCGGGTTCATCTGTAAAGGGAGACTGCGGAAATCCGAGCGTCGGTATCTT

_{CTTTTTCGTCTCCTACATTATAATTTCTTTCCTTGTTGTCGTGAACATGTATATA}

GCTGTGATCTTGGAAAATTTTTCTGTTGCTACTGAGGAATCCGCAGAACCACTT

TCAGAAGACGATTTTGAGATGTTTTACGAAGTTTGGGAGAAGTTTGATCCTGA

CGCTACACAGTTTATGGAATTTGAGAAGCTCTCACAGTTCGCAGCTGCCCTGG

AGCCTCCGTTGAATCTTCCACAGCCTAACAAGTTACAACTGATTGCGATGGAC

CTGCCAATGGTGTCTGGGGACCGAATCCACTGCCTTGATATACTCTTTGCTTTC

ACAAAAAGGGTCTTGGGCGAGTCTGGAGAAATGGACGCCCTCAGAATACAGA TGGAGGAACGATTCATGGCTTCGAATCCTAGCAAAGTGTCTTATCAACCCATC ACTACGACTCTTAAAAGAAAACAAGAGGAAGTGTCTGCTGTCATTATCCAGCG AGCATATAGACGGCACTTGCTCAAACGAACTGTTAAGCAAGCCAGTTTCACCT ACAATAAAAACAAAATAAAAGGTGGTGCTAATTTGCTGATTAAAGAGGACAT GATTATCGACAGAATCAATGAGAACTCCATTACAGAAAAAACCGATCTCACTA TGTCAACAGCAGCCTGTCCTCCCTCATACGACCGTGTCACTAAACCTATAGTCG AAAAACATGAACAAGAGGGCAAGGATGAGAAGGCCAAAGGCAAA (SEQ ID NO: 62) hSCNlA-CO-Nterm947-Intron (the Intron is underlined and in italics):

ATGGAGCAAACAGTTTTGGTCCCTCCGGGACCAGACAGTTTCAATTTCTTTACT CGGGAGAGTCTTGCCGCCATTGAGAGGCGCATAGCTGAGGZA4GZ4CZ4GG4GC TACAATCCAGCTACCATTCTGCTTTTA TTCTA TGGTTGGGA TAAGGCTGGA TTA TTCT GAGTCCAAGCTAGGCCCTTTTGCTAA TCATGTTCA TACCTCTTA TCTTCCTCCCACAG GAAAAGGCTAAGAATCCAAAACCTGACAAGAAAGACGACGACGAAAACGGA CCCAAACCTAACTCAGATCTCGAAGCTGGAAAGAATCTCCCATTCATCTATGG TGATATCCCTCCAGAAATGGTTTCAGAACCTCTAGAAGATCTCGATCCATACTA TATCAATAAAAAGACCTTCATCGTTCTGAACAAAGGAAAGGCGATTTTCCGGT TTTCTGCTACTTCTGCTCTCTATATTCTCACACCATTTAATCCACTTCGCAAGAT TGCGATTAAGATACTGGTGCATAGTCTGTTCAGTATGCTGATTATGTGTACAAT CCTTACAAATTGTGTCTTTATGACTATGTCTAACCCGCCGGATTGGACCAAGAA CGTAGAATACACGTTCACTGGAATCTATACGTTCGAGTCTCTTATTAAGATAAT CGCCAGGGGGTTCTGTCTTGAGGATTTCACTTTCCTCCGCGATCCGTGGAATTG GCTTGACTTCACCGTTATTACGTTCGCTTACGTTACTGAGTTCGTTGATCTTGGC AACGTGTCTGCACTCAGAACATTCAGAGTGCTTAGAGCACTTAAAACCATAAG TGTCATACCAGGATTGAAAACGATCGTGGGAGCTCTGATACAGAGTGTAAAGA AGCTTTCAGATGTAATGATCCTTACTGTCTTCTGTCTTTCCGTATTCGCACTCAT CGGGCTCCAGCTGTTTATGGGTAACCTCAGAAACAAATGCATTCAATGGCCAC CAACAAATGCGAGCCTTGAGGAACATAGCATAGAAAAGAATATCACTGTTAA CTATAATGGGACCCTCATAAACGAAACCGTGTTCGAATTTGACTGGAAATCCT ACATTCAGGATTCCAGATATCATTATTTTCTTGAGGGCTTCTTGGACGCACTTT TGTGCGGAAATTCAAGTGATGCTGGTCAATGTCCTGAAGGTTATATGTGTGTTA AAGCAGGAAGAAACCCAAACTACGGATACACATCTTTCGATACATTTTCTTGG GCTTTCCTATCTCTTTTTCGGCTTATGACACAAGACTTTTGGGAAAATTTGTATC AGCTGACACTCCGAGCGGCTGGAAAAACTTATATGATCTTCTTCGTTCTTGTAA TCTTTTTGGGATCCTTCTACCTCATCAATTTGATACTTGCAGTTGTCGCTATGGC ATACGAGGAGCAAAATCAAGCAACGCTAGAAGAAGCGGAGCAGAAAGAGGC TGAATTTCAACAGATGATTGAGCAATTGAAGAAACAACAGGAAGCTGCACAA CAAGCAGCTACTGCTACTGCATCTGAACATTCTAGAGAGCCAAGTGCAGCTGG AAGGCTTTCTGATAGTTCAAGTGAAGCATCTAAATTGAGTTCTAAGTCAGCAA AGGAACGGAGAAATAGACGGAAAAAACGAAAGCAGAAGGAGCAATCTGGAG GAGAAGAGAAGGACGAAGACGAGTTTCAAAAAAGTGAATCAGAGGACTCAAT TAGACGCAAAGGATTCAGATTTAGTATCGAAGGAAATAGATTGACTTATGAAA AACGATATTCCTCACCACATCAGTCACTCCTGAGTATACGCGGGTCACTCTTTT CACCGAGACGAAATTCCAGAACTTCACTCTTCTCATTCCGGGGAAGGGCTAAG GATGTAGGCTCAGAAAATGATTTCGCAGACGATGAGCATTCCACTTTTGAAGA TAATGAGAGCAGGCGAGACAGTCTCTTTGTACCACGAAGACATGGCGAAAGA AGGAACAGCAACCTTAGCCAGACTAGTCGGTCCAGTAGAATGCTAGCTGTATT CCCTGCTAATGGCAAGATGCATTCCACCGTTGATTGTAATGGGGTCGTCTCGTT AGTAGGTGGACCTTCAGTTCCTACCTCACCGGTTGGACAATTGCTGCCGGAGG GAACCACTACTGAGACTGAAATGAGAAAACGACGTTCTTCAAGCTTCCATGTG TCTATGGATTTTTTGGAAGACCCGTCACAGCGCCAAAGAGCTATGTCTATAGCT TCAATCCTGACAAACACCGTAGAGGAGTTGGAGGAGTCACGCCAGAAGTGCC CTCCTTGTTGGTACAAGTTCTCCAACATCTTCCTGATTTGGGATTGTTCACCTTA TTGGCTGAAAGTCAAGCACGTTGTTAACCTCGTCGTAATGGATCCTTTTGTCGA CCTGGCTATAACGATATGTATCGTCCTGAACACACTCTTCATGGCTATGGAGCA TTATCCGATGACTGATCATTTTAACAATGTGCTTACCGTGGGTAATCTGGTTTT CACTGGCATCTTTACTGCAGAAATGTTTCTTAAGATTATTGCAATGGACCCCTA CTACTACTTTCAAGAAGGATGGAATATTTTTGATGGTTTTATCGTCACACTTTC TTTGGTTGAATTGGGCTTGGCAAATGTAGAGGGGCTCTCAGTTCTTAGAAGTTT CAGACTTCTCCGGGTATTCAAGCTTGCTAAGAGCTGGCCTACTTTGAACATGCT CATCAAGATTATCGGAAACAGTGTTGGCGCCCTTGGCAATCTGACATTGGTGT TGGCTATCATAGTATTCATCTTCGCGGTTGTGGGAATGCAGTTGTTTGGGAAGT CATACAAGGACTGTGTGTGCAAGATAGCGTCCGACTGTCAACTTCCGAGGTGG CACATGAACGATTTCTTTCATTCATTCCTCATTGTGTTTCGGGTCCTCTGTGGCG AGTGGATAGAAACTATGTGGGAC (SEQ ID NO:63) hSCNlA-CO-Cterml051-Intron (the Intron is underlined and in italics):

TGTATGGAAGTAGCTGGGCAGGCGATGTGCCTTACGGTATTCATGATGGTCAT GGTCATCGGAAATCTTGTTGTATTGAATTTATTTCTCGCGTTGTTGTTGAGTTCA TTTTCCGCCGATAATTTGGCTGCCACTGATGACGACAACGAGATGAATAATCTT CAGATAGCTGTAGACCGGATGCACAAAGGCGTTGCCTACGTCAAACGAAAAAT CTATGAATTCATACAGCAATCCTTCATACGAAAACAGAAGATTCTGGATGAAA TCAAACCCCTTGATGATCTCAATAATAAGAAAGATTCATGCATGTCGAACCAT ACCACAGAGATAGGCAAGGACCTTGACTACCTTAAAGACGTGAACGGTACCAC kkGTGGkkTkGGCACAGGTAAGTACTAGCAGCTACAATCCAGCTACCATTCTGCTT TTATTCTATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCTTTTGCT AATCATGTTCATACCTCTTATCTTCCTCCCACAGGGTCGTGTGTkGAGkkGTACAsAC ATAGACGAGAGCGATTACATGTCTTTCATCAACAACCCGTCCCTCACTGTCACC GTTCCCATCGCCGTAGGAGAATCTGACTTCGAGAATCTCAATACGGAGGATTT CAGCTCCGAATCAGACTTGGAGGAATCAAAGGAGAAGTTGAACGAAAGTTCA AGTTCATCCGAGGGCAGCACCGTGGACATAGGCGCCCCCGTCGAGGAACAACC TGTAGTCGAGCCTGAGGAAACTTTGGAACCCGAAGCGTGTTTCACGGAGGGGT GTGTTCAACGCTTCAAGTGTTGCCAAATTAACGTTGAAGAGGGTCGTGGAAAA CAATGGTGGAACCTCCGCAGGACCTGTTTCCGGATCGTCGAACATAATTGGTT CGAGACGTTCATAGTTTTCATGATCTTGCTTTCATCTGGTGCTTTGGCATTCGA GGATATCTACATCGACCAACGAAAGACCATAAAAACTATGCTGGAATATGCAG ACAAGGTTTTCACATACATATTCATCCTTGAAATGCTCCTGAAATGGGTAGCGT ATGGTTACCAGACTTATTTCACGAACGCATGGTGCTGGCTCGATTTCCTGATTG TCGACGTCTCCCTGGTGTCATTGACTGCTAACGCACTCGGATATAGCGAACTAG GCGCTATTAAGAGTCTCAGAACCCTGAGAGCATTGAGGCCCCTCCGCGCGCTC TCTCGGTTTGAGGGAATGAGAGTAGTCGTTAATGCACTGTTGGGAGCGATACC TTCCATTATGAACGTGCTTCTCGTTTGTCTCATCTTCTGGCTGATATTCTCTATT

ATGGGTGTGAACTTGTTCGCAGGCAAATTTTACCACTGCATTAACACAACTAC AGGAGATAGATTTGATATTGAGGATGTAAACAACCACACCGACTGTTTGAAGT

TGATAGAGAGAAACGAGACCGCAAGATGGAAGAATGTAAAAGTCAACTTCGA

CAATGTCGGCTTTGGATATCTTTCACTGCTGCAAGTAGCCACATTCAAAGGATG

GATGGACATTATGTACGCTGCAGTAGATTCCCGAAACGTAGAGTTGCAACCGA

AGTATGAAGAAAGTTTGTATATGTACCTCTACTTCGTAATTTTTATCATCTTTG

GCTCATTCTTCACACTTAACCTGTTCATTGGTGTAATCATCGACAATTTCAATC

AGCAGAAAAAGAAATTTGGTGGACAAGACATCTTCATGACAGAGGAACAGAA

GAAATACTATAATGCAATGAAAAAACTAGGGTCCAAAAAGCCCCAAAAACCT

ATTCCTAGACCGGGCAACAAGTTTCAAGGCATGGTTTTCGACTTCGTAACTAG

ACAGGTGTTTGATATATCTATTATGATTCTGATATGTCTGAATATGGTTACGAT

GATGGTTGAGACTGATGATCAATCTGAATACGTTACGACGATACTTAGCCGAA

TTAACTTGGTATTCATTGTTCTTTTCACGGGCGAATGTGTACTTAAACTGATTA

GTTTAAGGCACTATTATTTCACAATCGGTTGGAACATTTTTGATTTCGTTGTGG

TCATACTTTCCATTGTTGGCATGTTTCTTGCTGAATTGATAGAAAAGTACTTCG

TCAGTCCAACACTTTTCCGAGTTATACGGCTTGCCCGAATCGGACGAATTCTCA

GGCTAATCAAAGGTGCTAAAGGAATTCGTACACTGCTTTTCGCTCTCATGATGT

CACTGCCAGCTCTTTTCAACATCGGTTTGTTACTATTTTTGGTAATGTTTATATA

TGCGATCTTCGGCATGAGTAATTTCGCTTATGTTAAACGGGAGGTGGGAATCG

ATGACATGTTTAATTTTGAGACATTCGGCAATTCTATGATCTGTCTCTTTCAAA

TTACCACGTCAGCTGGATGGGACGGATTGCTTGCTCCGATTCTCAACAGTAAA

CCGCCCGATTGCGACCCTAACAAAGTGAATCCGGGTTCATCTGTAAAGGGAGA

CTGCGGAAATCCGAGCGTCGGTATCTTCTTTTTCGTCTCCTACATTATAATTTCT

TTCCTTGTTGTCGTGAACATGTATATAGCTGTGATCTTGGAAAATTTTTCTGTTG

CTACTGAGGAATCCGCAGAACCACTTTCAGAAGACGATTTTGAGATGTTTTAC

GAAGTTTGGGAGAAGTTTGATCCTGACGCTACACAGTTTATGGAATTTGAGAA

GCTCTCACAGTTCGCAGCTGCCCTGGAGCCTCCGTTGAATCTTCCACAGCCTAA

CAAGTTACAACTGATTGCGATGGACCTGCCAATGGTGTCTGGGGACCGAATCC

ACTGCCTTGATATACTCTTTGCTTTCACAAAAAGGGTCTTGGGCGAGTCTGGAG

AAATGGACGCCCTCAGAATACAGATGGAGGAACGATTCATGGCTTCGAATCCT

AGCAAAGTGTCTTATCAACCCATCACTACGACTCTTAAAAGAAAACAAGAGGA

AGTGTCTGCTGTCATTATCCAGCGAGCATATAGACGGCACTTGCTCAAACGAA

CTGTTAAGCAAGCCAGTTTCACCTACAATAAAAACAAAATAAAAGGTGGTGCT

AATTTGCTGATTAAAGAGGACATGATTATCGACAGAATCAATGAGAACTCCAT

TACAGAAAAAACCGATCTCACTATGTCAACAGCAGCCTGTCCTCCCTCATACG

ACCGTGTCACTAAACCTATAGTCGAAAAACATGAACAAGAGGGCAAGGATGA

GAAGGCCAAAGGCAAA (SEQ ID NO: 64)

Human SCN1A:

ATGGAGCAAACAGTGCTTGTACCACCAGGACCTGACAGCTTCAACTTCTTCAC

CAGAGAATCTCTTGCGGCTATTGAAAGACGCATTGCAGAAGAAAAGGCAAAG

AATCCCAAACCAGACAAAAAAGATGACGACGAAAATGGCCCAAAGCCAAATA

GTGACTTGGAAGCTGGAAAGAACCTTCCATTTATTTATGGAGACATTCCTCCA

GAGATGGTGTCAGAGCCCCTGGAGGACCTGGACCCCTACTATATCAATAAGAA

AACTTTTATAGTATTGAATAAAGGGAAGGCCATCTTCCGGTTCAGTGCCACCTC

TGCCCTGTACATTTTAACTCCCTTCAATCCTCTTAGGAAAATAGCTATTAAGAT

TTTGGTACATTCATTATTCAGCATGCTAATTATGTGCACTATTTTGACAAACTG

TGTGTTTATGACAATGAGTAACCCTCCTGATTGGACAAAGAATGTAGAATACA

CCTTCACAGGAATATATACTTTTGAATCACTTATAAAAATTATTGCAAGGGGAT TCTGTTTAGAAGATTTTACTTTCCTTCGGGATCCATGGAACTGGCTCGATTTCA

CTGTCATTACATTTGCGTACGTCACAGAGTTTGTGGACCTGGGCAATGTCTCGG

CATTGAGAACATTCAGAGTTCTCCGAGCATTGAAGACGATTTCAGTCATTCCA

GGCCTGAAAACCATTGTGGGAGCCCTGATCCAGTCTGTGAAGAAGCTCTCAGA

TGTAATGATCCTGACTGTGTTCTGTCTGAGCGTATTTGCTCTAATTGGGCTGCA

GCTGTTCATGGGCAACCTGAGGAATAAATGTATACAATGGCCTCCCACCAATG

CTTCCTTGGAGGAACATAGTATAGAAAAGAATATAACTGTGAATTATAATGGT

ACACTTATAAATGAAACTGTCTTTGAGTTTGACTGGAAGTCATATATTCAAGAT

TCAAGATATCATTATTTCCTGGAGGGTTTTTTAGATGCACTACTATGTGGAAAT

AGCTCTGATGCAGGCCAATGTCCAGAGGGATATATGTGTGTGAAAGCTGGTAG

AAATCCCAATTATGGCTACACAAGCTTTGATACCTTCAGTTGGGCTTTTTTGTC

CTTGTTTCGACTAATGACTCAGGACTTCTGGGAAAATCTTTATCAACTGACATT

ACGTGCTGCTGGGAAAACGTACATGATATTTTTTGTATTGGTCATTTTCTTGGG

CTCATTCTACCTAATAAATTTGATCCTGGCTGTGGTGGCCATGGCCTACGAGGA

ACAGAATCAGGCCACCTTGGAAGAAGCAGAACAGAAAGAGGCCGAATTTCAG

CAGATGATTGAACAGCTTAAAAAGCAACAGGAGGCAGCTCAGCAGGCAGCAA

CGGCAACTGCCTCAGAACATTCCAGAGAGCCCAGTGCAGCAGGCAGGCTCTCA

GACAGCTCATCTGAAGCCTCTAAGTTGAGTTCCAAGAGTGCTAAGGAAAGAAG

AAATCGGAGGAAGAAAAGAAAACAGAAAGAGCAGTCTGGTGGGGAAGAGAA

AGATGAGGATGAATTCCAAAAATCTGAATCTGAGGACAGCATCAGGAGGAAA

GGTTTTCGCTTCTCCATTGAAGGGAACCGATTGACATATGAAAAGAGGTACTC

CTCCCCACACCAGTCTTTGTTGAGCATCCGTGGCTCCCTATTTTCACCAAGGCG

AAATAGCAGAACAAGCCTTTTCAGCTTTAGAGGGCGAGCAAAGGATGTGGGA

TCTGAGAACGACTTCGCAGATGATGAGCACAGCACCTTTGAGGATAACGAGAG

CCGTAGAGATTCCTTGTTTGTGCCCCGACGACACGGAGAGAGACGCAACAGCA

ACCTGAGTCAGACCAGTAGGTCATCCCGGATGCTGGCAGTGTTTCCAGCGAAT

GGGAAGATGCACAGCACTGTGGATTGCAATGGTGTGGTTTCCTTGGTTGGTGG

ACCTTCAGTTCCTACATCGCCTGTTGGACAGCTTCTGCCAGAGGTGATAATAGA

TAAGCCAGCTACTGATGACAATGGAACAACCACTGAAACTGAAATGAGAAAG

AGAAGGTCAAGTTCTTTCCACGTTTCCATGGACTTTCTAGAAGATCCTTCCCAA

AGGCAACGAGCAATGAGTATAGCCAGCATTCTAACAAATACAGTAGAAGAAC

TTGAAGAATCCAGGCAGAAATGCCCACCCTGTTGGTATAAATTTTCCAACATA

TTCTTAATCTGGGACTGTTCTCCATATTGGTTAAAAGTGAAACATGTTGTCAAC

CTGGTTGTGATGGACCCATTTGTTGACCTGGCCATCACCATCTGTATTGTCTTA

AATACTCTTTTCATGGCCATGGAGCACTATCCAATGACGGACCATTTCAATAAT

GTGCTTACAGTAGGAAACTTGGTTTTCACTGGGATCTTTACAGCAGAAATGTTT

CTGAAAATTATTGCCATGGATCCTTACTATTATTTCCAAGAAGGCTGGAATATC

TTTGACGGTTTTATTGTGACGCTTAGCCTGGTAGAACTTGGACTCGCCAATGTG

GAAGGATTATCTGTTCTCCGTTCATTTCGATTGCTGCGAGTTTTCAAGTTGGCA

AAATCTTGGCCAACGTTAAATATGCTAATAAAGATCATCGGCAATTCCGTGGG

GGCTCTGGGAAATTTAACCCTCGTCTTGGCCATCATCGTCTTCATTTTTGCCGT

GGTCGGCATGCAGCTCTTTGGTAAAAGCTACAAAGATTGTGTCTGCAAGATCG

CCAGTGATTGTCAACTCCCACGCTGGCACATGAATGACTTCTTCCACTCCTTCC

TGATTGTGTTCCGCGTGCTGTGTGGGGAGTGGATAGAGACCATGTGGGACTGT

ATGGAGGTTGCTGGTCAAGCCATGTGCCTTACTGTCTTCATGATGGTCATGGTG

ATTGGAAACCTAGTGGTCCTGAATCTCTTTCTGGCCTTGCTTCTGAGCTCATTT

AGTGCAGACAACCTTGCAGCCACTGATGATGATAATGAAATGAATAATCTCCA

AATTGCTGTGGATAGGATGCACAAAGGAGTAGCTTATGTGAAAAGAAAAATAT

ATGAATTTATTCAACAGTCCTTCATTAGGAAACAAAAGATTTTAGATGAAATT

AAACCACTTGATGATCTAAACAACAAGAAAGACAGTTGTATGTCCAATCATAC AGCAGAAATTGGGAAAGATCTTGACTATCTTAAAGATGTAAATGGAACTACAA

GTGGTATAGGAACTGGCAGCAGTGTTGAATACATTATTGATGAAAGTGATTAC

ATGTCATTCATAAACAACCCCAGTCTTACTGTGACTGTACCAATTGCTGTAGGA

GAATCTGACTTTGAAAATTTAAACACGGAAGACTTTAGTAGTGAATCGGATCT

GGAAGAAAGCAAAGAGAAACTGAATGAAAGCAGTAGCTCATCAGAAGGTAGC

ACTGTGGACATCGGCGCACCTGTAGAAGAACAGCCCGTAGTGGAACCTGAAG

AAACTCTTGAACCAGAAGCTTGTTTCACTGAAGGCTGTGTACAAAGATTCAAG

TGTTGTCAAATCAATGTGGAAGAAGGCAGAGGAAAACAATGGTGGAACCTGA

GAAGGACGTGTTTCCGAATAGTTGAACATAACTGGTTTGAGACCTTCATTGTTT

TCATGATTCTCCTTAGTAGTGGTGCTCTGGCATTTGAAGATATATATATTGATC

AGCGAAAGACGATTAAGACGATGTTGGAATATGCTGACAAGGTTTTCACTTAC

ATTTTCATTCTGGAAATGCTTCTAAAATGGGTGGCATATGGCTATCAAACATAT

TTCACCAATGCCTGGTGTTGGCTGGACTTCTTAATTGTTGATGTTTCATTGGTCA

GTTTAACAGCAAATGCCTTGGGTTACTCAGAACTTGGAGCCATCAAATCTCTC

AGGACACTAAGAGCTCTGAGACCTCTAAGAGCCTTATCTCGATTTGAAGGGAT

GAGGGTGGTTGTGAATGCCCTTTTAGGAGCAATTCCATCCATCATGAATGTGCT

TCTGGTTTGTCTTATATTCTGGCTAATTTTCAGCATCATGGGCGTAAATTTGTTT

GCTGGCAAATTCTACCACTGTATTAACACCACAACTGGTGACAGGTTTGACAT

CGAAGACGTGAATAATCATACTGATTGCCTAAAACTAATAGAAAGAAATGAG

ACTGCTCGATGGAAAAATGTGAAAGTAAACTTTGATAATGTAGGATTTGGGTA

TCTCTCTTTGCTTCAAGTTGCCACATTCAAAGGATGGATGGATATAATGTATGC

AGCAGTTGATTCCAGAAATGTGGAACTCCAGCCTAAGTATGAAGAAAGTCTGT

ACATGTATCTTTACTTTGTTATTTTCATCATCTTTGGGTCCTTCTTCACCTTGAA

CCTGTTTATTGGTGTCATCATAGATAATTTCAACCAGCAGAAAAAGAAGTTTG

GAGGTCAAGACATCTTTATGACAGAAGAACAGAAGAAATACTATAATGCAAT

GAAAAAATTAGGATCGAAAAAACCGCAAAAGCCTATACCTCGACCAGGAAAC

AAATTTCAAGGAATGGTCTTTGACTTCGTAACCAGACAAGTTTTTGACATAAGC

ATCATGATTCTCATCTGTCTTAACATGGTCACAATGATGGTGGAAACAGATGA

CCAGAGTGAATATGTGACTACCATTTTGTCACGCATCAATCTGGTGTTCATTGT

GCTATTTACTGGAGAGTGTGTACTGAAACTCATCTCTCTACGCCATTATTATTT

TACCATTGGATGGAATATTTTTGATTTTGTGGTTGTCATTCTCTCCATTGTAGGT

ATGTTTCTTGCCGAGCTGATAGAAAAGTATTTCGTGTCCCCTACCCTGTTCCGA

GTGATCCGTCTTGCTAGGATTGGCCGAATCCTACGTCTGATCAAAGGAGCAAA

GGGGATCCGCACGCTGCTCTTTGCTTTGATGATGTCCCTTCCTGCGTTGTTTAA

CATCGGCCTCCTACTCTTCCTAGTCATGTTCATCTACGCCATCTTTGGGATGTCC

AACTTTGCCTATGTTAAGAGGGAAGTTGGGATCGATGACATGTTCAACTTTGA

GACCTTTGGCAACAGCATGATCTGCCTATTCCAAATTACAACCTCTGCTGGCTG

GGATGGATTGCTAGCACCCATTCTCAACAGTAAGCCACCCGACTGTGACCCTA

ATAAAGTTAACCCTGGAAGCTCAGTTAAGGGAGACTGTGGGAACCCATCTGTT

GGAATTTTCTTTTTTGTCAGTTACATCATCATATCCTTCCTGGTTGTGGTGAACA

TGTACATCGCGGTCATCCTGGAGAACTTCAGTGTTGCTACTGAAGAAAGTGCA

GAGCCTCTGAGTGAGGATGACTTTGAGATGTTCTATGAGGTTTGGGAGAAGTT

TGATCCCGATGCAACTCAGTTCATGGAATTTGAAAAATTATCTCAGTTTGCAGC

TGCGCTTGAACCGCCTCTCAATCTGCCACAACCAAACAAACTCCAGCTCATTG

CCATGGATTTGCCCATGGTGAGTGGTGACCGGATCCACTGTCTTGATATCTTAT

TTGCTTTTACAAAGCGGGTTCTAGGAGAGAGTGGAGAGATGGATGCTCTACGA

ATACAGATGGAAGAGCGATTCATGGCTTCCAATCCTTCCAAGGTCTCCTATCA

GCCAATCACTACTACTTTAAAACGAAAACAAGAGGAAGTATCTGCTGTCATTA

TTCAGCGTGCTTACAGACGCCACCTTTTAAAGCGAACTGTAAAACAAGCTTCC

TTTACGTACAATAAAAACAAAATCAAAGGTGGGGCTAATCTTCTTATAAAAGA AGACATGATAATTGACAGAATAAATGAAAACTCTATTACAGAAAAAACTGATC TGACCATGTCCACTGCAGCTTGTCCACCTTCCTATGACCGGGTGACAAAGCCA ATTGTGGAAAAACATGAGCAAGAAGGCAAAGATGAAAAAGCCAAAGGGA (SEQ ID NO: 65)

Beta-Globin Minimal Promoter (pBGmin/minBGlobin/minBGprom):

GGGCTGGGCATAAAAGTCAGGGCAGAGCCATCTATTGCTTACATTTGCTTC

TG (SEQ ID NO:3)

MinCMV Promoter:

GAGGTAGGCGTGTACGGTGGGAGGCCTATATAAGCAGAGCTCGTTTAGTGAAC

CGTCAGATCGCCTGG (SEQ ID NO:4)

Mutated minCMV Promoter (SacI RE site removed):

GAGGTAGGCGTGTACGGTGGGAGGCCTATATAAGCAGAGCTGGTTTAGTGAAC

CGTCAGATCGCCTGG (SEQ ID NO: 5) minRho Promoter:

GATTCAGCCGGGAGCTTAGGGAGGGGAGGTCACTTCATAAGGGCCTGGGGGG GGAGTTGGAGCCACGAGTCGTCCAGCCGGAGCCCCGTGTGGCTGAGCTCCGGC CTCAGAAGCATCCCC (SEQ ID NO:6) minRho* Promoter:

GATTCAGCCGGGAGCTTAGGGAGGGGAGGTCACTTCATAAGGGCTTGGGG GGGGAGTTGGAGCCACGAGTCGTCCAGCCGGAGCCCCGTGTGGCTGTGCTC CGGCCTCAGAAGCATCCCC (SEQ ID NO: 7)

Hsp68 minimal Promoter (proHsp68):

CAGGAACATCCAAACTGAGCAGCCGGGGTCCCCCCCACCCCCCACCCCGCCCC ACGCGGCAACTTTGAGCCTGTGCTGGGACAGAGCCTCTAGTTCCTAAATTAGT CCATGAGGTCAGAGGCAGCACTGCCATTGTAACGCGATTGGAGAGGATCACGT

CACCGGACACGCCCCCAGGCATCTCCCTGGGTCTCCTAAACTTGGCGGGGAGA AGTTTTAGCCCTTAAGTTTTAGCCTTTAACCCCCATATTCAGAACTGTGCGAGT TGGCGAAACCCCACAAATCACAACAAACTGTACACAACACCGAGCTAGAGGT GATCTTTCTTGTCCATTCCACACAGGCCTTAGTAATGCGTCGCCATAGCAACAG TGTCACTAGTAGCACCAGCACTTCCCCACACCCTCCCCCTCAGGAATCCGTACT CTCCAGTGAACCCCAGAAACCTCTGGAGAGTTCTGGACAAGGGCGGAACCCAC

AACTCCGATTACTCAAGGGAGGCGGGGAAGCTCCACCAGACGCGAAACTGCT

GGAAGATTCCTGGCCCCAAGGCCTCCTCCGGCTCGCTGATTGGCCCAGCGGAG AGTGGGCGGGGCCGGTGAAGACTCCTTAAAGGCGCAGGGCGGCGAGCAGGTC ACCAGACGCTGACAGCTACTCAGAACCAAATCTGGTTCCATCCAGAGACAAGC

GAAGACAAGAGAAGCAGAGCGAGCGGCGCGTTCCCGATCCTCGGCCAGGACC AGCCTTCCCCAGAGCATCCCTGCCGCGGAGCGCAACCTTCCCAGGAGCATCCC TGCCGCGGAGCGCAACTTTCCCCGGAGCATCCACGCCGCGGAGCGCAGCCTTC

CAGAAGCAGAGCGCGGCGCC (SEQ ID NO: 8)

SYFP2:

ATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGA GCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAG GGCGATGCCACCTACGGCAAGCTGACCCTGAAGCTGATCTGCACCACCGGCAA

GCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGGGCTACGGCGTGCAGT GCTTCGCCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCC

ATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAA

CTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGC

ATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACA

AGCTGGAGTACAACTACAACAGCCACAACGTCTATATCACCGCCGACAAGCA

GAAGAACGGCATCAAGGCCAACTTCAAGATCCGCCACAACATCGAGGACGGC

GGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCC

CCGTGCTGCTGCCCGACAACCACTACCTGAGCTACCAGTCCAAGCTGAGCAAA

GACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGC

CGGGATCACTCTCGGCATGGACGAGCTGTACAAGTAA (SEQ ID NO: 9)

EGFP:

ATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGA

GCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAG

GGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAA

GCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGT

GCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCC

ATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAA

CTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGC

ATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACA

AGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAG

AAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCA

GCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCC

GTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGA

CCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCG

GGATCACTCTCGGCATGGACGAGCTGTACAAGTAA (SEQ ID NO: 10) mScarlet:

ATGGTCTCCAAAGGAGAAGCGGTCATTAAAGAGTTCATGAGGTTCAAGGTTCA

TATGGAAGGCTCCATGAATGGTCATGAGTTCGAGATTGAAGGGGAGGGTGAG

GGGAGACCTTATGAGGGCACTCAGACAGCGAAATTGAAGGTGACAAAGGGAG

GACCTCTCCCGTTCAGTTGGGACATATTGTCACCGCAATTTATGTATGGTTCTA

GAGCCTTCACTAAGCACCCTGCCGACATCCCAGATTACTACAAGCAATCCTTCC

CTGAGGGCTTTAAGTGGGAGAGAGTAATGAATTTTGAAGATGGCGGGGCAGTC

ACAGTAACACAAGATACATCCCTGGAAGATGGAACACTTATCTACAAAGTTAA

GCTCAGAGGAACGAATTTTCCACCGGACGGTCCAGTGATGCAAAAAAAAACA

ATGGGTTGGGAAGCATCTACAGAGCGACTGTACCCTGAAGACGGTGTGCTGAA

GGGGGACATCAAAATGGCCCTGCGACTTAAGGATGGAGGGCGCTATTTGGCAG

ATTTCAAGACTACTTACAAAGCCAAAAAGCCTGTACAAATGCCTGGAGCTTAC

AACGTGGATAGGAAGCTTGATATTACCAGTCACAATGAAGATTATACAGTGGT

AGAACAATATGAACGCTCAGAAGGTCGCCACAGCACTGGAGGCATGGATGAG

TTGTACAAG (SEQ ID NO: 66)

3xNLS:

GATCCAAAGAAGAAAAGGAAAGTTGATCCCAAAAAGAAGAGGAAAGTAGATC

CAAAAAAGAAGCGAAAAGTAGGGTACAAGAAG (SEQ ID NO: 67)

Optimized Flp recombinase (FlpO):

ATGGCTCCTAAGAAGAAGAGGAAGGTGATGAGCCAGTTCGACATCCTGTGCAA GACCCCCCCCAAGGTGCTGGTGCGGCAGTTCGTGGAGAGATTCGAGAGGCCCA

GCGGCGAGAAGATCGCCAGCTGTGCCGCCGAGCTGACCTACCTGTGCTGGATG

ATCACCCACAACGGCACCGCCATCAAGAGGGCCACCTTCATGAGCTACAACAC

CATCATCAGCAACAGCCTGAGCTTCGACATCGTGAACAAGAGCCTGCAGTTCA

AGTACAAGACCCAGAAGGCCACCATCCTGGAGGCCAGCCTGAAGAAGCTGAT

CCCCGCCTGGGAGTTCACCATCATCCCTTACAACGGCCAGAAGCACCAGAGCG

ACATCACCGACATCGTGTCCAGCCTGCAGCTGCAGTTCGAGAGCAGCGAGGAG

GCCGACAAGGGCAACAGCCACAGCAAGAAGATGCTGAAGGCCCTGCTGTCCG

AGGGCGAGAGCATCTGGGAGATCACCGAGAAGATCCTGAACAGCTTCGAGTA

CACCAGCAGGTTCACCAAGACCAAGACCCTGTACCAGTTCCTGTTCCTGGCCA

CATTCATCAACTGCGGCAGGTTCAGCGACATCAAGAACGTGGACCCCAAGAGC

TTCAAGCTGGTGCAGAACAAGTACCTGGGCGTGATCATTCAGTGCCTGGTGAC

CGAGACCAAGACAAGCGTGTCCAGGCACATCTACTTTTTCAGCGCCAGAGGCA

GGATCGACCCCCTGGTGTACCTGGACGAGTTCCTGAGGAACAGCGAGCCCGTG

CTGAAGAGAGTGAACAGGACCGGCAACAGCAGCAGCAACAAGCAGGAGTACC

AGCTGCTGAAGGACAACCTGGTGCGCAGCTACAACAAGGCCCTGAAGAAGAA

CGCCCCCTACCCCATCTTCGCTATCAAGAACGGCCCTAAGAGCCACATCGGCA

GGCACCTGATGACCAGCTTTCTGAGCATGAAGGGCCTGACCGAGCTGACAAAC

GTGGTGGGCAACTGGAGCGACAAGAGGGCCTCCGCCGTGGCCAGGACCACCT

ACACCCACCAGATCACCGCCATCCCCGACCACTACTTCGCCCTGGTGTCCAGGT

ACTACGCCTACGACCCCATCAGCAAGGAGATGATCGCCCTGAAGGACGAGACC

AACCCCATCGAGGAGTGGCAGCACATCGAGCAGCTGAAGGGCAGCGCCGAGG

GCAGCATCAGATACCCCGCCTGGAACGGCATCATCAGCCAGGAGGTGCTGGAC

TACCTGAGCAGCTACATCAACAGGCGGATCTGA (SEQ ID NO: 11)

Improved Cre recombinase (iCre):

ATGGTGCCCAAGAAGAAGAGGAAAGTCTCCAACCTGCTGACTGTGCACCAAA

ACCTGCCTGCCCTCCCTGTGGATGCCACCTCTGATGAAGTCAGGAAGAACCTG

ATGGACATGTTCAGGGACAGGCAGGCCTTCTCTGAACACACCTGGAAGATGCT

CCTGTCTGTGTGCAGATCCTGGGCTGCCTGGTGCAAGCTGAACAACAGGAAAT

GGTTCCCTGCTGAACCTGAGGATGTGAGGGACTACCTCCTGTACCTGCAAGCC

AGAGGCCTGGCTGTGAAGACCATCCAACAGCACCTGGGCCAGCTCAACATGCT

GCACAGGAGATCTGGCCTGCCTCGCCCTTCTGACTCCAATGCTGTGTCCCTGGT

GATGAGGAGAATCAGAAAGGAGAATGTGGATGCTGGGGAGAGAGCCAAGCA

GGCCCTGGCCTTTGAACGCACTGACTTTGACCAAGTCAGATCCCTGATGGAGA

ACTCTGACAGATGCCAGGACATCAGGAACCTGGCCTTCCTGGGCATTGCCTAC

AACACCCTGCTGCGCATTGCCGAAATTGCCAGAATCAGAGTGAAGGACATCTC

CCGCACCGATGGTGGGAGAATGCTGATCCACATTGGCAGGACCAAGACCCTG

GTGTCCACAGCTGGTGTGGAGAAGGCCCTGTCCCTGGGGGTTACCAAGCTGGT

GGAGAGATGGATCTCTGTGTCTGGTGTGGCTGATGACCCCAACAACTACCTGT

TCTGCCGGGTCAGAAAGAATGGTGTGGCTGCCCCTTCTGCCACCTCCCAACTG

TCCACCCGGGCCCTGGAAGGGATCTTTGAGGCCACCCACCGCCTGATCTATGG

TGCCAAGGATGACTCTGGGCAGAGATACCTGGCCTGGTCTGGCCACTCTGCCA

GAGTGGGTGCTGCCAGGGACATGGCCAGGGCTGGTGTGTCCATCCCTGAAATC

ATGCAGGCTGGTGGCTGGACCAATGTGAACATTGTGATGAACTACATCAGAAA

CCTGGACTCTGAGACTGGGGCCATGGTGAGGCTGCTCGAGGATGGGGACTAA

(SEQ ID NO: 12) SP10 insulator (SPlOins):

GAAGCTACCCCTAACACACTATTCTACACACAGAAAATGCTCTTCACTAG (SEQ ID NO:13)

3xSP10ins:

GAAGCTACCCCTAACACACTATTCTACACACAGAAAATGCTCTTCACTAGGAA

GCTACCCCTAACACACTATTCTACACACAGAAAATGCTCTTCACTAGGAAGCT ACCCCTAACACACTATTCTACACACAGAAAATGCTCTTCACTAG (SEQ ID NO: 14)

3XFLAG-version 1:

GACTACAAAGACCATGACGGAGACTATAAAGATCATGACATCGATTACAAGG

ATGACGATGACAAG (SEQ ID NO: 68)

3XFLAG-version 2:

GACTACAAAGACCATGACGGAGATTATAAAGATCATGACATCGATTACAAGG ATGACGATGACAAG (SEQ ID NO: 15)

2xHA:

TACCCGTATGATGTCCCGGATTACGCTGGCAGCTACCCATACGATGTACCCGA

CTATGCCGGCAGT (SEQ ID NO: 69) lOaa:

TCCGGACTCAGATCTGGAGGCTCCGGAGGC (SEQ ID NO: 16)

H2B:

CCAGAGCCAGCGAAGTCTGCTCCCGCCCCGAAAAAGGGCTCCAAGAAGGCGG

TGACTAAGGCGCAGAAGAAAGGCGGCAAGAAGCGCAAGCGCAGCCGCAAGG AGAGCTATTCCATCTATGTGTACAAGGTTCTGAAGCAGGTCCACCCTGACACC GGCATTTCGTCCAAGGCCATGGGCATCATGAATTCGTTTGTGAACGACATTTTC

GAGCGCATCGCAGGAGAGGCTTCCCGCCTGGCGCATTACAACAAGCGCTCGAC CATCACCTCCCGGGAGATCCAGACGGCCGTGCGCCTGCTGCTGCCTGGGGAGT TGGCCAAGCACGCCGTGTCCGAGGGTACTAAGGCCATCACCAAGTACACCAGC GCTAAGTAA (SEQ ID NO: 17)

WPRE3:

ATAATCAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAACT

ATGTTGCTCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCTTTGTATCATGC TATTGCTTCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTTAGTT CTTGCCACGGCGGAACTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGGC

TCGGCTGTTGGGCACTGACAATTCCGTGG (SEQ ID NO: 18)

WPRE:

GCTTATCGATAATCAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTA

TTCTTAACTATGTTGCTCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCTTT GTATCATGCTATTGCTTCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATCC TGGTTGCTGTCTCTTTATGAGGAGTTGTGGCCCGTTGTCAGGCAACGTGGCGTG GTGTGCACTGTGTTTGCTGACGCAACCCCCACTGGTTGGGGCATTGCCACCAC CTGTCAGCTCCTTTCCGGGACTTTCGCTTTCCCCCTCCCTATTGCCACGGCGGA ACTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGGCTCGGCTGTTGGGCA CTGACAATTCCGTGGTGTTGTCGGGGAAATCATCGTCCTTTCCTTGGCTGCTCG CCTATGTTGCCACCTGGATTCTGCGCGGGACGTCCTTCTGCTACGTCCCTTCGG CCCTCAATCCAGCGGACCTTCCTTCCCGCGGCCTGCTGCCGGCTCTGCGGCCTC TTCCGCGTCTTCGCCTTCGCCCTCAGACGAGTCGGATCTCCCTTTGGGCCGCCT CCCCGCATCGATACCG (SEQ ID NO: 19)

BGHpA:

CGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTT CCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAA ATTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGG GCAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGGCATG (SEQ ID NO:20) hGHpA:

ACGGGTGGCATCCCTGTGACCCCTCCCCAGTGCCTCTCCTGGCCCTGGAAGTTG CCACTCCAGTGCCCACCAGCCTTGTCCTAATAAAATTAAGTTGCATCATTTTGT CTGACTAGGTGTCCTTCTATAATATTATGGGGTGGAGGGGGGTGGTATGGAGC AAGGGGCAAGTTGGGAAGACAACCTGTAGGGCCTGCGGGGTCTATTGGGAAC CAAGCTGGAGTGCAGTGGCACAATCTTGGCTCACTGCAATCTCCGCCTCCTGG GTTCAAGCGATTCTCCTGCCTCAGCCTCCCGAGTTGTTGGGATTCCAGGCATGC ATGACCAGGCTCAGCTAATTTTTGTTTTTTTGGTAGAGACGGGGTTTCACCATA TTGGCCAGGCTGGTCTCCAACTCCTAATCTCAGGTGATCTACCCACCTTGGCCT CCCAAATTGCTGGGATTACAGGCGTGAACCACTGCTCCCTTCCCTGTCCTT (SEQ ID NO:21)

SV40pA:

TGTTTATTTGTGAAATTTGTGATGCTATTGCTTTATTTGTAACCATCTAGCTTTA TTTGTGAAATTTGTGATGCTATTGCTTTATTTGTAACCATTATAAGCTGCAATA AACAAGTTAACAACAACAATTGCATTCATTTTATGTTTCAGGTTCAGGGGGAG ATGTGGGAGGTTTTTTAAA (SEQ ID NO:70)

ShortPolyA:

CAATAAAAGATCTTTATTTTCATTAGATCTGTGTGTTGGTTTTTTGTGTG (SEQ ID NO:71)

IRES2:

CCCTCTCCCTCCCCCCCCCCTAACGTTACTGGCCGAAGCCGCTTGGAATAAGGC CGGTGTGCGTTTGTCTATATGTTATTTTCCACCATATTGCCGTCTTTTGGCAATG TGAGGGCCCGGAAACCTGGCCCTGTCTTCTTGACGAGCATTCCTAGGGGTCTTT CCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCGTGAAGGAAGCAGTT CCTCTGGAAGCTTCTTGAAGACAAACAACGTCTGTAGCGACCCTTTGCAGGCA GCGGAACCCCCCACCTGGCGACAGGTGCCTCTGCGGCCAAAAGCCACGTGTAT AAGATACACCTGCAAAGGCGGCACAACCCCAGTGCCACGTTGTGAGTTGGATA GTTGTGGAAAGAGTCAAATGGCTCTCCTCAAGCGTATTCAACAAGGGGCTGAA GGATGCCCAGAAGGTACCCCATTGTATGGGATCTGATCTGGGGCCTCGGTGCA CATGCTTTACATGTGTTTAGTCGAGGTTAAAAAAACGTCTAGGCCCCCCGAAC CACGGGGACGTGGTTTTCCTTTGAAAAACACGATGATAATATGGCCACAACC (SEQ ID NO: 72) P2A:

GGCAGCGGCGCCACCAACTTCAGCCTGCTGAAGCAGGCCGGCGACGTGGAGG

AGAACCCCGGCCCCGGAGCTAGCGGA (SEQ ID NO:22)

T2A:

(GSG)EGRGSLLTCGDVEENPGP (SEQ ID NO:23)

E2A:

(GSG)QCTNYALLKLAGDVESNPGPP (SEQ ID NO:24)

F2A:

(GSG)VKQTLNFDLLKLAGDVESNPGP (SEQ ID NO:25)

Exemplary Plasmid Backbone 1 - Left ITR:

CCTGCAGGCAGCTGCGCGCTCGCTCGCTCACTGAGGCCGCCCGGGCGTCGGGC GACCTTTGGTCGCCCGGCCTCAGTGAGCGAGCGAGCGCGCAGAGAGGGAGTG GCCAACTCCATCACTAGGGGTTCCT (SEQ ID NO:26)

Exemplary Plasmid Backbone 1 - Right ITR:

CCTGCAGGCAGCTGCGCGCTCGCTCGCTCACTGAGGCCGCCCGGGCAAAGCCC GGGCGTCGGGCGACCTTTGGTCGCCCGGCCTCAGTGAGCGAGCGAGCGCGCAG AGAGGGAGTGGCCAACTCCATCACTAGGGGTTCCT (SEQ ID NO:27)

Exemplary Plasmid Backbone 2 - Left ITR:

CATGTCCTGCAGGCAGCTGCGCGCTCGCTCGCTCACTGAGGCCGCCCGGGCAA AGCCCGGGCGTCGGGCGACCTTTGGTCGCCCGGCCTCAGTGAGCGAGCGAGCG CGCAGAGAGGGAGTGGCCAACTCCATCACTAGGGGTTCCT (SEQ ID NO:28)

Exemplary Plasmid Backbone 2 - Right ITR:

AGGAACCCCTAGTGATGGAGTTGGCCACTCCCTCTCTGCGCGCTCGCTCGCTCA CTGAGGCCGGGCGACCAAAGGTCGCCCGACGCCCGGGCTTTGCCCGGGCGGCC TCAGTGAGCGAGCGAGCGCGCAGCTGCCTGCAGGGGCGCCTG (SEQ ID NO:29)

PHP.eB capsid:

MAADGYLPDWLEDNLSEGIREWWALKPGAPQPKANQQHQDNARGLVLPGYKYL GPGNGLDKGEPVNAADAAALEHDKAYDQQLKAGDNPYLKYNHADAEFQERLKE DTSFGGNLGRAVFQAKKRLLEPLGLVEEAAKTAPGKKRPVEQSPQEPDSSAGIGK

SGAQPAKKRLNFGQTGDTESVPDPQPIGEPPAAPSGVGSLTMASGGGAPVADNNE GADGVGSSSGNWHCDSQWLGDRVITTSTRTWALPTYNNHLYKQISNSTSGGSSN DNAYFGYSTPWGYFDFNRFHCHFSPRDWQRLINNNWGFRPKRLNFKLFNIQVKE

VTDNNGVKTIANNLTSTVQVFTDSDYQLPYVLGSAHEGCLPPFPADVFMIPQYGY LTLNDGSQAVGRSSFYCLEYFPSQMLRTGNNFQFSYEFENVPFHSSYAHSQSLDRL

MNPLIDQYLYYLSKTINGSGQNQQTLKFSVAGPSNMAVQGRNYIPGPSYRQQRVS TTVTQNNNSEFAWPGASSWALNGRNSLMNPGPAMASHKEGEDRFFPLSGSLIFGK QGTGRDNVDADKVMITNEEEIKTTNPVATESYGQVATNHQSDGTLAVPFKAQAQ

TGWVQNQGILPGMVWQDRDVYLQGPIWAKIPHTDGNFHPSPLMGGFGMKHPPPQ ILIKNTPVPADPPTAFNKDKLNSFITQYSTGQVSVEIEWELQKENSKRWNPEIQYTS NYYKSNNVEFAVNTEGVYSEPRPIGTRYLTRNL (SEQ ID NO: 30) AAV9 VP1 capsid protein:

MAADGYLPDWLEDNLSEGIREWWALKPGAPQPKANQQHQDNARGLVLPGYKYL GPGNGLDKGEPVNAADAAALEHDKAYDQQLKAGDNPYLKYNHADAEFQERLKE DTSFGGNLGRAVFQAKKRLLEPLGLVEEAAKTAPGKKRPVEQSPQEPDSSAGIGK SGAQPAKKRLNFGQTGDTESVPDPQPIGEPPAAPSGVGSLTMASGGGAPVADNNE GADGVGSSSGNWHCDSQWLGDRVITTSTRTWALPTYNNHLYKQISNSTSGGSSN

DNAYFGYSTPWGYFDFNRFHCHFSPRDWQRLINNNWGFRPKRLNFKLFNIQVKE VTDNNGVKTIANNLTSTVQVFTDSDYQLPYVLGSAHEGCLPPFPADVFMIPQYGY LTLNDGSQAVGRSSFYCLEYFPSQMLRTGNNFQFSYEFENVPFHSSYAHSQSLDRL MNPLIDQYLYYLSKTINGSGQNQQTLKFSVAGPSNMAVQGRNYIPGPSYRQQRVS TTVTQNNNSEFAWPGASSWALNGRNSLMNPGPAMASHKEGEDRFFPLSGSLIFGK

QGTGRDNVDADKVMITNEEEIKTTNPVATESYGQVATNHQSAQAQAQTGWVQN QGILPGMVWQDRDVYLQGPIWAKIPHTDGNFHPSPLMGGFGMKHPPPQILIKNTP VPADPPTAFNKDKLNSFITQYSTGQVSVEIEWELQKENSKRWNPEIQYTSNYYKSN NVEFAVNTEGVYSEPRPIGTRYLTRNL (SEQ ID NO:31) tet-Transactivator version 2 (tTA2):

ATGTCTAGACTGGACAAGAGCAAAGTCATAAACTCTGCTCTGGAATTACTCAA TGAAGTCGGTATCGAAGGCCTGACGACAAGGAAACTCGCTCAAAAGCTGGGA GTTGAGCAGCCTACCCTGTACTGGCACGTGAAGAACAAGCGGGCCCTGCTCGA TGCCCTGGCAATCGAGATGCTGGACAGGCATCATACCCACTTCTGCCCCCTGG AAGGCGAGTCATGGCAAGACTTTCTGCGGAACAACGCCAAGTCATTCCGCTGT

GCTCTCCTCTCACATCGCGACGGGGCTAAAGTGCATCTCGGCACCCGCCCAAC AGAGAAACAGTACGAAACCCTGGAAAATCAGCTCGCGTTCCTGTGTCAGCAAG GCTTCTCCCTGGAGAACGCACTGTACGCTCTGTCCGCCGTGGGCCACTTTACAC TGGGCTGCGTATTGGAGGATCAGGAGCATCAAGTAGCAAAAGAGGAAAGAGA GACACCTACCACCGATTCTATGCCCCCACTTCTGAGACAAGCAATTGAGCTGTT

CGACCATCAGGGAGCCGAACCTGCCTTCCTTTTCGGCCTGGAACTAATCATATG TGGCCTGGAGAAACAGCTAAAGTGCGAAAGCGGCGGGCCGGCCGACGCCCTT GACGATTTTGACTTAGACATGCTCCCAGCCGATGCCCTTGACGACTTTGACCTT GATATGCTGCCTGCTGACGCTCTTGACGATTTTGACCTTGACATGCTCCCCGGG TAA (SEQ ID NO:32)

GTPase HRas [Homo sapiens]:

MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDIL DTAGQEEYSAMRDQYMRTGEGFLCVFAINNTKSFEDIHQYREQIKRVKDSDDVPM VLVGNKCDLAARTVESRQAQDLARSYGIPYIETSAKTRQGVEDAFYTLVREIRQH KLRKLNPPDESGPGCMSCKCVLS (SEQ ID NO:33)

CN3252 (4586 bp between ITRs)

GCGGCCGCACGCGTGGTACCCTAAATAAAGATGGCTTTTTAGTATTAAAAGTG GAAGAAAATTACAGGTAATTATCTTTGACGGTAAAAACGCTGTAATCAGCGGG CTACATGAAAAATTACTCTAATTATGGCTGCATTTAAGAGAATGGCTAAATAA AGATGGCTTTTTAGTATTAAAAGTGGAAGAAAATTACAGGTAATTATCTTTGA CGGTAAAAACGCTGTAATCAGCGGGCTACATGAAAAATTACTCTAATTATGGC

TGCATTTAAGAGAATGGCTAAATAAAGATGGCTTTTTAGTATTAAAAGTGGAA GAAAATTACAGGTAATTATCTTTGACGGTAAAAACGCTGTAATCAGCGGGCTA CATGAAAAATTACTCTAATTATGGCTGCATTTAAGAGAATGGAGCTCGGGCTG GGCATAAAAGTCAGGGCAGAGCCATCTATTGCTTACATTTGCTTCTGGGATCC

ACCATGGTCTACCCGTATGATGTCCCGGATTACGCTGGCAGCTACCCATACGA

TGTACCCGACTATGCCGGCAGTATGGAGCAAACAGTTTTGGTCCCTCCGGGAC

CAGACAGTTTCAATTTCTTTACTCGGGAGAGTCTTGCCGCCATTGAGAGGCGC

ATAGCTGAGGTAAGTACTAGCAGCTACAATCCAGCTACCATTCTGCTTTTATTC

TATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCTTTTGCTAA

TCATGTTCATACCTCTTATCTTCCTCCCACAGGAAAAGGCTAAGAATCCAAAA

CCTGACAAGAAAGACGACGACGAAAACGGACCCAAACCTAACTCAGATCTCG

AAGCTGGAAAGAATCTCCCATTCATCTATGGTGATATCCCTCCAGAAATGGTT

TCAGAACCTCTAGAAGATCTCGATCCATACTATATCAATAAAAAGACCTTCAT

CGTTCTGAACAAAGGAAAGGCGATTTTCCGGTTTTCTGCTACTTCTGCTCTCTA

TATTCTCACACCATTTAATCCACTTCGCAAGATTGCGATTAAGATACTGGTGCA

TAGTCTGTTCAGTATGCTGATTATGTGTACAATCCTTACAAATTGTGTCTTTAT

GACTATGTCTAACCCGCCGGATTGGACCAAGAACGTAGAATACACGTTCACTG

GAATCTATACGTTCGAGTCTCTTATTAAGATAATCGCCAGGGGGTTCTGTCTTG

AGGATTTCACTTTCCTCCGCGATCCGTGGAATTGGCTTGACTTCACCGTTATTA

CGTTCGCTTACGTTACTGAGTTCGTTGATCTTGGCAACGTGTCTGCACTCAGAA

CATTCAGAGTGCTTAGAGCACTTAAAACCATAAGTGTCATACCAGGATTGAAA

ACGATCGTGGGAGCTCTGATACAGAGTGTAAAGAAGCTTTCAGATGTAATGAT

CCTTACTGTCTTCTGTCTTTCCGTATTCGCACTCATCGGGCTCCAGCTGTTTATG

GGTAACCTCAGAAACAAATGCATTCAATGGCCACCAACAAATGCGAGCCTTGA

GGAACATAGCATAGAAAAGAATATCACTGTTAACTATAATGGGACCCTCATAA

ACGAAACCGTGTTCGAATTTGACTGGAAATCCTACATTCAGGATTCCAGATAT

CATTATTTTCTTGAGGGCTTCTTGGACGCACTTTTGTGCGGAAATTCAAGTGAT

GCTGGTCAATGTCCTGAAGGTTATATGTGTGTTAAAGCAGGAAGAAACCCAAA

CTACGGATACACATCTTTCGATACATTTTCTTGGGCTTTCCTATCTCTTTTTCGG

CTTATGACACAAGACTTTTGGGAAAATTTGTATCAGCTGACACTCCGAGCGGC

TGGAAAAACTTATATGATCTTCTTCGTTCTTGTAATCTTTTTGGGATCCTTCTAC

CTCATCAATTTGATACTTGCAGTTGTCGCTATGGCATACGAGGAGCAAAATCA

AGCAACGCTAGAAGAAGCGGAGCAGAAAGAGGCTGAATTTCAACAGATGATT

GAGCAATTGAAGAAACAACAGGAAGCTGCACAACAAGCAGCTACTGCTACTG

CATCTGAACATTCTAGAGAGCCAAGTGCAGCTGGAAGGCTTTCTGATAGTTCA

AGTGAAGCATCTAAATTGAGTTCTAAGTCAGCAAAGGAACGGAGAAATAGAC

GGAAAAAACGAAAGCAGAAGGAGCAATCTGGAGGAGAAGAGAAGGACGAAG

ACGAGTTTCAAAAAAGTGAATCAGAGGACTCAATTAGACGCAAAGGATTCAG

ATTTAGTATCGAAGGAAATAGATTGACTTATGAAAAACGATATTCCTCACCAC

ATCAGTCACTCCTGAGTATACGCGGGTCACTCTTTTCACCGAGACGAAATTCC

AGAACTTCACTCTTCTCATTCCGGGGAAGGGCTAAGGATGTAGGCTCAGAAAA

TGATTTCGCAGACGATGAGCATTCCACTTTTGAAGATAATGAGAGCAGGCGAG

ACAGTCTCTTTGTACCACGAAGACATGGCGAAAGAAGGAACAGCAACCTTAG

CCAGACTAGTCGGTCCAGTAGAATGCTAGCTGTATTCCCTGCTAATGGCAAGA

TGCATTCCACCGTTGATTGTAATGGGGTCGTCTCGTTAGTAGGTGGACCTTCAG

TTCCTACCTCACCGGTTGGACAATTGCTGCCGGAGGGAACCACTACTGAGACT

GAAATGAGAAAACGACGTTCTTCAAGCTTCCATGTGTCTATGGATTTTTTGGA

AGACCCGTCACAGCGCCAAAGAGCTATGTCTATAGCTTCAATCCTGACAAACA

CCGTAGAGGAGTTGGAGGAGTCACGCCAGAAGTGCCCTCCTTGTTGGTACAAG TTCTCCAACATCTTCCTGATTTGGGATTGTTCACCTTATTGGCTGAAAGTCAAG

CACGTTGTTAACCTCGTCGTAATGGATCCTTTTGTCGACCTGGCTATAACGATA

TGTATCGTCCTGAACACACTCTTCATGGCTATGGAGCATTATCCGATGACTGAT

CATTTTAACAATGTGCTTACCGTGGGTAATCTGGTTTTCACTGGCATCTTTACT

GCAGAAATGTTTCTTAAGATTATTGCAATGGACCCCTACTACTACTTTCAAGAA

GGATGGAATATTTTTGATGGTTTTATCGTCACACTTTCTTTGGTTGAATTGGGC

TTGGCAAATGTAGAGGGGCTCTCAGTTCTTAGAAGTTTCAGACTTCTCCGGGT

ATTCAAGCTTGCTAAGAGCTGGCCTACTTTGAACATGCTCATCAAGATTATCG

GAAACAGTGTTGGCGCCCTTGGCAATCTGACATTGGTGTTGGCTATCATAGTA

TTCATCTTCGCGGTTGTGGGAATGCAGTTGTTTGGGAAGTCATACAAGGACTG

TGTGTGCAAGATAGCGTCCGACTGTCAACTTCCGAGGTGGCACATGAACGATT

TCTTTCATTCATTCCTCATTGTGTTTCGGGTCCTCTGTGGCGAGTGGATAGAAA

CTATGTGGGACTGTATGGAAGTAGCTGGGCAGGCGATGTGCCTTACGGTATTC

ATGATGGTCATGGTCATCGGAAATCTTGTTGTATTGAATTTATTTCTCGCGTTG

TTGTTGAGTTCATTTTCCGCCGATAATTTGGCTGCCACTGATGACGACAACGAG

ATGAATAATCTTCAGATAGCTGTAGACCGGATGCACAAAGGCGTTGCCTACGT

CAAACGAAAAATCTATGAATTCATACAGCAATCCTTCATACGAAAACAGAAG

ATTCTGGATGAAATCAAACCCCTTGATGATCTCAATAATAAGAAAGATTCATG

TCTCAGTTATGACACAGAAATCTTGACGGTGGAATACGGGTTTCTTCCGATCG

GAAAGATTGTTGAGGAGCGCATAGAGTGTACGGTGTATACCGTCGATAAGAA

CGGATTTGTCTACACACAGCCTATCGCACAATGGCATAATAGAGGAGAACAAG

AAGTCTTCGAATATTGTTTGGAGGACGGATCAATCATACGGGCAACCAAAGAC

CACAAGTTTATGACAACAGATGGACAGATGTTGCCAATAGATGAGATATTTGA

GAGGGGACTTGATCTCAAGCAAGTGGATGGTCTGCCATAATGATATCATAATC

AACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAACTATGTTG

CTCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCTTTGTATCATGCTATTGC

TTCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTTAGTTCTTGCC

ACGGCGGAACTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGGCTCGGCT

GTTGGGCACTGACAATTCCGTGGTGTTTATTTGTGAAATTTGTGATGCTATTGC

TTTATTTGTAACCATCTAGCTTTATTTGTGAAATTTGTGATGCTATTGCTTTATT

TGTAACCATTATAAGCTGCAATAAACAAGTTAACAACAACAATTGCATTCATT

TTATGTTTCAGGTTCAGGGGGAGATGTGGGAGGTTTTTTAAACGGACCGAGCG

GCCGC (SEQ ID NO:73)

CN3254 (4091 bp between ITRs)

GCGGCCGCACGCGTGGTACCCTAAATAAAGATGGCTTTTTAGTATTAAAAGTG

GAAGAAAATTACAGGTAATTATCTTTGACGGTAAAAACGCTGTAATCAGCGGG

CTACATGAAAAATTACTCTAATTATGGCTGCATTTAAGAGAATGGCTAAATAA

AGATGGCTTTTTAGTATTAAAAGTGGAAGAAAATTACAGGTAATTATCTTTGA

CGGTAAAAACGCTGTAATCAGCGGGCTACATGAAAAATTACTCTAATTATGGC

TGCATTTAAGAGAATGGCTAAATAAAGATGGCTTTTTAGTATTAAAAGTGGAA

GAAAATTACAGGTAATTATCTTTGACGGTAAAAACGCTGTAATCAGCGGGCTA

CATGAAAAATTACTCTAATTATGGCTGCATTTAAGAGAATGGAGCTCGGGCTG

GGCATAAAAGTCAGGGCAGAGCCATCTATTGCTTACATTTGCTTCTGGGATCC

ACCATGGTCAAGATCATCTCCAGGAAGTCTCTGGGTACACAGAATGTCTACGA

TATCGGAGTCGAGAAAGACCACAATTTTCTCCTGAAAAACGGACTCGTGGCGT

CCAATTGCATGTCGAACCATACCACAGAGATAGGCAAGGACCTTGACTACCTT AAAGACGTGAACGGTACCACAAGTGGAATAGGCACAGGTAAGTACTAGCAGC

TACAATCCAGCTACCATTCTGCTTTTATTCTATGGTTGGGATAAGGCTGGATTA

TTCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTTCATACCTCTTATCTTCCT

CCCACAGGCTCCTCTGTAGAGAAGTACATCATAGACGAGAGCGATTACATGTC

TTTCATCAACAACCCGTCCCTCACTGTCACCGTTCCCATCGCCGTAGGAGAATC

TGACTTCGAGAATCTCAATACGGAGGATTTCAGCTCCGAATCAGACTTGGAGG

AATCAAAGGAGAAGTTGAACGAAAGTTCAAGTTCATCCGAGGGCAGCACCGT

GGACATAGGCGCCCCCGTCGAGGAACAACCTGTAGTCGAGCCTGAGGAAACTT

TGGAACCCGAAGCGTGTTTCACGGAGGGGTGTGTTCAACGCTTCAAGTGTTGC

CAAATTAACGTTGAAGAGGGTCGTGGAAAACAATGGTGGAACCTCCGCAGGA

CCTGTTTCCGGATCGTCGAACATAATTGGTTCGAGACGTTCATAGTTTTCATGA

TCTTGCTTTCATCTGGTGCTTTGGCATTCGAGGATATCTACATCGACCAACGAA

AGACCATAAAAACTATGCTGGAATATGCAGACAAGGTTTTCACATACATATTC

ATCCTTGAAATGCTCCTGAAATGGGTAGCGTATGGTTACCAGACTTATTTCACG

AACGCATGGTGCTGGCTCGATTTCCTGATTGTCGACGTCTCCCTGGTGTCATTG

ACTGCTAACGCACTCGGATATAGCGAACTAGGCGCTATTAAGAGTCTCAGAAC

CCTGAGAGCATTGAGGCCCCTCCGCGCGCTCTCTCGGTTTGAGGGAATGAGAG

TAGTCGTTAATGCACTGTTGGGAGCGATACCTTCCATTATGAACGTGCTTCTCG

TTTGTCTCATCTTCTGGCTGATATTCTCTATTATGGGTGTGAACTTGTTCGCAGG

CAAATTTTACCACTGCATTAACACAACTACAGGAGATAGATTTGATATTGAGG

ATGTAAACAACCACACCGACTGTTTGAAGTTGATAGAGAGAAACGAGACCGC

AAGATGGAAGAATGTAAAAGTCAACTTCGACAATGTCGGCTTTGGATATCTTT

CACTGCTGCAAGTAGCCACATTCAAAGGATGGATGGACATTATGTACGCTGCA

GTAGATTCCCGAAACGTAGAGTTGCAACCGAAGTATGAAGAAAGTTTGTATAT

GTACCTCTACTTCGTAATTTTTATCATCTTTGGCTCATTCTTCACACTTAACCTG

TTCATTGGTGTAATCATCGACAATTTCAATCAGCAGAAAAAGAAATTTGGTGG

ACAAGACATCTTCATGACAGAGGAACAGAAGAAATACTATAATGCAATGAAA

AAACTAGGGTCCAAAAAGCCCCAAAAACCTATTCCTAGACCGGGCAACAAGTT

TCAAGGCATGGTTTTCGACTTCGTAACTAGACAGGTGTTTGATATATCTATTAT

GATTCTGATATGTCTGAATATGGTTACGATGATGGTTGAGACTGATGATCAATC

TGAATACGTTACGACGATACTTAGCCGAATTAACTTGGTATTCATTGTTCTTTT

CACGGGCGAATGTGTACTTAAACTGATTAGTTTAAGGCACTATTATTTCACAAT

CGGTTGGAACATTTTTGATTTCGTTGTGGTCATACTTTCCATTGTTGGCATGTTT

CTTGCTGAATTGATAGAAAAGTACTTCGTCAGTCCAACACTTTTCCGAGTTATA

CGGCTTGCCCGAATCGGACGAATTCTCAGGCTAATCAAAGGTGCTAAAGGAAT

TCGTACACTGCTTTTCGCTCTCATGATGTCACTGCCAGCTCTTTTCAACATCGGT

TTGTTACTATTTTTGGTAATGTTTATATATGCGATCTTCGGCATGAGTAATTTCG

CTTATGTTAAACGGGAGGTGGGAATCGATGACATGTTTAATTTTGAGACATTC

GGCAATTCTATGATCTGTCTCTTTCAAATTACCACGTCAGCTGGATGGGACGGA

TTGCTTGCTCCGATTCTCAACAGTAAACCGCCCGATTGCGACCCTAACAAAGT

GAATCCGGGTTCATCTGTAAAGGGAGACTGCGGAAATCCGAGCGTCGGTATCT

TCTTTTTCGTCTCCTACATTATAATTTCTTTCCTTGTTGTCGTGAACATGTATAT

AGCTGTGATCTTGGAAAATTTTTCTGTTGCTACTGAGGAATCCGCAGAACCACT

TTCAGAAGACGATTTTGAGATGTTTTACGAAGTTTGGGAGAAGTTTGATCCTGA

CGCTACACAGTTTATGGAATTTGAGAAGCTCTCACAGTTCGCAGCTGCCCTGG

AGCCTCCGTTGAATCTTCCACAGCCTAACAAGTTACAACTGATTGCGATGGAC CTGCCAATGGTGTCTGGGGACCGAATCCACTGCCTTGATATACTCTTTGCTTTC

ACAAAAAGGGTCTTGGGCGAGTCTGGAGAAATGGACGCCCTCAGAATACAGA

TGGAGGAACGATTCATGGCTTCGAATCCTAGCAAAGTGTCTTATCAACCCATC

ACTACGACTCTTAAAAGAAAACAAGAGGAAGTGTCTGCTGTCATTATCCAGCG

AGCATATAGACGGCACTTGCTCAAACGAACTGTTAAGCAAGCCAGTTTCACCT

ACAATAAAAACAAAATAAAAGGTGGTGCTAATTTGCTGATTAAAGAGGACAT

GATTATCGACAGAATCAATGAGAACTCCATTACAGAAAAAACCGATCTCACTA

TGTCAACAGCAGCCTGTCCTCCCTCATACGACCGTGTCACTAAACCTATAGTCG

AAAAACATGAACAAGAGGGCAAGGATGAGAAGGCCAAAGGCAAAGCCGGCG

ACTACAAAGACCATGACGGAGACTATAAAGATCATGACATCGATTACAAGGA

TGACGATGACAAGTAATGATATCATAATCAACCTCTGGATTACAAAATTTGTG

AAAGATTGACTGGTATTCTTAACTATGTTGCTCCTTTTACGCTATGTGGATACG

CTGCTTTAATGCCTTTGTATCATGCTATTGCTTCCCGTATGGCTTTCATTTTCTC

CTCCTTGTATAAATCCTGGTTAGTTCTTGCCACGGCGGAACTCATCGCCGCCTG

CCTTGCCCGCTGCTGGACAGGGGCTCGGCTGTTGGGCACTGACAATTCCGTGG

TGTTTATTTGTGAAATTTGTGATGCTATTGCTTTATTTGTAACCATCTAGCTTTA

TTTGTGAAATTTGTGATGCTATTGCTTTATTTGTAACCATTATAAGCTGCAATA

AACAAGTTAACAACAACAATTGCATTCATTTTATGTTTCAGGTTCAGGGGGAG

ATGTGGGAGGTTTTTTAAACGGACCGAGCGGCCGC (SEQ ID NO: 74)

CN3683 (4528 bp between ITRs)

GCGGCCGCACGCGTTTAATTAAGTGTCTAGACTGCAGAGGGCCCTGCGTATGA

GTGCAAGTGGGTTTTAGGACCAGGATGAGGCGGGGTGGGGGTGCCTACCTGAC

GACCGACCCCGACCCACTGGACAAGCACCCAACCCCCATTCCCCAAATTGCGC

ATCCCCTATCAGAGAGGGGGAGGGGAAACAGGATGCGGCGAGGCGCGTGCGC

ACTGCCAGCTTCAGCACCGCGGACAGTGCCTTCGCCCCCGCCTGGCGGCGCGC

GCCACCGCCGCCTCAGCACTGAAGGCGCGCTGACGTCACTCGCCGGTCCCCCG

CAAACTCCCCTTCCCGGCCACCTTGGTCGCGTCCGCGCCGCCGCCGGCCCAGCC

GGACCGCACCACGCGAGGCGCGAGATAGGGGGGCACGGGCGCGACCATCTGC

GCTGCGGCGCCGGCGACTCAGCGCTGCCTCAGTCTGCGGTGGGCAGCGGAGGA

GTCGTGTCGTGCCTGAGAGCGCAGTCGAGAAACCGGCTAGAGGATCCTTCGAA

GCTAGCGCTACCGGTGCCACCATGGTCTACCCGTATGATGTCCCGGATTACGCT

GGCAGCTACCCATACGATGTACCCGACTATGCCGGCAGTATGGAGCAAACAGT

TTTGGTCCCTCCGGGACCAGACAGTTTCAATTTCTTTACTCGGGAGAGTCTTGC

CGCCATTGAGAGGCGCATAGCTGAGGTAAGTACTAGCAGCTACAATCCAGCTA

CCATTCTGCTTTTATTCTATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAG

CTAGGCCCTTTTGCTAATCATGTTCATACCTCTTATCTTCCTCCCACAGGAAAA

GGCTAAGAATCCAAAACCTGACAAGAAAGACGACGACGAAAACGGACCCAAA

CCTAACTCAGATCTCGAAGCTGGAAAGAATCTCCCATTCATCTATGGTGATATC

CCTCCAGAAATGGTTTCAGAACCTCTAGAAGATCTCGATCCATACTATATCAAT

AAAAAGACCTTCATCGTTCTGAACAAAGGAAAGGCGATTTTCCGGTTTTCTGCT

ACTTCTGCTCTCTATATTCTCACACCATTTAATCCACTTCGCAAGATTGCGATTA

AGATACTGGTGCATAGTCTGTTCAGTATGCTGATTATGTGTACAATCCTTACAA

ATTGTGTCTTTATGACTATGTCTAACCCGCCGGATTGGACCAAGAACGTAGAAT

ACACGTTCACTGGAATCTATACGTTCGAGTCTCTTATTAAGATAATCGCCAGGG

GGTTCTGTCTTGAGGATTTCACTTTCCTCCGCGATCCGTGGAATTGGCTTGACT

TCACCGTTATTACGTTCGCTTACGTTACTGAGTTCGTTGATCTTGGCAACGTGT CTGCACTCAGAACATTCAGAGTGCTTAGAGCACTTAAAACCATAAGTGTCATA

CCAGGATTGAAAACGATCGTGGGAGCTCTGATACAGAGTGTAAAGAAGCTTTC

AGATGTAATGATCCTTACTGTCTTCTGTCTTTCCGTATTCGCACTCATCGGGCTC

CAGCTGTTTATGGGTAACCTCAGAAACAAATGCATTCAATGGCCACCAACAAA

TGCGAGCCTTGAGGAACATAGCATAGAAAAGAATATCACTGTTAACTATAATG

GGACCCTCATAAACGAAACCGTGTTCGAATTTGACTGGAAATCCTACATTCAG

GATTCCAGATATCATTATTTTCTTGAGGGCTTCTTGGACGCACTTTTGTGCGGA

AATTCAAGTGATGCTGGTCAATGTCCTGAAGGTTATATGTGTGTTAAAGCAGG

AAGAAACCCAAACTACGGATACACATCTTTCGATACATTTTCTTGGGCTTTCCT

ATCTCTTTTTCGGCTTATGACACAAGACTTTTGGGAAAATTTGTATCAGCTGAC

ACTCCGAGCGGCTGGAAAAACTTATATGATCTTCTTCGTTCTTGTAATCTTTTT

GGGATCCTTCTACCTCATCAATTTGATACTTGCAGTTGTCGCTATGGCATACGA

GGAGCAAAATCAAGCAACGCTAGAAGAAGCGGAGCAGAAAGAGGCTGAATTT

CAACAGATGATTGAGCAATTGAAGAAACAACAGGAAGCTGCACAACAAGCAG

CTACTGCTACTGCATCTGAACATTCTAGAGAGCCAAGTGCAGCTGGAAGGCTT

TCTGATAGTTCAAGTGAAGCATCTAAATTGAGTTCTAAGTCAGCAAAGGAACG

GAGAAATAGACGGAAAAAACGAAAGCAGAAGGAGCAATCTGGAGGAGAAGA

GAAGGACGAAGACGAGTTTCAAAAAAGTGAATCAGAGGACTCAATTAGACGC

AAAGGATTCAGATTTAGTATCGAAGGAAATAGATTGACTTATGAAAAACGATA

TTCCTCACCACATCAGTCACTCCTGAGTATACGCGGGTCACTCTTTTCACCGAG

ACGAAATTCCAGAACTTCACTCTTCTCATTCCGGGGAAGGGCTAAGGATGTAG

GCTCAGAAAATGATTTCGCAGACGATGAGCATTCCACTTTTGAAGATAATGAG

AGCAGGCGAGACAGTCTCTTTGTACCACGAAGACATGGCGAAAGAAGGAACA

GCAACCTTAGCCAGACTAGTCGGTCCAGTAGAATGCTAGCTGTATTCCCTGCTA

ATGGCAAGATGCATTCCACCGTTGATTGTAATGGGGTCGTCTCGTTAGTAGGTG

GACCTTCAGTTCCTACCTCACCGGTTGGACAATTGCTGCCGGAGGGAACCACT

ACTGAGACTGAAATGAGAAAACGACGTTCTTCAAGCTTCCATGTGTCTATGGA

TTTTTTGGAAGACCCGTCACAGCGCCAAAGAGCTATGTCTATAGCTTCAATCCT

GACAAACACCGTAGAGGAGTTGGAGGAGTCACGCCAGAAGTGCCCTCCTTGTT

GGTACAAGTTCTCCAACATCTTCCTGATTTGGGATTGTTCACCTTATTGGCTGA

AAGTCAAGCACGTTGTTAACCTCGTCGTAATGGATCCTTTTGTCGACCTGGCTA

TAACGATATGTATCGTCCTGAACACACTCTTCATGGCTATGGAGCATTATCCGA

TGACTGATCATTTTAACAATGTGCTTACCGTGGGTAATCTGGTTTTCACTGGCA

TCTTTACTGCAGAAATGTTTCTTAAGATTATTGCAATGGACCCCTACTACTACT

TTCAAGAAGGATGGAATATTTTTGATGGTTTTATCGTCACACTTTCTTTGGTTG

AATTGGGCTTGGCAAATGTAGAGGGGCTCTCAGTTCTTAGAAGTTTCAGACTTC

TCCGGGTATTCAAGCTTGCTAAGAGCTGGCCTACTTTGAACATGCTCATCAAGA

TTATCGGAAACAGTGTTGGCGCCCTTGGCAATCTGACATTGGTGTTGGCTATCA

TAGTATTCATCTTCGCGGTTGTGGGAATGCAGTTGTTTGGGAAGTCATACAAGG

ACTGTGTGTGCAAGATAGCGTCCGACTGTCAACTTCCGAGGTGGCACATGAAC

GATTTCTTTCATTCATTCCTCATTGTGTTTCGGGTCCTCTGTGGCGAGTGGATAG

AAACTATGTGGGACTGTATGGAAGTAGCTGGGCAGGCGATGTGCCTTACGGTA

TTCATGATGGTCATGGTCATCGGAAATCTTGTTGTATTGAATTTATTTCTCGCGT

TGTTGTTGAGTTCATTTTCCGCCGATAATTTGGCTGCCACTGATGACGACAACG

AGATGAATAATCTTCAGATAGCTGTAGACCGGATGCACAAAGGCGTTGCCTAC

GTCAAACGAAAAATCTATGAATTCATACAGCAATCCTTCATACGAAAACAGAA GATTCTGGATGAAATCAAACCCCTTGATGATCTCAATAATAAGAAAGATTCAT

GTCTCAGTTATGACACAGAAATCTTGACGGTGGAATACGGGTTTCTTCCGATCG

GAAAGATTGTTGAGGAGCGCATAGAGTGTACGGTGTATACCGTCGATAAGAAC

GGATTTGTCTACACACAGCCTATCGCACAATGGCATAATAGAGGAGAACAAGA

AGTCTTCGAATATTGTTTGGAGGACGGATCAATCATACGGGCAACCAAAGACC

ACAAGTTTATGACAACAGATGGACAGATGTTGCCAATAGATGAGATATTTGAG

AGGGGACTTGATCTCAAGCAAGTGGATGGTCTGCCATAATGATATCATAATCA

ACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAACTATGTTGC

TCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCTTTGTATCATGCTATTGCT

TCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTTAGTTCTTGCCA

CGGCGGAACTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGGCTCGGCTG

TTGGGCACTGACAATTCCGTGGCAATAAAAGATCTTTATTTTCATTAGATCTGT

GTGTTGGTTTTTTGTGTGGTGCGGACCGAGCGGCCGC (SEQ ID NO: 75)

CN3684 (4033 bp between ITRs)

GCGGCCGCACGCGTTTAATTAAGTGTCTAGACTGCAGAGGGCCCTGCGTATGA

GTGCAAGTGGGTTTTAGGACCAGGATGAGGCGGGGTGGGGGTGCCTACCTGAC

GACCGACCCCGACCCACTGGACAAGCACCCAACCCCCATTCCCCAAATTGCGC

ATCCCCTATCAGAGAGGGGGAGGGGAAACAGGATGCGGCGAGGCGCGTGCGC

ACTGCCAGCTTCAGCACCGCGGACAGTGCCTTCGCCCCCGCCTGGCGGCGCGC

GCCACCGCCGCCTCAGCACTGAAGGCGCGCTGACGTCACTCGCCGGTCCCCCG

CAAACTCCCCTTCCCGGCCACCTTGGTCGCGTCCGCGCCGCCGCCGGCCCAGCC

GGACCGCACCACGCGAGGCGCGAGATAGGGGGGCACGGGCGCGACCATCTGC

GCTGCGGCGCCGGCGACTCAGCGCTGCCTCAGTCTGCGGTGGGCAGCGGAGGA

GTCGTGTCGTGCCTGAGAGCGCAGTCGAGAAACCGGCTAGAGGATCCTTCGAA

GCTAGCGCTACCGGTGCCACCATGGTCAAGATCATCTCCAGGAAGTCTCTGGG

TACACAGAATGTCTACGATATCGGAGTCGAGAAAGACCACAATTTTCTCCTGA

AAAACGGACTCGTGGCGTCCAATTGCATGTCGAACCATACCACAGAGATAGGC

AAGGACCTTGACTACCTTAAAGACGTGAACGGTACCACAAGTGGAATAGGCAC

AGGTAAGTACTAGCAGCTACAATCCAGCTACCATTCTGCTTTTATTCTATGGTT

GGGATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTT

CATACCTCTTATCTTCCTCCCACAGGCTCCTCTGTAGAGAAGTACATCATAGAC

GAGAGCGATTACATGTCTTTCATCAACAACCCGTCCCTCACTGTCACCGTTCCC

ATCGCCGTAGGAGAATCTGACTTCGAGAATCTCAATACGGAGGATTTCAGCTC

CGAATCAGACTTGGAGGAATCAAAGGAGAAGTTGAACGAAAGTTCAAGTTCA

TCCGAGGGCAGCACCGTGGACATAGGCGCCCCCGTCGAGGAACAACCTGTAGT

CGAGCCTGAGGAAACTTTGGAACCCGAAGCGTGTTTCACGGAGGGGTGTGTTC

AACGCTTCAAGTGTTGCCAAATTAACGTTGAAGAGGGTCGTGGAAAACAATGG

TGGAACCTCCGCAGGACCTGTTTCCGGATCGTCGAACATAATTGGTTCGAGAC

GTTCATAGTTTTCATGATCTTGCTTTCATCTGGTGCTTTGGCATTCGAGGATATC

TACATCGACCAACGAAAGACCATAAAAACTATGCTGGAATATGCAGACAAGG

TTTTCACATACATATTCATCCTTGAAATGCTCCTGAAATGGGTAGCGTATGGTT

ACCAGACTTATTTCACGAACGCATGGTGCTGGCTCGATTTCCTGATTGTCGACG

TCTCCCTGGTGTCATTGACTGCTAACGCACTCGGATATAGCGAACTAGGCGCTA

TTAAGAGTCTCAGAACCCTGAGAGCATTGAGGCCCCTCCGCGCGCTCTCTCGG

TTTGAGGGAATGAGAGTAGTCGTTAATGCACTGTTGGGAGCGATACCTTCCAT

TATGAACGTGCTTCTCGTTTGTCTCATCTTCTGGCTGATATTCTCTATTATGGGT GTGAACTTGTTCGCAGGCAAATTTTACCACTGCATTAACACAACTACAGGAGA

TAGATTTGATATTGAGGATGTAAACAACCACACCGACTGTTTGAAGTTGATAG

AGAGAAACGAGACCGCAAGATGGAAGAATGTAAAAGTCAACTTCGACAATGT

CGGCTTTGGATATCTTTCACTGCTGCAAGTAGCCACATTCAAAGGATGGATGG

ACATTATGTACGCTGCAGTAGATTCCCGAAACGTAGAGTTGCAACCGAAGTAT

GAAGAAAGTTTGTATATGTACCTCTACTTCGTAATTTTTATCATCTTTGGCTCAT

TCTTCACACTTAACCTGTTCATTGGTGTAATCATCGACAATTTCAATCAGCAGA

AAAAGAAATTTGGTGGACAAGACATCTTCATGACAGAGGAACAGAAGAAATA

CTATAATGCAATGAAAAAACTAGGGTCCAAAAAGCCCCAAAAACCTATTCCTA

GACCGGGCAACAAGTTTCAAGGCATGGTTTTCGACTTCGTAACTAGACAGGTG

TTTGATATATCTATTATGATTCTGATATGTCTGAATATGGTTACGATGATGGTT

GAGACTGATGATCAATCTGAATACGTTACGACGATACTTAGCCGAATTAACTT

GGTATTCATTGTTCTTTTCACGGGCGAATGTGTACTTAAACTGATTAGTTTAAG

GCACTATTATTTCACAATCGGTTGGAACATTTTTGATTTCGTTGTGGTCATACTT

TCCATTGTTGGCATGTTTCTTGCTGAATTGATAGAAAAGTACTTCGTCAGTCCA

ACACTTTTCCGAGTTATACGGCTTGCCCGAATCGGACGAATTCTCAGGCTAATC

AAAGGTGCTAAAGGAATTCGTACACTGCTTTTCGCTCTCATGATGTCACTGCCA

GCTCTTTTCAACATCGGTTTGTTACTATTTTTGGTAATGTTTATATATGCGATCT

TCGGCATGAGTAATTTCGCTTATGTTAAACGGGAGGTGGGAATCGATGACATG

TTTAATTTTGAGACATTCGGCAATTCTATGATCTGTCTCTTTCAAATTACCACGT

CAGCTGGATGGGACGGATTGCTTGCTCCGATTCTCAACAGTAAACCGCCCGAT

TGCGACCCTAACAAAGTGAATCCGGGTTCATCTGTAAAGGGAGACTGCGGAAA

TCCGAGCGTCGGTATCTTCTTTTTCGTCTCCTACATTATAATTTCTTTCCTTGTT

GTCGTGAACATGTATATAGCTGTGATCTTGGAAAATTTTTCTGTTGCTACTGAG

GAATCCGCAGAACCACTTTCAGAAGACGATTTTGAGATGTTTTACGAAGTTTG

GGAGAAGTTTGATCCTGACGCTACACAGTTTATGGAATTTGAGAAGCTCTCAC

AGTTCGCAGCTGCCCTGGAGCCTCCGTTGAATCTTCCACAGCCTAACAAGTTAC

AACTGATTGCGATGGACCTGCCAATGGTGTCTGGGGACCGAATCCACTGCCTT

GATATACTCTTTGCTTTCACAAAAAGGGTCTTGGGCGAGTCTGGAGAAATGGA

CGCCCTCAGAATACAGATGGAGGAACGATTCATGGCTTCGAATCCTAGCAAAG

TGTCTTATCAACCCATCACTACGACTCTTAAAAGAAAACAAGAGGAAGTGTCT

GCTGTCATTATCCAGCGAGCATATAGACGGCACTTGCTCAAACGAACTGTTAA

GCAAGCCAGTTTCACCTACAATAAAAACAAAATAAAAGGTGGTGCTAATTTGC

TGATTAAAGAGGACATGATTATCGACAGAATCAATGAGAACTCCATTACAGAA

AAAACCGATCTCACTATGTCAACAGCAGCCTGTCCTCCCTCATACGACCGTGTC

ACTAAACCTATAGTCGAAAAACATGAACAAGAGGGCAAGGATGAGAAGGCCA

AAGGCAAAGCCGGCGACTACAAAGACCATGACGGAGACTATAAAGATCATGA

CATCGATTACAAGGATGACGATGACAAGTAATGATATCATAATCAACCTCTGG

ATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAACTATGTTGCTCCTTTTA

CGCTATGTGGATACGCTGCTTTAATGCCTTTGTATCATGCTATTGCTTCCCGTAT

GGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTTAGTTCTTGCCACGGCGGAA

CTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGGCTCGGCTGTTGGGCACT

GACAATTCCGTGGCAATAAAAGATCTTTATTTTCATTAGATCTGTGTGTTGGTT

TTTTGTGTGGTGCGGACCGAGCGGCCGC (SEQ ID NO: 76)

CN3251 (Cassette length 5956 bp, no ITRs)

GTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAG TTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGC

CTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTT

CCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTT

ACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGC

CCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTAC

ATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCT

ATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGATAGCGGTT

TGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTT

TTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCAT

TGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCT

CTCTGGCTAACTAGAGAACCCACTGCTTACTGGCTTATCGAAATTAATACGACT

CACTATAGGGAGACCCAAGCTGGCTAGCCACCATGGTCTACCCGTATGATGTC

CCGGATTACGCTGGCAGCTACCCATACGATGTACCCGACTATGCCGGCAGTAT

GGAGCAAACAGTTTTGGTCCCTCCGGGACCAGACAGTTTCAATTTCTTTACTCG

GGAGAGTCTTGCCGCCATTGAGAGGCGCATAGCTGAGGTAAGTACTAGCAGCT

ACAATCCAGCTACCATTCTGCTTTTATTCTATGGTTGGGATAAGGCTGGATTAT

TCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTTCATACCTCTTATCTTCCTC

CCACAGGAAAAGGCTAAGAATCCAAAACCTGACAAGAAAGACGACGACGAAA

ACGGACCCAAACCTAACTCAGATCTCGAAGCTGGAAAGAATCTCCCATTCATC

TATGGTGATATCCCTCCAGAAATGGTTTCAGAACCTCTAGAAGATCTCGATCCA

TACTATATCAATAAAAAGACCTTCATCGTTCTGAACAAAGGAAAGGCGATTTT

CCGGTTTTCTGCTACTTCTGCTCTCTATATTCTCACACCATTTAATCCACTTCGC

AAGATTGCGATTAAGATACTGGTGCATAGTCTGTTCAGTATGCTGATTATGTGT

ACAATCCTTACAAATTGTGTCTTTATGACTATGTCTAACCCGCCGGATTGGACC

AAGAACGTAGAATACACGTTCACTGGAATCTATACGTTCGAGTCTCTTATTAA

GATAATCGCCAGGGGGTTCTGTCTTGAGGATTTCACTTTCCTCCGCGATCCGTG

GAATTGGCTTGACTTCACCGTTATTACGTTCGCTTACGTTACTGAGTTCGTTGA

TCTTGGCAACGTGTCTGCACTCAGAACATTCAGAGTGCTTAGAGCACTTAAAA

CCATAAGTGTCATACCAGGATTGAAAACGATCGTGGGAGCTCTGATACAGAGT

GTAAAGAAGCTTTCAGATGTAATGATCCTTACTGTCTTCTGTCTTTCCGTATTC

GCACTCATCGGGCTCCAGCTGTTTATGGGTAACCTCAGAAACAAATGCATTCA

ATGGCCACCAACAAATGCGAGCCTTGAGGAACATAGCATAGAAAAGAATATC

ACTGTTAACTATAATGGGACCCTCATAAACGAAACCGTGTTCGAATTTGACTG

GAAATCCTACATTCAGGATTCCAGATATCATTATTTTCTTGAGGGCTTCTTGGA

CGCACTTTTGTGCGGAAATTCAAGTGATGCTGGTCAATGTCCTGAAGGTTATAT

GTGTGTTAAAGCAGGAAGAAACCCAAACTACGGATACACATCTTTCGATACAT

TTTCTTGGGCTTTCCTATCTCTTTTTCGGCTTATGACACAAGACTTTTGGGAAAA

TTTGTATCAGCTGACACTCCGAGCGGCTGGAAAAACTTATATGATCTTCTTCGT

TCTTGTAATCTTTTTGGGATCCTTCTACCTCATCAATTTGATACTTGCAGTTGTC

GCTATGGCATACGAGGAGCAAAATCAAGCAACGCTAGAAGAAGCGGAGCAGA

AAGAGGCTGAATTTCAACAGATGATTGAGCAATTGAAGAAACAACAGGAAGC

TGCACAACAAGCAGCTACTGCTACTGCATCTGAACATTCTAGAGAGCCAAGTG

CAGCTGGAAGGCTTTCTGATAGTTCAAGTGAAGCATCTAAATTGAGTTCTAAG

TCAGCAAAGGAACGGAGAAATAGACGGAAAAAACGAAAGCAGAAGGAGCAA

TCTGGAGGAGAAGAGAAGGACGAAGACGAGTTTCAAAAAAGTGAATCAGAGG

ACTCAATTAGACGCAAAGGATTCAGATTTAGTATCGAAGGAAATAGATTGACT TATGAAAAACGATATTCCTCACCACATCAGTCACTCCTGAGTATACGCGGGTC

ACTCTTTTCACCGAGACGAAATTCCAGAACTTCACTCTTCTCATTCCGGGGAAG

GGCTAAGGATGTAGGCTCAGAAAATGATTTCGCAGACGATGAGCATTCCACTT

TTGAAGATAATGAGAGCAGGCGAGACAGTCTCTTTGTACCACGAAGACATGGC

GAAAGAAGGAACAGCAACCTTAGCCAGACTAGTCGGTCCAGTAGAATGCTAG

CTGTATTCCCTGCTAATGGCAAGATGCATTCCACCGTTGATTGTAATGGGGTCG

TCTCGTTAGTAGGTGGACCTTCAGTTCCTACCTCACCGGTTGGACAATTGCTGC

CGGAGGGAACCACTACTGAGACTGAAATGAGAAAACGACGTTCTTCAAGCTTC

CATGTGTCTATGGATTTTTTGGAAGACCCGTCACAGCGCCAAAGAGCTATGTCT

ATAGCTTCAATCCTGACAAACACCGTAGAGGAGTTGGAGGAGTCACGCCAGAA

GTGCCCTCCTTGTTGGTACAAGTTCTCCAACATCTTCCTGATTTGGGATTGTTCA

CCTTATTGGCTGAAAGTCAAGCACGTTGTTAACCTCGTCGTAATGGATCCTTTT

GTCGACCTGGCTATAACGATATGTATCGTCCTGAACACACTCTTCATGGCTATG

GAGCATTATCCGATGACTGATCATTTTAACAATGTGCTTACCGTGGGTAATCTG

GTTTTCACTGGCATCTTTACTGCAGAAATGTTTCTTAAGATTATTGCAATGGAC

CCCTACTACTACTTTCAAGAAGGATGGAATATTTTTGATGGTTTTATCGTCACA

CTTTCTTTGGTTGAATTGGGCTTGGCAAATGTAGAGGGGCTCTCAGTTCTTAGA

AGTTTCAGACTTCTCCGGGTATTCAAGCTTGCTAAGAGCTGGCCTACTTTGAAC

ATGCTCATCAAGATTATCGGAAACAGTGTTGGCGCCCTTGGCAATCTGACATT

GGTGTTGGCTATCATAGTATTCATCTTCGCGGTTGTGGGAATGCAGTTGTTTGG

GAAGTCATACAAGGACTGTGTGTGCAAGATAGCGTCCGACTGTCAACTTCCGA

GGTGGCACATGAACGATTTCTTTCATTCATTCCTCATTGTGTTTCGGGTCCTCTG

TGGCGAGTGGATAGAAACTATGTGGGACTGTATGGAAGTAGCTGGGCAGGCG

ATGTGCCTTACGGTATTCATGATGGTCATGGTCATCGGAAATCTTGTTGTATTG

AATTTATTTCTCGCGTTGTTGTTGAGTTCATTTTCCGCCGATAATTTGGCTGCCA

CTGATGACGACAACGAGATGAATAATCTTCAGATAGCTGTAGACCGGATGCAC

AAAGGCGTTGCCTACGTCAAACGAAAAATCTATGAATTCATACAGCAATCCTT

CATACGAAAACAGAAGATTCTGGATGAAATCAAACCCCTTGATGATCTCAATA

ATAAGAAAGATTCATGTCTCAGTTATGACACAGAAATCTTGACGGTGGAATAC

GGGTTTCTTCCGATCGGAAAGATTGTTGAGGAGCGCATAGAGTGTACGGTGTA

TACCGTCGATAAGAACGGATTTGTCTACACACAGCCTATCGCACAATGGCATA

ATAGAGGAGAACAAGAAGTCTTCGAATATTGTTTGGAGGACGGATCAATCATA

CGGGCAACCAAAGACCACAAGTTTATGACAACAGATGGACAGATGTTGCCAAT

AGATGAGATATTTGAGAGGGGACTTGATCTCAAGCAAGTGGATGGTCTGCCAT

AATGAAGGCCTCCCTCTCCCTCCCCCCCCCCTAACGTTACTGGCCGAAGCCGCT

TGGAATAAGGCCGGTGTGCGTTTGTCTATATGTTATTTTCCACCATATTGCCGT

CTTTTGGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCTTCTTGACGAGCATTC

CTAGGGGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCGTGA

AGGAAGCAGTTCCTCTGGAAGCTTCTTGAAGACAAACAACGTCTGTAGCGACC

CTTTGCAGGCAGCGGAACCCCCCACCTGGCGACAGGTGCCTCTGCGGCCAAAA

GCCACGTGTATAAGATACACCTGCAAAGGCGGCACAACCCCAGTGCCACGTTG

TGAGTTGGATAGTTGTGGAAAGAGTCAAATGGCTCTCCTCAAGCGTATTCAAC

AAGGGGCTGAAGGATGCCCAGAAGGTACCCCATTGTATGGGATCTGATCTGGG

GCCTCGGTGCACATGCTTTACATGTGTTTAGTCGAGGTTAAAAAAACGTCTAG

GCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAAAACACGATGATAATAT

GGCCACAACCCGCGGCCGCCACCATGGTGAGCAAGGGCGAGGAGCTGTTCAC CGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGT

TCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTG

AAGCTGATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGAC

CACCCTGGGCTACGGCGTGCAGTGCTTCGCCCGCTACCCCGACCACATGAAGC

AGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACC

ATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGA

GGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAG

GACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACG

TCTATATCACCGCCGACAAGCAGAAGAACGGCATCAAGGCCAACTTCAAGATC

CGCCACAACATCGAGGACGGCGGCGTGCAGCTCGCCGACCACTACCAGCAGA

ACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGC

TACCAGTCCAAGCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCT

GCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACA

AGTAAGAATTCCACCACACTGGACTAGTGGATCCGAGCTCGGTACCAAGCTTA

AGTTTAAACCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTT

GTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTC

CTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCT

ATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGACA

ATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGC (SEQ ID NO: 77)

CN3253 (Cassette length 5486 bp, no ITRs)

GTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAG

TTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGC

CTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTT

CCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTT

ACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGC

CCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTAC

ATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCT

ATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGATAGCGGTT

TGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTT

TTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCAT

TGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGC

TCTCTGGCTAACTAGAGAACCCACTGCTTACTGGCTTATCGAAATTAATACGAC

TCACTATAGGGAGACCCAAGCTGGCTAGCCACCATGGTCAAGATCATCTCCAG

GAAGTCTCTGGGTACACAGAATGTCTACGATATCGGAGTCGAGAAAGACCACA

ATTTTCTCCTGAAAAACGGACTCGTGGCGTCCAATTGCATGTCGAACCATACC

ACAGAGATAGGCAAGGACCTTGACTACCTTAAAGACGTGAACGGTACCACAA

GTGGAATAGGCACAGGTAAGTACTAGCAGCTACAATCCAGCTACCATTCTGCT

TTTATTCTATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCTT

TTGCTAATCATGTTCATACCTCTTATCTTCCTCCCACAGGCTCCTCTGTAGAGA

AGTACATCATAGACGAGAGCGATTACATGTCTTTCATCAACAACCCGTCCCTC

ACTGTCACCGTTCCCATCGCCGTAGGAGAATCTGACTTCGAGAATCTCAATAC

GGAGGATTTCAGCTCCGAATCAGACTTGGAGGAATCAAAGGAGAAGTTGAAC

GAAAGTTCAAGTTCATCCGAGGGCAGCACCGTGGACATAGGCGCCCCCGTCGA

GGAACAACCTGTAGTCGAGCCTGAGGAAACTTTGGAACCCGAAGCGTGTTTCA

CGGAGGGGTGTGTTCAACGCTTCAAGTGTTGCCAAATTAACGTTGAAGAGGGT

CGTGGAAAACAATGGTGGAACCTCCGCAGGACCTGTTTCCGGATCGTCGAACA TAATTGGTTCGAGACGTTCATAGTTTTCATGATCTTGCTTTCATCTGGTGCTTTG

GCATTCGAGGATATCTACATCGACCAACGAAAGACCATAAAAACTATGCTGGA

ATATGCAGACAAGGTTTTCACATACATATTCATCCTTGAAATGCTCCTGAAATG

GGTAGCGTATGGTTACCAGACTTATTTCACGAACGCATGGTGCTGGCTCGATTT

CCTGATTGTCGACGTCTCCCTGGTGTCATTGACTGCTAACGCACTCGGATATAG

CGAACTAGGCGCTATTAAGAGTCTCAGAACCCTGAGAGCATTGAGGCCCCTCC

GCGCGCTCTCTCGGTTTGAGGGAATGAGAGTAGTCGTTAATGCACTGTTGGGA

GCGATACCTTCCATTATGAACGTGCTTCTCGTTTGTCTCATCTTCTGGCTGATAT

TCTCTATTATGGGTGTGAACTTGTTCGCAGGCAAATTTTACCACTGCATTAACA

CAACTACAGGAGATAGATTTGATATTGAGGATGTAAACAACCACACCGACTGT

TTGAAGTTGATAGAGAGAAACGAGACCGCAAGATGGAAGAATGTAAAAGTCA

ACTTCGACAATGTCGGCTTTGGATATCTTTCACTGCTGCAAGTAGCCACATTCA

AAGGATGGATGGACATTATGTACGCTGCAGTAGATTCCCGAAACGTAGAGTTG

CAACCGAAGTATGAAGAAAGTTTGTATATGTACCTCTACTTCGTAATTTTTATC

ATCTTTGGCTCATTCTTCACACTTAACCTGTTCATTGGTGTAATCATCGACAATT

TCAATCAGCAGAAAAAGAAATTTGGTGGACAAGACATCTTCATGACAGAGGA

ACAGAAGAAATACTATAATGCAATGAAAAAACTAGGGTCCAAAAAGCCCCAA

AAACCTATTCCTAGACCGGGCAACAAGTTTCAAGGCATGGTTTTCGACTTCGT

AACTAGACAGGTGTTTGATATATCTATTATGATTCTGATATGTCTGAATATGGT

TACGATGATGGTTGAGACTGATGATCAATCTGAATACGTTACGACGATACTTA

GCCGAATTAACTTGGTATTCATTGTTCTTTTCACGGGCGAATGTGTACTTAAAC

TGATTAGTTTAAGGCACTATTATTTCACAATCGGTTGGAACATTTTTGATTTCG

TTGTGGTCATACTTTCCATTGTTGGCATGTTTCTTGCTGAATTGATAGAAAAGT

ACTTCGTCAGTCCAACACTTTTCCGAGTTATACGGCTTGCCCGAATCGGACGAA

TTCTCAGGCTAATCAAAGGTGCTAAAGGAATTCGTACACTGCTTTTCGCTCTCA

TGATGTCACTGCCAGCTCTTTTCAACATCGGTTTGTTACTATTTTTGGTAATGTT

TATATATGCGATCTTCGGCATGAGTAATTTCGCTTATGTTAAACGGGAGGTGG

GAATCGATGACATGTTTAATTTTGAGACATTCGGCAATTCTATGATCTGTCTCT

TTCAAATTACCACGTCAGCTGGATGGGACGGATTGCTTGCTCCGATTCTCAACA

GTAAACCGCCCGATTGCGACCCTAACAAAGTGAATCCGGGTTCATCTGTAAAG

GGAGACTGCGGAAATCCGAGCGTCGGTATCTTCTTTTTCGTCTCCTACATTATA

ATTTCTTTCCTTGTTGTCGTGAACATGTATATAGCTGTGATCTTGGAAAATTTTT

CTGTTGCTACTGAGGAATCCGCAGAACCACTTTCAGAAGACGATTTTGAGATG

TTTTACGAAGTTTGGGAGAAGTTTGATCCTGACGCTACACAGTTTATGGAATTT

GAGAAGCTCTCACAGTTCGCAGCTGCCCTGGAGCCTCCGTTGAATCTTCCACA

GCCTAACAAGTTACAACTGATTGCGATGGACCTGCCAATGGTGTCTGGGGACC

GAATCCACTGCCTTGATATACTCTTTGCTTTCACAAAAAGGGTCTTGGGCGAGT

CTGGAGAAATGGACGCCCTCAGAATACAGATGGAGGAACGATTCATGGCTTCG

AATCCTAGCAAAGTGTCTTATCAACCCATCACTACGACTCTTAAAAGAAAACA

AGAGGAAGTGTCTGCTGTCATTATCCAGCGAGCATATAGACGGCACTTGCTCA

AACGAACTGTTAAGCAAGCCAGTTTCACCTACAATAAAAACAAAATAAAAGG

TGGTGCTAATTTGCTGATTAAAGAGGACATGATTATCGACAGAATCAATGAGA

ACTCCATTACAGAAAAAACCGATCTCACTATGTCAACAGCAGCCTGTCCTCCC

TCATACGACCGTGTCACTAAACCTATAGTCGAAAAACATGAACAAGAGGGCA

AGGATGAGAAGGCCAAAGGCAAAGCCGGCGACTACAAAGACCATGACGGAG

ACTATAAAGATCATGACATCGATTACAAGGATGACGATGACAAGTAATGAGCC CGGGCAGTTCCCTCTCCCTCCCCCCCCCCTAACGTTACTGGCCGAAGCCGCTTG GAATAAGGCCGGTGTGCGTTTGTCTATATGTTATTTTCCACCATATTGCCGTCT

TTTGGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCTTCTTGACGAGCATTCCT AGGGGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCGTGAA GGAAGCAGTTCCTCTGGAAGCTTCTTGAAGACAAACAACGTCTGTAGCGACCC

TTTGCAGGCAGCGGAACCCCCCACCTGGCGACAGGTGCCTCTGCGGCCAAAAG CCACGTGTATAAGATACACCTGCAAAGGCGGCACAACCCCAGTGCCACGTTGT GAGTTGGATAGTTGTGGAAAGAGTCAAATGGCTCTCCTCAAGCGTATTCAACA

AGGGGCTGAAGGATGCCCAGAAGGTACCCCATTGTATGGGATCTGATCTGGGG CCTCGGTGCACATGCTTTACATGTGTTTAGTCGAGGTTAAAAAAACGTCTAGG CCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAAAACACGATGATAATATG GCCACAACCCGCGGCCGCCACCATGGTCTCCAAAGGAGAAGCGGTCATTAAA GAGTTCATGAGGTTCAAGGTTCATATGGAAGGCTCCATGAATGGTCATGAGTT CGAGATTGAAGGGGAGGGTGAGGGGAGACCTTATGAGGGCACTCAGACAGCG AAATTGAAGGTGACAAAGGGAGGACCTCTCCCGTTCAGTTGGGACATATTGTC ACCGCAATTTATGTATGGTTCTAGAGCCTTCACTAAGCACCCTGCCGACATCCC AGATTACTACAAGCAATCCTTCCCTGAGGGCTTTAAGTGGGAGAGAGTAATGA

ATTTTGAAGATGGCGGGGCAGTCACAGTAACACAAGATACATCCCTGGAAGAT GGAACACTTATCTACAAAGTTAAGCTCAGAGGAACGAATTTTCCACCGGACGG TCCAGTGATGCAAAAAAAAACAATGGGTTGGGAAGCATCTACAGAGCGACTG TACCCTGAAGACGGTGTGCTGAAGGGGGACATCAAAATGGCCCTGCGACTTAA GGATGGAGGGCGCTATTTGGCAGATTTCAAGACTACTTACAAAGCCAAAAAGC CTGTACAAATGCCTGGAGCTTACAACGTGGATAGGAAGCTTGATATTACCAGT CACAATGAAGATTATACAGTGGTAGAACAATATGAACGCTCAGAAGGTCGCC ACAGCACTGGAGGCATGGATGAGTTGTACAAGAGCGCTGATCCAAAGAAGAA AAGGAAAGTTGATCCCAAAAAGAAGAGGAAAGTAGATCCAAAAAAGAAGCG

AAAAGTAGGGTACAAGAAGTGAGTTTAAACCGCTGATCAGCCTCGACTGTGCC TTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTG GAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCA TTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCA AGGGGGAGGATTGGGAAGACAATAGCAGGCATGCTGGGGATGCGGTGGGCTC TATGGC (SEQ ID NO: 78)

CN3677 (4583 bp between ITRs)

GCGGCCGCACGCGTAGAGAGGGGGAGGGGAAACAGGATGCGGCGAGGCGCGT GCGCACTGCCAGCTTCAGCACCGCGGACAGTGCCTTCGCCCCCGCCTGGCGGC GCGCGCCACCGCCGCCTCAGCACTGAAGGCGCGCTGACGTCACTCGCCGGTCC

CCCGCAAACTCCCCTTCCCGGCCACCTTGGTCGCGTCCGCGCCGCCGCCGGCCC AGCCGGACCGCACCACGCGAGGCGCGAGATAGGGGGGCACGGGCGCGACCAT CTGCGCTGCGGCGCCGGCGACTCAGCGCTGCCTCAGTCTGCGGTGGGCAGCGG AGGAGTCGTGTCGTGCCTGAGAGCGCAGTCGAGAAACCGGCTAGATCGACTGG ATCCTTCGAAGCTAGCGCTGCCACCATGGTCTACCCGTATGATGTCCCGGATTA CGCTGGCAGCTACCCATACGATGTACCCGACTATGCCGGCAGTATGGAGCAAA CAGTTTTGGTCCCTCCGGGACCAGACAGTTTCAATTTCTTTACTCGGGAGAGTC TTGCCGCCATTGAGAGGCGCATAGCTGAGGTAAGTACTAGCAGCTACAATCCA GCTACCATTCTGCTTTTATTCTATGGTTGGGATAAGGCTGGATTATTCTGAGTC

CAAGCTAGGCCCTTTTGCTAATCATGTTCATACCTCTTATCTTCCTCCCACAGG AAAAGGCTAAGAATCCAAAACCTGACAAGAAAGACGACGACGAAAACGGACC

CAAACCTAACTCAGATCTCGAAGCTGGAAAGAATCTCCCATTCATCTATGGTG

ATATCCCTCCAGAAATGGTTTCAGAACCTCTAGAAGATCTCGATCCATACTATA

TCAATAAAAAGACCTTCATCGTTCTGAACAAAGGAAAGGCGATTTTCCGGTTT

TCTGCTACTTCTGCTCTCTATATTCTCACACCATTTAATCCACTTCGCAAGATTG

CGATTAAGATACTGGTGCATAGTCTGTTCAGTATGCTGATTATGTGTACAATCC

TTACAAATTGTGTCTTTATGACTATGTCTAACCCGCCGGATTGGACCAAGAACG

TAGAATACACGTTCACTGGAATCTATACGTTCGAGTCTCTTATTAAGATAATCG

CCAGGGGGTTCTGTCTTGAGGATTTCACTTTCCTCCGCGATCCGTGGAATTGGC

TTGACTTCACCGTTATTACGTTCGCTTACGTTACTGAGTTCGTTGATCTTGGCAA

CGTGTCTGCACTCAGAACATTCAGAGTGCTTAGAGCACTTAAAACCATAAGTG

TCATACCAGGATTGAAAACGATCGTGGGAGCTCTGATACAGAGTGTAAAGAAG

CTTTCAGATGTAATGATCCTTACTGTCTTCTGTCTTTCCGTATTCGCACTCATCG

GGCTCCAGCTGTTTATGGGTAACCTCAGAAACAAATGCATTCAATGGCCACCA

ACAAATGCGAGCCTTGAGGAACATAGCATAGAAAAGAATATCACTGTTAACTA

TAATGGGACCCTCATAAACGAAACCGTGTTCGAATTTGACTGGAAATCCTACA

TTCAGGATTCCAGATATCATTATTTTCTTGAGGGCTTCTTGGACGCACTTTTGTG

CGGAAATTCAAGTGATGCTGGTCAATGTCCTGAAGGTTATATGTGTGTTAAAG

CAGGAAGAAACCCAAACTACGGATACACATCTTTCGATACATTTTCTTGGGCTT

TCCTATCTCTTTTTCGGCTTATGACACAAGACTTTTGGGAAAATTTGTATCAGC

TGACACTCCGAGCGGCTGGAAAAACTTATATGATCTTCTTCGTTCTTGTAATCT

TTTTGGGATCCTTCTACCTCATCAATTTGATACTTGCAGTTGTCGCTATGGCATA

CGAGGAGCAAAATCAAGCAACGCTAGAAGAAGCGGAGCAGAAAGAGGCTGA

ATTTCAACAGATGATTGAGCAATTGAAGAAACAACAGGAAGCTGCACAACAA

GCAGCTACTGCTACTGCATCTGAACATTCTAGAGAGCCAAGTGCAGCTGGAAG

GCTTTCTGATAGTTCAAGTGAAGCATCTAAATTGAGTTCTAAGTCAGCAAAGG

AACGGAGAAATAGACGGAAAAAACGAAAGCAGAAGGAGCAATCTGGAGGAG

AAGAGAAGGACGAAGACGAGTTTCAAAAAAGTGAATCAGAGGACTCAATTAG

ACGCAAAGGATTCAGATTTAGTATCGAAGGAAATAGATTGACTTATGAAAAAC

GATATTCCTCACCACATCAGTCACTCCTGAGTATACGCGGGTCACTCTTTTCAC

CGAGACGAAATTCCAGAACTTCACTCTTCTCATTCCGGGGAAGGGCTAAGGAT

GTAGGCTCAGAAAATGATTTCGCAGACGATGAGCATTCCACTTTTGAAGATAA

TGAGAGCAGGCGAGACAGTCTCTTTGTACCACGAAGACATGGCGAAAGAAGG

AACAGCAACCTTAGCCAGACTAGTCGGTCCAGTAGAATGCTAGCTGTATTCCC

TGCTAATGGCAAGATGCATTCCACCGTTGATTGTAATGGGGTCGTCTCGTTAGT

AGGTGGACCTTCAGTTCCTACCTCACCGGTTGGACAATTGCTGCCGGAGGGAA

CCACTACTGAGACTGAAATGAGAAAACGACGTTCTTCAAGCTTCCATGTGTCT

ATGGATTTTTTGGAAGACCCGTCACAGCGCCAAAGAGCTATGTCTATAGCTTC

AATCCTGACAAACACCGTAGAGGAGTTGGAGGAGTCACGCCAGAAGTGCCCTC

CTTGTTGGTACAAGTTCTCCAACATCTTCCTGATTTGGGATTGTTCACCTTATTG

GCTGAAAGTCAAGCACGTTGTTAACCTCGTCGTAATGGATCCTTTTGTCGACCT

GGCTATAACGATATGTATCGTCCTGAACACACTCTTCATGGCTATGGAGCATTA

TCCGATGACTGATCATTTTAACAATGTGCTTACCGTGGGTAATCTGGTTTTCAC

TGGCATCTTTACTGCAGAAATGTTTCTTAAGATTATTGCAATGGACCCCTACTA

CTACTTTCAAGAAGGATGGAATATTTTTGATGGTTTTATCGTCACACTTTCTTTG

GTTGAATTGGGCTTGGCAAATGTAGAGGGGCTCTCAGTTCTTAGAAGTTTCAG ACTTCTCCGGGTATTCAAGCTTGCTAAGAGCTGGCCTACTTTGAACATGCTCAT

CAAGATTATCGGAAACAGTGTTGGCGCCCTTGGCAATCTGACATTGGTGTTGG

CTATCATAGTATTCATCTTCGCGGTTGTGGGAATGCAGTTGTTTGGGAAGTCAT

ACAAGGACTGTGTGTGCAAGATAGCGTCCGACTGTCAACTTCCGAGGTGGCAC

ATGAACGATTTCTTTCATTCATTCCTCATTGTGTTTCGGGTCCTCTGTGGCGAGT

GGATAGAAACTATGTGGGACTGTATGGAAGTAGCTGGGCAGGCGATGTGCCTT

ACGGTATTCATGATGGTCATGGTCATCGGAAATCTTGTTGTATTGAATTTATTT

CTCGCGTTGTTGTTGAGTTCATTTTCCGCCGATAATTTGGCTGCCACTGATGAC

GACAACGAGATGAATAATCTTCAGATAGCTGTAGACCGGATGCACAAAGGCGT

TGCCTACGTCAAACGAAAAATCTATGAATTCATACAGCAATCCTTCATACGAA

AACAGAAGATTCTGGATGAAATCAAACCCCTTGATGATCTCAATAATAAGAAA

GATTCATGTCTCAGTTATGACACAGAAATCTTGACGGTGGAATACGGGTTTCTT

CCGATCGGAAAGATTGTTGAGGAGCGCATAGAGTGTACGGTGTATACCGTCGA

TAAGAACGGATTTGTCTACACACAGCCTATCGCACAATGGCATAATAGAGGAG

AACAAGAAGTCTTCGAATATTGTTTGGAGGACGGATCAATCATACGGGCAACC

AAAGACCACAAGTTTATGACAACAGATGGACAGATGTTGCCAATAGATGAGAT

ATTTGAGAGGGGACTTGATCTCAAGCAAGTGGATGGTCTGCCATAATGATATC

AAAGAGACCGGTTCACTGTGACAGTAAAAGAGACCGGTTCACTGTGAGAATG

AAAGAGACCGGTTCACTGTGATCGGAAAAGAGACCGGTTCACTGTGAGCGGCC

TTGAAACCCAGCAGACAATGTAGCTCAGTAGAAACCCAGCAGACAATGTAGCT

GAATGGAAACCCAGCAGACAATGTAGCTTCGGAGAAACCCAGCAGACAATGT

AGCTATAATCAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTATTCTT

AACTATGTTGCTCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCTTTGTATC

ATGCTATTGCTTCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTT

AGTTCTTGCCACGGCGGAACTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAG

GGGCTCGGCTGTTGGGCACTGACAATTCCGTGGAATAAAAGATCTTTATTTTCA

TTAGATCTGTGTGTTGGTTTTTTGTGTGCGGACCGAGCGGCCGC (SEQ ID NO: 79)

CN3678 (4088 bp between ITRs)

GCGGCCGCACGCGTAGAGAGGGGGAGGGGAAACAGGATGCGGCGAGGCGCGT

GCGCACTGCCAGCTTCAGCACCGCGGACAGTGCCTTCGCCCCCGCCTGGCGGCG

CGCGCCACCGCCGCCTCAGCACTGAAGGCGCGCTGACGTCACTCGCCGGTCCCC

CGCAAACTCCCCTTCCCGGCCACCTTGGTCGCGTCCGCGCCGCCGCCGGCCCAG

CCGGACCGCACCACGCGAGGCGCGAGATAGGGGGGCACGGGCGCGACCATCTG

CGCTGCGGCGCCGGCGACTCAGCGCTGCCTCAGTCTGCGGTGGGCAGCGGAGG

AGTCGTGTCGTGCCTGAGAGCGCAGTCGAGAAACCGGCTAGATCGACTGGATCC

TTCGAAGCTAGCGCTGCCACCATGGTCAAGATCATCTCCAGGAAGTCTCTGGGT

ACACAGAATGTCTACGATATCGGAGTCGAGAAAGACCACAATTTTCTCCTGAAA

AACGGACTCGTGGCGTCCAATTGCATGTCGAACCATACCACAGAGATAGGCAA

GGACCTTGACTACCTTAAAGACGTGAACGGTACCACAAGTGGAATAGGCACAG

GTAAGTACTAGCAGCTACAATCCAGCTACCATTCTGCTTTTATTCTATGGTTGGG

ATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTTCATA

CCTCTTATCTTCCTCCCACAGGCTCCTCTGTAGAGAAGTACATCATAGACGAGA

GCGATTACATGTCTTTCATCAACAACCCGTCCCTCACTGTCACCGTTCCCATCGC

CGTAGGAGAATCTGACTTCGAGAATCTCAATACGGAGGATTTCAGCTCCGAATC

AGACTTGGAGGAATCAAAGGAGAAGTTGAACGAAAGTTCAAGTTCATCCGAGG

GCAGCACCGTGGACATAGGCGCCCCCGTCGAGGAACAACCTGTAGTCGAGCCT GAGGAAACTTTGGAACCCGAAGCGTGTTTCACGGAGGGGTGTGTTCAACGCTTC

AAGTGTTGCCAAATTAACGTTGAAGAGGGTCGTGGAAAACAATGGTGGAACCT

CCGCAGGACCTGTTTCCGGATCGTCGAACATAATTGGTTCGAGACGTTCATAGT

TTTCATGATCTTGCTTTCATCTGGTGCTTTGGCATTCGAGGATATCTACATCGAC

CAACGAAAGACCATAAAAACTATGCTGGAATATGCAGACAAGGTTTTCACATA

CATATTCATCCTTGAAATGCTCCTGAAATGGGTAGCGTATGGTTACCAGACTTAT

TTCACGAACGCATGGTGCTGGCTCGATTTCCTGATTGTCGACGTCTCCCTGGTGT

CATTGACTGCTAACGCACTCGGATATAGCGAACTAGGCGCTATTAAGAGTCTCA

GAACCCTGAGAGCATTGAGGCCCCTCCGCGCGCTCTCTCGGTTTGAGGGAATGA

GAGTAGTCGTTAATGCACTGTTGGGAGCGATACCTTCCATTATGAACGTGCTTCT

CGTTTGTCTCATCTTCTGGCTGATATTCTCTATTATGGGTGTGAACTTGTTCGCA

GGCAAATTTTACCACTGCATTAACACAACTACAGGAGATAGATTTGATATTGAG

GATGTAAACAACCACACCGACTGTTTGAAGTTGATAGAGAGAAACGAGACCGC

AAGATGGAAGAATGTAAAAGTCAACTTCGACAATGTCGGCTTTGGATATCTTTC

ACTGCTGCAAGTAGCCACATTCAAAGGATGGATGGACATTATGTACGCTGCAGT

AGATTCCCGAAACGTAGAGTTGCAACCGAAGTATGAAGAAAGTTTGTATATGTA

CCTCTACTTCGTAATTTTTATCATCTTTGGCTCATTCTTCACACTTAACCTGTTCA

TTGGTGTAATCATCGACAATTTCAATCAGCAGAAAAAGAAATTTGGTGGACAAG

ACATCTTCATGACAGAGGAACAGAAGAAATACTATAATGCAATGAAAAAACTA

GGGTCCAAAAAGCCCCAAAAACCTATTCCTAGACCGGGCAACAAGTTTCAAGG

CATGGTTTTCGACTTCGTAACTAGACAGGTGTTTGATATATCTATTATGATTCTG

ATATGTCTGAATATGGTTACGATGATGGTTGAGACTGATGATCAATCTGAATAC

GTTACGACGATACTTAGCCGAATTAACTTGGTATTCATTGTTCTTTTCACGGGCG

AATGTGTACTTAAACTGATTAGTTTAAGGCACTATTATTTCACAATCGGTTGGAA

CATTTTTGATTTCGTTGTGGTCATACTTTCCATTGTTGGCATGTTTCTTGCTGAAT

TGATAGAAAAGTACTTCGTCAGTCCAACACTTTTCCGAGTTATACGGCTTGCCC

GAATCGGACGAATTCTCAGGCTAATCAAAGGTGCTAAAGGAATTCGTACACTGC

TTTTCGCTCTCATGATGTCACTGCCAGCTCTTTTCAACATCGGTTTGTTACTATTT

TTGGTAATGTTTATATATGCGATCTTCGGCATGAGTAATTTCGCTTATGTTAAAC

GGGAGGTGGGAATCGATGACATGTTTAATTTTGAGACATTCGGCAATTCTATGA

TCTGTCTCTTTCAAATTACCACGTCAGCTGGATGGGACGGATTGCTTGCTCCGAT

TCTCAACAGTAAACCGCCCGATTGCGACCCTAACAAAGTGAATCCGGGTTCATC

TGTAAAGGGAGACTGCGGAAATCCGAGCGTCGGTATCTTCTTTTTCGTCTCCTAC

ATTATAATTTCTTTCCTTGTTGTCGTGAACATGTATATAGCTGTGATCTTGGAAA

ATTTTTCTGTTGCTACTGAGGAATCCGCAGAACCACTTTCAGAAGACGATTTTGA

GATGTTTTACGAAGTTTGGGAGAAGTTTGATCCTGACGCTACACAGTTTATGGA

ATTTGAGAAGCTCTCACAGTTCGCAGCTGCCCTGGAGCCTCCGTTGAATCTTCCA

CAGCCTAACAAGTTACAACTGATTGCGATGGACCTGCCAATGGTGTCTGGGGAC

CGAATCCACTGCCTTGATATACTCTTTGCTTTCACAAAAAGGGTCTTGGGCGAGT

CTGGAGAAATGGACGCCCTCAGAATACAGATGGAGGAACGATTCATGGCTTCG

AATCCTAGCAAAGTGTCTTATCAACCCATCACTACGACTCTTAAAAGAAAACAA

GAGGAAGTGTCTGCTGTCATTATCCAGCGAGCATATAGACGGCACTTGCTCAAA

CGAACTGTTAAGCAAGCCAGTTTCACCTACAATAAAAACAAAATAAAAGGTGG

TGCTAATTTGCTGATTAAAGAGGACATGATTATCGACAGAATCAATGAGAACTC

CATTACAGAAAAAACCGATCTCACTATGTCAACAGCAGCCTGTCCTCCCTCATA

CGACCGTGTCACTAAACCTATAGTCGAAAAACATGAACAAGAGGGCAAGGATG AGAAGGCCAAAGGCAAAGCCGGCGACTACAAAGACCATGACGGAGACTATAA

AGATCATGACATCGATTACAAGGATGACGATGACAAGTAATGATATCAAAGAG

ACCGGTTCACTGTGACAGTAAAAGAGACCGGTTCACTGTGAGAATGAAAGAGA

CCGGTTCACTGTGATCGGAAAAGAGACCGGTTCACTGTGAGCGGCCTTGAAACC

CAGCAGACAATGTAGCTCAGTAGAAACCCAGCAGACAATGTAGCTGAATGGAA

ACCCAGCAGACAATGTAGCTTCGGAGAAACCCAGCAGACAATGTAGCTATAAT

CAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAACTATGTTG

CTCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCTTTGTATCATGCTATTGCT

TCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTTAGTTCTTGCCAC

GGCGGAACTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGGCTCGGCTGTT

GGGCACTGACAATTCCGTGGAATAAAAGATCTTTATTTTCATTAGATCTGTGTGT

TGGTTTTTTGTGTGCGGACCGAGCGGCCGC (SEQ ID NO: 80)

CN4541 (4790 bp between ITRs)

GCGGCCGCACGCGTGAAATCATAAATGCTGAGGGTAGTCTGCCTCAGGTACAC

ACTGAGAAACTGCTTTAATGTAACCTGACCCACGGTTATTAGTGAAAATATCA

CTTTTGTTGTTACCTTATTCCCAACAAATTCATTTCTGCTTTAATGGAAAAGATC

CGGGTTCACACTAATCAGGCCCAACGGAAGGCCATATTAGCAATTTGGCAGGT

ACCCGAGGGCCATACCTAATCTGCATAAAATGAAGCAGATTGCAACCGCCCTC

ATCTTTTTTATTTTTAAACTGGTTTTTGAAGCAGAGCATAAAATCTCAGAGGGA

GAGACAGAAGATGCTAGTGCATACATTTTCCTTCATGCCTTTATTTTCATTCTTT

TTGCACAAACCATCTTCCTGAATGGCTGTTTACCTAAAGAAGAATAACAAAAT

AAAAGGTGCTAGGAAATGGAGTAGGCAGAGATCACAAATGTTTAATTAAAAA

AAAAAAAAGTCATGTACTTTCATAGATATTCACAATCCTCTCTAGTATACTTTC

AAATCAGTTTTAATTTCAGTTTAGTGTTTTTATGTTTTGTGAAGATACGCGAGC

TCGGGCTGGGCATAAAAGTCAGGGCAGAGCCATCTATTGCTTACATTTGCTTCT

GGGATCCAGATCTTTCGAAGCTAGCGCTACCGGTCGCCACCATGGTCTACCCG

TATGATGTCCCGGATTACGCTGGCAGCTACCCATACGATGTACCCGACTATGC

CGGCAGTATGGAGCAAACAGTTTTGGTCCCTCCGGGACCAGACAGTTTCAATT

TCTTTACTCGGGAGAGTCTTGCCGCCATTGAGAGGCGCATAGCTGAGGTAAGT

ACTAGCAGCTACAATCCAGCTACCATTCTGCTTTTATTCTATGGTTGGGATAAG

GCTGGATTATTCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTTCATACCTC

TTATCTTCCTCCCACAGGAAAAGGCTAAGAATCCAAAACCTGACAAGAAAGAC

GACGACGAAAACGGACCCAAACCTAACTCAGATCTCGAAGCTGGAAAGAATC

TCCCATTCATCTATGGTGATATCCCTCCAGAAATGGTTTCAGAACCTCTAGAAG

ATCTCGATCCATACTATATCAATAAAAAGACCTTCATCGTTCTGAACAAAGGA

AAGGCGATTTTCCGGTTTTCTGCTACTTCTGCTCTCTATATTCTCACACCATTTA

ATCCACTTCGCAAGATTGCGATTAAGATACTGGTGCATAGTCTGTTCAGTATGC

TGATTATGTGTACAATCCTTACAAATTGTGTCTTTATGACTATGTCTAACCCGC

CGGATTGGACCAAGAACGTAGAATACACGTTCACTGGAATCTATACGTTCGAG

TCTCTTATTAAGATAATCGCCAGGGGGTTCTGTCTTGAGGATTTCACTTTCCTC

CGCGATCCGTGGAATTGGCTTGACTTCACCGTTATTACGTTCGCTTACGTTACT

GAGTTCGTTGATCTTGGCAACGTGTCTGCACTCAGAACATTCAGAGTGCTTAG

AGCACTTAAAACCATAAGTGTCATACCAGGATTGAAAACGATCGTGGGAGCTC

TGATACAGAGTGTAAAGAAGCTTTCAGATGTAATGATCCTTACTGTCTTCTGTC

TTTCCGTATTCGCACTCATCGGGCTCCAGCTGTTTATGGGTAACCTCAGAAACA

AATGCATTCAATGGCCACCAACAAATGCGAGCCTTGAGGAACATAGCATAGA AAAGAATATCACTGTTAACTATAATGGGACCCTCATAAACGAAACCGTGTTCG

AATTTGACTGGAAATCCTACATTCAGGATTCCAGATATCATTATTTTCTTGAGG

GCTTCTTGGACGCACTTTTGTGCGGAAATTCAAGTGATGCTGGTCAATGTCCTG

AAGGTTATATGTGTGTTAAAGCAGGAAGAAACCCAAACTACGGATACACATCT

TTCGATACATTTTCTTGGGCTTTCCTATCTCTTTTTCGGCTTATGACACAAGACT

TTTGGGAAAATTTGTATCAGCTGACACTCCGAGCGGCTGGAAAAACTTATATG

ATCTTCTTCGTTCTTGTAATCTTTTTGGGATCCTTCTACCTCATCAATTTGATAC

TTGCAGTTGTCGCTATGGCATACGAGGAGCAAAATCAAGCAACGCTAGAAGA

AGCGGAGCAGAAAGAGGCTGAATTTCAACAGATGATTGAGCAATTGAAGAAA

CAACAGGAAGCTGCACAACAAGCAGCTACTGCTACTGCATCTGAACATTCTAG

AGAGCCAAGTGCAGCTGGAAGGCTTTCTGATAGTTCAAGTGAAGCATCTAAAT

TGAGTTCTAAGTCAGCAAAGGAACGGAGAAATAGACGGAAAAAACGAAAGCA

GAAGGAGCAATCTGGAGGAGAAGAGAAGGACGAAGACGAGTTTCAAAAAAG

TGAATCAGAGGACTCAATTAGACGCAAAGGATTCAGATTTAGTATCGAAGGAA

ATAGATTGACTTATGAAAAACGATATTCCTCACCACATCAGTCACTCCTGAGT

ATACGCGGGTCACTCTTTTCACCGAGACGAAATTCCAGAACTTCACTCTTCTCA

TTCCGGGGAAGGGCTAAGGATGTAGGCTCAGAAAATGATTTCGCAGACGATG

AGCATTCCACTTTTGAAGATAATGAGAGCAGGCGAGACAGTCTCTTTGTACCA

CGAAGACATGGCGAAAGAAGGAACAGCAACCTTAGCCAGACTAGTCGGTCCA

GTAGAATGCTAGCTGTATTCCCTGCTAATGGCAAGATGCATTCCACCGTTGATT

GTAATGGGGTCGTCTCGTTAGTAGGTGGACCTTCAGTTCCTACCTCACCGGTTG

GACAATTGCTGCCGGAGGGAACCACTACTGAGACTGAAATGAGAAAACGACG

TTCTTCAAGCTTCCATGTGTCTATGGATTTTTTGGAAGACCCGTCACAGCGCCA

AAGAGCTATGTCTATAGCTTCAATCCTGACAAACACCGTAGAGGAGTTGGAGG

AGTCACGCCAGAAGTGCCCTCCTTGTTGGTACAAGTTCTCCAACATCTTCCTGA

TTTGGGATTGTTCACCTTATTGGCTGAAAGTCAAGCACGTTGTTAACCTCGTCG

TAATGGATCCTTTTGTCGACCTGGCTATAACGATATGTATCGTCCTGAACACAC

TCTTCATGGCTATGGAGCATTATCCGATGACTGATCATTTTAACAATGTGCTTA

CCGTGGGTAATCTGGTTTTCACTGGCATCTTTACTGCAGAAATGTTTCTTAAGA

TTATTGCAATGGACCCCTACTACTACTTTCAAGAAGGATGGAATATTTTTGATG

GTTTTATCGTCACACTTTCTTTGGTTGAATTGGGCTTGGCAAATGTAGAGGGGC

TCTCAGTTCTTAGAAGTTTCAGACTTCTCCGGGTATTCAAGCTTGCTAAGAGCT

GGCCTACTTTGAACATGCTCATCAAGATTATCGGAAACAGTGTTGGCGCCCTT

GGCAATCTGACATTGGTGTTGGCTATCATAGTATTCATCTTCGCGGTTGTGGGA

ATGCAGTTGTTTGGGAAGTCATACAAGGACTGTGTGTGCAAGATAGCGTCCGA

CTGTCAACTTCCGAGGTGGCACATGAACGATTTCTTTCATTCATTCCTCATTGT

GTTTCGGGTCCTCTGTGGCGAGTGGATAGAAACTATGTGGGACTGTATGGAAG

TAGCTGGGCAGGCGATGTGCCTTACGGTATTCATGATGGTCATGGTCATCGGA

AATCTTGTTGTATTGAATTTATTTCTCGCGTTGTTGTTGAGTTCATTTTCCGCCG

ATAATTTGGCTGCCACTGATGACGACAACGAGATGAATAATCTTCAGATAGCT

GTAGACCGGATGCACAAAGGCGTTGCCTACGTCAAACGAAAAATCTATGAATT

CATACAGCAATCCTTCATACGAAAACAGAAGATTCTGGATGAAATCAAACCCC

TTGATGATCTCAATAATAAGAAAGATTCATGTCTCAGTTATGACACAGAAATC

TTGACGGTGGAATACGGGTTTCTTCCGATCGGAAAGATTGTTGAGGAGCGCAT

AGAGTGTACGGTGTATACCGTCGATAAGAACGGATTTGTCTACACACAGCCTA

TCGCACAATGGCATAATAGAGGAGAACAAGAAGTCTTCGAATATTGTTTGGAG GACGGATCAATCATACGGGCAACCAAAGACCACAAGTTTATGACAACAGATG

GACAGATGTTGCCAATAGATGAGATATTTGAGAGGGGACTTGATCTCAAGCAA

GTGGATGGTCTGCCATAATGATATCATAATCAACCTCTGGATTACAAAATTTGT

GAAAGATTGACTGGTATTCTTAACTATGTTGCTCCTTTTACGCTATGTGGATAC

GCTGCTTTAATGCCTTTGTATCATGCTATTGCTTCCCGTATGGCTTTCATTTTCT

CCTCCTTGTATAAATCCTGGTTAGTTCTTGCCACGGCGGAACTCATCGCCGCCT

GCCTTGCCCGCTGCTGGACAGGGGCTCGGCTGTTGGGCACTGACAATTCCGTG

GTGTTTATTTGTGAAATTTGTGATGCTATTGCTTTATTTGTAACCATCTAGCTTT

ATTTGTGAAATTTGTGATGCTATTGCTTTATTTGTAACCATTATAAGCTGCAAT

AAACAAGTTAACAACAACAATTGCATTCATTTTATGTTTCAGGTTCAGGGGGA

GATGTGGGAGGTTTTTTAAACGGACCGAGCGGCCGC (SEQ ID NO: 81)

CN4542 (4295 bp between ITRs)

GCGGCCGCACGCGTGAAATCATAAATGCTGAGGGTAGTCTGCCTCAGGTACAC

ACTGAGAAACTGCTTTAATGTAACCTGACCCACGGTTATTAGTGAAAATATCA

CTTTTGTTGTTACCTTATTCCCAACAAATTCATTTCTGCTTTAATGGAAAAGATC

CGGGTTCACACTAATCAGGCCCAACGGAAGGCCATATTAGCAATTTGGCAGGT

ACCCGAGGGCCATACCTAATCTGCATAAAATGAAGCAGATTGCAACCGCCCTC

ATCTTTTTTATTTTTAAACTGGTTTTTGAAGCAGAGCATAAAATCTCAGAGGGA

GAGACAGAAGATGCTAGTGCATACATTTTCCTTCATGCCTTTATTTTCATTCTTT

TTGCACAAACCATCTTCCTGAATGGCTGTTTACCTAAAGAAGAATAACAAAAT

AAAAGGTGCTAGGAAATGGAGTAGGCAGAGATCACAAATGTTTAATTAAAAA

AAAAAAAAGTCATGTACTTTCATAGATATTCACAATCCTCTCTAGTATACTTTC

AAATCAGTTTTAATTTCAGTTTAGTGTTTTTATGTTTTGTGAAGATACGCGAGC

TCGGGCTGGGCATAAAAGTCAGGGCAGAGCCATCTATTGCTTACATTTGCTTCT

GGGATCCAGATCTTTCGAAGCTAGCGCTACCGGTCGCCACCATGGTCAAGATC

ATCTCCAGGAAGTCTCTGGGTACACAGAATGTCTACGATATCGGAGTCGAGAA

AGACCACAATTTTCTCCTGAAAAACGGACTCGTGGCGTCCAATTGCATGTCGA

ACCATACCACAGAGATAGGCAAGGACCTTGACTACCTTAAAGACGTGAACGGT

ACCACAAGTGGAATAGGCACAGGTAAGTACTAGCAGCTACAATCCAGCTACC

ATTCTGCTTTTATTCTATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAGCT

AGGCCCTTTTGCTAATCATGTTCATACCTCTTATCTTCCTCCCACAGGCTCCTCT

GTAGAGAAGTACATCATAGACGAGAGCGATTACATGTCTTTCATCAACAACCC

GTCCCTCACTGTCACCGTTCCCATCGCCGTAGGAGAATCTGACTTCGAGAATCT

CAATACGGAGGATTTCAGCTCCGAATCAGACTTGGAGGAATCAAAGGAGAAG

TTGAACGAAAGTTCAAGTTCATCCGAGGGCAGCACCGTGGACATAGGCGCCCC

CGTCGAGGAACAACCTGTAGTCGAGCCTGAGGAAACTTTGGAACCCGAAGCGT

GTTTCACGGAGGGGTGTGTTCAACGCTTCAAGTGTTGCCAAATTAACGTTGAA

GAGGGTCGTGGAAAACAATGGTGGAACCTCCGCAGGACCTGTTTCCGGATCGT

CGAACATAATTGGTTCGAGACGTTCATAGTTTTCATGATCTTGCTTTCATCTGG

TGCTTTGGCATTCGAGGATATCTACATCGACCAACGAAAGACCATAAAAACTA

TGCTGGAATATGCAGACAAGGTTTTCACATACATATTCATCCTTGAAATGCTCC

TGAAATGGGTAGCGTATGGTTACCAGACTTATTTCACGAACGCATGGTGCTGG

CTCGATTTCCTGATTGTCGACGTCTCCCTGGTGTCATTGACTGCTAACGCACTC

GGATATAGCGAACTAGGCGCTATTAAGAGTCTCAGAACCCTGAGAGCATTGAG

GCCCCTCCGCGCGCTCTCTCGGTTTGAGGGAATGAGAGTAGTCGTTAATGCAC

TGTTGGGAGCGATACCTTCCATTATGAACGTGCTTCTCGTTTGTCTCATCTTCTG GCTGATATTCTCTATTATGGGTGTGAACTTGTTCGCAGGCAAATTTTACCACTG

CATTAACACAACTACAGGAGATAGATTTGATATTGAGGATGTAAACAACCACA

CCGACTGTTTGAAGTTGATAGAGAGAAACGAGACCGCAAGATGGAAGAATGT

AAAAGTCAACTTCGACAATGTCGGCTTTGGATATCTTTCACTGCTGCAAGTAGC

CACATTCAAAGGATGGATGGACATTATGTACGCTGCAGTAGATTCCCGAAACG

TAGAGTTGCAACCGAAGTATGAAGAAAGTTTGTATATGTACCTCTACTTCGTA

_{ATTTTTATCATCTTTGGCTCATTCTTCACACTTAACCTGTTCATTGGTGTAATCA}

TCGACAATTTCAATCAGCAGAAAAAGAAATTTGGTGGACAAGACATCTTCATG

ACAGAGGAACAGAAGAAATACTATAATGCAATGAAAAAACTAGGGTCCAAAA

AGCCCCAAAAACCTATTCCTAGACCGGGCAACAAGTTTCAAGGCATGGTTTTC

GACTTCGTAACTAGACAGGTGTTTGATATATCTATTATGATTCTGATATGTCTG

AATATGGTTACGATGATGGTTGAGACTGATGATCAATCTGAATACGTTACGAC

GATACTTAGCCGAATTAACTTGGTATTCATTGTTCTTTTCACGGGCGAATGTGT

ACTTAAACTGATTAGTTTAAGGCACTATTATTTCACAATCGGTTGGAACATTTT

TGATTTCGTTGTGGTCATACTTTCCATTGTTGGCATGTTTCTTGCTGAATTGATA

GAAAAGTACTTCGTCAGTCCAACACTTTTCCGAGTTATACGGCTTGCCCGAATC

GGACGAATTCTCAGGCTAATCAAAGGTGCTAAAGGAATTCGTACACTGCTTTT

CGCTCTCATGATGTCACTGCCAGCTCTTTTCAACATCGGTTTGTTACTATTTTTG

GTAATGTTTATATATGCGATCTTCGGCATGAGTAATTTCGCTTATGTTAAACGG

GAGGTGGGAATCGATGACATGTTTAATTTTGAGACATTCGGCAATTCTATGAT

CTGTCTCTTTCAAATTACCACGTCAGCTGGATGGGACGGATTGCTTGCTCCGAT

TCTCAACAGTAAACCGCCCGATTGCGACCCTAACAAAGTGAATCCGGGTTCAT

CTGTAAAGGGAGACTGCGGAAATCCGAGCGTCGGTATCTTCTTTTTCGTCTCCT

ACATTATAATTTCTTTCCTTGTTGTCGTGAACATGTATATAGCTGTGATCTTGG

AAAATTTTTCTGTTGCTACTGAGGAATCCGCAGAACCACTTTCAGAAGACGAT

TTTGAGATGTTTTACGAAGTTTGGGAGAAGTTTGATCCTGACGCTACACAGTTT

ATGGAATTTGAGAAGCTCTCACAGTTCGCAGCTGCCCTGGAGCCTCCGTTGAA

TCTTCCACAGCCTAACAAGTTACAACTGATTGCGATGGACCTGCCAATGGTGT

CTGGGGACCGAATCCACTGCCTTGATATACTCTTTGCTTTCACAAAAAGGGTCT

TGGGCGAGTCTGGAGAAATGGACGCCCTCAGAATACAGATGGAGGAACGATT

CATGGCTTCGAATCCTAGCAAAGTGTCTTATCAACCCATCACTACGACTCTTAA

AAGAAAACAAGAGGAAGTGTCTGCTGTCATTATCCAGCGAGCATATAGACGG

CACTTGCTCAAACGAACTGTTAAGCAAGCCAGTTTCACCTACAATAAAAACAA

AATAAAAGGTGGTGCTAATTTGCTGATTAAAGAGGACATGATTATCGACAGAA

TCAATGAGAACTCCATTACAGAAAAAACCGATCTCACTATGTCAACAGCAGCC

TGTCCTCCCTCATACGACCGTGTCACTAAACCTATAGTCGAAAAACATGAACA

AGAGGGCAAGGATGAGAAGGCCAAAGGCAAAGCCGGCGACTACAAAGACCA

TGACGGAGACTATAAAGATCATGACATCGATTACAAGGATGACGATGACAAG

TAATGATATCATAATCAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGG

TATTCTTAACTATGTTGCTCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCT

TTGTATCATGCTATTGCTTCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAAT

CCTGGTTAGTTCTTGCCACGGCGGAACTCATCGCCGCCTGCCTTGCCCGCTGCT

GGACAGGGGCTCGGCTGTTGGGCACTGACAATTCCGTGGTGTTTATTTGTGAA

ATTTGTGATGCTATTGCTTTATTTGTAACCATCTAGCTTTATTTGTGAAATTTGT

GATGCTATTGCTTTATTTGTAACCATTATAAGCTGCAATAAACAAGTTAACAAC AACAATTGCATTCATTTTATGTTTCAGGTTCAGGGGGAGATGTGGGAGGTTTTT

TAAACGGACCGAGCGGCCGC(SEQ ID NO:82)

CN4217 (4249 bp between ITRs)

GCGGCCGCACGCGTTTAATTAAGTGTCTAGACTGCAGAGGGCCCTGCGTATGAG

TGCAAGTGGGTTTTAGGACCAGGATGAGGCGGGGTGGGGGTGCCTACCTGACG

ACCGACCCCGACCCACTGGACAAGCACCCAACCCCCATTCCCCAAATTGCGCAT

CCCCTATCAGAGAGGGGGAGGGGAAACAGGATGCGGCGAGGCGCGTGCGCACT

GCCAGCTTCAGCACCGCGGACAGTGCCTTCGCCCCCGCCTGGCGGCGCGCGCCA

CCGCCGCCTCAGCACTGAAGGCGCGCTGACGTCACTCGCCGGTCCCCCGCAAAC

TCCCCTTCCCGGCCACCTTGGTCGCGTCCGCGCCGCCGCCGGCCCAGCCGGACC

GCACCACGCGAGGCGCGAGATAGGGGGGCACGGGCGCGACCATCTGCGCTGCG

GCGCCGGCGACTCAGCGCTGCCTCAGTCTGCGGTGGGCAGCGGAGGAGTCGTGT

CGTGCCTGAGAGCGCAGTCGAGAAACCGGCTAGAGGATCCTTCGAAGCTAGCG

CTACCGGTGCCACCATGGTCTACCCGTATGATGTCCCGGATTACGCTGGCAGCT

ACCCATACGATGTACCCGACTATGCCGGCAGTATGGAGCAAACAGTTTTGGTCC

CTCCGGGACCAGACAGTTTCAATTTCTTTACTCGGGAGAGTCTTGCCGCCATTGA

GAGGCGCATAGCTGAGGTAAGTACTAGCAGCTACAATCCAGCTACCATTCTGCT

TTTATTCTATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCTTT

TGCTAATCATGTTCATACCTCTTATCTTCCTCCCACAGGAAAAGGCTAAGAATCC

AAAACCTGACAAGAAAGACGACGACGAAAACGGACCCAAACCTAACTCAGATC

TCGAAGCTGGAAAGAATCTCCCATTCATCTATGGTGATATCCCTCCAGAAATGG

TTTCAGAACCTCTAGAAGATCTCGATCCATACTATATCAATAAAAAGACCTTCA

TCGTTCTGAACAAAGGAAAGGCGATTTTCCGGTTTTCTGCTACTTCTGCTCTCTA

TATTCTCACACCATTTAATCCACTTCGCAAGATTGCGATTAAGATACTGGTGCAT

AGTCTGTTCAGTATGCTGATTATGTGTACAATCCTTACAAATTGTGTCTTTATGA

CTATGTCTAACCCGCCGGATTGGACCAAGAACGTAGAATACACGTTCACTGGAA

TCTATACGTTCGAGTCTCTTATTAAGATAATCGCCAGGGGGTTCTGTCTTGAGGA

TTTCACTTTCCTCCGCGATCCGTGGAATTGGCTTGACTTCACCGTTATTACGTTC

GCTTACGTTACTGAGTTCGTTGATCTTGGCAACGTGTCTGCACTCAGAACATTCA

GAGTGCTTAGAGCACTTAAAACCATAAGTGTCATACCAGGATTGAAAACGATCG

TGGGAGCTCTGATACAGAGTGTAAAGAAGCTTTCAGATGTAATGATCCTTACTG

TCTTCTGTCTTTCCGTATTCGCACTCATCGGGCTCCAGCTGTTTATGGGTAACCT

CAGAAACAAATGCATTCAATGGCCACCAACAAATGCGAGCCTTGAGGAACATA

GCATAGAAAAGAATATCACTGTTAACTATAATGGGACCCTCATAAACGAAACC

GTGTTCGAATTTGACTGGAAATCCTACATTCAGGATTCCAGATATCATTATTTTC

TTGAGGGCTTCTTGGACGCACTTTTGTGCGGAAATTCAAGTGATGCTGGTCAAT

GTCCTGAAGGTTATATGTGTGTTAAAGCAGGAAGAAACCCAAACTACGGATAC

ACATCTTTCGATACATTTTCTTGGGCTTTCCTATCTCTTTTTCGGCTTATGACACA

AGACTTTTGGGAAAATTTGTATCAGCTGACACTCCGAGCGGCTGGAAAAACTTA

TATGATCTTCTTCGTTCTTGTAATCTTTTTGGGATCCTTCTACCTCATCAATTTGA

TACTTGCAGTTGTCGCTATGGCATACGAGGAGCAAAATCAAGCAACGCTAGAA

GAAGCGGAGCAGAAAGAGGCTGAATTTCAACAGATGATTGAGCAATTGAAGAA

ACAACAGGAAGCTGCACAACAAGCAGCTACTGCTACTGCATCTGAACATTCTAG

AGAGCCAAGTGCAGCTGGAAGGCTTTCTGATAGTTCAAGTGAAGCATCTAAATT

GAGTTCTAAGTCAGCAAAGGAACGGAGAAATAGACGGAAAAAACGAAAGCAG

AAGGAGCAATCTGGAGGAGAAGAGAAGGACGAAGACGAGTTTCAAAAAAGTG AATCAGAGGACTCAATTAGACGCAAAGGATTCAGATTTAGTATCGAAGGAAAT

AGATTGACTTATGAAAAACGATATTCCTCACCACATCAGTCACTCCTGAGTATA

CGCGGGTCACTCTTTTCACCGAGACGAAATTCCAGAACTTCACTCTTCTCATTCC

GGGGAAGGGCTAAGGATGTAGGCTCAGAAAATGATTTCGCAGACGATGAGCAT

TCCACTTTTGAAGATAATGAGAGCAGGCGAGACAGTCTCTTTGTACCACGAAGA

CATGGCGAAAGAAGGAACAGCAACCTTAGCCAGACTAGTCGGTCCAGTAGAAT

GCTAGCTGTATTCCCTGCTAATGGCAAGATGCATTCCACCGTTGATTGTAATGG

GGTCGTCTCGTTAGTAGGTGGACCTTCAGTTCCTACCTCACCGGTTGGACAATTG

CTGCCGGAGGGAACCACTACTGAGACTGAAATGAGAAAACGACGTTCTTCAAG

CTTCCATGTGTCTATGGATTTTTTGGAAGACCCGTCACAGCGCCAAAGAGCTAT

GTCTATAGCTTCAATCCTGACAAACACCGTAGAGGAGTTGGAGGAGTCACGCCA

GAAGTGCCCTCCTTGTTGGTACAAGTTCTCCAACATCTTCCTGATTTGGGATTGT

TCACCTTATTGGCTGAAAGTCAAGCACGTTGTTAACCTCGTCGTAATGGATCCTT

TTGTCGACCTGGCTATAACGATATGTATCGTCCTGAACACACTCTTCATGGCTAT

GGAGCATTATCCGATGACTGATCATTTTAACAATGTGCTTACCGTGGGTAATCT

GGTTTTCACTGGCATCTTTACTGCAGAAATGTTTCTTAAGATTATTGCAATGGAC

CCCTACTACTACTTTCAAGAAGGATGGAATATTTTTGATGGTTTTATCGTCACAC

TTTCTTTGGTTGAATTGGGCTTGGCAAATGTAGAGGGGCTCTCAGTTCTTAGAAG

TTTCAGACTTCTCCGGGTATTCAAGCTTGCTAAGAGCTGGCCTACTTTGAACATG

CTCATCAAGATTATCGGAAACAGTGTTGGCGCCCTTGGCAATCTGACATTGGTG

TTGGCTATCATAGTATTCATCTTCGCGGTTGTGGGAATGCAGTTGTTTGGGAAGT

CATACAAGGACTGTGTGTGCAAGATAGCGTCCGACTGTCAACTTCCGAGGTGGC

ACATGAACGATTTCTTTCATTCATTCCTCATTGTGTTTCGGGTCCTCTGTGGCGA

GTGGATAGAAACTATGTGGGACTGTATGGAAGTAGCTGGGCAGGCGATGTGTCT

CAGTTATGACACAGAAATCTTGACGGTGGAATACGGGTTTCTTCCGATCGGAAA

GATTGTTGAGGAGCGCATAGAGTGTACGGTGTATACCGTCGATAAGAACGGATT

TGTCTACACACAGCCTATCGCACAATGGCATAATAGAGGAGAACAAGAAGTCTT

CGAATATTGTTTGGAGGACGGATCAATCATACGGGCAACCAAAGACCACAAGT

TTATGACAACAGATGGACAGATGTTGCCAATAGATGAGATATTTGAGAGGGGA

CTTGATCTCAAGCAAGTGGATGGTCTGCCATAATGATATCATAATCAACCTCTG

GATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAACTATGTTGCTCCTTTTA

CGCTATGTGGATACGCTGCTTTAATGCCTTTGTATCATGCTATTGCTTCCCGTAT

GGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTTAGTTCTTGCCACGGCGGAA

CTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGGCTCGGCTGTTGGGCACT

GACAATTCCGTGGCAATAAAAGATCTTTATTTTCATTAGATCTGTGTGTTGGTTT

TTTGTGTGGTGCGGACCGAGCGGCCGC (SEQ ID NO:83)

CN4218 (4312 bp between ITRs)

GCGGCCGCACGCGTTTAATTAAGTGTCTAGACTGCAGAGGGCCCTGCGTATGA

GTGCAAGTGGGTTTTAGGACCAGGATGAGGCGGGGTGGGGGTGCCTACCTGAC

GACCGACCCCGACCCACTGGACAAGCACCCAACCCCCATTCCCCAAATTGCGC

ATCCCCTATCAGAGAGGGGGAGGGGAAACAGGATGCGGCGAGGCGCGTGCGC

ACTGCCAGCTTCAGCACCGCGGACAGTGCCTTCGCCCCCGCCTGGCGGCGCGC

GCCACCGCCGCCTCAGCACTGAAGGCGCGCTGACGTCACTCGCCGGTCCCCCG

CAAACTCCCCTTCCCGGCCACCTTGGTCGCGTCCGCGCCGCCGCCGGCCCAGC

CGGACCGCACCACGCGAGGCGCGAGATAGGGGGGCACGGGCGCGACCATCTG

CGCTGCGGCGCCGGCGACTCAGCGCTGCCTCAGTCTGCGGTGGGCAGCGGAGG

Ill AGTCGTGTCGTGCCTGAGAGCGCAGTCGAGAAACCGGCTAGAGGATCCTTCGA

AGCTAGCGCTACCGGTGCCACCATGGTCAAGATCATCTCCAGGAAGTCTCTGG

GTACACAGAATGTCTACGATATCGGAGTCGAGAAAGACCACAATTTTCTCCTG

AAAAACGGACTCGTGGCGTCCAATTGCCTTACGGTATTCATGATGGTCATGGT

CATCGGAAATCTTGTTGTATTGAATTTATTTCTCGCGTTGTTGTTGAGTTCATTT

TCCGCCGATAATTTGGCTGCCACTGATGACGACAACGAGATGAATAATCTTCA

GATAGCTGTAGACCGGATGCACAAAGGCGTTGCCTACGTCAAACGAAAAATCT

ATGAATTCATACAGCAATCCTTCATACGAAAACAGAAGATTCTGGATGAAATC

AAACCCCTTGATGATCTCAATAATAAGAAAGATTCATGCATGTCGAACCATAC

CACAGAGATAGGCAAGGACCTTGACTACCTTAAAGACGTGAACGGTACCACA

AGTGGAATAGGCACAGGTAAGTACTAGCAGCTACAATCCAGCTACCATTCTGC

TTTTATTCTATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCT

TTTGCTAATCATGTTCATACCTCTTATCTTCCTCCCACAGGCTCCTCTGTAGAGA

AGTACATCATAGACGAGAGCGATTACATGTCTTTCATCAACAACCCGTCCCTC

ACTGTCACCGTTCCCATCGCCGTAGGAGAATCTGACTTCGAGAATCTCAATAC

GGAGGATTTCAGCTCCGAATCAGACTTGGAGGAATCAAAGGAGAAGTTGAAC

GAAAGTTCAAGTTCATCCGAGGGCAGCACCGTGGACATAGGCGCCCCCGTCGA

GGAACAACCTGTAGTCGAGCCTGAGGAAACTTTGGAACCCGAAGCGTGTTTCA

CGGAGGGGTGTGTTCAACGCTTCAAGTGTTGCCAAATTAACGTTGAAGAGGGT

CGTGGAAAACAATGGTGGAACCTCCGCAGGACCTGTTTCCGGATCGTCGAACA

TAATTGGTTCGAGACGTTCATAGTTTTCATGATCTTGCTTTCATCTGGTGCTTTG

GCATTCGAGGATATCTACATCGACCAACGAAAGACCATAAAAACTATGCTGGA

ATATGCAGACAAGGTTTTCACATACATATTCATCCTTGAAATGCTCCTGAAATG

GGTAGCGTATGGTTACCAGACTTATTTCACGAACGCATGGTGCTGGCTCGATTT

CCTGATTGTCGACGTCTCCCTGGTGTCATTGACTGCTAACGCACTCGGATATAG

CGAACTAGGCGCTATTAAGAGTCTCAGAACCCTGAGAGCATTGAGGCCCCTCC

GCGCGCTCTCTCGGTTTGAGGGAATGAGAGTAGTCGTTAATGCACTGTTGGGA

GCGATACCTTCCATTATGAACGTGCTTCTCGTTTGTCTCATCTTCTGGCTGATAT

TCTCTATTATGGGTGTGAACTTGTTCGCAGGCAAATTTTACCACTGCATTAACA

CAACTACAGGAGATAGATTTGATATTGAGGATGTAAACAACCACACCGACTGT

TTGAAGTTGATAGAGAGAAACGAGACCGCAAGATGGAAGAATGTAAAAGTCA

ACTTCGACAATGTCGGCTTTGGATATCTTTCACTGCTGCAAGTAGCCACATTCA

AAGGATGGATGGACATTATGTACGCTGCAGTAGATTCCCGAAACGTAGAGTTG

CAACCGAAGTATGAAGAAAGTTTGTATATGTACCTCTACTTCGTAATTTTTATC

ATCTTTGGCTCATTCTTCACACTTAACCTGTTCATTGGTGTAATCATCGACAATT

TCAATCAGCAGAAAAAGAAATTTGGTGGACAAGACATCTTCATGACAGAGGA

ACAGAAGAAATACTATAATGCAATGAAAAAACTAGGGTCCAAAAAGCCCCAA

AAACCTATTCCTAGACCGGGCAACAAGTTTCAAGGCATGGTTTTCGACTTCGT

AACTAGACAGGTGTTTGATATATCTATTATGATTCTGATATGTCTGAATATGGT

TACGATGATGGTTGAGACTGATGATCAATCTGAATACGTTACGACGATACTTA

GCCGAATTAACTTGGTATTCATTGTTCTTTTCACGGGCGAATGTGTACTTAAAC

TGATTAGTTTAAGGCACTATTATTTCACAATCGGTTGGAACATTTTTGATTTCG

TTGTGGTCATACTTTCCATTGTTGGCATGTTTCTTGCTGAATTGATAGAAAAGT

ACTTCGTCAGTCCAACACTTTTCCGAGTTATACGGCTTGCCCGAATCGGACGAA

TTCTCAGGCTAATCAAAGGTGCTAAAGGAATTCGTACACTGCTTTTCGCTCTCA

TGATGTCACTGCCAGCTCTTTTCAACATCGGTTTGTTACTATTTTTGGTAATGTT TATATATGCGATCTTCGGCATGAGTAATTTCGCTTATGTTAAACGGGAGGTGG GAATCGATGACATGTTTAATTTTGAGACATTCGGCAATTCTATGATCTGTCTCT

TTCAAATTACCACGTCAGCTGGATGGGACGGATTGCTTGCTCCGATTCTCAACA GTAAACCGCCCGATTGCGACCCTAACAAAGTGAATCCGGGTTCATCTGTAAAG GGAGACTGCGGAAATCCGAGCGTCGGTATCTTCTTTTTCGTCTCCTACATTATA ATTTCTTTCCTTGTTGTCGTGAACATGTATATAGCTGTGATCTTGGAAAATTTTT CTGTTGCTACTGAGGAATCCGCAGAACCACTTTCAGAAGACGATTTTGAGATG TTTTACGAAGTTTGGGAGAAGTTTGATCCTGACGCTACACAGTTTATGGAATTT GAGAAGCTCTCACAGTTCGCAGCTGCCCTGGAGCCTCCGTTGAATCTTCCACA GCCTAACAAGTTACAACTGATTGCGATGGACCTGCCAATGGTGTCTGGGGACC GAATCCACTGCCTTGATATACTCTTTGCTTTCACAAAAAGGGTCTTGGGCGAGT

CTGGAGAAATGGACGCCCTCAGAATACAGATGGAGGAACGATTCATGGCTTCG AATCCTAGCAAAGTGTCTTATCAACCCATCACTACGACTCTTAAAAGAAAACA

AGAGGAAGTGTCTGCTGTCATTATCCAGCGAGCATATAGACGGCACTTGCTCA AACGAACTGTTAAGCAAGCCAGTTTCACCTACAATAAAAACAAAATAAAAGG TGGTGCTAATTTGCTGATTAAAGAGGACATGATTATCGACAGAATCAATGAGA

ACTCCATTACAGAAAAAACCGATCTCACTATGTCAACAGCAGCCTGTCCTCCC TCATACGACCGTGTCACTAAACCTATAGTCGAAAAACATGAACAAGAGGGCA

AGGATGAGAAGGCCAAAGGCAAAGCCGGCGACTACAAAGACCATGACGGAG ACTATAAAGATCATGACATCGATTACAAGGATGACGATGACAAGTAATGATAT

CATAATCAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAAC TATGTTGCTCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCTTTGTATCATG CTATTGCTTCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTTAGT TCTTGCCACGGCGGAACTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGG CTCGGCTGTTGGGCACTGACAATTCCGTGGCAATAAAAGATCTTTATTTTCATT AGATCTGTGTGTTGGTTTTTTGTGTGGTGCGGACCGAGCGGCCGC (SEQ ID

NO: 84)

CN4642 (4222 bp between ITRs)

GCGGCCGCACGCGTTTAATTAAGTGTCTAGACTGCAGAGGGCCCTGCGTATGA GTGCAAGTGGGTTTTAGGACCAGGATGAGGCGGGGTGGGGGTGCCTACCTGAC GACCGACCCCGACCCACTGGACAAGCACCCAACCCCCATTCCCCAAATTGCGC ATCCCCTATCAGAGAGGGGGAGGGGAAACAGGATGCGGCGAGGCGCGTGCGC

ACTGCCAGCTTCAGCACCGCGGACAGTGCCTTCGCCCCCGCCTGGCGGCGCGC GCCACCGCCGCCTCAGCACTGAAGGCGCGCTGACGTCACTCGCCGGTCCCCCG

CAAACTCCCCTTCCCGGCCACCTTGGTCGCGTCCGCGCCGCCGCCGGCCCAGCC GGACCGCACCACGCGAGGCGCGAGATAGGGGGGCACGGGCGCGACCATCTGC GCTGCGGCGCCGGCGACTCAGCGCTGCCTCAGTCTGCGGTGGGCAGCGGAGGA GTCGTGTCGTGCCTGAGAGCGCAGTCGAGAAACCGGCTAGAGGATCCTTCGAA GCTAGCGCTACCGGTGCCACCATGGTCTACCCGTATGATGTCCCGGATTACGCT GGCAGCTACCCATACGATGTACCCGACTATGCCGGCAGTATGGAGCAAACAGT TTTGGTCCCTCCGGGACCAGACAGTTTCAATTTCTTTACTCGGGAGAGTCTTGC CGCCATTGAGAGGCGCATAGCTGAGGTAAGTACTAGCAGCTACAATCCAGCTA CCATTCTGCTTTTATTCTATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAG

CTAGGCCCTTTTGCTAATCATGTTCATACCTCTTATCTTCCTCCCACAGGAAAA GGCTAAGAATCCAAAACCTGACAAGAAAGACGACGACGAAAACGGACCCAAA CCTAACTCAGATCTCGAAGCTGGAAAGAATCTCCCATTCATCTATGGTGATATC CCTCCAGAAATGGTTTCAGAACCTCTAGAAGATCTCGATCCATACTATATCAAT AAAAAGACCTTCATCGTTCTGAACAAAGGAAAGGCGATTTTCCGGTTTTCTGCT ACTTCTGCTCTCTATATTCTCACACCATTTAATCCACTTCGCAAGATTGCGATTA AGATACTGGTGCATAGTCTGTTCAGTATGCTGATTATGTGTACAATCCTTACAA ATTGTGTCTTTATGACTATGTCTAACCCGCCGGATTGGACCAAGAACGTAGAAT ACACGTTCACTGGAATCTATACGTTCGAGTCTCTTATTAAGATAATCGCCAGGG GGTTCTGTCTTGAGGATTTCACTTTCCTCCGCGATCCGTGGAATTGGCTTGACT TCACCGTTATTACGTTCGCTTACGTTACTGAGTTCGTTGATCTTGGCAACGTGT CTGCACTCAGAACATTCAGAGTGCTTAGAGCACTTAAAACCATAAGTGTCATA CCAGGATTGAAAACGATCGTGGGAGCTCTGATACAGAGTGTAAAGAAGCTTTC AGATGTAATGATCCTTACTGTCTTCTGTCTTTCCGTATTCGCACTCATCGGGCTC CAGCTGTTTATGGGTAACCTCAGAAACAAATGCATTCAATGGCCACCAACAAA TGCGAGCCTTGAGGAACATAGCATAGAAAAGAATATCACTGTTAACTATAATG GGACCCTCATAAACGAAACCGTGTTCGAATTTGACTGGAAATCCTACATTCAG GATTCCAGATATCATTATTTTCTTGAGGGCTTCTTGGACGCACTTTTGTGCGGA AATTCAAGTGATGCTGGTCAATGTCCTGAAGGTTATATGTGTGTTAAAGCAGG AAGAAACCCAAACTACGGATACACATCTTTCGATACATTTTCTTGGGCTTTCCT ATCTCTTTTTCGGCTTATGACACAAGACTTTTGGGAAAATTTGTATCAGCTGAC ACTCCGAGCGGCTGGAAAAACTTATATGATCTTCTTCGTTCTTGTAATCTTTTT GGGATCCTTCTACCTCATCAATTTGATACTTGCAGTTGTCGCTATGGCATACGA GGAGCAAAATCAAGCAACGCTAGAAGAAGCGGAGCAGAAAGAGGCTGAATTT CAACAGATGATTGAGCAATTGAAGAAACAACAGGAAGCTGCACAACAAGCAG CTACTGCTACTGCATCTGAACATTCTAGAGAGCCAAGTGCAGCTGGAAGGCTT TCTGATAGTTCAAGTGAAGCATCTAAATTGAGTTCTAAGTCAGCAAAGGAACG GAGAAATAGACGGAAAAAACGAAAGCAGAAGGAGCAATCTGGAGGAGAAGA GAAGGACGAAGACGAGTTTCAAAAAAGTGAATCAGAGGACTCAATTAGACGC AAAGGATTCAGATTTAGTATCGAAGGAAATAGATTGACTTATGAAAAACGATA TTCCTCACCACATCAGTCACTCCTGAGTATACGCGGGTCACTCTTTTCACCGAG ACGAAATTCCAGAACTTCACTCTTCTCATTCCGGGGAAGGGCTAAGGATGTAG GCTCAGAAAATGATTTCGCAGACGATGAGCATTCCACTTTTGAAGATAATGAG AGCAGGCGAGACAGTCTCTTTGTACCACGAAGACATGGCGAAAGAAGGAACA GCAACCTTAGCCAGACTAGTCGGTCCAGTAGAATGCTAGCTGTATTCCCTGCTA ATGGCAAGATGCATTCCACCGTTGATTGTAATGGGGTCGTCTCGTTAGTAGGTG GACCTTCAGTTCCTACCTCACCGGTTGGACAATTGCTGCCGGAGGGAACCACT ACTGAGACTGAAATGAGAAAACGACGTTCTTCAAGCTTCCATGTGTCTATGGA TTTTTTGGAAGACCCGTCACAGCGCCAAAGAGCTATGTCTATAGCTTCAATCCT GACAAACACCGTAGAGGAGTTGGAGGAGTCACGCCAGAAGTGCCCTCCTTGTT GGTACAAGTTCTCCAACATCTTCCTGATTTGGGATTGTTCACCTTATTGGCTGA

AAGTCAAGCACGTTGTTAACCTCGTCGTAATGGATCCTTTTGTCGACCTGGCTA TAACGATATGTATCGTCCTGAACACACTCTTCATGGCTATGGAGCATTATCCGA TGACTGATCATTTTAACAATGTGCTTACCGTGGGTAATCTGGTTTTCACTGGCA TCTTTACTGCAGAAATGTTTCTTAAGATTATTGCAATGGACCCCTACTACTACT TTCAAGAAGGATGGAATATTTTTGATGGTTTTATCGTCACACTTTCTTTGGTTG AATTGGGCTTGGCAAATGTAGAGGGGCTCTCAGTTCTTAGAAGTTTCAGACTTC TCCGGGTATTCAAGCTTGCTAAGAGCTGGCCTACTTTGAACATGCTCATCAAGA TTATCGGAAACAGTGTTGGCGCCCTTGGCAATCTGACATTGGTGTTGGCTATCA TAGTATTCATCTTCGCGGTTGTGGGAATGCAGTTGTTTGGGAAGTCATACAAGG

ACTGTGTGTGCAAGATAGCGTCCGACTGTCAACTTCCGAGGTGGCACATGAAC

GATTTCTTTCATTCATTCCTCATTGTGTTTCGGGTCCTCTGTGGCGAGTGGATAG

AAACTATGTGGGACTGTCTCAGTTATGACACAGAAATCTTGACGGTGGAATAC

GGGTTTCTTCCGATCGGAAAGATTGTTGAGGAGCGCATAGAGTGTACGGTGTA

TACCGTCGATAAGAACGGATTTGTCTACACACAGCCTATCGCACAATGGCATA

ATAGAGGAGAACAAGAAGTCTTCGAATATTGTTTGGAGGACGGATCAATCATA

CGGGCAACCAAAGACCACAAGTTTATGACAACAGATGGACAGATGTTGCCAAT

AGATGAGATATTTGAGAGGGGACTTGATCTCAAGCAAGTGGATGGTCTGCCAT

AATGATATCATAATCAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGT

ATTCTTAACTATGTTGCTCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCTT

TGTATCATGCTATTGCTTCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATC

CTGGTTAGTTCTTGCCACGGCGGAACTCATCGCCGCCTGCCTTGCCCGCTGCTG

GACAGGGGCTCGGCTGTTGGGCACTGACAATTCCGTGGCAATAAAAGATCTTT

ATTTTCATTAGATCTGTGTGTTGGTTTTTTGTGTGGTGCGGACCGAGCGGCCGC (SEQ ID NO:85)

CN4643 (4339 bp between ITRs)

GCGGCCGCACGCGTTTAATTAAGTGTCTAGACTGCAGAGGGCCCTGCGTATGA

GTGCAAGTGGGTTTTAGGACCAGGATGAGGCGGGGTGGGGGTGCCTACCTGAC

GACCGACCCCGACCCACTGGACAAGCACCCAACCCCCATTCCCCAAATTGCGC

ATCCCCTATCAGAGAGGGGGAGGGGAAACAGGATGCGGCGAGGCGCGTGCGC

ACTGCCAGCTTCAGCACCGCGGACAGTGCCTTCGCCCCCGCCTGGCGGCGCGC

GCCACCGCCGCCTCAGCACTGAAGGCGCGCTGACGTCACTCGCCGGTCCCCCG

CAAACTCCCCTTCCCGGCCACCTTGGTCGCGTCCGCGCCGCCGCCGGCCCAGC

CGGACCGCACCACGCGAGGCGCGAGATAGGGGGGCACGGGCGCGACCATCTG

CGCTGCGGCGCCGGCGACTCAGCGCTGCCTCAGTCTGCGGTGGGCAGCGGAGG

AGTCGTGTCGTGCCTGAGAGCGCAGTCGAGAAACCGGCTAGAGGATCCTTCGA

AGCTAGCGCTACCGGTGCCACCATGGTCAAGATCATCTCCAGGAAGTCTCTGG

GTACACAGAATGTCTACGATATCGGAGTCGAGAAAGACCACAATTTTCTCCTG

AAAAACGGACTCGTGGCGTCCAATTGTATGGAAGTAGCTGGGCAGGCGATGTG

CCTTACGGTATTCATGATGGTCATGGTCATCGGAAATCTTGTTGTATTGAATTT

ATTTCTCGCGTTGTTGTTGAGTTCATTTTCCGCCGATAATTTGGCTGCCACTGAT

GACGACAACGAGATGAATAATCTTCAGATAGCTGTAGACCGGATGCACAAAG

GCGTTGCCTACGTCAAACGAAAAATCTATGAATTCATACAGCAATCCTTCATA

CGAAAACAGAAGATTCTGGATGAAATCAAACCCCTTGATGATCTCAATAATAA

GAAAGATTCATGCATGTCGAACCATACCACAGAGATAGGCAAGGACCTTGACT

ACCTTAAAGACGTGAACGGTACCACAAGTGGAATAGGCACAGGTAAGTACTA

GCAGCTACAATCCAGCTACCATTCTGCTTTTATTCTATGGTTGGGATAAGGCTG

GATTATTCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTTCATACCTCTTAT

CTTCCTCCCACAGGCTCCTCTGTAGAGAAGTACATCATAGACGAGAGCGATTA

CATGTCTTTCATCAACAACCCGTCCCTCACTGTCACCGTTCCCATCGCCGTAGG

AGAATCTGACTTCGAGAATCTCAATACGGAGGATTTCAGCTCCGAATCAGACT

TGGAGGAATCAAAGGAGAAGTTGAACGAAAGTTCAAGTTCATCCGAGGGCAG

CACCGTGGACATAGGCGCCCCCGTCGAGGAACAACCTGTAGTCGAGCCTGAGG

AAACTTTGGAACCCGAAGCGTGTTTCACGGAGGGGTGTGTTCAACGCTTCAAG

TGTTGCCAAATTAACGTTGAAGAGGGTCGTGGAAAACAATGGTGGAACCTCCG CAGGACCTGTTTCCGGATCGTCGAACATAATTGGTTCGAGACGTTCATAGTTTT

CATGATCTTGCTTTCATCTGGTGCTTTGGCATTCGAGGATATCTACATCGACCA

ACGAAAGACCATAAAAACTATGCTGGAATATGCAGACAAGGTTTTCACATACA

TATTCATCCTTGAAATGCTCCTGAAATGGGTAGCGTATGGTTACCAGACTTATT

TCACGAACGCATGGTGCTGGCTCGATTTCCTGATTGTCGACGTCTCCCTGGTGT

CATTGACTGCTAACGCACTCGGATATAGCGAACTAGGCGCTATTAAGAGTCTC

AGAACCCTGAGAGCATTGAGGCCCCTCCGCGCGCTCTCTCGGTTTGAGGGAAT

GAGAGTAGTCGTTAATGCACTGTTGGGAGCGATACCTTCCATTATGAACGTGC

TTCTCGTTTGTCTCATCTTCTGGCTGATATTCTCTATTATGGGTGTGAACTTGTT

CGCAGGCAAATTTTACCACTGCATTAACACAACTACAGGAGATAGATTTGATA

TTGAGGATGTAAACAACCACACCGACTGTTTGAAGTTGATAGAGAGAAACGA

GACCGCAAGATGGAAGAATGTAAAAGTCAACTTCGACAATGTCGGCTTTGGAT

ATCTTTCACTGCTGCAAGTAGCCACATTCAAAGGATGGATGGACATTATGTAC

GCTGCAGTAGATTCCCGAAACGTAGAGTTGCAACCGAAGTATGAAGAAAGTTT

GTATATGTACCTCTACTTCGTAATTTTTATCATCTTTGGCTCATTCTTCACACTT

AACCTGTTCATTGGTGTAATCATCGACAATTTCAATCAGCAGAAAAAGAAATT

TGGTGGACAAGACATCTTCATGACAGAGGAACAGAAGAAATACTATAATGCA

ATGAAAAAACTAGGGTCCAAAAAGCCCCAAAAACCTATTCCTAGACCGGGCA

ACAAGTTTCAAGGCATGGTTTTCGACTTCGTAACTAGACAGGTGTTTGATATAT

CTATTATGATTCTGATATGTCTGAATATGGTTACGATGATGGTTGAGACTGATG

ATCAATCTGAATACGTTACGACGATACTTAGCCGAATTAACTTGGTATTCATTG

TTCTTTTCACGGGCGAATGTGTACTTAAACTGATTAGTTTAAGGCACTATTATT

TCACAATCGGTTGGAACATTTTTGATTTCGTTGTGGTCATACTTTCCATTGTTGG

CATGTTTCTTGCTGAATTGATAGAAAAGTACTTCGTCAGTCCAACACTTTTCCG

AGTTATACGGCTTGCCCGAATCGGACGAATTCTCAGGCTAATCAAAGGTGCTA

AAGGAATTCGTACACTGCTTTTCGCTCTCATGATGTCACTGCCAGCTCTTTTCA

ACATCGGTTTGTTACTATTTTTGGTAATGTTTATATATGCGATCTTCGGCATGA

GTAATTTCGCTTATGTTAAACGGGAGGTGGGAATCGATGACATGTTTAATTTTG

AGACATTCGGCAATTCTATGATCTGTCTCTTTCAAATTACCACGTCAGCTGGAT

GGGACGGATTGCTTGCTCCGATTCTCAACAGTAAACCGCCCGATTGCGACCCT

AACAAAGTGAATCCGGGTTCATCTGTAAAGGGAGACTGCGGAAATCCGAGCG

TCGGTATCTTCTTTTTCGTCTCCTACATTATAATTTCTTTCCTTGTTGTCGTGAA

CATGTATATAGCTGTGATCTTGGAAAATTTTTCTGTTGCTACTGAGGAATCCGC

AGAACCACTTTCAGAAGACGATTTTGAGATGTTTTACGAAGTTTGGGAGAAGT

TTGATCCTGACGCTACACAGTTTATGGAATTTGAGAAGCTCTCACAGTTCGCAG

CTGCCCTGGAGCCTCCGTTGAATCTTCCACAGCCTAACAAGTTACAACTGATTG

CGATGGACCTGCCAATGGTGTCTGGGGACCGAATCCACTGCCTTGATATACTC

TTTGCTTTCACAAAAAGGGTCTTGGGCGAGTCTGGAGAAATGGACGCCCTCAG

AATACAGATGGAGGAACGATTCATGGCTTCGAATCCTAGCAAAGTGTCTTATC

AACCCATCACTACGACTCTTAAAAGAAAACAAGAGGAAGTGTCTGCTGTCATT

ATCCAGCGAGCATATAGACGGCACTTGCTCAAACGAACTGTTAAGCAAGCCAG

TTTCACCTACAATAAAAACAAAATAAAAGGTGGTGCTAATTTGCTGATTAAAG

AGGACATGATTATCGACAGAATCAATGAGAACTCCATTACAGAAAAAACCGA

TCTCACTATGTCAACAGCAGCCTGTCCTCCCTCATACGACCGTGTCACTAAACC

TATAGTCGAAAAACATGAACAAGAGGGCAAGGATGAGAAGGCCAAAGGCAA

AGCCGGCGACTACAAAGACCATGACGGAGACTATAAAGATCATGACATCGAT TACAAGGATGACGATGACAAGTAATGATATCATAATCAACCTCTGGATTACAA

AATTTGTGAAAGATTGACTGGTATTCTTAACTATGTTGCTCCTTTTACGCTATG

TGGATACGCTGCTTTAATGCCTTTGTATCATGCTATTGCTTCCCGTATGGCTTTC

ATTTTCTCCTCCTTGTATAAATCCTGGTTAGTTCTTGCCACGGCGGAACTCATC

GCCGCCTGCCTTGCCCGCTGCTGGACAGGGGCTCGGCTGTTGGGCACTGACAA

TTCCGTGGCAATAAAAGATCTTTATTTTCATTAGATCTGTGTGTTGGTTTTTTGT

GTGGTGCGGACCGAGCGGCCGC (SEQ ID NO: 86)

Claims

WHAT IS CLAIMED IS:

1. A system to express one or more coding sequences, the system comprising:

(i) a first expression construct comprising: a first portion of a polynucleotide sequence of a gene encoding the sodium channel alpha subunit, and a polynucleotide sequence encoding an N-fragment of a split intein (N- intein), at the 3’ end relative to the first portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit;

(ii) a second expression construct comprising: a second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit, and a polynucleotide sequence encoding a C-fragment of the split intein (C- intein), at the 5’ end relative to the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit; and

(iii) optionally, a third expression construct comprising a polynucleotide sequence encoding a degron; wherein if the third expression construct is not included in the system, the first expression construct, the second expression construct, or both further comprises a polynucleotide sequence encoding a degron, located if within the first expression construct at the 3’ end relative to the first portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit, or located if within the second expression construct at the 5’ end relative to the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit.

2. The system of claim 1, wherein the system comprises the third expression construct.

3. The system of claim 1, wherein the first and/or the second expression construct further comprises an enhancer sequence and a promoter sequence, and optionally an intron having a polynucleotide sequence of SEQ ID NO: 107.

4. The system of claim 1, wherein the gene encoding the sodium channel alpha subunit is selected from the group consisting of SCNlA_t SCN2A, SCN3A, SCN4A, SCN5A, SCN8A, SCN9A, SCN10A, SCN11A, and SCN7A,- preferably the sodium channel alpha subunit comprising sodium channel protein type 1 subunit alpha, or the gene encoding the sodium channel alpha subunit comprises SCN1A.

5. The system of claim 1, wherein the first expression construct comprises the polynucleotide sequence encoding the degron, and the first expression construct comprises from 5’ to 3’: the first portion of the polynucleotide sequence of the SCN1A

- the polynucleotide sequence encoding the N-intein - the polynucleotide sequence encoding the degron.

6. The system of claim 1, wherein the first expression construct comprises the polynucleotide sequence encoding the degron, and the first expression construct comprises from 5’ to 3’: the first portion of the polynucleotide sequence of the SCN1A

- the polynucleotide sequence encoding the degron - the polynucleotide sequence encoding the N-intein.

7. The system of claim 1, wherein the second expression construct comprises the polynucleotide sequence encoding the degron, and when expressed, the degron is two or more amino acid residues at the N-terminus relative to a protein product encoded by the second portion of the polynucleotide sequence of the SCN1A.

8. The system of claim 1, wherein the second expression construct comprises the polynucleotide sequence encoding the degron, and the second expression construct comprises from 5’ to 3’: a first portion of the polynucleotide sequence encoding the C- intein - the polynucleotide sequence encoding the degron - a second portion of the polynucleotide sequence encoding the C-intein - the second portion of the polynucleotide sequence of the SCN1A, wherein when expression, a protein product of the first portion of the polynucleotide sequence encoding the C-intein and a protein product of the second portion of the polynucleotide sequence encoding the C-intein together form the C-intein.

9. The system of any one of claim 1-8, wherein when the first and the second expression constructs are expressed, a protein product of the first portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit and a protein product of the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit are linked, via a peptide bond between the C-terminus of the first portion’s protein product and the N-terminus of the second portion’s protein product, to reconstitute the sodium channel alpha subunit.

10. The system of any one of claims 1-8, wherein the degron has an amino acid sequence selected from the group consisting of: MSCAQESITSLYKKAGSENLYFQ (SEQ ID NO:88), MSCAQES (SEQ ID NO: 90), RPGSTSPFAPSATDLPSMPEPALTSR (SEQ ID NO:91), ACKNWFSSLSHFVIHL (SEQ ID NO: 92), and GSLIIFIIL (SEQ ID NO:93).

11. The system of claim 5, wherein the degron has an amino acid sequence of RPGSTSPFAPSATDLPSMPEPALTSR (SEQ ID NO:91) or ACKNWFSSLSHFVIHL (SEQ ID NO.92).

12. The system of claim 7, wherein the degron has an amino acid sequence of MSCAQESITSLYKKAGSENLYFQ (SEQ ID NO:88), MSCAQES (SEQ ID NO:90), or GSLIIFIIL (SEQ ID NO.93).

13. The system of any one of claims 1-8, wherein the sodium channel protein type 1 subunit alpha is sodium channel protein type 1 subunit alpha isoform 2, having amino acid residue numbering according to NCBI accession number NP 001340878.1 and optionally having an amino acid substitution of A1056T, or an isoform variant thereof; and wherein the first and the second portions of the polynucleotide sequence of the SCN1A encode residues 1-1049 and residues 1050-1998 of the sodium channel protein type 1 subunit alpha isoform 2, respectively, or residue segments of the isoform variant when sequence aligned to the sodium channel protein type 1 subunit alpha isoform 2, wherein the first and the second portions of the polynucleotide sequence of the SCN1A encode residues 1-956 and residues 957-1998 of the sodium channel protein type 1 subunit alpha isoform 2, respectively, or residue segments of the isoform variant when sequence aligned to the sodium channel protein type 1 subunit alpha isoform 2, or wherein the first and the second portions of the polynucleotide sequence of the SCN1A encode residues 1-947 and residues 948-1998 of the sodium channel protein type 1 subunit alpha isoform 2, respectively, or residue segments of the isoform variant when sequence aligned to the sodium channel protein type 1 subunit alpha isoform 2.

14. The system of claim 1, wherein the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterml049 (SEQ ID NO: 59), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterm949 (SEQ ID NO: 60); wherein the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterm956 (SEQ ID NO: 61), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterml042 (SEQ ID NO: 62); or wherein the first portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Nterm947 (SEQ ID NO: 63), and the second portion of the polynucleotide sequence of the SCN1A comprises a polynucleotide sequence of hSCNlA-CO-Cterml051 (SEQ ID NO: 64).

15. The system of any one of claims 1-8, wherein the split intein comprises consensus fast intein (Cfa); the degron is a polypeptide being 5-30 amino-acid residues in length or preferably 9-26 amino-acid residues in length; and a polypeptide product encoded by the second portion of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit starts with a cystein, serine, or threonine residue.

16. The system of any one of claims 1-8, wherein the polynucleotide sequence encoding the N-intein comprises a polynucleotide sequence of Cfa-N (SEQ ID NO:57), and the the polynucleotide sequence encoding the C-intein comprises a polynucleotide sequence of Cfa-C (SEQ ID NO:58).

17. The system of any one of claims 1-8, wherein the first expression construct and the second expression construct independently further comprise the promoter sequence selected from a minBglobin promoter having a polynucleotide sequence of SEQ ID NO:3, an hSynl promoter having a polynucleotide sequence of SEQ ID NO: 52, or a CMV promoter having a polynucleotide sequence of SEQ ID NO:53; optionally a shortened hSynl promoter having a polynucleotide sequence of SEQ ID NO: 54.

18. The system of claim 3, wherein the enhancer sequence is configured for targeted expression of the first, the second, or both portions, respectively, of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit within a targeted central nervous system cell type; preferably the targeted central nervous system cell type being GAB Aergic, glutamatergic, or both cell types.

19. The system of claim 3, wherein the enhancer sequence is set forth in SEQ ID NO: 2 (DLX2.0) or has a concatemerized core having a polynucleotide sequence of SEQ ID NO: 1, or the targeted central nervous system cell type comprises a GABAergic interneuron; or wherein the enhancer sequence is set forth in SEQ ID NO: 55 (eHGT_078h), or the targeted central nervous system cell type comprises a glutamatergic neuron.

20. The system of claim 3, wherein the first expression construct, the second expression construct, or both independently further comprise a miRNA binding site sequence, configured for targeted expression of the first, the second, or both portions, respectively, of the polynucleotide sequence of the gene encoding the sodium channel alpha subunit within a selected central nervous system cell type.

21. The system of claim 20, wherein the miRNA binding site sequence is set forth in SEQ ID NO: 56 (4x2C miRNA binding site) or SEQ ID NO:87 (8x2C miRNA binding site), or the selected central nervous system cell type comprises a pan-GABAergic neuron.

22. An artificial expression construct, comprising the first, the second or the third expression construct of the system of any one of claims 1-21, wherein the artificial expression construct is associated with a capsid that crosses the blood brain barrier.

23. The artificial expression construct of claim 22, wherein the capsid comprises PHP.eB, AAV-BR1, AAV-PHP.S, AAV-PHP.B, or AAV-PPS.

24. An administrable composition, comprising: the artificial expression construct of claim 22; and a pharmaceutically acceptable excipient.

25. A transgenic cell comprising the system of any one of claims 1-21, wherein optionally the transgenic cell comprises a GABAergic neuron or a glutamatergic neuron.

26. A method of rescuing voltage-gated sodium channel function within a targeted population of cells, the method comprising: administering a therapeutically effective amount of the system of any one of claims 1-21 to a sample or subject comprising the targeted population of cells, and inducing expression of the first expression construct and the second expression construct of the system to reconstitute a sodium channel alpha subunit, thereby rescuing voltage-gated sodium channel function within the targeted population of cells.

27. The method of claim 26, wherein the subject has a sodium channel opathies, optionally comprising Dravet syndrome, myoclonic seizures, myoclonic astatic epilepsy (MAE), intractable childhood epilepsy with generalized tonic-clonic seizures, simple febrile seizures, generalized epilepsy and febrile seizures plus (GEFS+), migrating partial seizures of infancy, Lennox-Gastaut syndrome, or West syndrome.

28. The method of claim 26, wherein the subject is a pediatric patient having Dravet syndrome.

29. A method of administering a system of expression constructs to a subject in need thereof, comprising administering a therapeutically effective amount of the system of any one of claims 1-21 to a sample or subject comprising the targeted population of cells, and inducing expression of the first expression construct and the second expression construct of the system to reconstitute a sodium channel alpha subunit, thereby rescuing voltage-gated sodium channel function within the targeted population of cells, wherein the subject in need thereof has a sodium channel opathies, optionally comprising Dravet syndrome, myoclonic seizures, myoclonic astatic epilepsy (MAE), intractable childhood epilepsy with generalized tonic-clonic seizures, simple febrile seizures, generalized epilepsy and febrile seizures plus (GEFS+), migrating partial seizures of infancy, Lennox-Gastaut syndrome, or West syndrome.